background image

Vol. 1 11-5

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS 2 (INTEL® SSE2)

floating-point values, and the destination operand contains the results of the operation (OP) performed in parallel 
on the corresponding values (X0 and Y0, and X1 and Y1) in each operand.

The scalar double-precision floating-point instructions operate on the low (least significant) quadwords of two 
source operands (X0 and Y0), as shown in Figure 11-4. The high quadword (X1) of the first source operand is 
passed through to the destination. The scalar operations are similar to the floating-point operations performed in 
x87 FPU data registers with the precision control field in the x87 FPU control word set for double precision (53-bit 
significand), except that x87 stack operations use a 15-bit exponent range for the result while SSE2 operations use 
an 11-bit exponent range. 
See Section 11.6.8, “Compatibility of SIMD and x87 FPU Floating-Point Data Types,” for more information about 
obtaining compatible results when performing both scalar double-precision floating-point operations in XMM regis-
ters and in x87 FPU data registers.

11.4.1.1   Data Movement Instructions

Data movement instructions move double-precision floating-point data between XMM registers and between XMM 
registers and memory.
The MOVAPD (move aligned packed double-precision floating-point) instruction transfers a 128-bit packed double-
precision floating-point operand from memory to an XMM register or vice versa, or between XMM registers. The 
memory address must be aligned to a 16-byte boundary; if not, a general-protection exception (GP#) is gener-
ated.

Figure 11-3.  Packed Double-Precision Floating-Point Operations

Figure 11-4.  Scalar Double-Precision Floating-Point Operations

X1

X0

     X1 OP Y1

X0 OP Y0

OP

Y1

Y0

OP

X1

X0

      X1

X0 OP Y0

OP

Y1

Y0