Page 259

Vol. 1 11-5

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS 2 (INTEL® SSE2)

floating-point values, and the destination operand contains the results of the operation (OP) performed in parallel
on the corresponding values (X0 and Y0, and X1 and Y1) in each operand.

The scalar double-precision floating-point instructions operate on the low (least significant) quadwords of two
source operands (X0 and Y0), as shown in Figure 11-4. The high quadword (X1) of the first source operand is
passed through to the destination. The scalar operations are similar to the floating-point operations performed in
x87 FPU data registers with the precision control field in the x87 FPU control word set for double precision (53-bit
significand), except that x87 stack operations use a 15-bit exponent range for the result while SSE2 operations use
an 11-bit exponent range.
See Section 11.6.8, “Compatibility of SIMD and x87 FPU Floating-Point Data Types,” for more information about
obtaining compatible results when performing both scalar double-precision floating-point operations in XMM regis-
ters and in x87 FPU data registers.

11.4.1.1 Data Movement Instructions

Data movement instructions move double-precision floating-point data between XMM registers and between XMM
registers and memory.
The MOVAPD (move aligned packed double-precision floating-point) instruction transfers a 128-bit packed double-
precision floating-point operand from memory to an XMM register or vice versa, or between XMM registers. The
memory address must be aligned to a 16-byte boundary; if not, a general-protection exception (GP#) is gener-
ated.

Figure 11-3. Packed Double-Precision Floating-Point Operations

Figure 11-4. Scalar Double-Precision Floating-Point Operations

X1 OP Y1

X0 OP Y0