background image

11-2 Vol. 1

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS 2 (INTEL® SSE2)

SSE2 extensions are fully compatible with all software written for IA-32 processors. All existing software continues 
to run correctly, without modification, on processors that incorporate SSE2 extensions, as well as in the presence 
of applications that incorporate these extensions. Enhancements to the CPUID instruction permit detection of the 
SSE2 extensions. Also, because the SSE2 extensions use the same registers as the SSE extensions, no new oper-
ating-system support is required for saving and restoring program state during a context switch beyond that 
provided for the SSE extensions.
SSE2 extensions are accessible from all IA-32 execution modes: protected mode, real address mode, virtual 8086 
mode.
The following sections in this chapter describe the programming environment for SSE2 extensions including: the 
128-bit XMM floating-point register set, data types, and SSE2 instructions. It also describes exceptions that can be 
generated with the SSE and SSE2 instructions and gives guidelines for writing applications with SSE and SSE2 
extensions.
For additional information about SSE2 extensions, see:

Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes 2A & 2B, provide a detailed 
description of individual SSE3 instructions.

Chapter 13, “System Programming for Instruction Set Extensions and Processor Extended States,” in the 
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A, gives guidelines for integrating 
the SSE and SSE2 extensions into an operating-system environment.

11.2 

SSE2 PROGRAMMING ENVIRONMENT

Figure 11-1 shows the programming environment for SSE2 extensions. No new registers or other instruction 
execution state are defined with SSE2 extensions. SSE2 instructions use the XMM registers, the MMX registers, 
and/or IA-32 general-purpose registers, as follows: 

XMM registers — These eight registers (see Figure 10-2) are used to operate on packed or scalar double-
precision floating-point data. Scalar operations are operations performed on individual (unpacked) double-
precision floating-point values stored in the low quadword of an XMM register. XMM registers are also used to 
perform operations on 128-bit packed integer data. They are referenced by the names XMM0 through XMM7.

MXCSR register — This 32-bit register (see Figure 10-3) provides status and control bits used in floating-point 
operations. The denormals-are-zeros and flush-to-zero flags in this register provide a higher performance 
alternative for the handling of denormal source operands and denormal (underflow) results. For more 

Figure 11-1.  Steaming SIMD Extensions 2 Execution Environment

0

2

32

 -1

Eight 32-Bit

32 Bits

EFLAGS Register

Address Space

General-Purpose

Eight 64-Bit

MMX Registers

Eight 128-Bit

XMM Registers

32 Bits

MXCSR Register

Registers