background image

10-14 Vol. 1

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS (INTEL® SSE)

10.4.6.4   SFENCE Instruction

The SFENCE (Store Fence) instruction controls write ordering by creating a fence for memory store operations. This 
instruction guarantees that the result of every store instruction that precedes the store fence in program order is 
globally visible before any store instruction that follows the fence. The SFENCE instruction provides an efficient way 
of ensuring ordering between procedures that produce weakly-ordered data and procedures that consume that 
data.

10.5 

FXSAVE AND FXRSTOR INSTRUCTIONS

The FXSAVE and FXRSTOR instructions were introduced into the IA-32 architecture in the Pentium II processor 
family (prior to the introduction of the SSE extensions). The original versions of these instructions performed a fast 
save and restore, respectively, of the x87 execution environment (x87 state). (By saving the state of the x87 FPU 
data registers, the FXSAVE and FXRSTOR instructions implicitly save and restore the state of the MMX registers.) 
The SSE extensions expanded the scope of these instructions to save and restore the states of the XMM registers 
and the MXCSR register (SSE state), along with x87 state. 
The FXSAVE and FXRSTOR instructions can be used in place of the FSAVE/FNSAVE and FRSTOR instructions; 
however, the operation of the FXSAVE and FXRSTOR instructions are not identical to the operation of 
FSAVE/FNSAVE and FRSTOR.

NOTE

The FXSAVE and FXRSTOR instructions are not considered part of the SSE instruction group. They 
have a separate CPUID feature bit to indicate whether they are present (if 
CPUID.01H:EDX.FXSR[bit 24] = 1). 

The CPUID feature bit for SSE extensions does not indicate the presence of FXSAVE and FXRSTOR.

The FXSAVE and FXRSTOR instructions organize x87 state and SSE state in a region of memory called the FXSAVE 
area
. Section 10.5.1 provides details of the FXSAVE area and its format. Section 10.5.2 describes operation of 
FXSAVE, and Section 10.5.3 describes the operation of FXRSTOR.

10.5.1 FXSAVE 

Area

The FXSAVE and FXRSTOR instructions organize x87 state and SSE state in a region of memory called the FXSAVE 
area
. Each of the instructions takes a memory operand that specifies the 16-byte aligned base address of the 
FXSAVE area on which it operates.

PREFETCHT2

Temporal data—fetch data into level 2 cache and higher
• Pentium III processor—2nd-level cache
• Pentium 4 and Intel Xeon processor—2nd-level cache

PREFETCHNTA

Non-temporal data—fetch data into location close to the processor, minimizing cache pollution 
• Pentium III processor—1st-level cache 
• Pentium 4 and Intel Xeon processor—2nd-level cache

Table 10-1.  PREFETCHh Instructions Caching Hints (Contd.)

PREFETCHh Instruction 

Mnemonic

Actions