10-16 Vol. 1
PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS (INTEL® SSE)
•
Byte 4 is used for an abridged version of the x87 FPU Tag Word (FTW). The following items describe its usage:
— For each j, 0 ≤ j ≤ 7, FXSAVE saves a 0 into bit j of byte 4 if x87 FPU data register STj has a empty tag;
otherwise, FXSAVE saves a 1 into bit j of byte 4.
— For each j, 0 ≤ j ≤ 7, FXRSTOR establishes the tag value for x87 FPU data register STj as follows. If bit j of
byte 4 is 0, the tag for STj in the tag register for that data register is marked empty (11B); otherwise, the
x87 FPU sets the tag for STj based on the value being loaded into that register (see below).
•
Bytes 15:8 are used as follows:
— If the instruction has no REX prefix, or if REX.W = 0:
•
Bytes 11:8 are used for bits 31:0 of the x87 FPU Instruction Pointer Offset (FIP).
•
If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0, bytes 13:12 are used for x87 FPU Instruction Pointer
Selector (FPU CS). Otherwise, the processor deprecates the FPU CS value: FXSAVE saves it as 0000H.
•
Bytes 15:14 are not used.
— If the instruction has a REX prefix with REX.W = 1, bytes 15:8 are used for the full 64 bits of FIP.
•
Bytes 23:16 are used as follows:
— If the instruction has no REX prefix, or if REX.W = 0:
•
Bytes 19:16 are used for bits 31:0 of the x87 FPU Data Pointer Offset (FDP).
•
If CPUID.(EAX=07H,ECX=0H):EBX[bit 13] = 0, bytes 21:20 are used for x87 FPU Data Pointer Selector
(FPU DS). Otherwise, the processor deprecates the FPU DS value: FXSAVE saves it as 0000H.
•
Bytes 23:22 are not used.
— If the instruction has a REX prefix with REX.W = 1, bytes 23:16 are used for the full 64 bits of FDP.
•
Bytes 31:24 are used for SSE state (see Section 10.5.1.2).
•
Bytes 159:32 are used for the registers ST0–ST7 (MM0–MM7). Each of the 8 registers is allocated a 128-bit
region, with the low 80 bits used for the register and the upper 48 bits unused.
10.5.1.2 SSE State
Table 10-2 illustrates how FXSAVE and FXRSTOR organize x87 state and SSE state; the SSE state is listed below,
along with details of its interactions with FXSAVE and FXRSTOR:
•
Bytes 23:0 are used for x87 state (see Section 10.5.1.1).
•
Bytes 27:24 are used for the MXCSR register. FXRSTOR generates a general-protection fault (#GP) in response
to an attempt to set any of the reserved bits in the MXCSR register.
•
Bytes 31:28 are used for the MXCSR_MASK value. FXRSTOR ignores this field.
•
Bytes 159:32 are used for x87 state.
•
Bytes 287:160 are used for the registers XMM0–XMM7.
•
Bytes 415:288 are used for the registers XMM8–XMM15. These fields are used only in 64-bit mode. Executions
of FXSAVE outside 64-bit mode do not write to these bytes; executions of FXRSTOR outside 64-bit mode do not
read these bytes and do not update XMM8–XMM15.
If CR4.OSFXSR = 0, FXSAVE and FXRSTOR may or may not operate on SSE state; this behavior is implementation
dependent. Moreover, SSE instructions cannot be used unless CR4.OSFXSR = 1.
10.5.2
Operation of FXSAVE
The FXSAVE instruction takes a single memory operand, which is an FXSAVE area. The instruction stores x87 state
and SSE state to the FXSAVE area. See Section 10.5.1.1 and Section 10.5.1.2 for details regarding mode-specific
operation and operation determined by instruction prefixes.