background image

11-18 Vol. 1

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS 2 (INTEL® SSE2)

2. Checks CR4.OSXMMEXCPT[bit 10]. If this flag is set, the processor goes to step 3; if the flag is clear, it 

generates an invalid-opcode exception (#UD) and makes an implicit call to the invalid-opcode exception 
handler.

3. Generates a SIMD floating-point exception (#XM) and makes an implicit call to the SIMD floating-point 

exception handler.

4. If the exception handler is able to fix the source operands that generated the pre-computation exceptions or 

mask the condition in such a way as to allow the processor to continue executing the instruction, the processor 
resumes instruction execution as described in step 5.

5. Upon returning from the exception handler (or if no pre-computation exceptions were detected), the processor 

checks for post-computation exceptions. If the processor detects any post-computation exceptions: it ORs 
those exceptions, sets the appropriate exception flags, leaves the source and destination operands unaltered, 
and repeats steps 2, 3, and 4.

6. Upon returning from the exceptions handler in step 4 (or if no post-computation exceptions were detected), the 

processor completes the execution of the instruction.

The implication of this procedure is that for unmasked exceptions, the processor can generate a SIMD floating-
point exception (#XM) twice: once if it detects pre-computation exception conditions and a second time if it detects 
post-computation exception conditions. For example, if SIMD floating-point exceptions are unmasked for the 
computation shown in Figure 11-9, the processor would generate one SIMD floating-point exception for denormal 
operand conditions and a second SIMD floating-point exception for overflow and underflow (no inexact result 
exception would be generated because the multiplications of X0 and Y0 and of X1 and Y1 are exact).

11.5.3.3   Handling Combinations of Masked and Unmasked Exceptions

In situations where both masked and unmasked exceptions are detected, the processor will set exception flags for 
the masked and the unmasked exceptions. However, it will not return masked results until after the processor has 
detected and handled unmasked post-computation exceptions and returned from the exception handler (as in step 
6 above) to finish executing the instruction.

11.5.4 

Handling SIMD Floating-Point Exceptions in Software

Section 4.9.3, “Typical Actions of a Floating-Point Exception Handler,” shows actions that may be carried out by a 
SIMD floating-point exception handler. The SSE/SSE2/SSE3 state is saved with the FXSAVE instruction (see Section 
11.6.5, “Saving and Restoring the SSE/SSE2 State”).
 

11.5.5 

Interaction of SIMD and x87 FPU Floating-Point Exceptions

SIMD floating-point exceptions are generated independently from x87 FPU floating-point exceptions. SIMD 
floating-point exceptions do not cause assertion of the FERR# pin (independent of the value of CR0.NE[bit 5]). 
They ignore the assertion and deassertion of the IGNNE# pin.
If applications use SSE/SSE2/SSE3 instructions along with x87 FPU instructions (in the same task or program), 
consider the following:

SIMD floating-point exceptions are reported independently from the x87 FPU floating-point exceptions. SIMD 
and x87 FPU floating-point exceptions can be unmasked independently. Separate x87 FPU and SIMD floating-
point exception handlers must be provided if the same exception is unmasked for x87 FPU and for 
SSE/SSE2/SSE3 operations.

The rounding mode specified in the MXCSR register does not affect x87 FPU instructions. Likewise, the rounding 
mode specified in the x87 FPU control word does not affect the SSE/SSE2/SSE3 instructions. To use the same 
rounding mode, the rounding control bits in the MXCSR register and in the x87 FPU control word must be set 
explicitly to the same value.

The flush-to-zero mode set in the MXCSR register for SSE/SSE2/SSE3 instructions has no counterpart in the 
x87 FPU. For compatibility with the x87 FPU, set the flush-to-zero bit to 0.