background image

Vol. 1 11-13

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS 2 (INTEL® SSE2)

11.4.5 Branch 

Hints

SSE2 extensions designate two instruction prefixes (2EH and 3EH) to provide branch hints to the processor (see 
“Instruction Prefixes” in Chapter 2 of thIntel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 
2A)
. These prefixes can only be used with the Jcc instruction and only at the machine code level (that is, there are 
no mnemonics for the branch hints).

11.5 

SSE, SSE2, AND SSE3 EXCEPTIONS

SSE/SSE2/SSE3 extensions generate two general types of exceptions:

Non-numeric exceptions

SIMD floating-point exceptions

1

SSE/SSE2/SSE3 instructions can generate the same type of memory-access and non-numeric exceptions as other 
IA-32 architecture instructions. Existing exception handlers can generally handle these exceptions without any 
code modification. See “Providing Non-Numeric Exception Handlers for Exceptions Generated by the SSE, SSE2 
and SSE3 Instructions” in Chapter 13 of the Intel® 64 and IA-32 Architectures Software Developer’s Manual, 
Volume 3A,
 for a list of the non-numeric exceptions that can be generated by SSE/SSE2/SSE3 instructions and for 
guidelines for handling these exceptions.
SSE/SSE2/SSE3 instructions do not generate numeric exceptions on packed integer operations; however, they can 
generate numeric (SIMD floating-point) exceptions on packed single-precision and double-precision floating-point 
operations. These SIMD floating-point exceptions are defined in the IEEE Standard 754 for Binary Floating-Point 
Arithmetic and are the same exceptions that are generated for x87 FPU instructions. See Section 11.5.1, “SIMD 
Floating-Point Exceptions,” fo
r a description of these exceptions.

11.5.1 

SIMD Floating-Point Exceptions

SIMD floating-point exceptions are those exceptions that can be generated by SSE/SSE2/SSE3 instructions that 
operate on packed or scalar floating-point operands.
Six classes of SIMD floating-point exceptions can be generated:

Invalid operation (#I)

Divide-by-zero (#Z)

Denormal operand (#D)

Numeric overflow (#O)

Numeric underflow (#U)

Inexact result (Precision) (#P)

All of these exceptions (except the denormal operand exception) are defined in IEEE Standard 754, and they are 
the same exceptions that are generated with the x87 floating-point instructions. Section 4.9, “Overview of 
Floating-Point Exceptions,” giv
es a detailed description of these exceptions and of how and when they are gener-
ated. The following sections discuss the implementation of these exceptions in SSE/SSE2/SSE3 extensions.
All SIMD floating-point exceptions are precise and occur as soon as the instruction completes execution.
Each of the six exception conditions has a corresponding flag (IE, DE, ZE, OE, UE, and PE) and mask bit (IM, DM, 
ZM, OM, UM, and PM) in the MXCSR register (see Figure 10-3). The mask bits can be set with the LDMXCSR or 
FXRSTOR instruction; the mask and flag bits can be read with the STMXCSR or FXSAVE instruction.
The OSXMMEXCEPT flag (bit 10) of control register CR4 provides additional control over generation of SIMD 
floating-point exceptions by allowing the operating system to indicate whether or not it supports software excep-
tion handlers for SIMD floating-point exceptions. If an unmasked SIMD floating-point exception is generated and 
the OSXMMEXCEPT flag is set, the processor invokes a software exception handler by generating a SIMD floating-

1. The FISTTP instruction in SSE3 does not generate SIMD floating-point exceptions, but it can generate x87 FPU floating-point excep-

tions.