background image

Vol. 1 12-9

PROGRAMMING WITH INTEL® SSE3, SSSE3, INTEL® SSE4 AND INTEL® AESNI

12.7 WRITING 

APPLICATIONS WITH SSSE3 EXTENSIONS

The following sections give guidelines for writing application programs and operating-system code that use SSSE3 
instructions. 

12.7.1 

Guidelines for Using SSSE3 Extensions

The following guidelines describe how to maximize the benefits of using SSSE3 extensions:

Check that the processor supports SSSE3 extensions.

Ensure that your operating system supports SSE/SSE2/SSE3/SSSE3 extensions. (Operating system support 
for the SSE extensions implies sufficient support for SSE2, SSE3, and SSSE3.) 

Employ the optimization and scheduling techniques described in the Intel® 64 and IA-32 Architectures Optimi-
zation Reference Manual 
(see Section 1.4, “Related Literature”).

12.7.2 

Checking for SSSE3 Support

Before an application attempts to use the SSSE3 extensions, the application should follow the steps illustrated in 
Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:

Check that the processor supports SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1). 

12.8 

SSE3/SSSE3 AND SSE4 EXCEPTIONS

SSE3, SSSE3, and SSE4 instructions can generate the same type of memory-access and non-numeric exceptions 
as other Intel 64 or IA-32 instructions. Existing exception handlers generally handle these exceptions without code 
modification. 
FISTTP can generate floating-point exceptions. Some SSE3 instructions can also generate SIMD floating-point 
exceptions. 
SSE3 additions and changes are noted in the following sections. See also: Section 11.5, “SSE, SSE2, and SSE3 
Exceptions”.

12.8.1 

Device Not Available (DNA) Exceptions

SSE3, SSSE3, and SSE4 will cause a DNA Exception (#NM) if the processor attempts to execute an SSE3 instruc-
tion while CR0.TS[bit 3] = 1. If CPUID.01H:ECX.SSE3[bit 0] = 0, execution of an SSE3 extension will cause an 
invalid opcode fault regardless of the state of CR0.TS[bit 3].
Similarly, an attempt to execute an SSSE3 instruction on a processor that reports CPUID.01H:ECX.SSSE3[bit 9] = 
0 will cause an invalid opcode fault regardless of the state of CR0.TS[bit 3]. An attempt to execute an SSE4.1 
instruction on a processor that reports CPUID.01H:ECX.SSE4_1[bit 19] = 0 will cause an invalid opcode fault 
regardless of the state of CR0.TS[bit 3].
An attempt to execute PCMPGTQ or any one of the four string processing instructions in SSE4.2 on a processor that 
reports CPUID.01H:ECX.SSSE3[bit 20] = 0 will cause an invalid opcode fault regardless of the state of 
CR0.TS[bit 3]. CRC32 and POPCNT do not cause #NM.

12.8.2 

Numeric Error flag and IGNNE#

Most SSE3 instructions ignore CR0.NE[bit 5] (treats it as if it were always set) and the IGNNE# pin. With one 
exception, all use the exception 19 (#XM) software exception for error reporting. The exception is FISTTP; it 
behaves like other x87-FP instructions.
SSSE3 instructions ignore CR0.NE[bit 5] (treats it as if it were always set) and the IGNNE# pin.