background image

Vol. 1 12-19

PROGRAMMING WITH INTEL® SSE3, SSSE3, INTEL® SSE4 AND INTEL® AESNI

12.12.2  Checking for SSE4.1 Support

Before an application attempts to use SSE4.1 instructions, the application should follow the steps illustrated in 
Section 11.6.2, “Checking for SSE/SSE2 Support.” Next, use the additional step provided below:
Check that the processor supports SSE4.1 (if CPUID.01H:ECX.SSE4_1[bit 19] = 1), SSE3 (if 
CPUID.01H:ECX.SSE3[bit 0] = 1), and SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1). 

12.12.3  Checking for SSE4.2 Support

Before an application attempts to use the following SSE4.2 instructions: PCMPESTRI/PCMPESTRM/PCMP-
ISTRI/PCMPISTRM, PCMPGTQ;the application should follow the steps illustrated in Section 11.6.2, “Checking for 
SSE/SSE2 Support.”
 Next, use the additional step provided below:
Check that the processor supports SSE4.2 (if CPUID.01H:ECX.SSE4_2[bit 20] = 1), SSE4.1 (if 
CPUID.01H:ECX.SSE4_1[bit 19] = 1), and SSSE3 (if CPUID.01H:ECX.SSSE3[bit 9] = 1). 
Before an application attempts to use the CRC32 instruction, it must check that the processor supports SSE4.2 (if 
CPUID.01H:ECX.SSE4_2[bit 20] = 1).
Before an application attempts to use the POPCNT instruction, it must check that the processor supports SSE4.2 (if 
CPUID.01H:ECX.SSE4_2[bit 20] = 1) and POPCNT (if CPUID.01H:ECX.POPCNT[bit 23] = 1).

12.13 AESNI 

OVERVIEW

The AESNI extension provides six instructions to accelerate symmetric block encryption/decryption of 128-bit data 
blocks using the Advanced Encryption Standard (AES) specified by the NIST publication FIPS 197. Specifically, two 
instructions (AESENC, AESENCLAST) target the AES encryption rounds, two instructions (AESDEC, AESDECLAST) 
target AES decryption rounds using the Equivalent Inverse Cipher. One instruction (AESIMC) targets the Inverse 
MixColumn transformation primitive and one instruction (AESKEYGEN) targets generation of round keys from the 
cipher key for the AES encryption/decryption rounds.
AES supports encryption/decryption using cipher key lengths of 128, 192, and 256 bits by processing the data 
block in 10, 12, 14 rounds of predefined transformations. Figure 12-5 depicts the cryptographic processing of a 
block of 128-bit plain text into cipher text. 

The predefined AES transformation primitives are described in the next few sections, they are also referenced in 
the operation flow of instruction reference page of these instructions.

12.13.1  Little-Endian Architecture and Big-Endian Specification (FIPS 197)

FIPS 197 document defines the Advanced Encryption Standard (AES) and includes a set of test vectors for testing 
all of the steps in the algorithm, and can be used for testing and debugging. 

Figure 12-5.  AES State Flow

Plain text

AES State

RK(0)

XOR

Rounds 2.. n-2

Round 1

Last 

RK(1)

RK(n-1)

AES State

AES State

Cipher text

AES-128: n = 10
AES-192: n = 12
AES-256: n = 14

Round 
n-1