background image

5-28 Vol. 1

INSTRUCTION SET SUMMARY

5.15 FUSED-MULTIPLY-ADD 

(FMA)

FMA extensions enhances Intel AVX with high-throughput, arithmetic capabilities covering fused multiply-add, 
fused multiply-subtract, fused multiply add/subtract interleave, signed-reversed multiply on fused multiply-add 
and multiply-subtract. FMA extensions provide 36 256-bit floating-point instructions to perform computation on 
256-bit vectors and additional 128-bit and scalar FMA instructions.

Table 14-15 lists FMA instruction sets.

5.16 

INTEL® ADVANCED VECTOR EXTENSIONS 2 (INTEL® AVX2)

Intel

®

 AVX2 extends Intel AVX by promoting most of the 128-bit SIMD integer instructions with 256-bit numeric 

processing capabilities. Intel AVX2 instructions follow the same programming model as AVX instructions. 
In addition, AVX2 provide enhanced functionalities for broadcast/permute operations on data elements, vector 
shift instructions with variable-shift count per data element, and instructions to fetch non-contiguous data 
elements from memory.

Table 14-18 lists promoted vector integer instructions in AVX2.

Table 14-19 lists new instructions in AVX2 that complements AVX.

5.17 INTEL® 

TRANSACTIONAL 

SYNCHRONIZATION EXTENSIONS (INTEL® TSX)

XABORT

Abort an RTM transaction execution.

XACQUIRE

Prefix hint to the beginning of an HLE transaction region.

XRELEASE

Prefix hint to the end of an HLE transaction region.

XBEGIN

Transaction begin of an RTM transaction region.

XEND

Transaction end of an RTM transaction region.

XTEST

Test if executing in a transactional region.

5.18 INTEL® 

SHA 

EXTENSIONS 

Intel

®

 SHA extensions provide a set of instructions that target the acceleration of the Secure Hash Algorithm 

(SHA), specifically the SHA-1 and SHA-256 variants. 
SHA1MSG1

Perform an intermediate calculation for the next four SHA1 message dwords from the 

previous message dwords.

SHA1MSG2

Perform the final calculation for the next four SHA1 message dwords from the intermediate 

message dwords.

SHA!NEXTE

Calculate SHA1 state E after four rounds.

SHA1RNDS4

Perform four rounds of SHA1 operations.

SHA256MSG1

Perform an intermediate calculation for the next four SHA256 message dwords.

SHA256MSG2

Perform the final calculation for the next four SHA256 message dwords.

SHA256RNDS2

Perform two rounds of SHA256 operations.

5.19 

INTEL® ADVANCED VECTOR EXTENSIONS 512 (INTEL® AVX-512)

The Intel

®

 AVX-512 family comprises a collection of 512-bit SIMD instruction sets to accelerate a diverse range of 

applications. Intel AVX-512 instructions provide a wide range of functionality that support programming in 512-bit, 
256 and 128-bit vector register, plus support for opmask registers and instructions operating on opmask registers. 
The collection of 512-bit SIMD instruction sets in Intel AVX-512 include new functionality not available in Intel AVX 
and Intel AVX2, and promoted instructions similar to equivalent ones in Intel AVX / Intel AVX2 but with enhance-