background image

Vol. 1 15-1

CHAPTER 15

PROGRAMMING WITH INTEL® AVX-512

15.1 OVERVIEW

The Intel AVX-512 family comprises of a collection of instruction set extensions, including AVX-512 Foundation,
AVX-512 Exponential and Reciprocal instructions, AVX-512 Conflict, AVX-512 Prefetch, and additional 512-bit
SIMD instruction extensions. Intel AVX-512 instructions are natural extensions to Intel AVX and Intel AVX2. Intel
AVX-512 introduces the following architectural enhancements:

Support for 512-bit wide vectors and SIMD register set. 512-bit register state is managed by the operating 
system using XSAVE/XRSTOR instructions introduced in 45 nm Intel 64 processors (see Intel® 64 and IA-32 
Architectures Software Developer’s Manual, Volume 2B, 
and Intel® 64 and IA-32 Architectures Software 
Developer’s Manual, Volume 3A
). 

Support for 16 new, 512-bit SIMD registers (for a total of 32 SIMD registers, ZMM0 through ZMM31) in 64-bit 
mode. The extra 16 registers state is managed by the operating system using XSAVE/XRSTOR/XSAVEOPT.

Support for 8 new opmask registers (k0 through k7) used for conditional execution and efficient merging of 
destination operands. The opmask register state is managed by the operating system using the 
XSAVE/XRSTOR/XSAVEOPT instructions.

A new encoding prefix (referred to as EVEX) to support additional vector length encoding up to 512 bits. The 
EVEX prefix builds upon the foundations of the VEX prefix to provide compact, efficient encoding for function-
ality available to VEX encoding plus the following enhanced vector capabilities: 

Opmasks.

Embedded broadcast.

Instruction prefix-embedded rounding control.

Compressed address displacements.

15.1.1 

512-Bit Wide SIMD Register Support

Intel AVX-512 instructions support 512-bit wide SIMD registers (ZMM0-ZMM31). The lower 256-bits of the ZMM 
registers are aliased to the respective 256-bit YMM registers and the lower 128-bit are aliased to the respective 
128-bit XMM registers.

15.1.2 

32 SIMD Register Support

Intel AVX-512 instructions also support 32 SIMD registers in 64-bit mode (XMM0-XMM31, YMM0-YMM31 and 
ZMM0-ZMM31). The number of available vector registers in 32-bit mode is still 8.

15.1.3 

Eight Opmask Register Support

Intel AVX-512 instructions support 8 opmask registers (k0-k7). The width of each opmask register is architectur-
ally defined as size MAX_KL (64 bits). Seven of the eight opmask registers (k1-k7) can be used in conjunction with 
EVEX-encoded AVX-512 Foundation instructions to provide conditional execution and efficient merging of data 
elements in the destination operand. The encoding of opmask register k0 is typically used when all data elements 
(unconditional processing) are desired. Additionally, the opmask registers are also used as vector flags/element-
level vector sources to introduce novel SIMD functionality as seen in new instructions such as VCOMPRESSPS.