background image

Vol. 1 15-5

PROGRAMMING WITH INTEL® AVX-512

PREFETCHT1W does not require OS support for XMM/YMM/ZMM/k-reg, SIMD FP exception support.
Procedural Flow of Application Detection of other 512-bit extensions:
Prior to using the Intel AVX-512 Exponential and Reciprocal instructions, the application must identify that the 
operating system supports the XGETBV instruction and the ZMM register state, in addition to confirming the 
processor’s support for ZMM state management using XSAVE/XRSTOR and AVX-512 Foundation instructions. The 
following simplified sequence accomplishes both and is strongly recommended.
1. Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use).
2. Execute XGETBV and verify that XCR0[7:5] = ‘111b’ (OPMASK state, upper 256-bit of ZMM0-ZMM15 and 

ZMM16-ZMM31 state are enabled by OS) and that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled 
by OS).

3. Verify both CPUID.0x7.0:EBX.AVX512F[bit 16] = 1, and CPUID.0x7.0:EBX.AVX512ER[bit 27] = 1.
Prior to using the Intel AVX-512 Prefetch instructions, the application must identify that the operating system 
supports the XGETBV instruction and the ZMM register state, in addition to confirming the processor’s support for 
ZMM state management using XSAVE/XRSTOR and AVX-512 Foundation instructions. The following simplified 
sequence accomplishes both and is strongly recommended.
1. Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use).
2. Execute XGETBV and verify that XCR0[7:5] = ‘111b’ (OPMASK state, upper 256-bit of ZMM0-ZMM15 and 

ZMM16-ZMM31 state are enabled by OS) and that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled 
by OS).

3. Verify both CPUID.0x7.0:EBX.AVX512F[bit 16] = 1, and CPUID.0x7.0:EBX.AVX512PF[bit 26] = 1.

15.3 

DETECTION OF 512-BIT INSTRUCTION GROUPS OF INTEL

®

 AVX-512 

FAMILY

In addition to the Intel AVX-512 Foundation instructions, Intel AVX-512 family provides several groups of instruc-
tion extensions that can operate in vector lengths of 512/256/128 bits. Each group is enumerated by a CPUID leaf 
7 feature flag and can be encoded via the EVEX.L’L field to support operation at vector lengths smaller than 512 
bits. These instruction groups are listed in Table 15-1.

Figure 15-3.  Procedural Flow for Application Detection of 512-bit Instructions

Implied HW support for

Check enabled state in

XCR0 via XGETBV

Check AVX512F and

additional 512-bit flags

Check feature flag

CPUID.1H:ECX.OSXSAVE = 1? 

OS provides processor

extended state management

States

ok to use

XSAVE, XRSTOR, XGETBV, XCR0

enabled

Instructions

Yes 

YMM,ZMM

Opmask,