Vol. 1 15-5
PROGRAMMING WITH INTEL® AVX-512
PREFETCHT1W does not require OS support for XMM/YMM/ZMM/k-reg, SIMD FP exception support.
Procedural Flow of Application Detection of other 512-bit extensions:
Prior to using the Intel AVX-512 Exponential and Reciprocal instructions, the application must identify that the
operating system supports the XGETBV instruction and the ZMM register state, in addition to confirming the
processor’s support for ZMM state management using XSAVE/XRSTOR and AVX-512 Foundation instructions. The
following simplified sequence accomplishes both and is strongly recommended.
1. Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use).
2. Execute XGETBV and verify that XCR0[7:5] = ‘111b’ (OPMASK state, upper 256-bit of ZMM0-ZMM15 and
ZMM16-ZMM31 state are enabled by OS) and that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled
by OS).
3. Verify both CPUID.0x7.0:EBX.AVX512F[bit 16] = 1, and CPUID.0x7.0:EBX.AVX512ER[bit 27] = 1.
Prior to using the Intel AVX-512 Prefetch instructions, the application must identify that the operating system
supports the XGETBV instruction and the ZMM register state, in addition to confirming the processor’s support for
ZMM state management using XSAVE/XRSTOR and AVX-512 Foundation instructions. The following simplified
sequence accomplishes both and is strongly recommended.
1. Detect CPUID.1:ECX.OSXSAVE[bit 27] = 1 (XGETBV enabled for application use).
2. Execute XGETBV and verify that XCR0[7:5] = ‘111b’ (OPMASK state, upper 256-bit of ZMM0-ZMM15 and
ZMM16-ZMM31 state are enabled by OS) and that XCR0[2:1] = ‘11b’ (XMM state and YMM state are enabled
by OS).
3. Verify both CPUID.0x7.0:EBX.AVX512F[bit 16] = 1, and CPUID.0x7.0:EBX.AVX512PF[bit 26] = 1.
15.3
DETECTION OF 512-BIT INSTRUCTION GROUPS OF INTEL
®
AVX-512
FAMILY
In addition to the Intel AVX-512 Foundation instructions, Intel AVX-512 family provides several groups of instruc-
tion extensions that can operate in vector lengths of 512/256/128 bits. Each group is enumerated by a CPUID leaf
7 feature flag and can be encoded via the EVEX.L’L field to support operation at vector lengths smaller than 512
bits. These instruction groups are listed in Table 15-1.
Figure 15-3. Procedural Flow for Application Detection of 512-bit Instructions
Implied HW support for
Check enabled state in
XCR0 via XGETBV
Check AVX512F and
additional 512-bit flags
Check feature flag
CPUID.1H:ECX.OSXSAVE = 1?
OS provides processor
extended state management
States
ok to use
XSAVE, XRSTOR, XGETBV, XCR0
enabled
Instructions
Yes
YMM,ZMM
Opmask,