background image

Vol. 1 15-9

PROGRAMMING WITH INTEL® AVX-512

15.6.1 

OPMASK Register to Predicate Vector Data Processing

AVX-512 instructions using EVEX encode a predicate operand to conditionally control per-element computational 
operation and updating of the result to the destination operand. The predicate operand is known as the opmask 
register. The opmask is a set of eight architectural registers of size MAX_KL (64-bit). Note that from this set of eight 
architectural registers, only k1 through k7 can be addressed as a predicate operand. k0 can be used as a regular 
source or destination but cannot be encoded as a predicate operand. Note also that a predicate operand can be 
used to enable memory fault-suppression for some instructions with a memory operand (source or destination). 
As a predicate operand, the opmask registers contain one bit to govern the operation/update to each data element 
of a vector register. In general, opmask registers can support instructions with all element sizes: byte (int8), word 
(int16), single-precision floating-point (float32), integer doubleword(int32), double-precision floating-point 
(float64), integer quadword (int64). Therefore, a ZMM vector register can hold 8, 16, 32, or 64 elements in prin-
ciple. The length of an opmask register, MAX_KL, is sufficient to handle up to 64 elements with one bit per element, 
i.e., 64 bits. Masking is supported in most of the AVX-512 instructions. For a given vector length, each instruction 
accesses only the number of least significant mask bits that are needed based on its data type. For example, AVX-
512 Foundation instructions operating on 64-bit data elements with a 512-bit vector length, only use the 8 least 
significant bits of the opmask register.
An opmask register affects an AVX-512 instruction at per-element granularity. Any numeric or non-numeric oper-
ation of each data element and per-element updates of intermediate results to the destination operand are predi-
cated on the corresponding bit of the opmask register. 
An opmask serving as a predicate operand in AVX-512 obeys the following properties:

The instruction’s operation is not performed for an element if the corresponding opmask bit is not set. This 
implies that no exception or violation can be caused by an operation on a masked-off element. Consequently, 
no MXCSR exception flag is updated as a result of a masked-off operation.

A destination element is not updated with the result of the operation if the corresponding writemask bit is not 
set. Instead, the destination element value must be preserved (merging-masking) or it must be zeroed out 
(zeroing-masking). 

For some instructions with a memory operand, memory faults are suppressed for elements with a mask bit of 
0.

Note that this feature provides a versatile construct to implement control-flow predication as the mask in effect 
provides a merging behavior for AVX-512 vector register destinations. As an alternative the masking can be used 
for zeroing instead of merging, so that the masked out elements are updated with 0 instead of preserving the old 
value. The zeroing behavior is provided to remove the implicit dependency on the old value when it is not needed.
Most instructions with masking enabled accept both forms of masking. Instructions that must have EVEX.aaa bits 
different than 0 (gather and scatter) and instructions that write to memory only accept merging-masking. 
It’s important to note that the per-element destination update rule also applies when the destination operand is a 
memory location. Vectors are written on a per element basis, based on the opmask register used as a predicate 
operand. 
The value of an opmask register can be:

Generated as a result of a vector instruction (e.g., CMP, FPCLASS, etc.).

Loaded from memory.

Loaded from a GPR register.

Modified by mask-to-mask operations.

Opmask registers can be used for purposes outside of predication. For example, they can be used to manipulate 
sparse sets of elements from a vector, or used to set the EFLAGS based on the 0/0xFFFFFFFFFFFFFFFF/other status 
of the OR of two opmask registers.

15.6.1.1   Opmask Register K0

The only exception to the opmask rules described above is that opmask k0 can not be used as a predicate operand. 
Opmask k0 cannot be encoded as a predicate operand for a vector operation; the encoding value that would select 
opmask k0 will instead select an implicit opmask value of 0xFFFFFFFFFFFFFFFF, thereby effectively disabling