background image

VPGATHERDD/VPGATHERDQ—Gather Packed Dword, Packed Qword with Signed Dword Indices

INSTRUCTION SET REFERENCE, V-Z

Vol. 2C 5-249

VPGATHERDD/VPGATHERDQ—Gather Packed Dword, Packed Qword with Signed Dword Indices

Instruction Operand Encoding

Description

A set of 16 or 8 doubleword/quadword memory locations pointed to by base address BASE_ADDR and index vector 
VINDEX with scale SCALE are gathered. The result is written into vector zmm1. The elements are specified via the 
VSIB (i.e., the index register is a zmm, holding packed indices). Elements will only be loaded if their corresponding 
mask bit is one. If an element’s mask bit is not set, the corresponding element of the destination register (zmm1) 
is left unchanged. The entire mask register will be set to zero by this instruction unless it triggers an exception.
This instruction can be suspended by an exception if at least one element is already gathered (i.e., if the exception 
is triggered by an element other than the rightmost one with its mask bit set). When this happens, the destination 
register and the mask register (k1) are partially updated; those elements that have been gathered are placed into 
the destination register and have their mask bits set to zero. If any traps or interrupts are pending from already 
gathered elements, they will be delivered in lieu of the exception; in this case, EFLAG.RF is set to one so an instruc-
tion breakpoint is not re-triggered when the instruction is continued.
If the data element size is less than the index element size, the higher part of the destination register and the mask 
register do not correspond to any elements being gathered. This instruction sets those higher parts to zero. It may 
update these unused elements to one or both of those registers even if the instruction triggers an exception, and 
even if the instruction triggers the exception before gathering any elements.
Note that:

The values may be read from memory in any order. Memory ordering with other instructions follows the Intel-
64 memory-ordering model.

Faults are delivered in a right-to-left manner. That is, if a fault is triggered by an element and delivered, all 
elements closer to the LSB of the destination zmm will be completed (and non-faulting). Individual elements 
closer to the MSB may or may not be completed. If a given element triggers multiple faults, they are delivered 
in the conventional order.

Elements may be gathered in any order, but faults must be delivered in a right-to-left order; thus, elements to 
the left of a faulting one may be gathered before the fault is delivered. A given implementation of this 

Opcode/

Instruction

Op/

En

64/32 

bit Mode 

Support

CPUID 

Feature 

Flag

Description

EVEX.128.66.0F38.W0 90 /vsib 

T1S

V/V

AVX512VL

Using signed dword indices, gather dword values from 

memory using writemask k1 for merging-masking.

VPGATHERDD xmm1 {k1}, vm32x

AVX512F

EVEX.256.66.0F38.W0 90 /vsib 

T1S

V/V

AVX512VL

Using signed dword indices, gather dword values from 

memory using writemask k1 for merging-masking.

VPGATHERDD ymm1 {k1}, vm32y

AVX512F

EVEX.512.66.0F38.W0 90 /vsib 

T1S

V/V

AVX512F

Using signed dword indices, gather dword values from 

memory using writemask k1 for merging-masking.

VPGATHERDD zmm1 {k1}, vm32z

EVEX.128.66.0F38.W1 90 /vsib 

T1S

V/V

AVX512VL

Using signed dword indices, gather quadword values from 

memory using writemask k1 for merging-masking.

VPGATHERDQ xmm1 {k1}, vm32x

AVX512F

EVEX.256.66.0F38.W1 90 /vsib 

T1S

V/V

AVX512VL

Using signed dword indices, gather quadword values from 

memory using writemask k1 for merging-masking.

VPGATHERDQ ymm1 {k1}, vm32x

AVX512F

EVEX.512.66.0F38.W1 90 /vsib 

T1S

V/V

AVX512F

Using signed dword indices, gather quadword values from 

memory using writemask k1 for merging-masking.

VPGATHERDQ zmm1 {k1}, vm32y

Op/En

Operand 1

Operand 2

Operand 3

Operand 4

T1S

ModRM:reg (w)

BaseReg (R): VSIB:base,

VectorReg(R): VSIB:index

NA

NA