background image

2-40 Vol. 2A

INSTRUCTION FORMAT

where EVEX encoded instructions are classified using the tupletype attribute. The scale factor N of each tupletype 
is listed based on the vector length (VL) and other factors affecting it.
Table 2-34 covers EVEX-encoded instructions which has a load semantic in conjunction with additional computa-
tional or data element movement operation, operating either on the full vector or half vector (due to conversion of 
numerical precision from a wider format to narrower format). EVEX.b is supported for such instructions for data 
element sizes which are either dword or qword (see Section 2.6.11). 
EVEX-encoded instruction that are pure load/store, and “Load+op” instruction semantic that operate on data 
element size less then dword do not support broadcasting using EVEX.b. These are listed in Table 2-35. Table 2-35 
also includes many broadcast instructions which perform broadcast using a subset of data elements without using 
EVEX.b. These instructions and a few data element size conversion instruction are covered in Table 2-35. Instruc-
tion classified in Table 2-35 do not use EVEX.b and EVEX.b must be 0, otherwise #UD will occur.
The tupletype abbreviation will be referenced in the instruction operand encoding table in the reference page of 
each instruction, providing the cross reference for the scaling factor N to encoding memory addressing operand. 
Note that the disp8*N rules still apply when using 16b addressing.

Table 2-34.  Compressed Displacement (DISP8*N) Affected by Embedded Broadcast

TupleType

EVEX.b InputSize EVEX.W  Broadcast N (VL=128) N (VL=256)

N (VL= 512)

Comment

Full Vector 

(FV)

0

32bit

0

none

16

32

64

Load+Op (Full Vector 

Dword/Qword)

1

32bit

0

{1tox}

4

4

4

0

64bit

1

none

16

32

64

1

64bit

1

{1tox}

8

8

8

Half Vector 

(HV)

0

32bit

0

none

8

16

32

Load+Op (Half Vector)

1

32bit

0

{1tox}

4

4

4

Table 2-35.  EVEX DISP8*N for Instructions Not Affected by Embedded Broadcast

TupleType

InputSize EVEX.W N (VL= 128) N (VL= 256) N (VL= 512)

Comment

Full Vector Mem (FVM)

N/A

N/A

16

32

64

Load/store or subDword full vector

Tuple1 Scalar (T1S)

8bit

N/A

1

1

1

1Tuple less than Full Vector

16bit

N/A

2

2

2

32bit

0

4

4

4

64bit

1

8

8

8

Tuple1 Fixed (T1F)

32bit

N/A

4

4

4

1 Tuple memsize not affected by 

EVEX.W

64bit

N/A

8

8

8

Tuple2 (T2)

32bit

0

8

8

8

Broadcast (2 elements) 

64bit

1

NA

16

16

Tuple4 (T4)

32bit

0

NA

16

16

Broadcast (4 elements) 

64bit

1

NA

NA

32

Tuple8 (T8)

32bit

0

NA

NA

32

Broadcast (8 elements) 

Half Mem (HVM)

N/A

N/A

8

16

32

 SubQword Conversion

QuarterMem (QVM)

N/A

N/A

4

8

16

SubDword Conversion

OctMem (OVM)

N/A

N/A

2

4

8

SubWord Conversion

Mem128 (M128)

N/A

N/A

16

16

16

Shift count from memory

MOVDDUP (DUP)

N/A

N/A

8

32

64

VMOVDDUP