Page 337

Vol. 1 14-9

PROGRAMMING WITH AVX, FMA AND AVX2

14.2.1

256-bit Floating-Point Arithmetic Processing Enhancements

Intel AVX provides 35 256-bit floating-point arithmetic instructions, see Table 14-2. The arithmetic operations
cover add, subtract, multiply, divide, square-root, compare, max, min, round, etc., on single-precision and double-
precision floating-point data.
The enhancement in AVX on floating-point compare operation provides 32 conditional predicates to improve
programming flexibility in evaluating conditional expressions.

14.2.2 256-bit

Non-Arithmetic Instruction Enhancements

Intel AVX provides new primitives for handling data movement within 256-bit floating-point vectors and promotes
many 128-bit floating data processing instructions to handle 256-bit floating-point vectors.
AVX includes 39 256-bit data movement and processing instructions that are promoted from previous generations
of SIMD instruction extensions, ranging from logical, blend, convert, test, unpacking, shuffling, load and stores
(see Table 14-3).

yes

PCMPISTRI

yes

PCMPISTRM

SSE4.2

POPCNT

integer

Table 14-2. Promoted 256-Bit and 128-bit Arithmetic AVX Instructions

VEX.256 Encoding

VEX.128 Encoding

Legacy Instruction Mnemonic

yes

SQRTPS, SQRTPD, RSQRTPS, RCPPS

yes

ADDPS, ADDPD, SUBPS, SUBPD

yes

MULPS, MULPD, DIVPS, DIVPD

yes

CVTPS2PD, CVTPD2PS

yes

CVTDQ2PS, CVTPS2DQ

yes

CVTTPS2DQ, CVTTPD2DQ

yes

CVTPD2DQ, CVTDQ2PD

yes

MINPS, MINPD, MAXPS, MAXPD

yes

HADDPD, HADDPS, HSUBPD, HSUBPS

yes

CMPPS, CMPPD

yes

ADDSUBPD, ADDSUBPS, DPPS

yes

ROUNDPD, ROUNDPS

Table 14-3. Promoted 256-bit and 128-bit Data Movement AVX Instructions

VEX.256 Encoding

VEX.128 Encoding

Legacy Instruction Mnemonic

yes

MOVAPS, MOVAPD, MOVDQA

yes

MOVUPS, MOVUPD, MOVDQU

yes

MOVMSKPS, MOVMSKPD

yes

LDDQU, MOVNTPS, MOVNTPD, MOVNTDQ, MOVNTDQA

yes

MOVSHDUP, MOVSLDUP, MOVDDUP

VEX.256

Encoding

VEX.128

Encoding

Group

Instruction

If No, Reason?