Vol. 1 14-9
PROGRAMMING WITH AVX, FMA AND AVX2
14.2.1
256-bit Floating-Point Arithmetic Processing Enhancements
Intel AVX provides 35 256-bit floating-point arithmetic instructions, see Table 14-2. The arithmetic operations
cover add, subtract, multiply, divide, square-root, compare, max, min, round, etc., on single-precision and double-
precision floating-point data.
The enhancement in AVX on floating-point compare operation provides 32 conditional predicates to improve
programming flexibility in evaluating conditional expressions.
14.2.2 256-bit
Non-Arithmetic Instruction Enhancements
Intel AVX provides new primitives for handling data movement within 256-bit floating-point vectors and promotes
many 128-bit floating data processing instructions to handle 256-bit floating-point vectors.
AVX includes 39 256-bit data movement and processing instructions that are promoted from previous generations
of SIMD instruction extensions, ranging from logical, blend, convert, test, unpacking, shuffling, load and stores
(see Table 14-3).
no
yes
PCMPISTRI
VI
no
yes
PCMPISTRM
VI
no
no
SSE4.2
POPCNT
integer
Table 14-2. Promoted 256-Bit and 128-bit Arithmetic AVX Instructions
VEX.256 Encoding
VEX.128 Encoding
Legacy Instruction Mnemonic
yes
yes
SQRTPS, SQRTPD, RSQRTPS, RCPPS
yes
yes
ADDPS, ADDPD, SUBPS, SUBPD
yes
yes
MULPS, MULPD, DIVPS, DIVPD
yes
yes
CVTPS2PD, CVTPD2PS
yes
yes
CVTDQ2PS, CVTPS2DQ
yes
yes
CVTTPS2DQ, CVTTPD2DQ
yes
yes
CVTPD2DQ, CVTDQ2PD
yes
yes
MINPS, MINPD, MAXPS, MAXPD
yes
yes
HADDPD, HADDPS, HSUBPD, HSUBPS
yes
yes
CMPPS, CMPPD
yes
yes
ADDSUBPD, ADDSUBPS, DPPS
yes
yes
ROUNDPD, ROUNDPS
Table 14-3. Promoted 256-bit and 128-bit Data Movement AVX Instructions
VEX.256 Encoding
VEX.128 Encoding
Legacy Instruction Mnemonic
yes
yes
MOVAPS, MOVAPD, MOVDQA
yes
yes
MOVUPS, MOVUPD, MOVDQU
yes
yes
MOVMSKPS, MOVMSKPD
yes
yes
LDDQU, MOVNTPS, MOVNTPD, MOVNTDQ, MOVNTDQA
yes
yes
MOVSHDUP, MOVSLDUP, MOVDDUP
VEX.256
Encoding
VEX.128
Encoding
Group
Instruction
If No, Reason?