background image

Vol. 1 10-11

PROGRAMMING WITH INTEL® STREAMING SIMD EXTENSIONS (INTEL® SSE)

10.4.3 

SSE Conversion Instructions

SSE conversion instructions (see Figure 11-8) support packed and scalar conversions between single-precision 
floating-point and doubleword integer formats.
The CVTPI2PS (convert packed doubleword integers to packed single-precision floating-point values) instruction 
converts two packed signed doubleword integers into two packed single-precision floating-point values. When the 
conversion is inexact, the result is rounded according to the rounding mode selected in the MXCSR register. 
The CVTSI2SS (convert doubleword integer to scalar single-precision floating-point value) instruction converts a 
signed doubleword integer into a single-precision floating-point value. When the conversion is inexact, the result is 
rounded according to the rounding mode selected in the MXCSR register. 
The CVTPS2PI (convert packed single-precision floating-point values to packed doubleword integers) instruction 
converts two packed single-precision floating-point values into two packed signed doubleword integers. When the 
conversion is inexact, the result is rounded according to the rounding mode selected in the MXCSR register. The 
CVTTPS2PI (convert with truncation packed single-precision floating-point values to packed doubleword integers) 
instruction is similar to the CVTPS2PI instruction, except that truncation is used to round a source value to an 
integer value (see Section 4.8.4.2, “Truncation with SSE and SSE2 Conversion Instructions”).
The CVTSS2SI (convert scalar single-precision floating-point value to doubleword integer) instruction converts a 
single-precision floating-point value into a signed doubleword integer. When the conversion is inexact, the result is 
rounded according to the rounding mode selected in the MXCSR register. The CVTTSS2SI (convert with truncation 
scalar single-precision floating-point value to doubleword integer) instruction is similar to the CVTSS2SI instruc-
tion, except that truncation is used to round the source value to an integer value (see Section 4.8.4.2, “Truncation 
with SSE and SSE2 Conversion Instructions”).

10.4.4 

SSE 64-Bit SIMD Integer Instructions

SSE extensions add the following 64-bit packed integer instructions to the IA-32 architecture. These instructions 
operate on data in MMX registers and 64-bit memory locations. 

NOTE

When SSE2 extensions are present in an IA-32 processor, these instructions are extended to 
operate on 128-bit operands in XMM registers and 128-bit memory locations.

The PAVGB (compute average of packed unsigned byte integers) and PAVGW (compute average of packed 
unsigned word integers) instructions compute a SIMD average of two packed unsigned byte or word integer oper-
ands, respectively. For each corresponding pair of data elements in the packed source operands, the elements are 
added together, a 1 is added to the temporary sum, and that result is shifted right one bit position.
The PEXTRW (extract word) instruction copies a selected word from an MMX register into a general-purpose 
register.
The PINSRW (insert word) instruction copies a word from a general-purpose register or from memory into a 
selected word location in an MMX register.
The PMAXUB (maximum of packed unsigned byte integers) instruction compares the corresponding unsigned byte 
integers in two packed operands and returns the greater of each comparison to the destination operand.
The PMINUB (minimum of packed unsigned byte integers) instruction compares the corresponding unsigned byte 
integers in two packed operands and returns the lesser of each comparison to the destination operand.
The PMAXSW (maximum of packed signed word integers) instruction compares the corresponding signed word 
integers in two packed operands and returns the greater of each comparison to the destination operand.
The PMINSW (minimum of packed signed word integers) instruction compares the corresponding signed word inte-
gers in two packed operands and returns the lesser of each comparison to the destination operand.
The PMOVMSKB (move byte mask) instruction creates an 8-bit mask from the packed byte integers in an MMX 
register and stores the result in the low byte of a general-purpose register. The mask contains the most significant 
bit of each byte in the MMX register. (When operating on 128-bit operands, a 16-bit mask is created.)