background image

8-8 Vol. 3A

MULTIPLE-PROCESSOR MANAGEMENT

These examples are limited to accesses to memory regions defined as write-back cacheable (WB). (Section 8.2.3.1 
describes other limitations on the generality of the examples.) The reader should understand that they describe 
only software-visible behavior. A logical processor may reorder two accesses even if one of examples indicates that 
they may not be reordered. Such an example states only that software cannot detect that such a reordering 
occurred. Similarly, a logical processor may execute a memory access more than once as long as the behavior 
visible to software is consistent with a single execution of the memory access.

8.2.3.1  

Assumptions, Terminology, and Notation

As noted above, the examples in this section are limited to accesses to memory regions defined as write-back 
cacheable (WB). They apply only to ordinary loads stores and to locked read-modify-write instructions. They do not 
necessarily apply to any of the following: out-of-order stores for string instructions (see Section 8.2.4); accesses 
with a non-temporal hint; reads from memory by the processor as part of address translation (e.g., page walks); 
and updates to segmentation and paging structures by the processor (e.g., to update “accessed” bits).
The principles underlying the examples in this section apply to individual memory accesses and to locked read-
modify-write instructions. The Intel-64 memory-ordering model guarantees that, for each of the following 
memory-access instructions, the constituent memory operation appears to execute as a single memory access:

Instructions that read or write a single byte.

Instructions that read or write a word (2 bytes) whose address is aligned on a 2 byte boundary.

Instructions that read or write a doubleword (4 bytes) whose address is aligned on a 4 byte boundary.

Instructions that read or write a quadword (8 bytes) whose address is aligned on an 8 byte boundary.

Any locked instruction (either the XCHG instruction or another read-modify-write instruction with a LOCK prefix) 
appears to execute as an indivisible and uninterruptible sequence of load(s) followed by store(s) regardless of 
alignment.
Other instructions may be implemented with multiple memory accesses. From a memory-ordering point of view, 
there are no guarantees regarding the relative order in which the constituent memory accesses are made. There is 
also no guarantee that the constituent operations of a store are executed in the same order as the constituent 
operations of a load.
Section 8.2.3.2 through Section 8.2.3.7 give examples using the MOV instruction. The principles that underlie 
these examples apply to load and store accesses in general and to other instructions that load from or store to 
memory. Section 8.2.3.8 and Section 8.2.3.9 give examples using the XCHG instruction. The principles that 
underlie these examples apply to other locked read-modify-write instructions.
This section uses the term “processor” is to refer to a logical processor. The examples are written using Intel-64 
assembly-language syntax and use the following notational conventions:

Arguments beginning with an “r”, such as r1 or r2 refer to registers (e.g., EAX) visible only to the processor 
being considered.

Memory locations are denoted with x, y, z.

Stores are written as mov [ _x], val, which implies that val is being stored into the memory location x.

Loads are written as mov r, [ _x], which implies that the contents of the memory location x are being loaded 
into the register r.

As noted earlier, the examples refer only to software visible behavior. When the succeeding sections make state-
ment such as “the two stores are reordered,” the implication is only that “the two stores appear to be reordered 
from the point of view of software.”