8-12 Vol. 3A
MULTIPLE-PROCESSOR MANAGEMENT
•
Because the Intel-64 memory-ordering model prevents loads from being reordered (see Section 8.2.3.2),
processor 3’s loads occur in order and, therefore, processor 1’s XCHG occurs before processor 3’s load from x.
•
Since processor 0’s XCHG into x occurs before processor 1’s XCHG (by assumption), it occurs before
processor 3’s load from x. Thus, r6 = 1.
A similar argument (referring instead to processor 2’s loads) applies if processor 1’s XCHG occurs before
processor 0’s XCHG.
8.2.3.9
Loads and Stores Are Not Reordered with Locked Instructions
The memory-ordering model prevents loads and stores from being reordered with locked instructions that execute
earlier or later. The examples in this section illustrate only cases in which a locked instruction is executed before a
load or a store. The reader should note that reordering is prevented also if the locked instruction is executed after
a load or a store.
The first example illustrates that loads may not be reordered with earlier locked instructions:
As explained in Section 8.2.3.8, there is a total order of the executions of locked instructions. Without loss of
generality, suppose that processor 0’s XCHG occurs first.
Because the Intel-64 memory-ordering model prevents processor 1’s load from being reordered with its earlier
XCHG, processor 0’s XCHG occurs before processor 1’s load. This implies r4 = 1.
A similar argument (referring instead to processor 2’s accesses) applies if processor 1’s XCHG occurs before
processor 0’s XCHG.
The second example illustrates that a store may not be reordered with an earlier locked instruction:
Assume r2 = 1.
•
Because r2 = 1, processor 0’s store to y occurs before processor 1’s load from y.
•
Because the memory-ordering model prevents a store from being reordered with an earlier locked instruction,
processor 0’s XCHG into x occurs before its store to y. Thus, processor 0’s XCHG into x occurs before
processor 1’s load from y.
•
Because the memory-ordering model prevents loads from being reordered (see Section 8.2.3.2), processor 1’s
loads occur in order and, therefore, processor 1’s XCHG into x occurs before processor 1’s load from x. Thus,
r3 = 1.
8.2.4
Fast-String Operation and Out-of-Order Stores
Section 7.3.9.3 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 described an optimi-
zation of repeated string operations called fast-string operation.
Example 8-9. Loads Are not Reordered with Locks
Processor 0
Processor 1
xchg [ _x], r1
xchg [ _y], r3
mov r2, [ _y]
mov r4, [ _x]
Initially x = y = 0, r1 = r3 = 1
r2 = 0 and r4 = 0 is not allowed
Example 8-10. Stores Are not Reordered with Locks
Processor 0
Processor 1
xchg [ _x], r1
mov r2, [ _y]
mov [ _y], 1
mov r3, [ _x]
Initially x = y = 0, r1 = 1
r2 = 1 and r3 = 0 is not allowed