background image

8-12 Vol. 3A

MULTIPLE-PROCESSOR MANAGEMENT

•

Because the Intel-64 memory-ordering model prevents loads from being reordered (see Section 8.2.3.2), 
processor 3’s loads occur in order and, therefore, processor 1’s XCHG occurs before processor 3’s load from x.

•

Since processor 0’s XCHG into x occurs before processor 1’s XCHG (by assumption), it occurs before 
processor 3’s load from x. Thus, r6 = 1.

A similar argument (referring instead to processor 2’s loads) applies if processor 1’s XCHG occurs before 
processor 0’s XCHG.

8.2.3.9  

Loads and Stores Are Not Reordered with Locked Instructions

The memory-ordering model prevents loads and stores from being reordered with locked instructions that execute 
earlier or later. The examples in this section illustrate only cases in which a locked instruction is executed before a 
load or a store. The reader should note that reordering is prevented also if the locked instruction is executed after 
a load or a store.
The first example illustrates that loads may not be reordered with earlier locked instructions:

As explained in Section 8.2.3.8, there is a total order of the executions of locked instructions. Without loss of 
generality, suppose that processor 0’s XCHG occurs first.
Because the Intel-64 memory-ordering model prevents processor 1’s load from being reordered with its earlier 
XCHG, processor 0’s XCHG occurs before processor 1’s load. This implies r4 = 1.
A similar argument (referring instead to processor 2’s accesses) applies if processor 1’s XCHG occurs before 
processor 0’s XCHG.
The second example illustrates that a store may not be reordered with an earlier locked instruction:

Assume r2 = 1.

•

Because r2 = 1, processor 0’s store to y occurs before processor 1’s load from y.

•

Because the memory-ordering model prevents a store from being reordered with an earlier locked instruction, 
processor 0’s XCHG into x occurs before its store to y. Thus, processor 0’s XCHG into x occurs before 
processor 1’s load from y.

•

Because the memory-ordering model prevents loads from being reordered (see Section 8.2.3.2), processor 1’s 
loads occur in order and, therefore, processor 1’s XCHG into x occurs before processor 1’s load from x. Thus, 
r3 = 1.

8.2.4 

Fast-String Operation and Out-of-Order Stores

Section 7.3.9.3 of Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 1 described an optimi-
zation of repeated string operations called fast-string operation.

Example 8-9.  Loads Are not Reordered with Locks

Processor 0

Processor 1

xchg [ _x], r1

xchg [ _y], r3

mov r2, [ _y]

mov r4, [ _x]

Initially x = y = 0, r1 = r3 = 1
r2 = 0 and r4 = 0 is not allowed

Example 8-10.  Stores Are not Reordered with Locks

Processor 0

Processor 1

xchg [ _x], r1

mov r2, [ _y]

mov [ _y], 1

mov r3, [ _x]

Initially x = y = 0, r1 = 1
r2 = 1 and r3 = 0 is not allowed