background image

Vol. 3A 8-15

MULTIPLE-PROCESSOR MANAGEMENT

In Example 8-14, processor 0 started a string operation to write to a memory block of 512 bytes starting at address 
_x. Processor 0 got interrupted after k iterations of store operations. The address _y has not yet been updated by 
processor 0 when processor 0 got interrupted. The interrupt handler that took control on processor 0 writes to the 
address _z. Processor 1 may see the store to _z from the interrupt handler, before seeing the remaining stores to 
the 512-byte memory block that are executed when the string operation resumes.
Example 8-15 illustrates the ordering of string operations with earlier stores. No store from a string operation can 
be visible before all prior stores are visible.

8.2.5 

Strengthening or Weakening the Memory-Ordering Model

The Intel 64 and IA-32 architectures provide several mechanisms for strengthening or weakening the memory-
ordering model to handle special programming situations. These mechanisms include:

The I/O instructions, locking instructions, the LOCK prefix, and serializing instructions force stronger ordering 
on the processor.

The SFENCE instruction (introduced to the IA-32 architecture in the Pentium III processor) and the LFENCE and 
MFENCE instructions (introduced in the Pentium 4 processor) provide memory-ordering and serialization 
capabilities for specific types of memory operations.

The memory type range registers (MTRRs) can be used to strengthen or weaken memory ordering for specific 
area of physical memory (see Section 11.11, “Memory Type Range Registers (MTRRs)”). MTRRs are available 
only in the Pentium 4, Intel Xeon, and P6 family processors. 

The page attribute table (PAT) can be used to strengthen memory ordering for a specific page or group of pages 
(see Section 11.12, “Page Attribute Table (PAT)”). The PAT is available only in the Pentium 4, Intel Xeon, and 
Pentium III processors. 

These mechanisms can be used as follows:
Memory mapped devices and other I/O devices on the bus are often sensitive to the order of writes to their I/O 
buffers. I/O instructions can be used to (the IN and OUT instructions) impose strong write ordering on such 
accesses as follows. Prior to executing an I/O instruction, the processor waits for all previous instructions in the 
program to complete and for all buffered writes to drain to memory. Only instruction fetch and page tables walks 
can pass I/O instructions. Execution of subsequent instructions do not begin until the processor determines that 
the I/O instruction has been completed.

Example 8-14.  Interrupted String Operation

Processor 0

Processor 1

rep:stosd [ _x] // interrupted before es:edi reach _y

mov r1, [ _z]

mov [_z], $1 // interrupt handler 

mov r2, [ _y]

Initially on processor 0: EAX = 1, ECX=128, ES:EDI =_x 
Initially [_y] = [_z] = 0, [_x] to 511[_x]= 0, _x <= _y < _x+512, _z is a separate memory location
r1 = 1 and r2 = 0 is allowed

Example 8-15.  String Operations Are not Reordered with Earlier Stores

Processor 0

Processor 1

mov [_z], $1

mov r1, [ _y]

rep:stosd [ _x] 

mov r2, [ _z]

Initially on processor 0: EAX = 1, ECX=128, ES:EDI =_x 
Initially [_y] = [_z] = 0, [_x] to 511[_x]= 0, _x <= _y < _x+512, _z is a separate memory location
r1 = 1 and r2 = 0 is not allowed