background image

8-6 Vol. 3A


illustrating the behavior of the memory-ordering model on IA-32 and Intel-64 processors. Section 8.2.4 considers 
the special treatment of stores for string operations and Section 8.2.5 discusses how memory-ordering behavior 
may be modified through the use of specific instructions.


Memory Ordering in the Intel





 and Intel486


The Pentium and Intel486 processors follow the processor-ordered memory model; however, they operate as 
strongly-ordered processors under most circumstances. Reads and writes always appear in programmed order at 
the system bus—except for the following situation where processor ordering is exhibited. Read misses are 
permitted to go ahead of buffered writes on the system bus when all the buffered writes are cache hits and, there-
fore, are not directed to the same address being accessed by the read miss. 
In the case of I/O operations, both reads and writes always appear in programmed order.
Software intended to operate correctly in processor-ordered processors (such as the Pentium 4, Intel Xeon, and P6 
family processors) should not depend on the relatively strong ordering of the Pentium or Intel486 processors. 
Instead, it should ensure that accesses to shared variables that are intended to control concurrent execution 
among processors are explicitly required to obey program ordering through the use of appropriate locking or seri-
alizing operations (see Section 8.2.5, “Strengthening or Weakening the Memory-Ordering Model”).


Memory Ordering in P6 and More Recent Processor Families

The Intel Core 2 Duo, Intel Atom, Intel Core Duo, Pentium 4, and P6 family processors also use a processor-ordered 
memory-ordering model that can be further defined as “write ordered with store-buffer forwarding.” This model 
can be characterized as follows. 
In a single-processor system for memory regions defined as write-back cacheable, the memory-ordering model 
respects the following principles (Note the memory-ordering principles for single-processor and multiple-
processor systems are written from the perspective of software executing on the processor, where the term 
“processor” refers to a logical processor. For example, a physical processor supporting multiple cores and/or Intel 
Hyper-Threading Technology is treated as a multi-processor systems.):

Reads are not reordered with other reads.

Writes are not reordered with older reads.

Writes to memory are not reordered with other writes, with the following exceptions:
— streaming stores (writes) executed with the non-temporal move instructions (MOVNTI, MOVNTQ, 


— string operations (see Section

No write to memory may be reordered with an execution of the CLFLUSH instruction; a write may be reordered 
with an execution of the CLFLUSHOPT instruction that flushes a cache line other than the one being written.



Executions of the CLFLUSH instruction are not reordered with each other. Executions of CLFLUSHOPT that 
access different cache lines may be reordered with each other. An execution of CLFLUSHOPT may be reordered 
with an execution of CLFLUSH that accesses a different cache line.

Reads may be reordered with older writes to different locations but not with older writes to the same location. 

Reads or writes cannot be reordered with I/O instructions, locked instructions, or serializing instructions.

Reads cannot pass earlier LFENCE and MFENCE instructions.

Writes and executions of CLFLUSH and CLFLUSHOPT cannot pass earlier LFENCE, SFENCE, and MFENCE 

LFENCE instructions cannot pass earlier reads.

SFENCE instructions cannot pass earlier writes or executions of CLFLUSH and CLFLUSHOPT.

MFENCE instructions cannot pass earlier reads, writes, or executions of CLFLUSH and CLFLUSHOPT.

1. Earlier versions of this manual specified that writes to memory may be reordered with executions of the CLFLUSH instruction. No 

processors implementing the CLFLUSH instruction allow such reordering.