background image

11-10 Vol. 3A

MEMORY CACHE CONTROL

between the caches when instructions are modified. See Section 11.6, “Self-Modifying Code,” for more information 
on the implications of caching instructions.

11.5 CACHE 

CONTROL

The Intel 64 and IA-32 architectures provide a variety of mechanisms for controlling the caching of data and 
instructions and for controlling the ordering of reads and writes between the processor, the caches, and memory. 
These mechanisms can be divided into two groups:

Cache control registers and bits — The Intel 64 and IA-32 architectures define several dedicated registers 
and various bits within control registers and page- and directory-table entries that control the caching system 
memory locations in the L1, L2, and L3 caches. These mechanisms control the caching of virtual memory pages 
and of regions of physical memory.

Cache control and memory ordering instructions — The Intel 64 and IA-32 architectures provide several 
instructions that control the caching of data, the ordering of memory reads and writes, and the prefetching of 
data. These instructions allow software to control the caching of specific data structures, to control memory 
coherency for specific locations in memory, and to force strong memory ordering at specific locations in a 
program.

The following sections describe these two groups of cache control mechanisms.

11.5.1 

Cache Control Registers and Bits

Figure 11-3 depicts cache-control mechanisms in IA-32 processors. Other than for the matter of memory address 
space, these work the same in Intel 64 processors.
The Intel 64 and IA-32 architectures provide the following cache-control registers and bits for use in enabling or 
restricting caching to various pages or regions in memory:

CD flag, bit 30 of control register CR0 — Controls caching of system memory locations (see Section 2.5, 
“Control Registers”). If 
the CD flag is clear, caching is enabled for the whole of system memory, but may be 
restricted for individual pages or regions of memory by other cache-control mechanisms. When the CD flag is 
set, caching is restricted in the processor’s caches (cache hierarchy) for the P6 and more recent processor 
families and prevented for the Pentium processor (see note below). With the CD flag set, however, the caches 
will still respond to snoop traffic. Caches should be explicitly flushed to insure memory coherency. For highest 
processor performance, both the CD and the NW flags in control register CR0 should be cleared. Table 11-5 
shows the interaction of the CD and NW flags.
The effect of setting the CD flag is somewhat different for processor families starting with P6 family than the 
Pentium processor (see Table 11-5). To insure memory coherency after the CD flag is set, the caches should 
be explicitly flushed (see Section 11.5.3, “Preventing Caching”). Setting the CD flag for the P6 and more 
recent processor families modify cache line fill and update behaviour. Also, setting the CD flag on these 
processors do not force strict ordering of memory accesses unless the MTRRs are disabled and/or all memory 
is referenced as uncached (see Section 8.2.5, “Strengthening or Weakening the Memory-Ordering Model”).