background image

11-18 Vol. 3A

MEMORY CACHE CONTROL

11.5.6 

L1 Data Cache Context Mode

L1 data cache context mode is a feature of processors based on the Intel NetBurst microarchitecture that support 
Intel Hyper-Threading Technology. When CPUID.1:ECX[bit 10] = 1, the processor supports setting L1 data cache 
context mode using the L1 data cache context mode flag ( IA32_MISC_ENABLE[bit 24] ). Selectable modes are 
adaptive mode (default) and shared mode.
The BIOS is responsible for configuring the L1 data cache context mode.

11.5.6.1   Adaptive Mode

Adaptive mode facilitates L1 data cache sharing between logical processors. When running in adaptive mode, the 
L1 data cache is shared across logical processors in the same core if:

CR3 control registers for logical processors sharing the cache are identical.

The same paging mode is used by logical processors sharing the cache.

In this situation, the entire L1 data cache is available to each logical processor (instead of being competitively 
shared).
If CR3 values are different for the logical processors sharing an L1 data cache or the logical processors use different 
paging modes, processors compete for cache resources. This reduces the effective size of the cache for each logical 
processor. Aliasing of the cache is not allowed (which prevents data thrashing).

11.5.6.2   Shared Mode

In shared mode, the L1 data cache is competitively shared between logical processors. This is true even if the 
logical processors use identical CR3 registers and paging modes.
In shared mode, linear addresses in the L1 data cache can be aliased, meaning that one linear address in the cache 
can point to different physical locations. The mechanism for resolving aliasing can lead to thrashing. For this 
reason, IA32_MISC_ENABLE[bit 24] = 0 is the preferred configuration for processors based on the Intel NetBurst 
microarchitecture that support Intel Hyper-Threading Technology.

11.6 SELF-MODIFYING 

CODE

A write to a memory location in a code segment that is currently cached in the processor causes the associated 
cache line (or lines) to be invalidated. This check is based on the physical address of the instruction. In addition, 
the P6 family and Pentium processors check whether a write to a code segment may modify an instruction that has 
been prefetched for execution. If the write affects a prefetched instruction, the prefetch queue is invalidated. This 
latter check is based on the linear address of the instruction. For the Pentium 4 and Intel Xeon processors, a write 
or a snoop of an instruction in a code segment, where the target instruction is already decoded and resident in the 
trace cache, invalidates the entire trace cache. The latter behavior means that programs that self-modify code can 
cause severe degradation of performance when run on the Pentium 4 and Intel Xeon processors.
In practice, the check on linear addresses should not create compatibility problems among IA-32 processors. Appli-
cations that include self-modifying code use the same linear address for modifying and fetching the instruction. 
Systems software, such as a debugger, that might possibly modify an instruction using a different linear address 
than that used to fetch the instruction, will execute a serializing operation, such as a CPUID instruction, before the 
modified instruction is executed, which will automatically resynchronize the instruction cache and prefetch queue. 
(See Section 8.1.3, “Handling Self- and Cross-Modifying Code,” for more information about the use of self-modi-
fying code.)
For Intel486 processors, a write to an instruction in the cache will modify it in both the cache and memory, but if 
the instruction was prefetched before the write, the old version of the instruction could be the one executed. To 
prevent the old instruction from being executed, flush the instruction prefetch unit by coding a jump instruction 
immediately after any write that modifies an instruction.