background image

8-30 Vol. 3A

MULTIPLE-PROCESSOR MANAGEMENT

NOTE

Some processors (prior to the introduction of Intel 64 Architecture and based on Intel NetBurst 
microarchitecture) do not support simultaneous loading of microcode update to the sibling logical 
processors in the same core. All other processors support logical processors initiating an update 
simultaneously. Intel recommends a common approach that the microcode loader use the 
sequential technique described in Section 9.11.6.3.

8.7.12 

Self Modifying Code

Intel processors supporting Intel Hyper-Threading Technology support self-modifying code, where data writes 
modify instructions cached or currently in flight. They also support cross-modifying code, where on an MP system 
writes generated by one processor modify instructions cached or currently in flight on another. See Section 8.1.3, 
“Handling Self- and Cross-Modifying Code,”
 for a description of the requirements for self- and cross-modifying code 
in an IA-32 processor.

8.7.13 

Implementation-Specific Intel HT Technology Facilities

The following non-architectural facilities are implementation-specific in IA-32 processors supporting Intel Hyper-
Threading Technology:

Caches

Translation lookaside buffers (TLBs)

Thermal monitoring facilities

The Intel Xeon processor MP implementation is described in the following sections.

8.7.13.1   Processor Caches

For processors supporting Intel Hyper-Threading Technology, the caches are shared. Any cache manipulation 
instruction that is executed on one logical processor has a global effect on the cache hierarchy of the physical 
processor. Note the following:

WBINVD instruction — The entire cache hierarchy is invalidated after modified data is written back to 
memory. All logical processors are stopped from executing until after the write-back and invalidate operation is 
completed. A special bus cycle is sent to all caching agents. The amount of time or cycles for WBINVD to 
complete will vary due to the size of different cache hierarchies and other factors. As a consequence, the use of 
the WBINVD instruction can have an impact on interrupt/event response time.

INVD instruction — The entire cache hierarchy is invalidated without writing back modified data to memory. 
All logical processors are stopped from executing until after the invalidate operation is completed. A special bus 
cycle is sent to all caching agents.

CLFLUSH and CLFLUSHOPT instructions — The specified cache line is invalidated from the cache hierarchy 
after any modified data is written back to memory and a bus cycle is sent to all caching agents, regardless of 
which logical processor caused the cache line to be filled.

CD flag in control register CR0 — Each logical processor has its own CR0 control register, and thus its own 
CD flag in CR0. The CD flags for the two logical processors are ORed together, such that when any logical 
processor sets its CD flag, the entire cache is nominally disabled. 

8.7.13.2   Processor Translation Lookaside Buffers (TLBs)

In processors supporting Intel Hyper-Threading Technology, data cache TLBs are shared. The instruction cache TLB 
may be duplicated or shared in each logical processor, depending on implementation specifics of different processor 
families.
Entries in the TLBs are tagged with an ID that indicates the logical processor that initiated the translation. This tag 
applies even for translations that are marked global using the page-global feature for memory paging. See Section 
4.10, “Caching Translation Information,” 
for information about global translations.