background image

8-48 Vol. 3A

MULTIPLE-PROCESSOR MANAGEMENT

Software should not allow for voluntary context switches in between MONITOR/MWAIT in the instruction flow. Note 
that execution of MWAIT does not re-arm the monitor hardware. This means that MONITOR/MWAIT need to be 
executed in a loop. Also note that exits from the MWAIT state could be due to a condition other than a write to the 
triggering address; software should explicitly check the triggering data location to determine if the write occurred. 
Software should also check the value of the triggering address following the execution of the monitor instruction 
(and prior to the execution of the MWAIT instruction). This check is to identify any writes to the triggering address 
that occurred during the course of MONITOR execution. 
The address range provided to the MONITOR instruction must be of write-back caching type. Only write-back 
memory type stores to the monitored address range will trigger the monitor hardware. If the address range is not 
in memory of write-back type, the address monitor hardware may not be set up properly or the monitor hardware 
may not be armed. Software is also responsible for ensuring that

Writes that are not intended to cause the exit of a busy loop do not write to a location within the address region 
being monitored by the monitor hardware,

Writes intended to cause the exit of a busy loop are written to locations within the monitored address region.

Not doing so will lead to more false wakeups (an exit from the MWAIT state not due to a write to the intended data 
location). These have negative performance implications. It might be necessary for software to use padding to 
prevent false wakeups. CPUID provides a mechanism for determining the size data locations for monitoring as well 
as a mechanism for determining the size of a the pad.

8.10.5 

Monitor/Mwait Address Range Determination

To use the MONITOR/MWAIT instructions, software should know the length of the region monitored by the 
MONITOR/MWAIT instructions and the size of the coherence line size for cache-snoop traffic in a multiprocessor 
system. This information can be queried using the CPUID monitor leaf function (EAX = 05H). You will need the 
smallest and largest monitor line size:

To avoid missed wake-ups: make sure that the data structure used to monitor writes fits within the smallest 
monitor line-size. Otherwise, the processor may not wake up after a write intended to trigger an exit from 
MWAIT. 

To avoid false wake-ups; use the largest monitor line size to pad the data structure used to monitor writes. 
Software must make sure that beyond the data structure, no unrelated data variable exists in the triggering 
area for MWAIT. A pad may be needed to avoid this situation.

These above two values bear no relationship to cache line size in the system and software should not make any 
assumptions to that effect. Within a single-cluster system, the two parameters should default to be the same (the 
size of the monitor triggering area is the same as the system coherence line size).
Based on the monitor line sizes returned by the CPUID, the OS should dynamically allocate structures with appro-
priate padding. If static data structures must be used by an OS, attempt to adapt the data structure and use a 
dynamically allocated data buffer for thread synchronization. When the latter technique is not possible, consider 
not using MONITOR/MWAIT when using static data structures.
To set up the data structure correctly for MONITOR/MWAIT on multi-clustered systems: interaction between 
processors, chipsets, and the BIOS is required (system coherence line size may depend on the chipset used in the 
system; the size could be different from the processor’s monitor triggering area). The BIOS is responsible to set the 
correct value for system coherence line size using the IA32_MONITOR_FILTER_LINE_SIZE MSR. Depending on the 
relative magnitude of the size of the monitor triggering area versus the value written into the 
IA32_MONITOR_FILTER_LINE_SIZE MSR, the smaller of the parameters will be reported as the Smallest Monitor 
Line Size
. The larger of the parameters will be reported as the Largest Monitor Line Size.

8.10.6 

Required Operating System Support

This section describes changes that must be made to an operating system to run on processors supporting Intel 
Hyper-Threading Technology. It also describes optimizations that can help an operating system make more efficient 
use of the logical processors sharing execution resources. The required changes and suggested optimizations are 
representative of the types of modifications that appear in Windows* XP and Linux* kernel 2.4.0 operating systems 
for Intel processors supporting Intel Hyper-Threading Technology. Additional optimizations for processors