background image

Vol. 3A 8-47

MULTIPLE-PROCESSOR MANAGEMENT

Example 8-23.  Verifying MONITOR/MWAIT Support

boolean MONITOR_MWAIT_works = TRUE;
try {

_asm {

xor ecx, ecx
xor edx, edx
mov eax, MemArea
monitor 
}

        // Use monitor
} except (UNWIND) {
        // if we get here, MONITOR/MWAIT is not supported

MONITOR_MWAIT_works = FALSE;

}

8.10.4 MONITOR/MWAIT 

Instruction

Operating systems usually implement idle loops to handle thread synchronization. In a typical idle-loop scenario, 
there could be several “busy loops” and they would use a set of memory locations. An impacted processor waits in 
a loop and poll a memory location to determine if there is available work to execute. The posting of work is typically 
a write to memory (the work-queue of the waiting processor). The time for initiating a work request and getting it 
scheduled is on the order of a few bus cycles. 
From a resource sharing perspective (logical processors sharing execution resources), use of the HLT instruction in 
an OS idle loop is desirable but has implications. Executing the HLT instruction on a idle logical processor puts the 
targeted processor in a non-execution state. This requires another processor (when posting work for the halted 
logical processor) to wake up the halted processor using an inter-processor interrupt. The posting and servicing of 
such an interrupt introduces a delay in the servicing of new work requests. 
In a shared memory configuration, exits from busy loops usually occur because of a state change applicable to a 
specific memory location; such a change tends to be triggered by writes to the memory location by another agent 
(typically a processor). 
MONITOR/MWAIT complement the use of HLT and PAUSE to allow for efficient partitioning and un-partitioning of 
shared resources among logical processors sharing physical resources. MONITOR sets up an effective address 
range that is monitored for write-to-memory activities; MWAIT places the processor in an optimized state (this 
may vary between different implementations) until a write to the monitored address range occurs. 
In the initial implementation of MONITOR and MWAIT, they are available at CPL = 0 only.
Both instructions rely on the state of the processor’s monitor hardware. The monitor hardware can be either armed 
(by executing the MONITOR instruction) or triggered (due to a variety of events, including a store to the monitored 
memory region). If upon execution of MWAIT, monitor hardware is in a triggered state: MWAIT behaves as a NOP 
and execution continues at the next instruction in the execution stream. The state of monitor hardware is not archi-
tecturally visible except through the behavior of MWAIT.
Multiple events other than a write to the triggering address range can cause a processor that executed MWAIT to 
wake up. These include events that would lead to voluntary or involuntary context switches, such as:

External interrupts, including NMI, SMI, INIT, BINIT, MCERR, A20M#

Faults, Aborts (including Machine Check)

Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MSR writes; execution of LMSW 
(occurring prior to issuing MWAIT but after setting the monitor)

Voluntary transitions due to fast system call and far calls (occurring prior to issuing MWAIT but after setting the 
monitor)

Power management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause 
the monitor event pending flag to be cleared. Faults will not cause the monitor event pending flag to be cleared.