Page 303

Vol. 3A 8-47

MULTIPLE-PROCESSOR MANAGEMENT

Example 8-23. Verifying MONITOR/MWAIT Support

boolean MONITOR_MWAIT_works = TRUE;
try {

_asm {

xor ecx, ecx
xor edx, edx
mov eax, MemArea
monitor
}

// Use monitor
} except (UNWIND) {
// if we get here, MONITOR/MWAIT is not supported

MONITOR_MWAIT_works = FALSE;

}

8.10.4 MONITOR/MWAIT

Instruction

Operating systems usually implement idle loops to handle thread synchronization. In a typical idle-loop scenario,
there could be several “busy loops” and they would use a set of memory locations. An impacted processor waits in
a loop and poll a memory location to determine if there is available work to execute. The posting of work is typically
a write to memory (the work-queue of the waiting processor). The time for initiating a work request and getting it
scheduled is on the order of a few bus cycles.
From a resource sharing perspective (logical processors sharing execution resources), use of the HLT instruction in
an OS idle loop is desirable but has implications. Executing the HLT instruction on a idle logical processor puts the
targeted processor in a non-execution state. This requires another processor (when posting work for the halted
logical processor) to wake up the halted processor using an inter-processor interrupt. The posting and servicing of
such an interrupt introduces a delay in the servicing of new work requests.
In a shared memory configuration, exits from busy loops usually occur because of a state change applicable to a
specific memory location; such a change tends to be triggered by writes to the memory location by another agent
(typically a processor).
MONITOR/MWAIT complement the use of HLT and PAUSE to allow for efficient partitioning and un-partitioning of
shared resources among logical processors sharing physical resources. MONITOR sets up an effective address
range that is monitored for write-to-memory activities; MWAIT places the processor in an optimized state (this
may vary between different implementations) until a write to the monitored address range occurs.
In the initial implementation of MONITOR and MWAIT, they are available at CPL = 0 only.
Both instructions rely on the state of the processor’s monitor hardware. The monitor hardware can be either armed
(by executing the MONITOR instruction) or triggered (due to a variety of events, including a store to the monitored
memory region). If upon execution of MWAIT, monitor hardware is in a triggered state: MWAIT behaves as a NOP
and execution continues at the next instruction in the execution stream. The state of monitor hardware is not archi-
tecturally visible except through the behavior of MWAIT.
Multiple events other than a write to the triggering address range can cause a processor that executed MWAIT to
wake up. These include events that would lead to voluntary or involuntary context switches, such as:

•

External interrupts, including NMI, SMI, INIT, BINIT, MCERR, A20M#

•

Faults, Aborts (including Machine Check)

•

Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MSR writes; execution of LMSW
(occurring prior to issuing MWAIT but after setting the monitor)

•

Voluntary transitions due to fast system call and far calls (occurring prior to issuing MWAIT but after setting the
monitor)

Power management related events (such as Thermal Monitor 2 or chipset driven STPCLK# assertion) will not cause
the monitor event pending flag to be cleared. Faults will not cause the monitor event pending flag to be cleared.