Page 731

Vol. 3B 18-95

PERFORMANCE MONITORING

18.15.5.8 Generating an Interrupt on Overflow

Any performance counter can be configured to generate a performance monitor interrupt (PMI) if the counter over-
flows. The PMI interrupt service routine can then collect information about the state of the processor or program
when overflow occurred. This information can then be used with a tool like the Intel

VTune™ Performance

Analyzer to analyze and tune program performance.
To enable an interrupt on counter overflow, the OVR_PMI flag in the counter’s associated CCCR MSR must be set.
When overflow occurs, a PMI is generated through the local APIC. (Here, the performance counter entry in the local
vector table [LVT] is set up to deliver the interrupt generated by the PMI to the processor.)
The PMI service routine can use the OVF flag to determine which counter overflowed when multiple counters have
been configured to generate PMIs. Also, note that these processors mask PMIs upon receiving an interrupt. Clear
this condition before leaving the interrupt handler.
When generating interrupts on overflow, the performance counter being used should be preset to value that will
cause an overflow after a specified number of events are counted plus 1. The simplest way to select the preset
value is to write a negative number into the counter, as described in Section 18.15.5.6, “Cascading Counters.”
Here, however, if an interrupt is to be generated after 100 event counts, the counter should be preset to minus 100
plus 1 (-100 + 1), or -99. The counter will then overflow after it counts 99 events and generate an interrupt on the
next (100th) event counted. The difference of 1 for this count enables the interrupt to be generated immediately
after the selected event count has been reached, instead of waiting for the overflow to be propagation through the
counter.
Because of latency in the microarchitecture between the generation of events and the generation of interrupts on
overflow, it is sometimes difficult to generate an interrupt close to an event that caused it. In these situations, the
FORCE_OVF flag in the CCCR can be used to improve reporting. Setting this flag causes the counter to overflow on
every counter increment, which in turn triggers an interrupt after every counter increment.

18.15.5.9 Counter Usage Guideline

There are some instances where the user must take care to configure counting logic properly, so that it is not
powered down. To use any ESCR, even when it is being used just for tagging, (any) one of the counters that the
particular ESCR (or its paired ESCR) can be connected to should be enabled. If this is not done, 0 counts may
result. Likewise, to use any counter, there must be some event selected in a corresponding ESCR (other than
no_event, which generally has a select value of 0).

18.15.6 At-Retirement

Counting

At-retirement counting provides a means counting only events that represent work committed to architectural
state and ignoring work that was performed speculatively and later discarded.
One example of this speculative activity is branch prediction. When a branch misprediction occurs, the results of
instructions that were decoded and executed down the mispredicted path are canceled. If a performance counter
was set up to count all executed instructions, the count would include instructions whose results were canceled as
well as those whose results committed to architectural state.
To provide finer granularity in event counting in these situations, the performance monitoring facilities provided in
the Pentium 4 and Intel Xeon processors provide a mechanism for tagging events and then counting only those
tagged events that represent committed results. This mechanism is called “at-retirement counting.”
Tables 19-29 throug h 19-33 list predefined at-retirement events and event metrics that can be used to for tagging
events when using at retirement counting. The following terminology is used in describing at-retirement counting:

•

Bogus, non-bogus, retire — In at-retirement event descriptions, the term “bogus” refers to instructions or
μops that must be canceled because they are on a path taken from a mispredicted branch. The terms “retired”

and “non-bogus” refer to instructions or μops along the path that results in committed architectural state

changes as required by the program being executed. Thus instructions and μops are either bogus or non-

bogus, but not both. Several of the Pentium 4 and Intel Xeon processors’ performance monitoring events (such
as, Instruction_Retired and Uops_Retired in Table 19-29) can count instructions or μops that are retired based

on the characterization of bogus” versus non-bogus.