background image

Vol. 3B 18-95

PERFORMANCE MONITORING

18.15.5.8   Generating an Interrupt on Overflow

Any performance counter can be configured to generate a performance monitor interrupt (PMI) if the counter over-
flows. The PMI interrupt service routine can then collect information about the state of the processor or program 
when overflow occurred. This information can then be used with a tool like the Intel

®

 

VTune™ Performance 

Analyzer to analyze and tune program performance.
To enable an interrupt on counter overflow, the OVR_PMI flag in the counter’s associated CCCR MSR must be set. 
When overflow occurs, a PMI is generated through the local APIC. (Here, the performance counter entry in the local 
vector table [LVT] is set up to deliver the interrupt generated by the PMI to the processor.)
The PMI service routine can use the OVF flag to determine which counter overflowed when multiple counters have 
been configured to generate PMIs. Also, note that these processors mask PMIs upon receiving an interrupt. Clear 
this condition before leaving the interrupt handler.
When generating interrupts on overflow, the performance counter being used should be preset to value that will 
cause an overflow after a specified number of events are counted plus 1. The simplest way to select the preset 
value is to write a negative number into the counter, as described in Section 18.15.5.6, “Cascading Counters.” 
Here, however, if an interrupt is to be generated after 100 event counts, the counter should be preset to minus 100 
plus 1 (-100 + 1), or -99. The counter will then overflow after it counts 99 events and generate an interrupt on the 
next (100th) event counted. The difference of 1 for this count enables the interrupt to be generated immediately 
after the selected event count has been reached, instead of waiting for the overflow to be propagation through the 
counter.
Because of latency in the microarchitecture between the generation of events and the generation of interrupts on 
overflow, it is sometimes difficult to generate an interrupt close to an event that caused it. In these situations, the 
FORCE_OVF flag in the CCCR can be used to improve reporting. Setting this flag causes the counter to overflow on 
every counter increment, which in turn triggers an interrupt after every counter increment.

18.15.5.9   Counter Usage Guideline

There are some instances where the user must take care to configure counting logic properly, so that it is not 
powered down. To use any ESCR, even when it is being used just for tagging, (any) one of the counters that the 
particular ESCR (or its paired ESCR) can be connected to should be enabled. If this is not done, 0 counts may 
result. Likewise, to use any counter, there must be some event selected in a corresponding ESCR (other than 
no_event, which generally has a select value of 0). 

18.15.6 At-Retirement 

Counting

At-retirement counting provides a means counting only events that represent work committed to architectural 
state and ignoring work that was performed speculatively and later discarded.
One example of this speculative activity is branch prediction. When a branch misprediction occurs, the results of 
instructions that were decoded and executed down the mispredicted path are canceled. If a performance counter 
was set up to count all executed instructions, the count would include instructions whose results were canceled as 
well as those whose results committed to architectural state.
To provide finer granularity in event counting in these situations, the performance monitoring facilities provided in 
the Pentium 4 and Intel Xeon processors provide a mechanism for tagging events and then counting only those 
tagged events that represent committed results. This mechanism is called “at-retirement counting.” 
Tables 19-29 through 19-33 list predefined at-retirement events and event metrics that can be used to for tagging 
events when using at retirement counting. The following terminology is used in describing at-retirement counting:

Bogus, non-bogus, retire — In at-retirement event descriptions, the term “bogus” refers to instructions or 
μops that must be canceled because they are on a path taken from a mispredicted branch. The terms “retired” 

and “non-bogus” refer to instructions or μops along the path that results in committed architectural state 

changes as required by the program being executed. Thus instructions and μops are either bogus or non-

bogus, but not both. Several of the Pentium 4 and Intel Xeon processors’ performance monitoring events (such 
as, Instruction_Retired and Uops_Retired in Table 19-29) can count instructions or μops that are retired based 

on the characterization of bogus” versus non-bogus.