background image

18-6 Vol. 3B

PERFORMANCE MONITORING

This event counts mispredicted branch instructions at retirement. It counts the retirement of the last micro-op 
of a branch instruction in the architectural path of execution and experienced misprediction in the branch 
prediction hardware. 
Branch prediction hardware is implementation-specific across microarchitectures; value comparison to 
estimate performance differences is not recommended. 

NOTE

Programming decisions or software precisians on functionality should not be based on the event 
values or dependent on the existence of performance monitoring events.

18.2.2 

Architectural Performance Monitoring Version 2

The enhanced features provided by architectural performance monitoring version 2 include the following:

Fixed-function performance counter register and associated control register — Three of the architec-
tural performance events are counted using three fixed-function MSRs (IA32_FIXED_CTR0 through 
IA32_FIXED_CTR2). Each of the fixed-function PMC can count only one architectural performance event. 
Configuring the fixed-function PMCs is done by writing to bit fields in the MSR (IA32_FIXED_CTR_CTRL) located 
at address 38DH. Unlike configuring performance events for general-purpose PMCs (IA32_PMCx) via UMASK 
field in (IA32_PERFEVTSELx), configuring, programming IA32_FIXED_CTR_CTRL for fixed-function PMCs do 
not require any UMASK.

Simplified event programming — Most frequent operation in programming performance events are 
enabling/disabling event counting and checking the status of counter overflows. Architectural performance 
event version 2 provides three architectural MSRs:
— IA32_PERF_GLOBAL_CTRL allows software to enable/disable event counting of all or any combination of 

fixed-function PMCs (IA32_FIXED_CTRx) or any general-purpose PMCs via a single WRMSR.

— IA32_PERF_GLOBAL_STATUS allows software to query counter overflow conditions on any combination of 

fixed-function PMCs or general-purpose PMCs via a single RDMSR.

— IA32_PERF_GLOBAL_OVF_CTRL allows software to clear counter overflow conditions on any combination of 

fixed-function PMCs or general-purpose PMCs via a single WRMSR.

PMI Overhead Mitigation — Architectural performance monitoring version 2 introduces two bit field interface 
in IA32_DEBUGCTL for PMI service routine to accumulate performance monitoring data and LBR records with 
reduced perturbation from servicing the PMI. The two bit fields are:
— IA32_DEBUGCTL.Freeze_LBR_On_PMI(bit 11). In architectural performance monitoring version 2, only the 

legacy semantic behavior is supported. See Section 17.4.7 for details of the legacy Freeze LBRs on PMI 
control.

— IA32_DEBUGCTL.Freeze_PerfMon_On_PMI(bit 12). In architectural performance monitoring version 2, only 

the legacy semantic behavior is supported. See Section 17.4.7 for details of the legacy Freeze LBRs on PMI 
control.

The facilities provided by architectural performance monitoring version 2 can be queried from CPUID leaf 0AH by 
examining the content of register EDX:

Bits 0 through 4 of CPUID.0AH.EDX indicates the number of fixed-function performance counters available per 
core,

Bits 5 through 12 of CPUID.0AH.EDX indicates the bit-width of fixed-function performance counters. Bits 
beyond the width of the fixed-function counter are reserved and must be written as zeros.

NOTE

Early generation of processors based on Intel Core microarchitecture may report in 
CPUID.0AH:EDX of support for version 2 but indicating incorrect information of version 2 facilities.

The IA32_FIXED_CTR_CTRL MSR include multiple sets of 4-bit field, each 4 bit field controls the operation of a 
fixed-function performance counter. Figure 18-2 shows the layout of 4-bit controls for each fixed-function PMC. 
Two sub-fields are currently defined within each control. The definitions of the bit fields are: