background image

Vol. 3B 18-35

PERFORMANCE MONITORING

18.8.1 

Enhancements of Performance Monitoring in the Processor Core

The notable enhancements in the monitoring of performance events in the processor core include:

Four general purpose performance counters, IA32_PMCx, associated counter configuration MSRs, 
IA32_PERFEVTSELx, and global counter control MSR supporting simplified control of four counters. Each of the 
four performance counter can support processor event based sampling (PEBS) and thread-qualification of 
architectural and non-architectural performance events. Width of IA32_PMCx supported by hardware has been 
increased. The width of counter reported by CPUID.0AH:EAX[23:16] is 48 bits. The PEBS facility in Intel micro-
architecture code name Nehalem has been enhanced to include new data format to capture additional infor-
mation, such as load latency.

Load latency sampling facility. Average latency of memory load operation can be sampled using load-latency 
facility in processors based on Intel microarchitecture code name Nehalem. This field measures the load 
latency from load's first dispatch of till final data writeback from the memory subsystem. The latency is 
reported for retired demand load operations and in core cycles (it accounts for re-dispatches). This facility is 
used in conjunction with the PEBS facility.

Off-core response counting facility. This facility in the processor core allows software to count certain 
transaction responses between the processor core to sub-systems outside the processor core (uncore). 
Counting off-core response requires additional event qualification configuration facility in conjunction with 
IA32_PERFEVTSELx. Two off-core response MSRs are provided to use in conjunction with specific event codes 
that must be specified with IA32_PERFEVTSELx.

18.8.1.1   Processor Event Based Sampling (PEBS)

All four general-purpose performance counters, IA32_PMCx, can be used for PEBS if the performance event 
supports PEBS. Software uses IA32_MISC_ENABLE[7] and IA32_MISC_ENABLE[12] to detect whether the perfor-
mance monitoring facility and PEBS functionality are supported in the processor. The MSR IA32_PEBS_ENABLE 
provides 4 bits that software must use to enable which IA32_PMCx overflow condition will cause the PEBS record 
to be captured. 
Additionally, the PEBS record is expanded to allow latency information to be captured. The MSR 
IA32_PEBS_ENABLE provides 4 additional bits that software must use to enable latency data recording in the PEBS 
record upon the respective IA32_PMCx overflow condition. The layout of IA32_PEBS_ENABLE for processors based 
on Intel microarchitecture code name Nehalem is shown in Figure 18-21.

Figure 18-20.  IA32_PERF_GLOBAL_STATUS MSR 

CHG (R/W)
OVF_PMI (R/W)

8 7

0

32

3

1

Reserved

63

2

4

31

5

6

62

60

61

OVF_PC7 (R/O), if CCNT>7
OVF_PC6 (R/O), if CCNT>6
OVF_PC5 (R/O), if CCNT>5
OVF_PC4 (R/O), if CCNT>4
OVF_PC3 (R/O)
OVF_PC2 (R/O)
OVF_PC1 (R/O)
OVF_PC0 (R/O)

RESET Value — 00000000_00000000H

OVF_FC2 (R/O)
OVF_FC1 (R/O)

353433

OVF_FC0 (R/O)

CCNT: CPUID.AH:EAX[15:8]