background image

Vol. 3B 19-57

PERFORMANCE-MONITORING EVENTS

19.7 

PERFORMANCE MONITORING EVENTS FOR   INTEL

®

CORE

™ 

I7 PROCESSOR 

FAMILY AND INTEL

® 

XEON

®

 PROCESSOR FAMILY

Processors based on the Intel microarchitecture code name Nehalem support the architectural and non-architec-
tural performance-monitoring events listed in Table 19-1 and Table 19-17. The events in Table 19-17 generally 
applies to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 
06_1AH, 06_1EH, 06_1FH, and 06_2EH. However, Intel Xeon processors with CPUID signature of 
DisplayFamily_DisplayModel 06_2EH have a small number of events that are not supported in processors with 
CPUID signature 06_1AH, 06_1EH, and 06_1FH. These events are noted in the comment column.
In addition, these processors (CPUID signature of DisplayFamily_DisplayModel 06_1AH, 06_1EH, 06_1FH) also 
support the following non-architectural, product-specific uncore performance-monitoring events listed in Table 
19-18

Fixed counters in the core PMU support the architecture events defined in Table 19-2.

83H

01H

UNC_ARB_COH_TRK_OCCUPA

NCY.ALL

Cycles weighted by number of requests pending in 

Coherency Tracker.

Counter 0 only.

84H

01H

UNC_ARB_COH_TRK_REQUES

T.ALL

Number of requests allocated in Coherency Tracker.

NOTES:

1. The uncore events must be programmed using MSRs located in specific performance monitoring units in the uncore. UNC_CBO* 

events are supported using MSR_UNC_CBO* MSRs; UNC_ARB* events are supported using MSR_UNC_ARB*MSRs.

Table 19-17.  Non-Architectural Performance Events In the Processor Core for 

Intel® Core™ i7 Processor and Intel® Xeon® Processor 5500 Series

Event

Num.

Umask

Value

Event Mask Mnemonic

Description

Comment

04H

07H

SB_DRAIN.ANY

Counts the number of store buffer drains.

06H

04H

STORE_BLOCKS.AT_RET

Counts number of loads delayed with at-Retirement 

block code. The following loads need to be executed 

at retirement and wait for all senior stores on the 

same thread to be drained: load splitting across 4K 

boundary (page split), load accessing uncacheable 

(UC or WC) memory, load lock, and load with page 

table in UC or WC memory region.

06H

08H

STORE_BLOCKS.L1D_BLOCK

Cacheable loads delayed with L1D block code.

07H

01H

PARTIAL_ADDRESS_ALIAS

Counts false dependency due to partial address 

aliasing.

08H

01H

DTLB_LOAD_MISSES.ANY

Counts all load misses that cause a page walk.

08H

02H

DTLB_LOAD_MISSES.WALK_CO

MPLETED

Counts number of completed page walks due to load 

miss in the STLB.

08H

10H

DTLB_LOAD_MISSES.STLB_HIT Number of cache load STLB hits.

08H

20H

DTLB_LOAD_MISSES.PDE_MIS

S

Number of DTLB cache load misses where the low 

part of the linear to physical address translation 

was missed.

08H

80H

DTLB_LOAD_MISSES.LARGE_W

ALK_COMPLETED

Counts number of completed large page walks due 

to load miss in the STLB.

Table 19-16.  Non-Architectural Performance Events In the Processor Uncore for 2nd Generation 

Intel® Core™ i7-2xxx, Intel® Core™ i5-2xxx, Intel® Core™ i3-2xxx Processor Series (Contd.)

Event

Num.

1

Umask

Value

Event Mask Mnemonic

Description

Comment