background image

Vol. 3B 19-3

PERFORMANCE-MONITORING EVENTS

19.2 

PERFORMANCE MONITORING EVENTS FOR 6TH GENERATION INTEL® 

CORE™ PROCESSOR

6th Generation Intel

®

 Core™ processors are based on the Skylake microarchitecture. They support the architec-

tural performance-monitoring events listed in Table 19-1. Fixed counters in the core PMU support the architecture 
events defined in Table 19-2. Non-architectural performance-monitoring events in the processor core are listed in 
Table 19-3. The events in Table 19-3 apply to processors with CPUID signature of DisplayFamily_DisplayModel 
encoding with the following values: 06_4EH and 06_5EH. Table 19-8 lists performance events supporting Intel TSX 
(see Section 18.11.5) and the events are applicable to processors based on Skylake microarchitecture. Where 
Skylake microarchitecture implements TSX-related event semantics that differ from Table 19-8, they are listed in 
Table 19-4.
The comment column in Table 19-3 uses abbreviated letters to indicate additional conditions applicable to the 
Event Mask Mnemonic. For event umasks listed in Table 19-3 that do not show “AnyT”, users should refrain from 
programming “AnyThread =1” in IA32_PERF_EVTSELx.

IA32_PERF_FIXED_CTR2

30BH

CPU_CLK_UNHALTED.REF

This event counts the number of reference cycles when the 

core is not in a halt state and not in a TM stop-clock state. 

The core enters the halt state when it is running the HLT 

instruction or the MWAIT instruction. 
This event is not affected by core frequency changes (e.g., 

P states) but counts at the same frequency as the time 

stamp counter. This event can approximate elapsed time 

while the core was not in a halt state and not in a TM stop-

clock state. 

Table 19-3.  Non-Architectural Performance Events of the Processor Core Supported by Skylake Microarchitecture

Event

Num.

Umask

Value

Event Mask Mnemonic

Description

Comment

03H

02H

LD_BLOCKS.STORE_FORWARD

Loads blocked by overlapping with store buffer that 

cannot be forwarded.

03H

08H

LD_BLOCKS.NO_SR

The number of times that split load operations are 

temporarily blocked because all resources for handling 

the split accesses are in use.

07H

01H

LD_BLOCKS_PARTIAL.ADDRESS

_ALIAS

False dependencies in MOB due to partial compare on 

address.

08H

01H

DTLB_LOAD_MISSES.MISS_CAUS

ES_A_WALK

Load misses in all TLB levels that cause a page walk of 

any page size.

08H

0EH

DTLB_LOAD_MISSES.WALK_COM

PLETED

Load misses in all TLB levels causes a page walk that 

completes. (All page sizes.)

08H

10H

DTLB_LOAD_MISSES.WALK_PEN

DING

Counts 1 per cycle for each PMH that is busy with a 

page walk for a load.

08H

10H

DTLB_LOAD_MISSES.WALK_ACT

IVE

Cycles when at least one PMH is busy with a walk for a 

load.

CMSK1

08H

20H

DTLB_LOAD_MISSES.STLB_HIT

Loads that miss the DTLB but hit STLB.

0DH

01H

INT_MISC.RECOVERY_CYCLES

Core cycles the allocator was stalled due to recovery 

from earlier machine clear event for this thread (for 

example, misprediction or memory order conflict).

Table 19-2.  Fixed-Function Performance Counter and Pre-defined Performance Events (Contd.)

Fixed-Function Performance 

Counter

Address Event Mask Mnemonic

Description