background image

Vol. 3B 19-13

PERFORMANCE-MONITORING EVENTS

19.3 

PERFORMANCE MONITORING EVENTS FOR THE INTEL

®

CORE

 M AND 5TH 

GENERATION INTEL

®

 CORE

 PROCESSORS 

The Intel

®

 Core™ M processors, the 5th generation Intel

®

 Core™ processors and the Intel Xeon processor E3 1200 

v4 product family are based on the Broadwell microarchitecture. They support the architectural performance-
monitoring events listed in Table 19-1. Non-architectural performance-monitoring events in the processor core are 
listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of 
DisplayFamily_DisplayModel encoding with the following values: 06_3DH and 06_47H. Table 19-8 lists perfor-
mance events supporting Intel TSX (see Section 18.11.5) and the events are available on processors based on 
Broadwell microarchitecture. Fixed counters in the core PMU support the architecture events defined in Table 19-2.
Non-architectural performance monitoring events that are located in the uncore sub-system are implementation 
specific between different platforms using processors based on Broadwell microarchitecture and with different 
DisplayFamily_DisplayModel signatures. Processors with CPUID signature of DisplayFamily_DisplayModel 06_3DH 
and 06_47H support uncore performance events listed in Table 19-9.

Table 19-5.  Non-Architectural Performance Events of the Processor Core Supported by Broadwell 

Microarchitecture

Event

Num.

Umask

Value

Event Mask Mnemonic

Description

Comment

03H

02H

LD_BLOCKS.STORE_FORWARD

Loads blocked by overlapping with store buffer that 

cannot be forwarded.

03H

08H

LD_BLOCKS.NO_SR

The number of times that split load operations are 

temporarily blocked because all resources for 

handling the split accesses are in use.

05H

01H

MISALIGN_MEM_REF.LOADS

Speculative cache-line split load uops dispatched to 

L1D.

05H

02H

MISALIGN_MEM_REF.STORES

Speculative cache-line split store-address uops 

dispatched to L1D.

07H

01H

LD_BLOCKS_PARTIAL.ADDRESS

_ALIAS

False dependencies in MOB due to partial compare 

on address.

08H

01H

DTLB_LOAD_MISSES.MISS_CAUS

ES_A_WALK

Load misses in all TLB levels that cause a page walk 

of any page size.

08H

02H

DTLB_LOAD_MISSES.WALK_COM

PLETED_4K

Completed page walks due to demand load misses 

that caused 4K page walks in any TLB levels.

08H

10H

DTLB_LOAD_MISSES.WALK_DUR

ATION

Cycle PMH is busy with a walk.

08H

20H

DTLB_LOAD_MISSES.STLB_HIT_

4K

Load misses that missed DTLB but hit STLB (4K).

0DH

03H

INT_MISC.RECOVERY_CYCLES

Cycles waiting to recover after Machine Clears 

except JEClear. Set Cmask= 1.

Set Edge to count 

occurrences.

0EH

01H

UOPS_ISSUED.ANY

Increments each cycle the # of uops issued by the 

RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count 

stalled cycles of this core.

Set Cmask = 1, Inv = 1to 

count stalled cycles.

0EH

10H

UOPS_ISSUED.FLAGS_MERGE

Number of flags-merge uops allocated. Such uops 

add delay.

0EH

20H

UOPS_ISSUED.SLOW_LEA

Number of slow LEA or similar uops allocated. Such 

uop has 3 sources (for example, 2 sources + 

immediate) regardless of whether it is a result of 

LEA instruction or not.

0EH

40H

UOPS_ISSUED.SiNGLE_MUL

Number of multiply packed/scalar single precision 

uops allocated.