background image

19-126 Vol. 3B

PERFORMANCE-MONITORING EVENTS

42H

10

H

L1D_CACHE_

LOCK_DURATION

Duration of L1 data 

cacheable locked 

operation.

This event counts the number of cycles during which any 

cache line is locked by any locking instruction. 
Locking happens at retirement and therefore the event does 

not occur for instructions that are speculatively executed. 

Locking duration is shorter than locked instruction execution 

duration.

43H

01

H

L1D_ALL_REF

All references to the 

L1 data cache.

This event counts all references to the L1 data cache, 

including all loads and stores with any memory types. 
The event counts memory accesses only when they are 

actually performed. For example, a load blocked by unknown 

store address and later performed is only counted once. 
The event includes non-cacheable accesses, such as I/O 

accesses.

43

H

02

H

L1D_ALL_

CACHE_REF

L1 Data cacheable 

reads and writes.

This event counts the number of data reads and writes from 

cacheable memory, including locked operations. 
This event is a sum of:
• L1D_CACHE_LD.MESI

• L1D_CACHE_ST.MESI

• L1D_CACHE_LOCK.MESI

45

H

0F

H

L1D_REPL

Cache lines allocated 

in the L1 data cache.

This event counts the number of lines brought into the L1 

data cache.

46

H

00

H

L1D_M_REPL

Modified cache lines 

allocated in the L1 

data cache.

This event counts the number of modified lines brought into 

the L1 data cache. 

47

H

00H

L1D_M_EVICT

Modified cache lines 

evicted from the L1 

data cache.

This event counts the number of modified lines evicted from 

the L1 data cache, whether due to replacement or by snoop 

HITM intervention.

48

H

00

H

L1D_PEND_

MISS

Total number of 

outstanding L1 data 

cache misses at any 

cycle.

This event counts the number of outstanding L1 data cache 

misses at any cycle. An L1 data cache miss is outstanding 

from the cycle on which the miss is determined until the 

first chunk of data is available. This event counts: 
• All cacheable demand requests.

• L1 data cache hardware prefetch requests.

• Requests to write through memory.

• Requests to write combine memory.
Uncacheable requests are not counted. The count of this 

event divided by the number of L1 data cache misses, 

L1D_REPL, is the average duration in core cycles of an L1 

data cache miss.

49H

01H

L1D_SPLIT.LOADS

Cache line split loads 

from the L1 data 

cache.

This event counts the number of load operations that span 

two cache lines. Such load operations are also called split 

loads. Split load operations are executed at retirement. 

49

H

02

H

L1D_SPLIT.

STORES

Cache line split stores 

to the L1 data cache.

This event counts the number of store operations that span 

two cache lines.

4B

H

00

H

SSE_PRE_

MISS.NTA

Streaming SIMD 

Extensions (SSE) 

Prefetch NTA 

instructions missing all 

cache levels.

This event counts the number of times the SSE instructions 

prefetchNTA were executed and missed all cache levels. 
Due to speculation an executed instruction might not retire. 

This instruction prefetches the data to the L1 data cache.

Table 19-23.  Non-Architectural Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.)

Event 

Num

Umask

Value

Event Name 

Definition

Description and

Comment