19-126 Vol. 3B
PERFORMANCE-MONITORING EVENTS
42H
10
H
L1D_CACHE_
LOCK_DURATION
Duration of L1 data
cacheable locked
operation.
This event counts the number of cycles during which any
cache line is locked by any locking instruction.
Locking happens at retirement and therefore the event does
not occur for instructions that are speculatively executed.
Locking duration is shorter than locked instruction execution
duration.
43H
01
H
L1D_ALL_REF
All references to the
L1 data cache.
This event counts all references to the L1 data cache,
including all loads and stores with any memory types.
The event counts memory accesses only when they are
actually performed. For example, a load blocked by unknown
store address and later performed is only counted once.
The event includes non-cacheable accesses, such as I/O
accesses.
43
H
02
H
L1D_ALL_
CACHE_REF
L1 Data cacheable
reads and writes.
This event counts the number of data reads and writes from
cacheable memory, including locked operations.
This event is a sum of:
• L1D_CACHE_LD.MESI
• L1D_CACHE_ST.MESI
• L1D_CACHE_LOCK.MESI
45
H
0F
H
L1D_REPL
Cache lines allocated
in the L1 data cache.
This event counts the number of lines brought into the L1
data cache.
46
H
00
H
L1D_M_REPL
Modified cache lines
allocated in the L1
data cache.
This event counts the number of modified lines brought into
the L1 data cache.
47
H
00H
L1D_M_EVICT
Modified cache lines
evicted from the L1
data cache.
This event counts the number of modified lines evicted from
the L1 data cache, whether due to replacement or by snoop
HITM intervention.
48
H
00
H
L1D_PEND_
MISS
Total number of
outstanding L1 data
cache misses at any
cycle.
This event counts the number of outstanding L1 data cache
misses at any cycle. An L1 data cache miss is outstanding
from the cycle on which the miss is determined until the
first chunk of data is available. This event counts:
• All cacheable demand requests.
• L1 data cache hardware prefetch requests.
• Requests to write through memory.
• Requests to write combine memory.
Uncacheable requests are not counted. The count of this
event divided by the number of L1 data cache misses,
L1D_REPL, is the average duration in core cycles of an L1
data cache miss.
49H
01H
L1D_SPLIT.LOADS
Cache line split loads
from the L1 data
cache.
This event counts the number of load operations that span
two cache lines. Such load operations are also called split
loads. Split load operations are executed at retirement.
49
H
02
H
L1D_SPLIT.
STORES
Cache line split stores
to the L1 data cache.
This event counts the number of store operations that span
two cache lines.
4B
H
00
H
SSE_PRE_
MISS.NTA
Streaming SIMD
Extensions (SSE)
Prefetch NTA
instructions missing all
cache levels.
This event counts the number of times the SSE instructions
prefetchNTA were executed and missed all cache levels.
Due to speculation an executed instruction might not retire.
This instruction prefetches the data to the L1 data cache.
Table 19-23. Non-Architectural Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.)
Event
Num
Umask
Value
Event Name
Definition
Description and
Comment