19-148 Vol. 3B
PERFORMANCE-MONITORING EVENTS
D0H
13H
MEM_UOPS_RETIRED.D
TLB_MISS
Counts uops retired that had a DTLB miss on load, store or either.
Note that when two distinct memory operations to the same page miss
the DTLB, only one of them will be recorded as a DTLB miss.
Precise Event
D0H
21H
MEM_UOPS_RETIRED.L
OCK_LOADS
Counts locked memory uops retired. This includes 'regular' locks and
bus locks. To specifically count bus locks only, see the offcore response
event. A locked access is one with a lock prefix, or an exchange to
memory.
Precise Event
D0H
41H
MEM_UOPS_RETIRED.S
PLIT_LOADS
Counts load uops retired where the data requested spans a 64 byte
cache line boundary.
Precise Event
D0H
42H
MEM_UOPS_RETIRED.S
PLIT_STORES
Counts store uops retired where the data requested spans a 64 byte
cache line boundary.
Precise Event
D0H
43H
MEM_UOPS_RETIRED.S
PLIT
Counts memory uops retired where the data requested spans a 64
byte cache line boundary.
Precise Event
D1H
01H
MEM_LOAD_UOPS_RETI
RED.L1_HIT
Counts load uops retired that hit the L1 data cache.
Precise Event
D1H
08H
MEM_LOAD_UOPS_RETI
RED.L1_MISS
Counts load uops retired that miss the L1 data cache.
Precise Event
D1H
02H
MEM_LOAD_UOPS_RETI
RED.L2_HIT
Counts load uops retired that hit in the L2 cache.
Precise Event
0xD1H 10H
MEM_LOAD_UOPS_RETI
RED.L2_MISS
Counts load uops retired that miss in the L2 cache.
Precise Event
D1H
20H
MEM_LOAD_UOPS_RETI
RED.HITM
Counts load uops retired where the cache line containing the data was
in the modified state of another core or modules cache (HITM). More
specifically, this means that when the load address was checked by
other caching agents (typically another processor) in the system, one
of those caching agents indicated that they had a dirty copy of the
data. Loads that obtain a HITM response incur greater latency than
most that is typical for a load. In addition, since HITM indicates that
some other processor had this data in its cache, it implies that the data
was shared between processors, or potentially was a lock or
semaphore value. This event is useful for locating sharing, false
sharing, and contended locks.
Precise Event
D1H
40H
MEM_LOAD_UOPS_RETI
RED.WCB_HIT
Counts memory load uops retired where the data is retrieved from the
WCB (or fill buffer), indicating that the load found its data while that
data was in the process of being brought into the L1 cache. Typically a
load will receive this indication when some other load or prefetch
missed the L1 cache and was in the process of retrieving the cache line
containing the data, but that process had not yet finished (and written
the data back to the cache). For example, consider load X and Y, both
referencing the same cache line that is not in the L1 cache. If load X
misses cache first, it obtains and WCB (or fill buffer) begins the process
of requesting the data. When load Y requests the data, it will either hit
the WCB, or the L1 cache, depending on exactly what time the request
to Y occurs.
Precise Event
D1H
80H
MEM_LOAD_UOPS_RETI
RED.DRAM_HIT
Counts memory load uops retired where the data is retrieved from
DRAM. Event is counted at retirement, so the speculative loads are
ignored. A memory load can hit (or miss) the L1 cache, hit (or miss) the
L2 cache, hit DRAM, hit in the WCB or receive a HITM response.
Precise Event
Table 19-24. Non-Architectural Performance Events for the Goldmont Microarchitecture (Contd.)
Event
Num.
Umask
Value
Event Name
Description
Comment