background image

19-148 Vol. 3B

PERFORMANCE-MONITORING EVENTS

D0H

13H

MEM_UOPS_RETIRED.D

TLB_MISS

Counts uops retired that had a DTLB miss on load, store or either. 
Note that when two distinct memory operations to the same page miss 

the DTLB, only one of them will be recorded as a DTLB miss.

Precise Event

D0H

21H

MEM_UOPS_RETIRED.L

OCK_LOADS

Counts locked memory uops retired. This includes 'regular' locks and 

bus locks. To specifically count bus locks only, see the offcore response 

event. A locked access is one with a lock prefix, or an exchange to 

memory.

Precise Event

D0H

41H

MEM_UOPS_RETIRED.S

PLIT_LOADS

Counts load uops retired where the data requested spans a 64 byte 

cache line boundary.

Precise Event

D0H

42H

MEM_UOPS_RETIRED.S

PLIT_STORES

Counts store uops retired where the data requested spans a 64 byte 

cache line boundary.

Precise Event

D0H

43H

MEM_UOPS_RETIRED.S

PLIT

Counts memory uops retired where the data requested spans a 64 

byte cache line boundary.

Precise Event

D1H

01H

MEM_LOAD_UOPS_RETI

RED.L1_HIT

Counts load uops retired that hit the L1 data cache.

Precise Event

D1H

08H

MEM_LOAD_UOPS_RETI

RED.L1_MISS

Counts load uops retired that miss the L1 data cache.

Precise Event

D1H

02H

MEM_LOAD_UOPS_RETI

RED.L2_HIT

Counts load uops retired that hit in the L2 cache.

Precise Event

0xD1H 10H

MEM_LOAD_UOPS_RETI

RED.L2_MISS

Counts load uops retired that miss in the L2 cache.

Precise Event

D1H

20H

MEM_LOAD_UOPS_RETI

RED.HITM

Counts load uops retired where the cache line containing the data was 

in the modified state of another core or modules cache (HITM). More 

specifically, this means that when the load address was checked by 

other caching agents (typically another processor) in the system, one 

of those caching agents indicated that they had a dirty copy of the 

data. Loads that obtain a HITM response incur greater latency than 

most that is typical for a load. In addition, since HITM indicates that 

some other processor had this data in its cache, it implies that the data 

was shared between processors, or potentially was a lock or 

semaphore value. This event is useful for locating sharing, false 

sharing, and contended locks.

Precise Event

D1H

40H

MEM_LOAD_UOPS_RETI

RED.WCB_HIT

Counts memory load uops retired where the data is retrieved from the 

WCB (or fill buffer), indicating that the load found its data while that 

data was in the process of being brought into the L1 cache. Typically a 

load will receive this indication when some other load or prefetch 

missed the L1 cache and was in the process of retrieving the cache line 

containing the data, but that process had not yet finished (and written 

the data back to the cache). For example, consider load X and Y, both 

referencing the same cache line that is not in the L1 cache. If load X 

misses cache first, it obtains and WCB (or fill buffer) begins the process 

of requesting the data. When load Y requests the data, it will either hit 

the WCB, or the L1 cache, depending on exactly what time the request 

to Y occurs.

Precise Event

D1H

80H

MEM_LOAD_UOPS_RETI

RED.DRAM_HIT

Counts memory load uops retired where the data is retrieved from 

DRAM. Event is counted at retirement, so the speculative loads are 

ignored. A memory load can hit (or miss) the L1 cache, hit (or miss) the 

L2 cache, hit DRAM, hit in the WCB or receive a HITM response.

Precise Event

Table 19-24.  Non-Architectural Performance Events for the Goldmont Microarchitecture (Contd.)

Event

Num.

Umask

Value

Event Name

Description

Comment