background image

19-100 Vol. 3B

PERFORMANCE-MONITORING EVENTS

D2H

0FH

RAT_STALLS.ANY

Counts all Register Allocation Table stall cycles due 

to: Cycles when ROB read port stalls occurred, 

which did not allow new micro-ops to enter the 

execution pipe, Cycles when partial register stalls 

occurred, Cycles when flag stalls occurred, Cycles 

floating-point unit (FPU) status word stalls occurred. 

To count each of these conditions separately use 

the events: RAT_STALLS.ROB_READ_PORT, 

RAT_STALLS.PARTIAL, RAT_STALLS.FLAGS, and 

RAT_STALLS.FPSW.

D4H

01H

SEG_RENAME_STALLS

Counts the number of stall cycles due to the lack of 

renaming resources for the ES, DS, FS, and GS 

segment registers. If a segment is renamed but not 

retired and a second update to the same segment 

occurs, a stall occurs in the front end of the pipeline 

until the renamed segment retires.

D5H

01H

ES_REG_RENAMES

Counts the number of times the ES segment 

register is renamed.

DBH

01H

UOP_UNFUSION

Counts unfusion events due to floating point 

exception to a fused uop.

E0H

01H

BR_INST_DECODED

Counts the number of branch instructions decoded. 

E5H

01H

BPU_MISSED_CALL_RET

Counts number of times the Branch Prediction Unit 

missed predicting a call or return branch.

E6H

01H

BACLEAR.CLEAR

Counts the number of times the front end is 

resteered, mainly when the Branch Prediction Unit 

cannot provide a correct prediction and this is 

corrected by the Branch Address Calculator at the 

front end. This can occur if the code has many 

branches such that they cannot be consumed by 

the BPU. Each BACLEAR asserted by the BAC 

generates approximately an 8 cycle bubble in the 

instruction fetch pipeline. The effect on total 

execution time depends on the surrounding code.

E6H

02H

BACLEAR.BAD_TARGET

Counts number of Branch Address Calculator clears 

(BACLEAR) asserted due to conditional branch 

instructions in which there was a target hit but the 

direction was wrong. Each BACLEAR asserted by 

the BAC generates approximately an 8 cycle bubble 

in the instruction fetch pipeline.

E8H

01H

BPU_CLEARS.EARLY

Counts early (normal) Branch Prediction Unit clears: 

BPU predicted a taken branch after incorrectly 

assuming that it was not taken. 

The BPU clear leads to 2 

cycle bubble in the front 

end.

E8H

02H

BPU_CLEARS.LATE

Counts late Branch Prediction Unit clears due to 

Most Recently Used conflicts. The PBU clear leads 

to a 3 cycle bubble in the front end.

ECH

01H

THREAD_ACTIVE

Counts cycles threads are active.

F0H

01H

L2_TRANSACTIONS.LOAD

Counts L2 load operations due to HW prefetch or 

demand loads.

Table 19-19.  Non-Architectural Performance Events In the Processor Core for 

Processors Based on IntelĀ® Microarchitecture Code Name Westmere (Contd.)

Event

Num.

Umask

Value

Event Mask Mnemonic

Description

Comment