background image

19-134 Vol. 3B

PERFORMANCE-MONITORING EVENTS

B3H

20H

SIMD_UOP_TYPE_EXEC.ARI

THMETIC

SIMD packed 

arithmetic micro-ops 

executed.

This event counts the number of SIMD packed arithmetic 

micro-ops executed.

C0H

00H

INST_RETIRED.

ANY_P

Instructions retired.

This event counts the number of instructions that retire 

execution. For instructions that consist of multiple micro-

ops, this event counts the retirement of the last micro-op of 

the instruction. The counter continues counting during 

hardware interrupts, traps, and inside interrupt handlers. 
INST_RETIRED.ANY_P is an architectural performance 

event. 

C0H

01H

INST_RETIRED.

LOADS

Instructions retired, 

which contain a load.

This event counts the number of instructions retired that 

contain a load operation.

C0H

02H

INST_RETIRED.

STORES

Instructions retired, 

which contain a store.

This event counts the number of instructions retired that 

contain a store operation.

C0H

04H

INST_RETIRED.

OTHER

Instructions retired, 

with no load or store 

operation.

This event counts the number of instructions retired that do 

not contain a load or a store operation.

C1H

01H

X87_OPS_

RETIRED.FXCH

FXCH instructions 

retired.

This event counts the number of FXCH instructions retired. 

Modern compilers generate more efficient code and are less 

likely to use this instruction. If you obtain a high count for 

this event consider recompiling the code.

C1H

FEH

X87_OPS_

RETIRED.ANY

Retired floating-point 

computational 

operations (precise 

event).

This event counts the number of floating-point 

computational operations retired. It counts: 
• Floating point computational operations executed by the 

assist handler. 

• Sub-operations of complex floating-point instructions like 

transcendental instructions. 

This event does not count: 
• Floating-point computational operations that cause traps 

or assists. 

• Floating-point loads and stores. 
When this event is captured with the precise event 

mechanism, the collected samples contain the address of 

the instruction that was executed immediately after the 

instruction that caused the event.

C2H

01H

UOPS_RETIRED.

LD_IND_BR

Fused load+op or 

load+indirect branch 

retired.

This event counts the number of retired micro-ops that 

fused a load with another operation. This includes: 
• Fusion of a load and an arithmetic operation, such as with 

the following instruction: ADD EAX, [EBX] where the 

content of the memory location specified by EBX register 

is loaded, added to EXA register, and the result is stored 

in EAX.

• Fusion of a load and a branch in an indirect branch 

operation, such as with the following instructions:

• JMP [RDI+200] 

• RET 

• Fusion decreases the number of micro-ops in the 

processor pipeline. A high value for this event count 

indicates that the code is using the processor resources 

effectively.

Table 19-23.  Non-Architectural Performance Events in Processors Based on Intel® Core™ Microarchitecture (Contd.)

Event 

Num

Umask

Value

Event Name 

Definition

Description and

Comment