background image

19-210 Vol. 3B

PERFORMANCE-MONITORING EVENTS

Memory 

Ordering

03H

LD_BLOCKS

00H

Number of load operations delayed due to 

store buffer blocks.
Includes counts caused by preceding stores 

whose addresses are unknown, preceding 

stores whose addresses are known but 

whose data is unknown, and preceding 

stores that conflicts with the load but which 

incompletely overlap the load.

04H

SB_DRAINS

00H

Number of store buffer drain cycles.
Incremented every cycle the store buffer is 

draining.
Draining is caused by serializing operations 

like CPUID, synchronizing operations like 

XCHG, interrupt acknowledgment, as well as 

other conditions (such as cache flushing).

05H

MISALIGN_

MEM_REF

00H

Number of misaligned data memory 

references.
Incremented by 1 every cycle, during which 

either the processor’s load or store pipeline 

dispatches a misaligned μop.
Counting is performed if it is the first or 

second half, or if it is blocked, squashed, or 

missed.
In this context, misaligned means crossing a 

64-bit boundary.

MISALIGN_MEM_

REF is only an approximation to the 

true number of misaligned memory 

references.
The value returned is roughly 

proportional to the number of 

misaligned memory accesses (the 

size of the problem).

07H

EMON_KNI_PREF

_DISPATCHED

Number of Streaming SIMD extensions 

prefetch/weakly-ordered instructions 

dispatched (speculative prefetches are 

included in counting):

Counters 0 and 1. Pentium III 

processor only.

00H
01H
02H
03H

0: prefetch NTA
1: prefetch T1
2: prefetch T2
3: weakly ordered stores

4BH

EMON_KNI_PREF

_MISS

Number of prefetch/weakly-ordered 

instructions that miss all caches:

Counters 0 and 1. Pentium III 

processor only.

00H
01H
02H
03H

0: prefetch NTA
1: prefetch T1
2: prefetch T2
3: weakly ordered stores

Instruction 

Decoding 

and 

Retirement

C0H

INST_RETIRED

00H

Number of instructions retired.

A hardware interrupt received 

during/after the last iteration of 

the REP STOS flow causes the 

counter to undercount by 1 

instruction.
An SMI received while executing a 

HLT instruction will cause the 

performance counter to not count 

the RSM instruction and 

undercount by 1.

Table 19-37.  Events That Can Be Counted with the P6 Family Performance-Monitoring Counters (Contd.)

Unit

Event 

Num.

Mnemonic Event 

Name

Unit 

Mask Description

Comments