background image

17-30 Vol. 3B

DEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES

The call stack profiling capability is an enhancement of the LBR facility. The LBR stack is a ring buffer typically used 
to profile control flow transitions resulting from branches. However, the finite depth of the LBR stack often become 
less effective when profiling certain high-level languages (e.g. C++), where a transition of the execution flow is 
accompanied by a large number of leaf function calls, each of which returns an individual parameter to form the list 
of parameters for the main execution function call. A long list of such parameters returned by the leaf functions 
would serve to flush the data captured in the LBR stack, often losing the main execution context. 
When the call stack feature is enabled, the LBR stack will capture unfiltered call data normally, but as return 
instructions are executed the last captured branch record is flushed from the on-chip registers in a last-in first-out 
(LIFO) manner. Thus, branch information relative to leaf functions will not be captured, while preserving the call 
stack information of the main line execution path.
The configuration of the call stack facility is summarized below:

Set IA32_DEBUGCTL.LBR (bit 0) to enable the LBR stack to capture branch records. The source and target 
addresses of the call branches will be captured in the 16 pairs of From/To LBR MSRs that form the LBR stack.

Program the Top of Stack (TOS) MSR that points to the last valid from/to pair. This register is incremented by 
1, modulo 16, before recording the next pair of addresses.

Program the branch filtering bits of MSR_LBR_SELECT (bits 0:8) as desired.

Program the MSR_LBR_SELECT to enable LIFO filtering of return instructions with:
— The following bits in MSR_LBR_SELECT must be set to ‘1’: JCC, NEAR_IND_JMP, NEAR_REL_JMP, 

FAR_BRANCH, EN_CALLSTACK;

— The following bits in MSR_LBR_SELECT must be cleared: NEAR_REL_CALL, NEAR-IND_CALL, NEAR_RET;
— At most one of CPL_EQ_0, CPL_NEQ_0 is set.

Note that when call stack profiling is enabled, “zero length calls” are excluded from writing into the LBRs. (A “zero 
length call” uses the attribute of the call instruction to push the immediate instruction pointer on to the stack and 
then pops off that address into a register. This is accomplished without any matching return on the call.)

17.9.1 

LBR Stack Enhancement

Processors based on Intel microarchitecture code name Haswell provide 16 pairs of MSR to record last branch 
record information. The layout of each MSR pair is enumerated by IA32_PERF_CAPABILITIES[5:0] = 04H, and is 
shown in Table 17-14 and Table 17-9.

FAR_BRANCH

8

R/W

When set, do not capture far branches

EN_CALLSTACK

1

9

Enable LBR stack to use LIFO filtering to capture Call stack profile

Reserved

63:10

Must be zero

NOTES:

1. Must set valid combination of bits 0-8 in conjunction with bit 9 (as described below), otherwise the contents of the LBR MSRs are 

undefined.

Table 17-14.   MSR_LASTBRANCH_x_FROM_IP with TSX Information

Bit Field

Bit Offset

Access

Description

Data

47:0

R/O

This is the “branch from“ address. See Section 17.4.8.1 for address format.

SIGN_EXT

60:48

R/0

Signed extension of bit 47 of this register.

TSX_ABORT

61

R/0

When set, indicates a TSX Abort entry

LBR_FROM: EIP at the time of the TSX Abort

LBR_TO: EIP of the start of HLE region, or EIP of the RTM Abort Handler

IN_TSX

62

R/0

When set, indicates the entry occurred in a TSX region

Table 17-13.   MSR_LBR_SELECT for Intel® microarchitecture code name Haswell

Bit Field

Bit Offset

Access

Description