17-30 Vol. 3B
DEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES
The call stack profiling capability is an enhancement of the LBR facility. The LBR stack is a ring buffer typically used
to profile control flow transitions resulting from branches. However, the finite depth of the LBR stack often become
less effective when profiling certain high-level languages (e.g. C++), where a transition of the execution flow is
accompanied by a large number of leaf function calls, each of which returns an individual parameter to form the list
of parameters for the main execution function call. A long list of such parameters returned by the leaf functions
would serve to flush the data captured in the LBR stack, often losing the main execution context.
When the call stack feature is enabled, the LBR stack will capture unfiltered call data normally, but as return
instructions are executed the last captured branch record is flushed from the on-chip registers in a last-in first-out
(LIFO) manner. Thus, branch information relative to leaf functions will not be captured, while preserving the call
stack information of the main line execution path.
The configuration of the call stack facility is summarized below:
•
Set IA32_DEBUGCTL.LBR (bit 0) to enable the LBR stack to capture branch records. The source and target
addresses of the call branches will be captured in the 16 pairs of From/To LBR MSRs that form the LBR stack.
•
Program the Top of Stack (TOS) MSR that points to the last valid from/to pair. This register is incremented by
1, modulo 16, before recording the next pair of addresses.
•
Program the branch filtering bits of MSR_LBR_SELECT (bits 0:8) as desired.
•
Program the MSR_LBR_SELECT to enable LIFO filtering of return instructions with:
— The following bits in MSR_LBR_SELECT must be set to ‘1’: JCC, NEAR_IND_JMP, NEAR_REL_JMP,
FAR_BRANCH, EN_CALLSTACK;
— The following bits in MSR_LBR_SELECT must be cleared: NEAR_REL_CALL, NEAR-IND_CALL, NEAR_RET;
— At most one of CPL_EQ_0, CPL_NEQ_0 is set.
Note that when call stack profiling is enabled, “zero length calls” are excluded from writing into the LBRs. (A “zero
length call” uses the attribute of the call instruction to push the immediate instruction pointer on to the stack and
then pops off that address into a register. This is accomplished without any matching return on the call.)
17.9.1
LBR Stack Enhancement
Processors based on Intel microarchitecture code name Haswell provide 16 pairs of MSR to record last branch
record information. The layout of each MSR pair is enumerated by IA32_PERF_CAPABILITIES[5:0] = 04H, and is
shown in Table 17-14 and Table 17-9.
FAR_BRANCH
8
R/W
When set, do not capture far branches
EN_CALLSTACK
1
9
Enable LBR stack to use LIFO filtering to capture Call stack profile
Reserved
63:10
Must be zero
NOTES:
1. Must set valid combination of bits 0-8 in conjunction with bit 9 (as described below), otherwise the contents of the LBR MSRs are
undefined.
Table 17-14. MSR_LASTBRANCH_x_FROM_IP with TSX Information
Bit Field
Bit Offset
Access
Description
Data
47:0
R/O
This is the “branch from“ address. See Section 17.4.8.1 for address format.
SIGN_EXT
60:48
R/0
Signed extension of bit 47 of this register.
TSX_ABORT
61
R/0
When set, indicates a TSX Abort entry
LBR_FROM: EIP at the time of the TSX Abort
LBR_TO: EIP of the start of HLE region, or EIP of the RTM Abort Handler
IN_TSX
62
R/0
When set, indicates the entry occurred in a TSX region
Table 17-13. MSR_LBR_SELECT for Intel® microarchitecture code name Haswell
Bit Field
Bit Offset
Access
Description