Vol. 3C 36-31
INTEL® PROCESSOR TRACE
When restoring the trace configuration context, IA32_RTIT_CTL should be restored last:
1. Read saved configuration MSR values, aside from IA32_RTIT_CTL, from memory, and restore them with
WRMSR
2. Read saved IA32_RTIT_CTL value from memory, and restore with WRMSR.
36.3.5.2 Trace Configuration Context Switch Using XSAVES/XRSTORS
On processors whose XSAVE feature set supports XSAVES and XRSTORS, the Trace configuration state can be
saved using XSAVES and restored by XRSTORS, in conjunction with the bit field associated with supervisory state
component in IA32_XSS. See Chapter 13, “Managing State Using the XSAVE Feature Set” of Intel® 64 and IA-32
Architectures Software Developer’s Manual, Volume 1.
The layout of the trace configuration component state in the XSAVE area is shown in Table 36-13.
1
The IA32_XSS MSR is zero coming out of RESET. Once IA32_XSS[bit 8] is set, system software operating at CPL=
0 can use XSAVES/XRSTORS with the appropriate requested-feature bitmap (RFBM) to manage supervisor state
components in the XSAVE map. See Chapter 13, “Managing State Using the XSAVE Feature Set” of Intel® 64 and
IA-32 Architectures Software Developer’s Manual, Volume 1.
36.3.6 Cycle-Accurate
Mode
Intel PT can be run in a cycle-accurate mode which enables CYC packets (see Section 36.4.2.14) that provide low-
level information in the processor core clock domain. This cycle counter data in CYC packets can be used to
compute IPC (Instructions Per Cycle), or to track wall-clock time on a fine-grain level.
To enable cycle-accurate mode packet generation, software should set IA32_RTIT_CTL.CYCEn=1. It is recom-
mended that software also set TSCEn=1 anytime cycle-accurate mode is in use. With this, all CYC-eligible packets
will be preceded by a CYC packet, the payload of which indicates the number of core clock cycles since the last CYC
packet. In cases where multiple CYC-eligible packets are generated in a single cycle, only a single CYC will be
generated before the CYC-eligible packets, otherwise each CYC-eligible packet will be preceded by its own CYC. The
CYC-eligible packets are:
•
TNT, TIP, TIP.PGE, TIP.PGD, MODE.EXEC, MODE.TSX, PIP, VMCS, OVF, MTC, TSC, PTWRITE, EXSTOP
TSC packets are generated when there is insufficient information to reconstruct wall-clock time, due to tracing
being disabled (TriggerEn=0), or power down scenarios like a transition to a deep-sleep MWAIT C-state. In this
case, the CYC that is generated along with the TSC will indicate the number of cycles actively tracing (those
powered up, with TriggerEn=1) executed between the last CYC packet and the TSC packet. And hence the amount
of time spent while tracing is inactive can be inferred from the difference in time between that expected based on
the CYC value, and the actual time indicated by the TSC.
Additional CYC packets may be sent stand-alone, so that the processor can ensure that the decoder is aware of the
number of cycles that have passed before the internal hardware counter wraps, or is reset due to other micro-
architectural condition. There is no guarantee at what intervals these standalone CYC packets will be sent, except
that they will be sent before the wrap occurs. An illustration is given below.
1. Table 36-13 documents support for the MSRs defining address ranges 0 and 1. Processors that provide XSAVE support for Intel Processor
Trace support only those address ranges.
Table 36-13. Memory Layout of the Trace Configuration State Component
Offset within
Component Area
Field
Offset within
Component Area
Field
0H
IA32_RTIT_CTL
08H
IA32_RTIT_OUTPUT_BASE
10H
IA32_RTIT_OUTPUT_MASK_PTRS
18H
IA32_RTIT_STATUS
20H
IA32_RTIT_CR3_MATCH
28H
IA32_RTIT_ADDR0_A
30H
IA32_RTIT_ADDR0_B
38H
IA32_RTIT_ADDR1_A
40H
IA32_RTIT_ADDR1_B
48H–End
Reserved