background image

Vol. 3C 36-31

INTEL® PROCESSOR TRACE

When restoring the trace configuration context, IA32_RTIT_CTL should be restored last:
1. Read saved configuration MSR values, aside from IA32_RTIT_CTL, from memory, and restore them with 

WRMSR 

2. Read saved IA32_RTIT_CTL value from memory, and restore with WRMSR.

36.3.5.2   Trace Configuration Context Switch Using XSAVES/XRSTORS

On processors whose XSAVE feature set supports XSAVES and XRSTORS, the Trace configuration state can be 
saved using XSAVES and restored by XRSTORS, in conjunction with the bit field associated with supervisory state 
component in IA32_XSS. See Chapter 13, â€śManaging State Using the XSAVE Feature Set” of Intel® 64 and IA-32 
Architectures Software Developer’s Manual, Volume 1
.
The layout of the trace configuration component state in the XSAVE area is shown in Table 36-13.

1

The IA32_XSS MSR is zero coming out of RESET. Once IA32_XSS[bit 8] is set, system software operating at CPL= 
0 can use XSAVES/XRSTORS with the appropriate requested-feature bitmap (RFBM) to manage supervisor state 
components in the XSAVE map. See Chapter 13, “Managing State Using the XSAVE Feature Set” of Intel® 64 and 
IA-32 Architectures Software Developer’s Manual, Volume 1
.

36.3.6 Cycle-Accurate 

Mode 

Intel PT can be run in a cycle-accurate mode which enables CYC packets (see Section 36.4.2.14) that provide low-
level information in the processor core clock domain. This cycle counter data in CYC packets can be used to 
compute IPC (Instructions Per Cycle), or to track wall-clock time on a fine-grain level.
To enable cycle-accurate mode packet generation, software should set IA32_RTIT_CTL.CYCEn=1. It is recom-
mended that software also set TSCEn=1 anytime cycle-accurate mode is in use. With this, all CYC-eligible packets 
will be preceded by a CYC packet, the payload of which indicates the number of core clock cycles since the last CYC 
packet. In cases where multiple CYC-eligible packets are generated in a single cycle, only a single CYC will be 
generated before the CYC-eligible packets, otherwise each CYC-eligible packet will be preceded by its own CYC. The 
CYC-eligible packets are:

•

TNT, TIP, TIP.PGE, TIP.PGD, MODE.EXEC, MODE.TSX, PIP, VMCS, OVF, MTC, TSC, PTWRITE, EXSTOP

TSC packets are generated when there is insufficient information to reconstruct wall-clock time, due to tracing 
being disabled (TriggerEn=0), or power down scenarios like a transition to a deep-sleep MWAIT C-state. In this 
case, the CYC that is generated along with the TSC will indicate the number of cycles actively tracing (those 
powered up, with TriggerEn=1) executed between the last CYC packet and the TSC packet. And hence the amount 
of time spent while tracing is inactive can be inferred from the difference in time between that expected based on 
the CYC value, and the actual time indicated by the TSC.
Additional CYC packets may be sent stand-alone, so that the processor can ensure that the decoder is aware of the 
number of cycles that have passed before the internal hardware counter wraps, or is reset due to other micro-
architectural condition. There is no guarantee at what intervals these standalone CYC packets will be sent, except 
that they will be sent before the wrap occurs. An illustration is given below.

1. Table 36-13 documents support for the MSRs defining address ranges 0 and 1. Processors that provide XSAVE support for Intel Processor 

Trace support only those address ranges.

Table 36-13. Memory Layout of the Trace Configuration State Component

Offset within 

Component Area

Field

Offset within 

Component Area

Field

0H

IA32_RTIT_CTL

08H

IA32_RTIT_OUTPUT_BASE

10H

IA32_RTIT_OUTPUT_MASK_PTRS

18H

IA32_RTIT_STATUS

20H

IA32_RTIT_CR3_MATCH

28H

IA32_RTIT_ADDR0_A

30H

IA32_RTIT_ADDR0_B

38H

IA32_RTIT_ADDR1_A

40H

IA32_RTIT_ADDR1_B

48H–End

Reserved