background image

Vol. 3C 36-1

INTEL® PROCESSOR TRACE

CHAPTER 36

INTEL® PROCESSOR TRACE

36.1 OVERVIEW

Intel

®

 Processor Trace (Intel PT) is an extension of Intel

®

 Architecture that captures information about software 

execution using dedicated hardware facilities that cause only minimal performance perturbation to the software 
being traced. This information is collected in data packets. The initial implementations of Intel PT offer control 
flow tracing
, which generates a variety of packets to be processed by a software decoder. The packets include 
timing, program flow information (e.g. branch targets, branch taken/not taken indications) and program-induced 
mode related information (e.g. Intel TSX state transitions, CR3 changes). These packets may be buffered internally 
before being sent to the memory subsystem or other output mechanism available in the platform. Debug software 
can process the trace data and reconstruct the program flow.
Later generations include additional trace sources, including software trace instrumentation using PTWRITE, and 
Power Event tracing.

36.1.1 

Features and Capabilities

Intel PT’s control flow trace generates a variety of packets that, when combined with the binaries of a program by 
a post-processing tool, can be used to produce an exact execution trace. The packets record flow information such 
as instruction pointers (IP), indirect branch targets, and directions of conditional branches within contiguous code 
regions (basic blocks).
Intel PT can also be configured to log software-generated packets using PTWRITE, and packets describing 
processor power management events.
In addition, the packets record other contextual, timing, and bookkeeping information that enables both functional 
and performance debugging of applications. Intel PT has several control and filtering capabilities available to 
customize the tracing information collected and to append other processor state and timing information to enable 
debugging. For example, there are modes that allow packets to be filtered based on the current privilege level 
(CPL) or the value of CR3.
Configuration of the packet generation and filtering capabilities are programmed via a set of MSRs. The MSRs 
generally follow the naming convention of IA32_RTIT_*. The capability provided by these configuration MSRs are 
enumerated by CPUID, see Section 36.3. Details of the MSRs for configuring Intel PT are described in Section 
36.2.7.

36.1.1.1   Packet Summary

After a tracing tool has enabled and configured the appropriate MSRs, the processor will collect and generate trace 
information in the following categories of packets (for more details on the packets, see Section 36.4):

Packets about basic information on program execution: These include:
— Packet Stream Boundary (PSB) packets: PSB packets act as ‘heartbeats’ that are generated at regular 

intervals (e.g., every 4K trace packet bytes). These packets allow the packet decoder to find the packet 
boundaries within the output data stream; a PSB packet should be the first packet that a decoder looks for 
when beginning to decode a trace.

— Paging Information Packet (PIP): PIPs record modifications made to the CR3 register. This information, 

along with information from the operating system on the CR3 value of each process, allows the debugger 
to attribute linear addresses to their correct application source.

— Time-Stamp Counter (TSC) packets: TSC packets aid in tracking wall-clock time, and contain some portion 

of the software-visible time-stamp counter.

— Core Bus Ratio (CBR) packets: CBR packets contain the core:bus clock ratio.