background image

17-50 Vol. 3B

DEBUG, BRANCH PROFILE, TSC, AND RESOURCE MONITORING FEATURES

A mechanism for the OS or Hypervisor to configure the amount of a resource available to a particular Class of 
Service via a list of allocation bitmasks, 

Mechanisms for the OS or Hypervisor to signal the Class of Service to which an application belongs, and

Hardware mechanisms to guide the LLC fill policy when an application has been designated to belong to a 
specific Class of Service.

Note that for many usages, an OS or Hypervisor may not want to expose Cache Allocation Technology mechanisms 
to Ring3 software or virtualized guests.
The Cache Allocation Technology feature enables more cache resources (i.e. cache space) to be made available for 
high priority applications based on guidance from the execution environment as shown in Figure 17-26. The archi-
tecture also allows dynamic resource reassignment during runtime to further optimize the performance of the high 
priority application with minimal degradation to the low priority app. Additionally, resources can be rebalanced for 
system throughput benefit across uses cases of OSes, VMMs, containers and other scenarios by managing the 
CPUID and MSR interfaces. This section describes the hardware and software support required in the platform 
including what is required of the execution environment (i.e. OS/VMM) to support such resource control. Note that 
in Figure 17-26 the L3 Cache is shown as an example resource.

17.17.1  Cache Allocation Technology Architecture

The fundamental goal of Cache Allocation Technology is to enable resource allocation based on application priority 
or Class of Service (COS or CLOS). The processor exposes a set of Classes of Service into which applications (or 
individual threads) can be assigned. Cache allocation for the respective applications or threads is then restricted 
based on the class with which they are associated. Each Class of Service can be configured using capacity bitmasks 
(CBMs) which represent capacity and indicate the degree of overlap and isolation between classes. For each logical 
processor there is a register exposed (referred to here as the IA32_PQR_ASSOC MSR or PQR) to allow the OS/VMM 
to specify a COS when an application, thread or VM is scheduled. 
The usage of Classes of Service (COS) are consistent across resources - and a COS may have multiple re-source 
control attributes attached, which reduces software overhead at context swap time. Rather than adding new types 
of COS tags per resource for instance, the COS management overhead is constant. Cache allocation for the indi-
cated application/thread/VM is then controlled automatically by the hardware based on the class and the bitmask 
associated with that class. Bitmasks are configured via the IA32_resourceType_MASK_n MSRs, where 
resourceType indicates a resource type (e.g. “L3” for the L3 cache) and n indicates a COS number. 
The basic ingredients of Cache Allocation Technology are as follows:

An architecturally exposed mechanism using CPUID to indicate whether CAT is supported, and what resource 
types are available which can be controlled,

For each available resourceType, CPUID also enumerates the total number of Classes of Services and the length 
of the capacity bitmasks that can be used to enforce cache allocation to applications on the platform, 

An architecturally exposed mechanism to allow the execution environment (OS/VMM) to configure the behavior 
of different classes of service using the bitmasks available, 

Figure 17-26.  Cache Allocation Technology Allocates More Resource to High Priority Applications

Without CAT

Core 0

Shared LLC, Low priority got more cache

Lo Pri App

Hi Pri App

Core 1

Core 0

Shared LLC, High priority got more cache

Lo Pri App

Hi Pri App

Core 1

With CAT