Looking at ACPI PPTT from Userspace

The sys file system is used to expose Linux constructs to user space. One place we can see ACPI based information is in /sys/firmware/acpi

ls /sys/firmware/acpi/
bgrt  hotplug  pm_profile  tables

WHat are these: well, bgrt is, I think the boot graphics record table.

$ ls /sys/firmware/acpi/bgrt/
image  status  type  version  xoffset  yoffset

Sure looks like it.

What I am currently interested in is the tables sub directory.

ls   /sys/firmware/acpi/tables/
APIC  BGRT  DBG2  dynamic  FACP  FPDT  HEST  MCFG  PPTT  SLIT  SPMI1  SPMI3  SSDT
BERT  data  DSDT  EINJ     FIDT  GTDT  IORT  PCCT  SDEI  SPCR  SPMI2  SRAT   WSMT

With the exception of data and dynamic, this appears to be a 4 character code defined by ACPI. I have a passing knowlege of two of these: BERT and SSDT. BERT is the Boot Error Record Table, a log used for debugging bringup issues. We had a bug around this that was recently fixed.

The patch I was reviewing referenced Processor Properties Topology Table ( PPTT ) entry. The file with largest number of changes in this patch is drivers/acpi/pptt.c with >500 lines. It looks like it might be entirely new in this patch, and has crept up a bit since:

wc -l pptt.c 
840 pptt.c

Git blame show about 4 patches: the first which added it, and a fairly big chunk added last year. Lets look a the commit messages:

ACPI/PPTT: Add Processor Properties Topology Table parsing

ACPI 6.2 adds a new table, which describes how processing units are related to each other in tree like fashion. Caches are also sprinkled throughout the tree and describe the properties of the caches in relation to other caches and processing units.

Add the code to parse the cache hierarchy and report the total number of levels of cache for a given core using acpi_find_last_cache_level() as well as fill out the individual cores cache information with cache_setup_acpi() once the cpu_cacheinfo structure has been populated by the arch specific code.

An additional patch later in the set adds the ability to report peers in the topology using find_acpi_cpu_topology() to report a unique ID for each processing unit at a given level in the tree. These unique id's can then be used to match related processing units which exist as threads, within a given package, etc.

Linux Kernel commit 2bd00bcd73e5edd5769e2a5f24c59a517582d862

My system has 80 cores on it in one CPU socket/chip. This is a non-trivial processor setup, compared to an ARM64 system running on a Raspberri PI, which probably only has 1a single core. A cell phone most likely has 2,4, or 8 cores; complexity is growing in your pocket!

There is a command in the discussion of the patch that shows how to make use of this table:

cat /sys/firmware/acpi/tables/PPTT > pptt.dat
iasl -d pptt.dat

What is iasl? iasl – ACPI Source Language compiler/decompiler

I ran roughly those two commands on my system and it generated a fairly large file:

wc -l /tmp/pptt.dsl
24230 /tmp/pptt.dsl

A comment at the top of the output shows how to interpret it:

Format: [HexOffset DecimalOffset ByteLength]  FieldName : FieldValue (in hex)

I think the HexOffset is used in the parent field, showing how to build the tree.

  • The top node has an offset of 0x0 and no parent key.
  • The second node has an offset of 024h and a parent value of 0x0
  • The third node has an offset of 0x38 and a parent value of 0x24
  • The fourst node has an offset of 04Ch and a parent value of 0x24
  • skipping a couple, I see a node with offset 0x60 and a parent of 38. This is the parent value for all nodes up through offset 54Ch
  • Node offset 0x560 has a paretn value of 0000004C which is also repeated for many more nodes.

So I think it is safe to say that we know how this part of the tree is built.

The MPIDR referenced in the patch is a way of identifying cores and scheduling affinity, which affects for performance in high core count CPUs.

I think that is why I am not seeing the problem: this patch is (I think) explicitly for SMT systems, and ours is not. I don’t know if it would be strictly speaking correct to call and AltraMax chip MPP (it might be?) but it certainly does not treat all CPU resources as equal. It is more correct to call it a NUMA architecture.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.