xref: /xnu-10063.121.3/doc/observability/recount.md (revision 2c2f96dc2b9a4408a43d3150ae9c105355ca3daa)
1*2c2f96dcSApple OSS Distributions# Recount
2*2c2f96dcSApple OSS Distributions
3*2c2f96dcSApple OSS DistributionsCPU resource accounting interfaces and implementation.
4*2c2f96dcSApple OSS Distributions
5*2c2f96dcSApple OSS Distributions## Overview
6*2c2f96dcSApple OSS Distributions
7*2c2f96dcSApple OSS DistributionsRecount is a resource accounting subsystem in the kernel that tracks the CPU resources consumed by threads, tasks, coalitions, and processors.
8*2c2f96dcSApple OSS DistributionsIt supports attributing counts to a specific level of the CPU topology (per-CPU and per-CPU kind).
9*2c2f96dcSApple OSS DistributionsARM64 devices with a fast timebase read and Intel devices can track time spent in the kernel (system) separately from user space.
10*2c2f96dcSApple OSS Distributions64-bit, non-virtualized (e.g. _not_ running under a hypervisor) devices also accumulate instructions and cycles at each context switch.
11*2c2f96dcSApple OSS DistributionsThese two metrics are abbreviated to cycles-per-instruction, or CPI, for brevity.
12*2c2f96dcSApple OSS DistributionsARM64 devices can also track task and thread energy in nanojoules,
13*2c2f96dcSApple OSS Distributionsbut only at the granularity of thread context switch,
14*2c2f96dcSApple OSS Distributionsnot between user and system.
15*2c2f96dcSApple OSS Distributions
16*2c2f96dcSApple OSS Distributions
17*2c2f96dcSApple OSS DistributionsBy default, Recount tracks its counters per-CPU kind (e.g. performance or efficiency) for threads, per-CPU for tasks, and per-CPU kind for coalitions.
18*2c2f96dcSApple OSS Distributions
19*2c2f96dcSApple OSS Distributions## High-Level Interfaces
20*2c2f96dcSApple OSS Distributions
21*2c2f96dcSApple OSS DistributionsThese interfaces report counter data to user space and are backed by Recount.
22*2c2f96dcSApple OSS Distributions
23*2c2f96dcSApple OSS Distributions| Interface                   | Entity      | Target        | Tests | Time | CPI | Energy | Perf Levels | Secure |
24*2c2f96dcSApple OSS Distributions| --------------------------: | ----------- | ------------- | :---: | :--: | :-: | :----: | :---------: | :----: |
25*2c2f96dcSApple OSS Distributions|                 `getrusage` | task        | self/children |  FP   |  ✓¹  |     |        |             |        |
26*2c2f96dcSApple OSS Distributions|           `prod_pid_rusage` | task        | pid           |  FP   |  ✓   |  ✓  |   ✓    |     ✓²      |   ✓²   |
27*2c2f96dcSApple OSS Distributions|          `PROC_PIDTASKINFO` | task        | pid           |  FP   |  ✓   |  ✓  |        |     ✓²      |        |
28*2c2f96dcSApple OSS Distributions|           `TASK_BASIC_INFO` | task        | task port     |  FP   |  ✓¹  |     |        |             |        |
29*2c2f96dcSApple OSS Distributions|    `TASK_ABSOLUTETIME_INFO` | task        | task port     |  FP   |  ✓   |     |        |             |        |
30*2c2f96dcSApple OSS Distributions|           `TASK_POWER_INFO` | task        | task port     |  FP   |  ✓   |     |        |             |        |
31*2c2f96dcSApple OSS Distributions| `TASK_INSPECT_BASIC_COUNTS` | task        | task inspect  |   P   |      |  ✓  |        |             |        |
32*2c2f96dcSApple OSS Distributions|         `THREAD_BASIC_INFO` | thread      | thread port   |   P   |  ✓   |     |        |             |        |
33*2c2f96dcSApple OSS Distributions|      `THREAD_EXTENDED_INFO` | thread      | thread port   |       |  ✓   |     |        |             |        |
34*2c2f96dcSApple OSS Distributions|           `proc_threadinfo` | thread      | thread ID     |       |  ✓   |     |        |             |        |
35*2c2f96dcSApple OSS Distributions|         `proc_threadcounts` | thread      | thread ID     |   F   |  ✓   |  ✓  |   ✓    |      ✓      |        |
36*2c2f96dcSApple OSS Distributions|         `thread_selfcounts` | thread      | self          |  FP   |  ✓   |  ✓  |   ✓    |      ✓      |        |
37*2c2f96dcSApple OSS Distributions|          `thread_selfusage` | thread      | self          |  FP   |  ✓   |     |        |             |        |
38*2c2f96dcSApple OSS Distributions|            `coalition_info` | coalition   | coalition ID  |   F   |  ✓   |  ✓  |   ✓    |     ✓²      |        |
39*2c2f96dcSApple OSS Distributions|        `HOST_CPU_LOAD_INFO` | system      | all           |       |  ✓   |     |        |             |        |
40*2c2f96dcSApple OSS Distributions|   `PROCESSOR_CPU_LOAD_INFO` | processor   | port          |       |  ✓   |     |        |             |        |
41*2c2f96dcSApple OSS Distributions|                 `stackshot` | task/thread | all           |   P   |  ✓   |  ✓  |        |     ✓²      |        |
42*2c2f96dcSApple OSS Distributions|                      DTrace | thread      | any           |       |  ✓   |  ✓  |        |             |        |
43*2c2f96dcSApple OSS Distributions|                       kperf | task/thread | any           |       |  ✓   |  ✓  |        |     ✓²      |        |
44*2c2f96dcSApple OSS Distributions
45*2c2f96dcSApple OSS Distributions- Under Tests, "F" is functional and "P" is performance.
46*2c2f96dcSApple OSS Distributions- ¹ Time precision is microseconds.
47*2c2f96dcSApple OSS Distributions- ² These return overall totals and hard-code a separate, P-core-only value.
48*2c2f96dcSApple OSS Distributions
49*2c2f96dcSApple OSS Distributions## LLDB
50*2c2f96dcSApple OSS Distributions
51*2c2f96dcSApple OSS DistributionsThe `recount` macro inspects counters in an LLDB session and is generally useful for retrospective analysis of CPU usage.
52*2c2f96dcSApple OSS DistributionsIts subcommands print each metric as a column and then uses rows for the groupings, like per-CPU or per-CPU kind values.
53*2c2f96dcSApple OSS DistributionsTables also include formulaic columns that can be derived from two metrics, like CPI or power.
54*2c2f96dcSApple OSS DistributionsBy default, it prints the times in seconds, but the `-M` flag switches the output to Mach time values.
55*2c2f96dcSApple OSS Distributions
56*2c2f96dcSApple OSS Distributions- `recount thread <thread-ptr> [...]` prints a table of per-CPU kind counts for threads.
57*2c2f96dcSApple OSS Distributions
58*2c2f96dcSApple OSS Distributions- `recount task <task-ptr> [...]` prints a table of per-CPU counts for tasks.
59*2c2f96dcSApple OSS Distributions	- `-T` prints the task's active thread counters in additional tables.
60*2c2f96dcSApple OSS Distributions	- `-F <name>` finds the task matching the provided name instead of using a task pointer.
61*2c2f96dcSApple OSS Distributions
62*2c2f96dcSApple OSS Distributions- `recount coalition <coalition-ptr>` prints a table of per-CPU kind counts for each coalition, not including the currently-active tasks.
63*2c2f96dcSApple OSS DistributionsCoalition pointers can be found with the `showtaskcoalitions` macro, and should be _resource_ coalitions.
64*2c2f96dcSApple OSS Distributions
65*2c2f96dcSApple OSS Distributions- `recount processor <processor-ptr-or-cpu-id>` prints a table of counts for a processor.
66*2c2f96dcSApple OSS Distributions	- `-T` prints the processor's active thread counters in an additional table.
67*2c2f96dcSApple OSS Distributions	- `-A` includes all processors in the output.
68*2c2f96dcSApple OSS Distributions
69*2c2f96dcSApple OSS Distributions- `recount diagnose` prints information useful for debugging the Recount subsystem itself.
70*2c2f96dcSApple OSS Distributions
71*2c2f96dcSApple OSS Distributions- `recount triage` is meant to be used by the automated panic debug scripts.
72*2c2f96dcSApple OSS Distributions
73*2c2f96dcSApple OSS Distributions## Internals
74*2c2f96dcSApple OSS Distributions
75*2c2f96dcSApple OSS DistributionsAccounting for groups of entities like threads and tasks starts with a `recount_plan_t`, declared using `RECOUNT_PLAN_DECLARE` and defined with `RECOUNT_PLAN_DEFINE`, which takes the topology, or granularity, of the counting.
76*2c2f96dcSApple OSS DistributionsThe plan topology defines how many `recount_usage` structures are needed.
77*2c2f96dcSApple OSS DistributionsTo count CPU resource usage, a `struct recount_usage` has the following fields:
78*2c2f96dcSApple OSS Distributions
79*2c2f96dcSApple OSS Distributions- `ru_metrics[RCT_LVL_COUNT]`: metrics accumulated in each exception level
80*2c2f96dcSApple OSS Distributions- `ru_energy_nj`: the energy consumed by a CPU, in nano-Joules with `CONFIG_PERVASIVE_ENERGY`
81*2c2f96dcSApple OSS Distributions
82*2c2f96dcSApple OSS DistributionsThe metrics are stored in a `recount_metrics` structure with the following fields:
83*2c2f96dcSApple OSS Distributions
84*2c2f96dcSApple OSS Distributions- `ru_time_mach`: the time spent, in Mach time units
85*2c2f96dcSApple OSS Distributions- `ru_cycles`: the cycles run by a CPU with `CONFIG_PERVASIVE_CPI`
86*2c2f96dcSApple OSS Distributions- `ru_instructions`: the instructions retired by a CPU with `CONFIG_PERVASIVE_CPI`
87*2c2f96dcSApple OSS Distributions
88*2c2f96dcSApple OSS DistributionsAt context switch, `recount_switch_thread` captures the hardware counters with `recount_snapshot` into a `struct recount_snap`.
89*2c2f96dcSApple OSS DistributionsThe CPU's previous snapshot, stored in the `_snaps_percpu` per-CPU variable, is subtracted from the new one to get a delta to add to the currently-executing entity's usage structure.
90*2c2f96dcSApple OSS DistributionsThe per-CPU variable is then updated with the current snapshot for the next switch.
91*2c2f96dcSApple OSS DistributionsThe user/kernel transition code calls `recount_leave_user` or `recount_enter_user`, which performs the same operation, except with `recount_snapshot_speculative`.
92*2c2f96dcSApple OSS DistributionsIt relies on other synchronization barriers in the transition code to provide keep the snapshot precise.
93*2c2f96dcSApple OSS DistributionsDuring preemption, the context switch handler attributes metrics back to the level stored in each thread.
94*2c2f96dcSApple OSS DistributionsOn the boundaries of secure execution handoff, `recount_enter_secure` and `recount_leave_secure` update the current thread's level and attribute metrics back to the previous level.
95*2c2f96dcSApple OSS Distributions
96*2c2f96dcSApple OSS DistributionsProcessors also track their idle time separately from the usage structure with paired calls to `recount_processor_idle` and `recount_processor_run`.
97*2c2f96dcSApple OSS DistributionsIdle time has no user component and doesn't consume instructions or cycles, so a full usage structure isn't necessary.
98*2c2f96dcSApple OSS DistributionsIt stores the last update time in a 64-bit value combined with a state stored in the top two bits to determine whether the processor is currently idle or active.
99*2c2f96dcSApple OSS Distributions
100*2c2f96dcSApple OSS DistributionsA `struct recount_track` is the primary data structure found in threads, tasks, and processors.
101*2c2f96dcSApple OSS DistributionsTracks include a `recount_usage` structure but ensures that each is updated atomically with respect to readers.
102*2c2f96dcSApple OSS Distributions
103*2c2f96dcSApple OSS Distributions### Track Atomicity
104*2c2f96dcSApple OSS Distributions
105*2c2f96dcSApple OSS DistributionsTo ensure the accuracy of formulas involving multiple metrics, like CPI, all metrics must be updated atomically from the perspective of the reader.
106*2c2f96dcSApple OSS DistributionsA traditional locking mechanism would prevent the writer from updating the counts while readers are present, so Recount uses a sequence lock instead.
107*2c2f96dcSApple OSS DistributionsWriters make a generation count odd before updating any of the values and then set it back to even when all values are updated.
108*2c2f96dcSApple OSS DistributionsReaders wait until the generation count becomes even before trying to read the values, and if the counter changes by the time they're done reading them, it retries the read.
109*2c2f96dcSApple OSS DistributionsSince three entities need to be updated at once (thread, task, and processor), only the last update has a release barrier to publish the writes.
110*2c2f96dcSApple OSS DistributionsWhen reporting just user and system time, taking the sequence lock as a reader introduced unacceptable overhead.
111*2c2f96dcSApple OSS DistributionsThe sequence lock doesn't need to be taken for these metrics since they're never updated simultaneously.
112*2c2f96dcSApple OSS Distributions
113*2c2f96dcSApple OSS DistributionsThe coalition counters are not updated by threads switching off-CPU and are instead protected by the coalition lock while a task exits and rolls up its counters to the coalition.
114*2c2f96dcSApple OSS DistributionsReading the counters requires holding the lock and iterating the constituent tasks, grouping their per-CPU counters into per-CPU kind ones.
115*2c2f96dcSApple OSS Distributions
116*2c2f96dcSApple OSS Distributions### Energy
117*2c2f96dcSApple OSS Distributions
118*2c2f96dcSApple OSS DistributionsThe energy counters on ARM systems count a custom unit of energy that needs to be scaled to nanojoules.
119*2c2f96dcSApple OSS DistributionsBecause this unit can be very small and may overflow a 64-bit counter, it's scaled to nanojoules during context-switch.
120*2c2f96dcSApple OSS Distributions
121*2c2f96dcSApple OSS DistributionsUnlike the other metrics, the energy counters are not sampled directly by Recount so the values cannot be tracked at user/kernel/secure granularity.
122*2c2f96dcSApple OSS Distributions
123*2c2f96dcSApple OSS Distributions## See Also
124*2c2f96dcSApple OSS Distributions
125*2c2f96dcSApple OSS Distributions- <doc:cpu_counters>
126