1*19c3b8c2SApple OSS Distributions# Recount 2*19c3b8c2SApple OSS Distributions 3*19c3b8c2SApple OSS DistributionsRecount is a resource accounting subsystem in the kernel that tracks the CPU resources consumed by threads, tasks, coalitions, and processors. 4*19c3b8c2SApple OSS DistributionsIt supports attributing counts to a specific level of the CPU topology (per-CPU and per-CPU kind). 5*19c3b8c2SApple OSS DistributionsARM64 devices with a fast timebase read and Intel devices can track time spent in the kernel (system) separately from user space. 6*19c3b8c2SApple OSS Distributions64-bit, non-virtualized (e.g. _not_ running under a hypervisor) devices also accumulate instructions and cycles at each context switch. 7*19c3b8c2SApple OSS DistributionsThese two metrics are abbreviated to cycles-per-instruction, or CPI, for brevity. 8*19c3b8c2SApple OSS DistributionsARM64 devices can also track task and thread energy in nanojoules. 9*19c3b8c2SApple OSS Distributions 10*19c3b8c2SApple OSS DistributionsBy default, Recount tracks its counters per-CPU kind (e.g. performance or efficiency) for threads, per-CPU for tasks, and per-CPU kind for coalitions. 11*19c3b8c2SApple OSS Distributions 12*19c3b8c2SApple OSS Distributions## High-Level Interfaces 13*19c3b8c2SApple OSS Distributions 14*19c3b8c2SApple OSS DistributionsThese interfaces report counter data to user space and are backed by Recount. 15*19c3b8c2SApple OSS Distributions 16*19c3b8c2SApple OSS Distributions| Interface | Entity | Target | Tests | Time | CPI | Energy | Perf Levels | 17*19c3b8c2SApple OSS Distributions| --------------------------: | ----------- | ------------- | :---: | :--: | :-: | :----: | :---------: | 18*19c3b8c2SApple OSS Distributions| `getrusage` | task | self/children | FP | ✓¹ | | | | 19*19c3b8c2SApple OSS Distributions| `prod_pid_rusage` | task | pid | FP | ✓ | ✓ | ✓ | ✓² | 20*19c3b8c2SApple OSS Distributions| `PROC_PIDTASKINFO` | task | pid | FP | ✓ | ✓ | | ✓² | 21*19c3b8c2SApple OSS Distributions| `TASK_BASIC_INFO` | task | task port | FP | ✓¹ | | | | 22*19c3b8c2SApple OSS Distributions| `TASK_ABSOLUTETIME_INFO` | task | task port | FP | ✓ | | | | 23*19c3b8c2SApple OSS Distributions| `TASK_POWER_INFO` | task | task port | FP | ✓ | | | | 24*19c3b8c2SApple OSS Distributions| `TASK_INSPECT_BASIC_COUNTS` | task | task inspect | P | | ✓ | | | 25*19c3b8c2SApple OSS Distributions| `THREAD_BASIC_INFO` | thread | thread port | P | ✓ | | | | 26*19c3b8c2SApple OSS Distributions| `THREAD_EXTENDED_INFO` | thread | thread port | | ✓ | | | | 27*19c3b8c2SApple OSS Distributions| `proc_threadinfo` | thread | thread ID | | ✓ | | | | 28*19c3b8c2SApple OSS Distributions| `proc_threadcounts` | thread | thread ID | F | ✓ | ✓ | ✓ | ✓ | 29*19c3b8c2SApple OSS Distributions| `thread_selfcounts` | thread | self | FP | ✓ | ✓ | ✓ | ✓ | 30*19c3b8c2SApple OSS Distributions| `thread_selfusage` | thread | self | FP | ✓ | | | | 31*19c3b8c2SApple OSS Distributions| `coalition_info` | coalition | coalition ID | F | ✓ | ✓ | ✓ | ✓² | 32*19c3b8c2SApple OSS Distributions| `HOST_CPU_LOAD_INFO` | system | all | | ✓ | | | | 33*19c3b8c2SApple OSS Distributions| `PROCESSOR_CPU_LOAD_INFO` | processor | port | | ✓ | | | | 34*19c3b8c2SApple OSS Distributions| `stackshot` | task/thread | all | P | ✓ | ✓ | | ✓² | 35*19c3b8c2SApple OSS Distributions| DTrace | thread | any | | ✓ | ✓ | | | 36*19c3b8c2SApple OSS Distributions| kperf | task/thread | any | | ✓ | ✓ | | ✓² | 37*19c3b8c2SApple OSS Distributions 38*19c3b8c2SApple OSS Distributions- Under Tests, "F" is functional and "P" is performance. 39*19c3b8c2SApple OSS Distributions- ¹ Time precision is microseconds. 40*19c3b8c2SApple OSS Distributions- ² These return overall totals and hard-code a separate, P-core-only value. 41*19c3b8c2SApple OSS Distributions 42*19c3b8c2SApple OSS Distributions## LLDB 43*19c3b8c2SApple OSS Distributions 44*19c3b8c2SApple OSS DistributionsThe `recount` macro inspects counters in an LLDB session and is generally useful for retrospective analysis of CPU usage. 45*19c3b8c2SApple OSS DistributionsIts subcommands print each metric as a column and then uses rows for the groupings, like per-CPU or per-CPU kind values. 46*19c3b8c2SApple OSS DistributionsTables also include formulaic columns that can be derived from two metrics, like CPI or power. 47*19c3b8c2SApple OSS DistributionsBy default, it prints the times in seconds, but the `-M` flag switches the output to Mach time values. 48*19c3b8c2SApple OSS Distributions 49*19c3b8c2SApple OSS Distributions- `recount thread <thread-ptr> [...]` prints a table of per-CPU kind counts for threads. 50*19c3b8c2SApple OSS Distributions 51*19c3b8c2SApple OSS Distributions- `recount task <task-ptr> [...]` prints a table of per-CPU counts for tasks. 52*19c3b8c2SApple OSS Distributions - `-T` prints the task's active thread counters in additional tables. 53*19c3b8c2SApple OSS Distributions - `-F <name>` finds the task matching the provided name instead of using a task pointer. 54*19c3b8c2SApple OSS Distributions 55*19c3b8c2SApple OSS Distributions- `recount coalition <coalition-ptr>` prints a table of per-CPU kind counts for each coalition, not including the currently-active tasks. 56*19c3b8c2SApple OSS DistributionsCoalition pointers can be found with the `showtaskcoalitions` macro, and should be _resource_ coalitions. 57*19c3b8c2SApple OSS Distributions 58*19c3b8c2SApple OSS Distributions- `recount processor <processor-ptr-or-cpu-id>` prints a table of counts for a processor. 59*19c3b8c2SApple OSS Distributions - `-T` prints the processor's active thread counters in an additional table. 60*19c3b8c2SApple OSS Distributions - `-A` includes all processors in the output. 61*19c3b8c2SApple OSS Distributions 62*19c3b8c2SApple OSS Distributions- `recount diagnose` prints information useful for debugging the Recount subsystem itself. 63*19c3b8c2SApple OSS Distributions 64*19c3b8c2SApple OSS Distributions- `recount triage` is meant to be used by the automated panic debug scripts. 65*19c3b8c2SApple OSS Distributions 66*19c3b8c2SApple OSS Distributions## Internals 67*19c3b8c2SApple OSS Distributions 68*19c3b8c2SApple OSS DistributionsAccounting for groups of entities like threads and tasks starts with a `recount_plan_t`, declared using `RECOUNT_PLAN_DECLARE` and defined with `RECOUNT_PLAN_DEFINE`, which takes the topology, or granularity, of the counting. 69*19c3b8c2SApple OSS DistributionsThe plan topology defines how many `recount_usage` structures are needed. 70*19c3b8c2SApple OSS DistributionsTo count CPU resource usage, a `struct recount_usage` has the following fields: 71*19c3b8c2SApple OSS Distributions 72*19c3b8c2SApple OSS Distributions- `ru_system_time_mach`: the total time spent in the kernel consumed, in Mach time units 73*19c3b8c2SApple OSS Distributions- `ru_user_time_mach`: the total time spent in user space consumed, in Mach time units 74*19c3b8c2SApple OSS Distributions- `ru_cycles`: the cycles run by a CPU with `CONFIG_PERVASIVE_CPI` 75*19c3b8c2SApple OSS Distributions- `ru_instructions`: the instructions retired by a CPU with `CONFIG_PERVASIVE_CPI` 76*19c3b8c2SApple OSS Distributions- `ru_energy_nj`: the energy consumed by a CPU, in nano-Joules with `CONFIG_PERVASIVE_ENERGY` 77*19c3b8c2SApple OSS Distributions 78*19c3b8c2SApple OSS DistributionsAt context switch, `recount_switch_thread` captures the hardware counters with `recount_snapshot` into a `struct recount_snap`. 79*19c3b8c2SApple OSS DistributionsThe CPU's previous snapshot, stored in the `_snaps_percpu` per-CPU variable, is subtracted from the new one to get a delta to add to the currently-executing entity's usage structure. 80*19c3b8c2SApple OSS DistributionsThe per-CPU variable is then updated with the current snapshot for the next switch. 81*19c3b8c2SApple OSS DistributionsThe user/kernel transition code calls `recount_leave_user` or `recount_enter_user`, which performs the same operation, except with `recount_snapshot_speculative`. 82*19c3b8c2SApple OSS DistributionsIt relies on other synchronization barriers in the transition code to provide keep the snapshot precise. 83*19c3b8c2SApple OSS Distributions 84*19c3b8c2SApple OSS DistributionsProcessors also track their idle time separately from the usage structure with paired calls to `recount_processor_idle` and `recount_processor_run`. 85*19c3b8c2SApple OSS DistributionsIdle time has no user component and doesn't consume instructions or cycles, so a full usage structure isn't necessary. 86*19c3b8c2SApple OSS DistributionsIt stores the last update time in a 64-bit value combined with a state stored in the top two bits to determine whether the processor is currently idle or active. 87*19c3b8c2SApple OSS Distributions 88*19c3b8c2SApple OSS DistributionsA `struct recount_track` is the primary data structure found in threads, tasks, and processors. 89*19c3b8c2SApple OSS DistributionsTracks include a `recount_usage` structure but ensures that each is updated atomically with respect to readers. 90*19c3b8c2SApple OSS Distributions 91*19c3b8c2SApple OSS Distributions### Track Atomicity 92*19c3b8c2SApple OSS Distributions 93*19c3b8c2SApple OSS DistributionsTo ensure the accuracy of formulas involving multiple metrics, like CPI, all metrics must be updated atomically from the perspective of the reader. 94*19c3b8c2SApple OSS DistributionsA traditional locking mechanism would prevent the writer from updating the counts while readers are present, so Recount uses a sequence lock instead. 95*19c3b8c2SApple OSS DistributionsWriters make a generation count odd before updating any of the values and then set it back to even when all values are updated. 96*19c3b8c2SApple OSS DistributionsReaders wait until the generation count becomes even before trying to read the values, and if the counter changes by the time they're done reading them, it retries the read. 97*19c3b8c2SApple OSS DistributionsSince three entities need to be updated at once (thread, task, and processor), only the last update has a release barrier to publish the writes. 98*19c3b8c2SApple OSS DistributionsWhen reporting just user and system time, taking the sequence lock as a reader introduced unacceptable overhead. 99*19c3b8c2SApple OSS DistributionsThe sequence lock doesn't need to be taken for these metrics since they're never updated simultaneously. 100*19c3b8c2SApple OSS Distributions 101*19c3b8c2SApple OSS DistributionsThe coalition counters are not updated by threads switching off-CPU and are instead protected by the coalition lock while a task exits and rolls up its counters to the coalition. 102*19c3b8c2SApple OSS DistributionsReading the counters requires holding the lock and iterating the constituent tasks, grouping their per-CPU counters into per-CPU kind ones. 103*19c3b8c2SApple OSS Distributions 104*19c3b8c2SApple OSS Distributions### Energy 105*19c3b8c2SApple OSS Distributions 106*19c3b8c2SApple OSS DistributionsThe energy counters on ARM systems count a custom unit of energy that needs to be scaled to nanojoules. 107*19c3b8c2SApple OSS DistributionsBecause this unit can be very small and may overflow a 64-bit counter, it's scaled to nanojoules during context-switch. 108