xref: /xnu-12377.41.6/doc/debugging/extensible_paniclog.md (revision bbb1b6f9e71b8cdde6e5cd6f4841f207dee3d828)
1# Extensible Paniclog
2
3This documentation discusses the API and features of the extensible paniclog in XNU's panic flow.
4
5## Overview
6
7With this feature we want to provide an infrastructure for kexts / dexts to insert their system state into the paniclog. Currently there is no way of knowing the kext or dext state unless we take a full coredump. With this feature, they can drop relevant state information that will end up in the paniclog and can be used to triage panics.
8
9## UUID ↔ buffer data mapping
10
11All clients who adopt this infrastructure will have to use a UUID that maps to a format of the buffer data. Clients will have to provide a mapping that specifies how to decode the data. This mapping will be used to decode the data in DumpPanic or a tool integrated into MPT.
12
13## IOKit APIs
14
15Source Code: `iokit/IOKit/IOExtensiblePaniclog.h`
16
17```c
18static bool createWithUUID(uuid_t uuid, const char *data_id, uint32_t max_len, ext_paniclog_create_options_t options, IOExtensiblePaniclog **out);
19```
20
21This is the first API that is called by a kext to initialize an IOExtensiblePaniclog instance. It takes a UUID, data_id, max len, and options as input and emits an instance in the out pointer. The data id takes a short description of the buffer and the maximum length is 32 bytes.
22
23```c
24int setActive();
25int setInactive();
26```
27
28These functions are called to make an IOExtensiblePaniclog instance active or inactive. An instance is collected and put into the panic file only if it's active. It's ignored in the panic path if it's inactive.
29
30```c
31int insertData(void *addr, uint32_t len);
32```
33
34This function inserts the data pointed to by addr into the IOExtensiblePaniclog instance. It will copy the data into the buffer from offset 0.
35
36```c
37int appendData(void *addr, uint32_t len);
38```
39
40This function appends the data pointed to by addr into the IOExtensiblePaniclog instance. It will position the data after the previous insert or append.
41
42```c
43void *claimBuffer();
44```
45
46This function returns the buffer of the IOExtensiblePaniclog instance. This function also sets the used length of the handle to the max length. The entire buffer is copied out when the system panic after this function call. yieldBuffer() has to be called before using insertData() or appendData(). 
47
48```c
49int yieldBuffer(uint32_t used_len);
50```
51
52This function is called to yield the buffer and set the used_len for the buffer.
53
54```c
55int setUsedLen(uint32_t used_len)
56```
57
58This function is called to set the used len of the buffer.
59
60## DriverKit APIs
61
62Source Code: `iokit/DriverKit/IOExtensiblePaniclog.iig`
63
64```cpp
65static kern_return_t Create(OSData *uuid, OSString *data_id, uint32_t max_len, IOExtensiblePaniclog **out);
66```
67
68This is first API that is called by a dext to initialize an IOExtensiblePaniclog instance. It takes a UUID, data_id and the max len as input and emits an instance in the out pointer. The data id takes a short description of the buffer and the maximum length is 32 bytes.
69
70```cpp
71kern_return_t SetActive();
72kern_return_t SetInactive();
73```
74
75These functions are called to make an IOExtensiblePaniclog instance active or inactive. An instance is collected and put into the panic file only if it's active. It's ignored in the panic path if it's inactive.
76
77```cpp
78kern_return_t InsertData(OSData *data);
79```
80
81This function inserts the data pointed to by addr into the IOExtensiblePaniclog instance. It will copy the data into the buffer from offset 0.
82
83```cpp
84kern_return_t AppendData(OSData *data);
85```
86
87This function appends the data pointed to by addr into the IOExtensiblePaniclog instance. It will position the data after the previous insert or append.
88
89```cpp
90kern_return_t ClaimBuffer(uint64_t *addr, uint64_t *len);
91```
92
93This function is called to get a pointer to the ext paniclog buffer. After this function is called, the user is responsible for copying data into the buffer. The entire buffer is copied when a system panics. After claiming the buffer, YieldBuffer() has to be called to set the used_len of the buffer before calling InsertData() or AppendData().
94
95```cpp
96kern_return_t YieldBuffer(uint32_t used_len);
97```
98
99This function is called to yield the buffer and set the used_len for the buffer.
100
101```cpp
102kern_return_t SetUsedLen(uint32_t used_len);
103```
104
105This function is called to set the used len of the buffer.
106
107## Low-Level Kernel APIs
108
109Source Code: `osfmk/kern/ext_paniclog.h`
110
111### ExtensiblePaniclog Handle Struct
112
113```c
114typedef struct ext_paniclog_handle {
115	LIST_ENTRY(ext_paniclog_handle) handles;
116	uuid_t uuid;
117	char data_id[MAX_DATA_ID_SIZE];
118	void *buf_addr;
119	uint32_t max_len;
120	uint32_t used_len;
121    ext_paniclog_create_options_t options;
122    ext_paniclog_flags_t flags;
123	uint8_t active;
124} ext_paniclog_handle_t;
125```
126
127We employ handles in XNU to guarantee the effective management of buffer lifecycles, prevent nested panics from occurring during access from the panic path, and build a durable and expandable API. The primary reason for using handles is to allow XNU to oversee the entire buffer lifecycle. By keeping track of the buffer's state and managing its deallocation, we can avoid potential issues that may arise during panic scenarios.
128
129```c
130ext_paniclog_handle_t *ext_paniclog_handle_alloc_with_uuid(uuid_t uuid, const char *data_id, uint32_t max_len, ext_paniclog_create_options_t);
131```
132
133This function will be called to initialize a buffer of the specified length. For all subsequent operations we use this handle as input. It takes a UUID, data_id, max len, and options as input and emits an instance in the out pointer. The data id takes a short description of the buffer and the maximum length is 32 bytes. This function will return a handle on success and NULL on failure.
134
135```c
136int ext_paniclog_handle_set_active(ext_paniclog_handle_t *handle);
137```
138
139This function sets the handle as active. In active state, this buffer will get picked up by the panic path and put into the panic file.
140
141```c
142int ext_paniclog_handle_set_inactive(ext_paniclog_handle_t *handle);
143```
144
145This function sets the handle as inactive.
146
147```c
148void ext_paniclog_handle_free(ext_paniclog_handle_t *handle)
149```
150
151This functions deallocates all the memory that is allocated in the alloc function. The handle has to a be a valid and this function should only be called after handle_alloc is called.
152
153```c
154int ext_paniclog_insert_data(ext_paniclog_handle_t *handle, void *addr, size_t len)
155```
156
157This function is called to insert the data from a buffer to the handle buffer. This function will take a handle that has been previously allocated, an address to the buffer and length of the buffer. This function will return 0 on success and a negative value on failure.
158
159```c
160int ext_paniclog_append_data(ext_paniclog_handle_t *handle, void *addr, uint32_t len);
161```
162
163This function is called to append to the data that is already present in the buffer.
164
165```c
166void *ext_paniclog_get_buffer(ext_paniclog_handle_t *handle)
167```
168
169This function is called to get a pointer to the ext paniclog buffer. To modify the buffer after getting the pointer use the `ext_paniclog_claim_buffer()`.
170
171```c
172void *ext_paniclog_claim_buffer(ext_paniclog_handle_t *handle);
173```
174
175This function is called to get a pointer to the ext paniclog buffer. After this function is called, the user is responsible for copying data into the buffer. The entire buffer is copied when a system panics. After claiming the buffer, `ext_paniclog_yield_buffer()` has to be called to set the `used_len` of the buffer before calling `ext_paniclog_insert_data()` or `ext_paniclog_append_data()`.
176
177```c
178int ext_paniclog_yield_buffer(ext_paniclog_handle_t *handle, uint32_t used_len);
179```
180
181This function is called to yield the buffer and set the used_len for the buffer.
182
183```c
184int ext_paniclog_set_used_len(ext_paniclog_handle_t *handle, uint32_t used_len);
185```
186
187This function is called to set the used len of the buffer.
188
189## panic_with_data APIs
190
191```c
192void panic_with_data(uuid_t uuid, void *addr, uint32_t len, uint64_t debugger_options_mask, const char *format, ...);
193```
194
195This function is called when a kernel client is panicking and wants to insert the data into the extensible panic log. We treat this as a special case and put this data at the start of the extensible panic log region. The client has to supply the UUID to decode the buffer that is pushed to the paniclog.
196
197```c
198int panic_with_data(char *uuid, void *addr, uint32_t len, uint32_t flags, const char *msg);
199```
200
201This provides the same functionality as panic_with_data() for userspace clients.
202
203## Special Options
204
205### `EXT_PANICLOG_OPTIONS_ADD_SEPARATE_KEY`
206
207If the `EXT_PANICLOG_OPTIONS_ADD_SEPARATE_KEY` option is set when creating an ExtensiblePaniclog handle, the Data ID / buffer data (key / value) pair will be added directly to the paniclog instead of under the "ExtensiblePaniclog" key.
208
209## Implementation
210
211### Estimating the panic log size
212
213We want to add the utilization metrics of the panic log to the panic.ips file. This will give us an idea of the percentage of the panic log we currently use and how big each section in the panic log is. We will use this data to estimate how big the other log section usually is and ensure that we leave enough space for this section when inserting the extensible panic log. We will cut off the extensible panic log if we cannot fit all the buffers into the free region.
214
215### Registering a buffer + Writing data to the buffer
216
217We have APIs exposed at different layers so that a client can use whatever suits it best. In DriverKit and IOKit cases, they call the `createWithUUID` or `Create` methods to create an IOExtensiblePaniclog instance and use that instance to insert or append data to a buffer.
218
219Lower level clients use `ext_paniclog_handle_alloc_with_uuid` to allocate a handle and use that handle to insert data using `ext_paniclog_insert_data` and `ext_paniclog_append_data` functions.
220
221When a kernel client is panicking, it has the option to call `panic_with_data()`, which just takes a UUID, buffer address and length. This API makes sure that we copy this data in to the extensible panic log.
222
223### Insert data into the extended panic log
224
225Current structure of the panic log is as follows:
226
227```
228-------------------------
229-      Panic Header     -
230-------------------------
231-                       -
232-       Panic Log       -
233-                       -
234-------------------------
235-                       -
236-      Stack shots      -
237-                       -
238-------------------------
239-                       -
240-       Other Log       -
241-                       -
242-------------------------
243-       Misc Data       -
244-------------------------
245-                       -
246-                       -
247-         Free          -
248-                       -
249-                       -
250-------------------------
251```
252
253We want to use the free part of the panic log to insert the extensible panic log. After we insert the stackshots, we calculate and see how much space we have in the panic log to insert the extensible panic log. These calculations will use the data that we collect from our utilization metrics and leave out space for the other log section. We then go through the ext_paniclog linked list and start inserting the buffers into the panic log region until we fill out size we calculated. After this, we move onto inserting data into the other log section.
254
255## Format / structure of the extensible panic log:
256
257```
258+---------+------------+---------+---------+------------+------------+---------+---------+---------+-----------+------------+----------+
259|         |            |         |         |            |            |         |         |         |           |            |          |
260|Version  | No of logs | UUID 1  | Flags 1 | Data ID 1  | Data len 1 | Data 1  | UUID 2  | Flags 2 | Data ID 2 | Data len 2 | Data 2   |
261|         |            |         |         |            |            |         |         |         |           |            |          |
262+---------+------------+---------+---------+------------+------------+---------+---------+---------+-----------+------------+----------+
263```
264
265## Extract and format the extensible panic log into the panic.ips file
266
267In DumpPanic, we will extract this data from the panic log region and format it to be readable. We can group the data according to uuid and sort it with the data_id of the data. An example of the extensible panic log data in the panic.ips file shown below.
268
269```
270{
271    "ExtensiblePanicLog": {
272        "<UUID_1>": [
273            {
274                "DataID": "0x1"
275                "data" : <buffer1>
276            },
277            {
278                "DataID": "0x2"
279                "data" : <buffer2>
280            }
281        ],
282        "<UUID_2>": [
283            {
284                "DataID": "0x1"
285                "data" : <buffer1>
286            },
287            {
288                "DataID": "0x2"
289                "data" : <buffer2>
290            }
291        ],
292    },
293    "SeparateFieldDataID1": "Separate buffer value here 1",
294    "SeparateFieldDataID2": "Separate buffer value here 2",
295}
296```
297
298Notice that there are two fields below ExtensiblePanicLog in the panic.ips example above. If you were to pass the option `EXT_PANICLOG_CREATE_OPTIONS_ADD_SEPARATE_KEY` to the handle create function, DumpPanic would process that handle as seen above, by adding it as a field directly to the panic file instead of including it in the ExtensiblePanicLog field.
299
300## Example code
301
302### IOKit Example
303
304#### Creating the handle
305
306```c
307char uuid_string_1[] = "E2070C7E-A1C3-41DF-ABA4-B9921DACCD87";
308bool res;
309kern_return_t ret;
310
311uuid_t uuid_1;
312uuid_parse(uuid_string_1, uuid_1);
313
314res = IOExtensiblePaniclog::createWithUUID(uuid_1, "Lha ops 1", 1024, EXT_PANICLOG_OPTIONS_NONE, &paniclog_handle_1);
315if (res == false) {
316    DEBUG_LOG ("Failed to create ext paniclog handle: %d\n", res);
317}
318
319DEBUG_LOG("Created panic log handle 1 with UUID: %s\n", uuid_string_1);
320
321char uuid_string_2[] = "28245A8F-04CA-4932-8A38-E6C159FD9C92";
322uuid_t uuid_2;
323uuid_parse(uuid_string_2, uuid_2);
324res = IOExtensiblePaniclog::createWithUUID(uuid_2, "Lha ops 2", 1024, EXT_PANICLOG_OPTIONS_NONE, &paniclog_handle_2);
325if (res == false) {
326    DEBUG_LOG ("Failed to create ext paniclog handle: %d\n", res);
327}
328
329DEBUG_LOG("Created panic log handle 2 with UUID: %s\n", uuid_string_2);
330```
331
332#### Inserting the data
333
334```c
335DEBUG_LOG ("%s\n", __FUNCTION__);
336char buff[1024] = {0};
337snprintf(buff, 1024, "HW access Dir: %u Type: %u Address: %llu\n", input->direction, input->type, input->address);
338
339char buff1[1024] = {0};
340
341paniclog_handle_1->insertData(buff, (uint32_t)strlen(buff));
342paniclog_handle_1->setActive();
343
344paniclog_handle_2->insertData(input, sizeof(HardwareAccessParameters));
345paniclog_handle_2->setActive();
346```
347
348### DriverKit Example
349
350#### Creating the handle
351
352```cpp
353OSData *uuid_data = OSData::withBytes(&uuid_3[0], sizeof(uuid_t));
354if (!uuid_data) {
355    IOLog("Data was not created\n");
356    return NULL;
357}
358
359OSString *data_id = OSString::withCString("DriverKit OP 1");
360
361ret = IOExtensiblePaniclog::Create(uuid_data, data_id, 64, kIOExtensiblePaniclogOptionsNone, &paniclog_handle_3);
362if (ret != kIOReturnSuccess) {
363    IOLog("Failed to create paniclog handle 3\n");
364    return NULL;
365}
366IOLog("EXT_PANICLOG: Created panic log handle 3 with UUID: %s\n", uuid_string_3);
367```
368
369#### Inserting the data
370
371```cpp
372ret = paniclog_handle_3->ClaimBuffer(&addr, &len);
373if (ret != kIOReturnSuccess) {
374    IOLog("EXT_PANICLOG: Failed to claim buffer. Ret: %x\n", ret);
375    return NULL;
376}
377
378IOLog("EXT_PANICLOG: Got buffer address %llu, %llu", addr, len);
379
380buff1 = (char *)addr;
381
382IOLog("EXT_PANICLOG: Ignoring write for now");
383memcpy(buff1, buff, strlen(buff));
384
385paniclog_handle_3->YieldBuffer((uint32_t)strlen(buff));
386
387paniclog_handle_3->SetActive();
388```
389
390