xref: /xnu-8792.81.2/doc/allocators/api-basics.md (revision 19c3b8c28c31cb8130e034cfb5df6bf9ba342d90)
1# XNU Allocators best practices
2
3## Introduction
4
5XNU proposes two ways to allocate memory:
6
7- the VM subsystem that provides allocations at the granularity of pages (with
8  `kmem_alloc` and similar interfaces);
9- the zone allocator subsystem (`<kern/zalloc.h>`) which is a slab-allocator of
10  objects of fixed size.
11
12In addition to that, `<kern/kalloc.h>` provides a variable-size general purpose
13allocator implemented as a collection of zones of fixed size, and overflowing to
14`kmem_alloc` for allocations larger than a few pages (32KB when this
15document was being written but this is subject to change/tuning in the future).
16
17
18The Core Kernel allocators rely on the following headers:
19
20- `<kern/zalloc.h>` and `<kern/kalloc.h>` for its API surface, which most
21  clients should find sufficient,
22- `<kern/zalloc_internal.h>` for interfaces that need to be exported
23  for introspection and implementation purposes, and is not meant
24  for general consumption.
25
26This document will present the best practices to allocate memory
27in the kernel, from a security perspective.
28
29## Permanent allocations
30
31The kernel sometimes needs to provide persistent allocations that depend on
32parameters that aren't compile time constants, but will not vary over time (NCPU
33is an obvious example here).
34
35The zone subsystem provides a `zalloc_permanent*` family of functions that help
36allocating memory in such a fashion in a very compact way.
37
38Unlike the typical zone allocators, this allows for arbitrary sizes, in a
39similar fashion to `kalloc`. These functions will never fail (if the allocation
40fails, the kernel will panic), and always return zeroed memory. Trying to free
41these allocations results in a kernel panic.
42
43## Allocation flags
44
45Most `zalloc` or `kalloc` functions take `zalloc_flags_t` typed flags.
46When flags are expected, exactly one of `Z_WAITOK`, `Z_NOWAIT` or `Z_NOPAGEWAIT`
47is to be passed:
48
49- `Z_WAITOK` means that the zone allocator can wait and block,
50- `Z_NOWAIT` can be used to require a fully non blocking behavior, which can be
51  used for allocations under spinlock and other preemption disabled contexts;
52- `Z_NOPAGEWAIT` allows for the allocator to block (typically on mutexes),
53  but not to wait for available pages if there are none, this is only useful
54  for the buffer cache, and most client should either use `Z_NOWAIT` or `Z_WAITOK`.
55
56Other important flags:
57
58- `Z_ZERO` if zeroed memory is expected (nowadays most of the allocations will
59  be zeroed regardless, but it's always clearer to specify it), note that it is
60  often more efficient than calling bzero as the allocator tends to maintain
61  freed memory as zeroed in the first place,
62- `Z_NOFAIL` if the caller knows the allocation can't fail: allocations that are
63   made with `Z_WAITOK` from regular (non exhaustible) zones, or from `kalloc*`
64   interfaces with a size smaller than `KALLOC_SAFE_ALLOC_SIZE`,
65  will never fail (the kernel will instead panic if no memory can be found).
66  `Z_NOFAIL` can be used to denote that the caller knows about this.
67  If `Z_NOFAIL` is incorrectly used, then the zone allocator will panic at runtime.
68
69## Zones (`zalloc`)
70
71The first blessed way to allocate memory in the kernel is by using zones.
72Zones are mostly meant to be used in Core XNU and some "BSD" kexts.
73
74It is generally recommended to create zones early and to store the `zone_t`
75pointer in read-only memory (using `SECURITY_READ_ONLY_LATE` storage).
76
77Zones are more feature-rich than `kalloc`, and some features can only be
78used when making a zone:
79
80- the object type being allocated requires extremely strong segregation
81  from other types (typically `zone_require` will be used with this zone),
82- the object type implements some form of security boundary and wants to adopt
83  the read-only allocator (See `ZC_READONLY`),
84- the allocation must be per-cpu,
85- ...
86
87In the vast majority of cases however, using `kalloc_type` (or `IOMallocType`)
88is preferred.
89
90
91## The Typed allocator
92
93Ignoring VM allocations (or wrappers like `IOMemoryDescriptor`), the only
94blessed way to allocate typed memory in XNU is using the typed allocator
95`kalloc_type` or one of its variants (like IOKit's `IOMallocType`) and untyped
96memory that doesn't contain pointers is using the data API `kalloc_data` or
97one of its variants (like IOKit's `IOMallocData`). However, this comes with
98additional requirements.
99
100Note that at this time, those interfaces aren't exported to third parties,
101as its ABI has not yet converged.
102
103### A word about types
104
105The typed allocators assume that allocated types fit a very precise model.
106If the allocations you perform do not fit the model, then your types
107must be restructured to fit, for security reasons.
108
109A general theme will be the separation of data/primitive types from pointers,
110as attackers tend to use data/pointer overlaps to carry out their exploitations.
111
112The typed allocators use compiler support to infer signatures
113of the types being allocated. Because some scalars actually represent
114kernel pointers (like `vm_offset_t`,`vm_address_t`, `uintptr_t`, ...),
115types or structure members can be decorated with `__kernel_ptr_semantics`
116to denote when a data-looking type is actually a pointer.
117
118Do note that `__kernel_data_semantics` and `__kernel_dual_semantics`
119are also provided but should typically rarely be used.
120
121#### fixed-sized types
122
123The first case is fixed size types, this is typically a `struct`, `union`
124or C++ `class`. Fixed-size types must follow certain rules:
125
126- types should be small enough to fit in the zone allocator:
127  smaller than `KALLOC_SAFE_ALLOC_SIZE`. When this is not the case,
128  we have typically found that there is a large array of data,
129  or some buffer in that type, the solution is to outline this allocation.
130- for union types, data/pointer overlaps should be avoided if possible.
131  when this isn't possible, a zone should be considered.
132
133#### Variable-sized types
134
135These come in two variants: arrays, and arrays prefixed with a header.
136Any other case must be reduced to those, by possibly making more allocations.
137
138An array is simply an allocation of several fixed-size types,
139and the rules of "fixed-sized types" above apply to them.
140
141The following rules are expected when dealing with variable sized allocations:
142
143- variable sized allocations should have a single owner and not be refcounted;
144- under the header-prefixed form, if the header contains pointers,
145  then the array element type **must not** be only data.
146
147If those rules can't be followed, then the allocation must be split with
148the header becoming a fixed-sized type becoming the single owner
149of an array.
150
151#### Untyped memory
152
153When allocating untyped memory with the data APIs ensure that it doesn't
154contain kernel pointers. If your untyped allocation contains kernel pointers
155consider splitting the allocation into two: one part that is typed and contains
156the kernel pointers and the second that is untyped and data-only.
157
158### API surface
159
160<table>
161  <tr>
162    <th>Interface</th>
163    <th>API</th>
164    <th>Notes</th>
165  </tr>
166  <tr>
167    <td>Data/Primitive types</td>
168    <td>
169      <p>
170      <b>Core Kernel</b>:<br/>
171      <tt>kalloc_data(size, flags)</tt><br/>
172      <tt>krealloc_data(ptr, old_size, new_size, flags)</tt><br/>
173      <tt>kfree_data(ptr, size)</tt><br/>
174      <tt>kfree_data_addr(ptr)</tt>
175      </p>
176      <p>
177      <b>IOKit untyped variant (returns <tt>void *</tt>)</b>:<br/>
178      <tt>IOMallocData(size)</tt><br/>
179      <tt>IOMallocZeroData(size)</tt><br/>
180      <tt>IOFreeData(ptr, size)</tt>
181      </p>
182      <p>
183      <b>IOKit typed variant (returns <tt>type_t *</tt>)</b>:<br/>
184      <tt>IONewData(type_t, count)</tt><br/>
185      <tt>IONewZeroData(type_t, count)</tt><br/>
186      <tt>IODeleteData(ptr, type_t, count)</tt>
187      </p>
188    </td>
189    <td>This should be used when the allocated type contains no kernel pointer only</td>
190  </tr>
191  <tr>
192    <td>Fixed-sized type</td>
193    <td>
194      <p>
195      <b>Core Kernel</b>:<br/>
196      <tt>kalloc_type(type_t, flags)</tt><br/>
197      <tt>kfree_type(type_t, ptr)</tt>
198      </p>
199      <p>
200      <b>IOKit:</b><br/>
201      <tt>IOMallocType(type_t)</tt><br/>
202      <tt>IOFreeType(ptr, type_t)</tt>
203      </p>
204    </td>
205    <td>
206      <p>
207      Note that this is absolutely OK to use this variant
208      for data/primitive types, it will be redirected to <tt>kalloc_data</tt>
209      (or <tt>IOMallocData</tt>).
210      </p>
211    </td>
212  </tr>
213  <tr>
214    <td>Arrays of fixed-sized type</td>
215    <td>
216      <p>
217      <b>Core Kernel</b>:<br/>
218      <tt>kalloc_type(type_t, count, flags)</tt><br/>
219      <tt>kfree_type(type_t, count, ptr)</tt>
220      </p>
221      <p>
222      <b>IOKit:</b><br/>
223      <tt>IONew(type_t, count)</tt><br/>
224      <tt>IONewZero(type_t, count)</tt><br/>
225      <tt>IODelete(ptr, type_t, count)</tt>
226      </p>
227    </td>
228    <td>
229      <p>
230      <tt>kalloc_type(type_t, ...)</tt> (resp. <tt>IONew(type_t, 1)</tt>)
231      <b>isn't</b> equivalent to <tt>kalloc_type(type_t, 1, ...)</tt>
232      (resp. <tt>IOMallocType(type_t)</tt>). Mix-and-matching interfaces
233      will result in panics.
234      </p>
235      <p>
236      Note that this is absolutely OK to use this variant
237      for data/primitive types, it will be redirected to <tt>kalloc_data</tt>.
238      </p>
239    </td>
240  </tr>
241  <tr>
242    <td>Header-prefixed arrays of fixed-sized type</td>
243    <td>
244      <p>
245      <b>Core Kernel</b>:<br/>
246      <tt>kalloc_type(hdr_type_t, type_t, count, flags)</tt><br/>
247      <tt>kfree_type(hdr_type_t, type_t, count, ptr)</tt>
248      </p>
249      <p>
250      <b>IOKit:</b><br/>
251      <tt>IONew(hdr_type_t, type_t, count)</tt><br/>
252      <tt>IONewZero(hdr_type_t, type_t, count)</tt><br/>
253      <tt>IODelete(ptr, hdr_type_t, type_t, count)</tt>
254      </p>
255    </td>
256    <td>
257      <p>
258      <tt>hdr_type_t</tt> can't contain a refcount,
259      and <tt>type_t</tt> can't be a primitive type.
260      </p>
261    </td>
262  </tr>
263</table>
264
265## C++ classes and operator new.
266
267This section covers how typed allocators should be adopted to use
268`operator new/delete` in C++. For C++ classes, the approach required
269differs based on whether the class inherits from `OSObject` or not.
270
271Most, if not all, C++ objects used in conjuction with IOKit APIs
272should probably use OSObject as a base class. C++ operators
273and non-POD types should be used seldomly.
274
275### `OSObject` subclasses
276
277All subclasses of `OSObject` must declare and define one of IOKit's
278`OSDeclare*` and `OSDefine*` macros. As part of those, an `operator new` and
279`operator delete` are injected that force objects to enroll into `kalloc_type`.
280
281Note that idiomatic IOKit is supposed to use `OSTypeAlloc(Class)`.
282
283### Other classes
284
285Unlike `OSObject` subclasses, regular C++ classes must adopt typed allocators
286manually. If your struct or class is POD (Plain Old Data), then replacing usage of
287`new/delete` (resp. `new[]/delete[]`) with `IOMallocType/IOFreeType` (resp.
288`IONew/IODelete`) is safe.
289
290However, if you have non default structors, or members of your class/struct
291have non default structors, you will need to manually enroll it into `kalloc_type`.
292This can be accomplished through one of the following approaches, and it lets you
293to continue to use C++'s new and delete keywords to allocate/deallocate instances.
294
295The first approach is to subclass the IOTypedOperatorsMixin struct. This will
296adopt typed allocators for your class/struct by providing the appropriate
297implementations for `operator new/delete`:
298
299```cpp
300struct Type : public IOTypedOperatorsMixin<Type> {
301    ...
302};
303```
304
305Alternatively, if you cannot use the mixin approach, you can use the
306`IOOverrideTypedOperators` macro to override `operator new/delete`
307within your class/struct declaration:
308
309```cpp
310struct Type {
311    IOOverrideTypedOperators(Type);
312    ...
313};
314```
315
316Finally, if you need to decouple the declaration of the operators from
317their implementation, you can use `IODeclareTypedOperators` paired with
318`IODefineTypedOperators`, to declare the operators within your class/struct
319declaration and then provide their definition out of line:
320
321```cpp
322// declaration
323struct Type {
324    IODeclareTypedOperators(Type);
325    ...
326};
327
328// definition
329IODefineTypedOperators(Type)
330```
331
332When a class/struct adopts typed allocators through one of those approaches,
333all its subclasses must also explicitly adopt typed allocators. It is not
334sufficient for a common parent within the class hierarchy to enroll, in order to
335automatically provide the implementation of the operators for all of its children:
336each and every subclass in the class hierarchy must also explicitly do the same.
337
338### The case of `operator new[]`
339
340The ABI of `operator new[]` is unfortunate, as it denormalizes
341data that we prefer to be known by the owning object
342(the element sizes and array element count).
343
344It also makes those allocations ripe for abuse in an adversarial
345context as this denormalized information is at the begining
346of the structure, making it relatively easy to attack with
347out-of-bounds bugs.
348
349For this reason, the default variants of the mixin and the macros
350presented above will delete the implementation of `operator new[]`
351from the class they are applied to.
352
353However, if those must be used, you can add adopt the typed
354allocators on your class by using the appropriate variant
355which explicitly implements the support for array operators:
356- `IOTypedOperatorsMixinSupportingArrayOperators`
357- `IOOverrideTypedOperatorsSupportingArrayOperators`
358- `IO{Declare, Define}TypedOperatorsSupportingArrayOperators`
359
360### Scalar types
361
362The only accepted ways of using `operator new/delete` and their variants are the ones
363described above. You should never use the operators on scalar types. Instead, you
364should use the appropriate typed allocator API based on the semantics of the memory
365being allocated (i.e. `IOMallocData` for data only buffers, and `IOMallocType`/`IONew`
366for any other type).
367
368### Wrapping C++ type allocation in container OSObjects
369The blessed way of wrapping and passing a C++ type allocation for use in the
370libkern collection is using `OSValueObject`. Please do no use OSData for this
371purpose as its backing store should not contain kernel pointers.
372
373