1*4f1223e8SApple OSS DistributionsARM Scalable Matrix Extension 2*4f1223e8SApple OSS Distributions============================= 3*4f1223e8SApple OSS Distributions 4*4f1223e8SApple OSS DistributionsManaging hardware resources related to SME state. 5*4f1223e8SApple OSS Distributions 6*4f1223e8SApple OSS DistributionsIntroduction 7*4f1223e8SApple OSS Distributions------------ 8*4f1223e8SApple OSS Distributions 9*4f1223e8SApple OSS DistributionsThis document describes how xnu manages the hardware resources associated with 10*4f1223e8SApple OSS DistributionsARM's Scalable Matrix Extension (SME). 11*4f1223e8SApple OSS Distributions 12*4f1223e8SApple OSS DistributionsSME is an ARMv9 extension intended to accelerate matrix math operations. SME 13*4f1223e8SApple OSS Distributionsbuilds on top of ARM's previous Scalable Vector Extension (SVE), which extends 14*4f1223e8SApple OSS Distributionsthe length of the FPSIMD register files and adds new 1D vector-math 15*4f1223e8SApple OSS Distributionsinstructions. SME extends SVE by adding a matrix register file and associated 16*4f1223e8SApple OSS Distributions2D matrix-math instructions. SME2 further extends SME with additional 17*4f1223e8SApple OSS Distributionsinstructions and register state. 18*4f1223e8SApple OSS Distributions 19*4f1223e8SApple OSS DistributionsThis document summarizes SVE, SME, and SME2 hardware features that are relevant 20*4f1223e8SApple OSS Distributionsto xnu. It is not intended as a full programming guide for SVE or SME: readers 21*4f1223e8SApple OSS Distributionsmay find a full description of these ISAs in the 22*4f1223e8SApple OSS Distributions[SVE supplement to the ARM ARM](https://developer.arm.com/documentation/ddi0584/latest/) 23*4f1223e8SApple OSS Distributionsand [SME supplement to the ARM ARM](https://developer.arm.com/documentation/ddi0616/latest/), 24*4f1223e8SApple OSS Distributionsrespectively. 25*4f1223e8SApple OSS Distributions 26*4f1223e8SApple OSS Distributions 27*4f1223e8SApple OSS Distributions 28*4f1223e8SApple OSS DistributionsHardware overview 29*4f1223e8SApple OSS Distributions----------------- 30*4f1223e8SApple OSS Distributions 31*4f1223e8SApple OSS Distributions### EL0-accessible state 32*4f1223e8SApple OSS Distributions 33*4f1223e8SApple OSS DistributionsSVE, SME, and SME2 introduce four new EL0-accessible register 34*4f1223e8SApple OSS Distributionsfiles<sup>[1](#feat_sve_footnote)</sup>: 35*4f1223e8SApple OSS Distributions 36*4f1223e8SApple OSS Distributions- vector registers `Z0`-`Z31` 37*4f1223e8SApple OSS Distributions- predicate registers `P0`-`P15` 38*4f1223e8SApple OSS Distributions- matrix data `ZA` (SME/SME2 only) 39*4f1223e8SApple OSS Distributions- look-up table `ZT0` (SME2 only) 40*4f1223e8SApple OSS Distributions 41*4f1223e8SApple OSS DistributionsThese register files are unbanked, i.e., their contents are shared across all 42*4f1223e8SApple OSS Distributionsexception levels. Data can be copied between these registers and system memory 43*4f1223e8SApple OSS Distributionsusing specialized `ldr` and `str` variants. SME also adds `mov` variants that 44*4f1223e8SApple OSS Distributionscan directly copy data between the vector and matrix register files. 45*4f1223e8SApple OSS Distributions 46*4f1223e8SApple OSS DistributionsMost of these register files supplement, rather than replace, the existing ARM 47*4f1223e8SApple OSS Distributionsregister files. However the `Z` register file effectively extends the length of 48*4f1223e8SApple OSS Distributionsthe existing FPSIMD `V` register file. Instructions targeting the `V` register 49*4f1223e8SApple OSS Distributionsfile will now access the lower 128 bits of the corresponding `Z` register. 50*4f1223e8SApple OSS Distributions 51*4f1223e8SApple OSS DistributionsThe size of most of these files is defined by the *streaming vector length* 52*4f1223e8SApple OSS Distributions(SVL), a power-of-two between 128 and 2048 inclusive. Each `Z` register is SVL 53*4f1223e8SApple OSS Distributionsbits in size; each `P` register is SVL / 8 bits in size; and `ZA` is SVL x SVL 54*4f1223e8SApple OSS Distributionsbits in size. The value of SVL is determined by both hardware and software. 55*4f1223e8SApple OSS DistributionsHardware places an implementation-defined cap on SVL, and privileged software 56*4f1223e8SApple OSS Distributionscan further reduce SVL for itself and lower exception levels. 57*4f1223e8SApple OSS Distributions 58*4f1223e8SApple OSS DistributionsIn contrast, `ZT0` is fixed at 512 bits, independent of SVL. 59*4f1223e8SApple OSS Distributions 60*4f1223e8SApple OSS DistributionsSME also adds a single EL0-accessible system register `TPIDR2_EL0`. Like 61*4f1223e8SApple OSS Distributions`TPIDR_EL0`, `TPIDR2_EL0` is officially reserved for ABI use, but its contents 62*4f1223e8SApple OSS Distributionshave no particular meaning to hardware. 63*4f1223e8SApple OSS Distributions 64*4f1223e8SApple OSS Distributions### `PSTATE` changes 65*4f1223e8SApple OSS Distributions 66*4f1223e8SApple OSS DistributionsSME adds two orthogonal states to `PSTATE`. 67*4f1223e8SApple OSS Distributions 68*4f1223e8SApple OSS Distributions`PSTATE.SM` moves the CPU in and out of a special execution mode called 69*4f1223e8SApple OSS Distributions*streaming SVE mode*. Software must enter streaming SVE mode to execute most 70*4f1223e8SApple OSS DistributionsSME instructions. However software must then exit streaming SVE mode to execute 71*4f1223e8SApple OSS Distributionsmany legacy SIMD instructions<sup>[2](#feat_sme_fa64_footnote)</sup>. To make 72*4f1223e8SApple OSS Distributionsthings even more complicated, these transitions cause the CPU to zero out the 73*4f1223e8SApple OSS Distributions`V`/`Z` and `P` register files, and to set all `FPSR` flags. When software 74*4f1223e8SApple OSS Distributionsneeds to retain this state across `PSTATE.SM` transitions, it must manually 75*4f1223e8SApple OSS Distributionsstash the state in memory. 76*4f1223e8SApple OSS Distributions 77*4f1223e8SApple OSS Distributions`PSTATE.ZA` independently controls whether the contents of `ZA` and `ZT0` are 78*4f1223e8SApple OSS Distributionsvalid. Setting `PSTATE.ZA` zeroes out both register files, and enables 79*4f1223e8SApple OSS Distributionsinstructions that access them. Clearing `PSTATE.ZA` causes `ZA` and `ZT0` 80*4f1223e8SApple OSS Distributionsaccesses to trap. 81*4f1223e8SApple OSS Distributions 82*4f1223e8SApple OSS DistributionsMost SME instructions require both `PSTATE.SM` and `PSTATE.ZA` to be 83*4f1223e8SApple OSS Distributionsset, so software usually toggles both bits at the same time. However setting 84*4f1223e8SApple OSS Distributionsthese bits independently can be useful when software needs to interleave SME and 85*4f1223e8SApple OSS DistributionsFPSIMD instructions. If software needs to temporarily exit streaming SVE mode 86*4f1223e8SApple OSS Distributionsto execute FPSIMD instructions, setting `PSTATE.{SM,ZA} = {0,1}` will do so 87*4f1223e8SApple OSS Distributionswithout clobbering the `ZA` or `ZT0` array. 88*4f1223e8SApple OSS Distributions 89*4f1223e8SApple OSS Distributions`PSTATE.{SM,ZA} = {0,0}` acts as a hint to the CPU that it may power down 90*4f1223e8SApple OSS DistributionsSME-related hardware. Hence software should clear these bits as soon as 91*4f1223e8SApple OSS DistributionsSME state can be discarded. 92*4f1223e8SApple OSS Distributions 93*4f1223e8SApple OSS DistributionsThese `PSTATE` bits are accessible to software in several ways: 94*4f1223e8SApple OSS Distributions 95*4f1223e8SApple OSS Distributions- Reads or writes to the `SVCR` system register, which packs both bits into 96*4f1223e8SApple OSS Distributions a single register 97*4f1223e8SApple OSS Distributions- Writes to the `SVCRSM`, `SVCRZA`, or `SVCRSMZA` system registers with the 98*4f1223e8SApple OSS Distributions immediate values `0` or `1`, which directly modify the specified bit(s) 99*4f1223e8SApple OSS Distributions- `sm{start,stop} (sm|za)` pseudo-instructions, which are assembler aliases for 100*4f1223e8SApple OSS Distributions the above `msr` instructions 101*4f1223e8SApple OSS Distributions 102*4f1223e8SApple OSS DistributionsRegardless of which method is used to access these bits, software generally does 103*4f1223e8SApple OSS Distributionsnot need explicit barriers. Specifically, ARM guarantees that all direct and 104*4f1223e8SApple OSS Distributionsindirect reads from these bits will appear in program order relative to any 105*4f1223e8SApple OSS Distributionsdirect writes. 106*4f1223e8SApple OSS Distributions 107*4f1223e8SApple OSS Distributions### Other hardware resources 108*4f1223e8SApple OSS Distributions 109*4f1223e8SApple OSS DistributionsAn implementation may share SME compute resources across multiple CPUs. In this 110*4f1223e8SApple OSS Distributionscase, the per-CPU `SMPRI_EL1` controls the relative priority of the SME 111*4f1223e8SApple OSS Distributionsinstructions issued by that CPU. ARM guarantees that higher `SMPRI_EL1` values 112*4f1223e8SApple OSS Distributionsindicate higher priorities, and that setting `SMPRI_EL1 = 0` on all CPUs is a 113*4f1223e8SApple OSS Distributionssafe way to disable SME prioritization. Otherwise the exact meaning of 114*4f1223e8SApple OSS Distributions`SMPRI_EL1` is implementation-defined. 115*4f1223e8SApple OSS Distributions 116*4f1223e8SApple OSS DistributionsEL2 may trap guest reads and writes to `SMPRI_EL1` using the fine-grained trap 117*4f1223e8SApple OSS Distributionscontrols `HFGRTR_EL2.nSMPRI_EL1` and `HFGWTR_EL2.nSMPRI_EL1`, respectively. 118*4f1223e8SApple OSS DistributionsAlternatively, EL2 may adjust the effective SME priority at EL0 and EL1 without 119*4f1223e8SApple OSS Distributionstrapping, by populating the lookup table register `SMPRIMAP_EL2` and setting the 120*4f1223e8SApple OSS Distributionscontrol bit `HCRX_EL2.SMPME`. When `HCRX_EL2.SMPME` is set, SME instructions 121*4f1223e8SApple OSS Distributionsexecuted at EL0 and EL1 will interpret `SMPRI_EL1` as an index into 122*4f1223e8SApple OSS Distributions`SMPRIMAP_EL2` rather than as a raw priority value. 123*4f1223e8SApple OSS Distributions 124*4f1223e8SApple OSS Distributions`SMIDR_EL1` advertises hardware properties about the SME implementation, 125*4f1223e8SApple OSS Distributionsincluding whether SME execution priority is implemented. 126*4f1223e8SApple OSS Distributions 127*4f1223e8SApple OSS Distributions`CPACR_EL1` and `CPTR_ELx` have controls that can trap SVE and SME operations. 128*4f1223e8SApple OSS DistributionsTwo of these are relevant to Apple's SME 129*4f1223e8SApple OSS Distributionsimplementation<sup>[3](#cpacr_zen_footnote)</sup>: 130*4f1223e8SApple OSS Distributions 131*4f1223e8SApple OSS Distributions- `SMEN`: trap SME instructions and register accesses, including SVE 132*4f1223e8SApple OSS Distributions instructions executed during streaming SVE mode. 133*4f1223e8SApple OSS Distributions- `FPEN`: trap FPSIMD, SME, and SVE instructions and most register accesses, but 134*4f1223e8SApple OSS Distributions *not* `SVCR` accesses. Lower priority than `SMEN`. 135*4f1223e8SApple OSS Distributions 136*4f1223e8SApple OSS DistributionsSeveral SME registers aren't affected by these controls, since they have their 137*4f1223e8SApple OSS Distributionsown trapping mechanisms. `SMPRI_EL1` has fine-grained hypervisor trap controls 138*4f1223e8SApple OSS Distributionsas described above. `SMIDR_EL1` accesses can trap to the hypervisor using the 139*4f1223e8SApple OSS Distributionsexisting `HCR_EL2.TID1` control bit. Finally `TPIDR2_EL0` has a dedicated 140*4f1223e8SApple OSS Distributionscontrol bit `SCTLR_ELx.EnTP2` along with fine-grained trap controls 141*4f1223e8SApple OSS Distributions`HFG{R,W}TR_EL2.TPIDR2_EL0`. 142*4f1223e8SApple OSS Distributions 143*4f1223e8SApple OSS Distributions 144*4f1223e8SApple OSS DistributionsSoftware usage 145*4f1223e8SApple OSS Distributions-------------- 146*4f1223e8SApple OSS Distributions 147*4f1223e8SApple OSS Distributions### SME `PSTATE` management 148*4f1223e8SApple OSS Distributions 149*4f1223e8SApple OSS Distributionsxnu has in-kernel SIMD instructions<sup>[4](#xnu_simd_footnote)</sup> which 150*4f1223e8SApple OSS Distributionsbecome illegal while the CPU is in streaming SVE mode. This poses a problem if 151*4f1223e8SApple OSS Distributionsxnu interrupts EL0 while it is in the middle of executing SME-accelerated code. 152*4f1223e8SApple OSS Distributions 153*4f1223e8SApple OSS DistributionsHence, anytime xnu enters the kernel with `PSTATE.SM` set, it saves the current 154*4f1223e8SApple OSS Distributions`Z`, `P`, and `SVCR` values and then clears `PSTATE.SM`. xnu later restores 155*4f1223e8SApple OSS Distributionsthese values during kernel exit. These operations occur in an assembly-only 156*4f1223e8SApple OSS Distributionsmodule (`locore.s`) where we have strict control over code generation, and can 157*4f1223e8SApple OSS Distributionsguarantee that no problematic SIMD instructions are executed while `PSTATE.SM` 158*4f1223e8SApple OSS Distributionsis set. 159*4f1223e8SApple OSS Distributions 160*4f1223e8SApple OSS DistributionsSince the kernel may interrupt *itself*, kernel code is forbidden from entering 161*4f1223e8SApple OSS Distributionsstreaming SVE mode. This policy means that xnu does not need to preserve 162*4f1223e8SApple OSS Distributions`TPIDR2_EL0`, `ZA`, or `ZT0` during kernel entry and exit, since there are no 163*4f1223e8SApple OSS Distributionsin-kernel SME operations that could clobber them. 164*4f1223e8SApple OSS Distributions 165*4f1223e8SApple OSS Distributions### Context switching 166*4f1223e8SApple OSS Distributions 167*4f1223e8SApple OSS Distributionsxnu saves and restores `TPIDR2_EL0`, `ZA`, and `ZT0` inside the ARM64 168*4f1223e8SApple OSS Distributionsimplementation of `machine_switch_context()`, specifically as the routines 169*4f1223e8SApple OSS Distributions`machine_{save,restore}_sme_context()` in `osfmk/arm64/pcb.c`. These in turn 170*4f1223e8SApple OSS Distributionsbuild on lower-level routines to save and load SME register state, located in 171*4f1223e8SApple OSS Distributions`osfmk/arm64/sme.c`. The low-level routines are built on top of the SME `str` 172*4f1223e8SApple OSS Distributionsand `ldr` instructions, which can be executed outside of streaming SVE mode. 173*4f1223e8SApple OSS Distributions 174*4f1223e8SApple OSS Distributions`machine_{save,restore}_sme_context()` unconditionally save and restore 175*4f1223e8SApple OSS Distributions`TPIDR2_EL0`, since its contents are valid even when EL0 isn't actually using 176*4f1223e8SApple OSS DistributionsSME. However `ZA`'s and `ZT0`'s contents are often invalid and hence do not 177*4f1223e8SApple OSS Distributionsrequire context-switching. `machine_save_sme_context()` reads `SVCR.ZA` 178*4f1223e8SApple OSS Distributionsto determine if the `ZA` and `ZT0` arrays were actually valid at context-switch 179*4f1223e8SApple OSS Distributionstime. If not, it skips saving the invalid `ZA` and `ZT0` contents. 180*4f1223e8SApple OSS Distributions 181*4f1223e8SApple OSS DistributionsLikewise, when context-switching back to a thread where the saved-state 182*4f1223e8SApple OSS Distributions`SVCR.ZA` is cleared, `machine_restore_sme_context()` simply ensures that the 183*4f1223e8SApple OSS DistributionsCPU's `PSTATE.ZA` bit is cleared (executing `smstop za` if necessary). xnu does 184*4f1223e8SApple OSS Distributionsnot need to manually invalidate any `ZA` or `ZT0` state left by a previous 185*4f1223e8SApple OSS Distributionsthread: the next time `PSTATE.ZA` is enabled, the CPU is architecturally 186*4f1223e8SApple OSS Distributionsguaranteed to zero out both register files. 187*4f1223e8SApple OSS Distributions 188*4f1223e8SApple OSS DistributionsAs noted above, xnu saves `SVCR` on kernel entry and uses it to restore 189*4f1223e8SApple OSS Distributions`PSTATE.SM` on kernel exit. Hence `machine_restore_sme_context()` updates 190*4f1223e8SApple OSS Distributions`PSTATE.ZA` to match the new process's saved state, but doesn't update 191*4f1223e8SApple OSS Distributions`PSTATE.SM`. Likewise `machine_restore_sme_context()` doesn't manipulate the `Z` 192*4f1223e8SApple OSS Distributionsor `P` register files, since these will be updated on kernel exit. 193*4f1223e8SApple OSS Distributions 194*4f1223e8SApple OSS DistributionsSince SME thread state (`thread->machine.usme`) is large, and won't be used by 195*4f1223e8SApple OSS Distributionsmost threads, xnu lazily allocates the backing memory the first time a thread 196*4f1223e8SApple OSS Distributionsencounters an SME instruction. This is implemented by clearing `SCTLR_EL1.SMEN` 197*4f1223e8SApple OSS Distributionsinside `machine_restore_sme_context()`, then performing the allocation during 198*4f1223e8SApple OSS Distributionsthe subsequent SME trap. 199*4f1223e8SApple OSS Distributions 200*4f1223e8SApple OSS Distributions### Execution priority 201*4f1223e8SApple OSS Distributions 202*4f1223e8SApple OSS Distributionsxnu does not currently have an API for changing SME execution priority. 203*4f1223e8SApple OSS DistributionsAccordingly xnu resets `SMPRI_EL1` to `0` during CPU initialization, and 204*4f1223e8SApple OSS Distributionsotherwise does not modify it at runtime. 205*4f1223e8SApple OSS Distributions 206*4f1223e8SApple OSS Distributions### Power management 207*4f1223e8SApple OSS Distributions 208*4f1223e8SApple OSS Distributionsxnu updates `PSTATE.ZA` during `machine_switch_sme_context()` using the `SVCR` 209*4f1223e8SApple OSS Distributionsvalue stashed in the new thread's SME state. If the new process has never used 210*4f1223e8SApple OSS DistributionsSME, and hence doesn't have saved `ZA` state, xnu unconditionally clears 211*4f1223e8SApple OSS Distributions`PSTATE.ZA`. This policy means that xnu issues the power-down hint 212*4f1223e8SApple OSS Distributions`PSTATE.{SM,ZA} = {0,0}` on every context-switch, unless the new thread has live 213*4f1223e8SApple OSS Distributions`ZA` state. (Recall that `PSTATE.SM` was previously cleared on kernel entry.) 214*4f1223e8SApple OSS Distributions 215*4f1223e8SApple OSS DistributionsBy extension, xnu will always issue this hint before entering WFI. In order to 216*4f1223e8SApple OSS Distributionsreach `arm64_retention_wfi()`, xnu must first context-switch to the idle thread, 217*4f1223e8SApple OSS Distributionswhich never has `ZA` state. 218*4f1223e8SApple OSS Distributions 219*4f1223e8SApple OSS Distributions### Virtualizing SME 220*4f1223e8SApple OSS Distributions 221*4f1223e8SApple OSS DistributionsSME introduces a number of new registers that the hypervisor needs to manage. 222*4f1223e8SApple OSS Distributions`SMCR_ELx` is the only one of these that's banked between EL1 and EL2. The 223*4f1223e8SApple OSS Distributions`SVCR`, `SMPRI_EL1`, and `TPIDR2_EL0` system registers are all shared between 224*4f1223e8SApple OSS Distributionsthe host and guest, and must be managed by the host hypervisor accordingly. 225*4f1223e8SApple OSS Distributions 226*4f1223e8SApple OSS DistributionsMore critically, the `Z`, `P`, `ZA`, and `ZT0` register files are also shared 227*4f1223e8SApple OSS Distributionsacross all exception levels. To minimize the cost of managing this unbanked SME 228*4f1223e8SApple OSS Distributionsregister state, xnu tries to keep the guest matrix state resident in the CPU as 229*4f1223e8SApple OSS Distributionslong as possible, even when the guest traps to EL2. xnu will only spill the `ZA` 230*4f1223e8SApple OSS Distributionsand `ZT0` state back to memory when one of two things happens: 231*4f1223e8SApple OSS Distributions 232*4f1223e8SApple OSS Distributions(1) The `hv_vcpu_run` trap handler returns control all the way back to the VMM 233*4f1223e8SApple OSS Distributions thread at host EL0 234*4f1223e8SApple OSS Distributions 235*4f1223e8SApple OSS Distributions(2) xnu needs to context-switch the host VMM thread that owns the vCPU 236*4f1223e8SApple OSS Distributions 237*4f1223e8SApple OSS DistributionsIn these cases xnu will spill the guest `ZA` and `ZT0` state back to memory, 238*4f1223e8SApple OSS Distributionsthen replace them with the VMM thread's or new thread's state (respectively). 239*4f1223e8SApple OSS Distributions 240*4f1223e8SApple OSS DistributionsUnfortunately since xnu has to disable streaming SVE mode to handle traps, it's 241*4f1223e8SApple OSS Distributionsstill forced to spill `Z` and `P` state to memory anytime the guest traps to EL2 242*4f1223e8SApple OSS Distributionswith `PSTATE.SM` set. 243*4f1223e8SApple OSS Distributions 244*4f1223e8SApple OSS Distributions 245*4f1223e8SApple OSS DistributionsSince xnu doesn't currently support SME prioritization, it sets `HCRX_EL2.SMPME` 246*4f1223e8SApple OSS Distributionsand populates all `SMPRIMAP_EL2` entries with a value of `0`. Guest OSes are 247*4f1223e8SApple OSS Distributionsstill allowed to write to `SMPRI_EL1`, but currently this has no effect on 248*4f1223e8SApple OSS Distributionsthe actual hardware priority. 249*4f1223e8SApple OSS Distributions 250*4f1223e8SApple OSS Distributions 251*4f1223e8SApple OSS Distributions 252*4f1223e8SApple OSS DistributionsFootnotes 253*4f1223e8SApple OSS Distributions--------- 254*4f1223e8SApple OSS Distributions 255*4f1223e8SApple OSS Distributions<a name="feat_sve_footnote"></a>1. For simplicity, this section describes the 256*4f1223e8SApple OSS Distributionsbehavior on Apple CPUs. Details like register length and accessibility may 257*4f1223e8SApple OSS Distributionsdepend on whether the CPU is in streaming SVE mode (described later in the 258*4f1223e8SApple OSS Distributionsdocument). Apple's current SME implementation simply makes SVE features 259*4f1223e8SApple OSS Distributionsinaccessible outside this mode. 260*4f1223e8SApple OSS Distributions 261*4f1223e8SApple OSS Distributions<a name="feat_sme_fa64_footnote"></a>2. The optional CPU feature FEAT_SME_FA64 262*4f1223e8SApple OSS Distributionsallows full use of the SIMD instruction set inside streaming SVE mode. 263*4f1223e8SApple OSS DistributionsHowever xnu does not currently support any CPUs which implement FEAT_SME_FA64. 264*4f1223e8SApple OSS Distributions 265*4f1223e8SApple OSS Distributions<a name="cpacr_zen_footnote"></a>3. `CPACR_EL1` and `CPTR_ELx` also have a 266*4f1223e8SApple OSS Distributionsdiscrete trap control `ZEN` for SVE instruction and register accesses performed 267*4f1223e8SApple OSS Distributionsoutside streaming SVE mode. This trap control isn't currently relevant to Apple 268*4f1223e8SApple OSS DistributionsCPUs, since Apple's current SME implementation only allows SVE accesses inside 269*4f1223e8SApple OSS Distributionsstreaming SVE mode. 270*4f1223e8SApple OSS Distributions 271*4f1223e8SApple OSS Distributions<a name="xnu_simd_footnote"></a>4. LLVM is surprisingly aggressive about 272*4f1223e8SApple OSS Distributionsemitting SIMD instructions unless explicitly inhibited by compiler flags. Even 273*4f1223e8SApple OSS Distributionsif the xnu build started inhibiting these instructions for targets that support 274*4f1223e8SApple OSS DistributionsSME, they could still appear in existing kext binaries. 275*4f1223e8SApple OSS Distributions 276