marcan

Linux 5.13 and a new hypervisor

Added 2021-05-13 04:18:04 +0000 UTC

Hi all! It's been a few busy months and I'm way overdue for another update. There has been a lot of interesting progress lately. I will be covering it in official progress updates, but here is a quick overview of what's going on.

Basic bring-up merged!

As I'm sure many of you saw, my initial M1 bring-up series was merged into the Linux kernel last month and will be coming to kernel release 5.13 (5.13-rc1 was released a few days ago). This release just has basic "it boots" support - there are no drivers other than serial and a working framebuffer - but it lays the core foundation for all subsequent work, and keeps our efforts synced with mainline, which means we can avoid the ongoing maintenance burden and big rewrites that working on kernel forks causes. I have a pending progress report for March that will cover what it takes to get a driver like this merged into the Linux kernel, but that has somewhat fallen on the back burner, as you'll see.

The next step is to work on bringing up USB, PCIe, and NVMe, and other folks have already done initial work on those areas (Arnd from the Linux SoC team has a prototype NVMe platform binding, my friend Sven has been working on clock gating which is a requirement for these devices to work, and kettenis has been bringing up PCIe and pinctrl), but before focusing on pushing forward with those I've embarked on another tooling project.

The road to GPU acceleration

You've probably seen Alyssa's work on the M1 GPU. She's been working together with other Mesa folks on reverse engineering the rendering and shaders of the M1 GPU from macOS, and their work-in-progress Mesa driver already passes 85% of the OpenGL ES 2.0 test suite!

But the userspace driver is only half of the story. This driver currently runs on macOS, in place of Apple's Metal framework, but still using the macOS kernel driver for the GPU. In order to use this on Linux, we need to write a kernel driver for the GPU. The kernel side is in charge of managing how commands are sent from applications to the GPU itself, and how memory is allocated and shared. In addition, we also need a display driver (which is completely separate from the GPU); this part would be in charge of things like switching on external displays and changing resolutions.

The userspace side is being reverse engineered by treating Apple's Metal framework as a black box: write a Metal (or OpenGL on Metal) app, intercept the commands that the framework sends to the kernel, and then correlate them with what the app did to understand how to "speak" the GPU's language. This includes things like figuring out how the GPU shaders work.

In order to write the kernel driver, we need to do the same thing there: intercept what the kernel does with the hardware, and use that information to figure out how to do the same thing. This kind of approach has been done in the past for GPUs which have Linux kernel drivers: the Nouveau project reverse engineered Nvidia GPUs by running Nvidia's proprietary driver on a kernel modified with a hack to intercept the driver's hardware accesses and log them, which is called "mmiotrace".

But here we have no Linux drivers, so whatever we do has to work with macOS instead. Since the macOS kernel is actually open source, we could do the same kind of mmiotrace trick, just ported to run on XNU (the macOS kernel) instead. However, I decided to take a different approach which should be much more powerful in the long run, and also help with other goals.

m1n1 as a hypervisor

Instead of building the tracing functionality into the macOS kernel, why not run macOS in a VM? Of course, running macOS in a "typical" VM, for example under Linux, is not easy; we would have to develop virtual devices matching all physical hardware in M1, which is a massive undertaking.

However, VMs do not have to work like this. You may be familiar with the concept of "device passthrough", where you can, for example, "assign" a physical GPU to a VM and the VM will see it as a native device, using the original driver. This concept can be taken to its extreme, where all or almost all of the hardware in a system can be passed through straight into the VM. This is what I am planning to do: turn m1n1 (our Linux bootloader) into a "thin" hypervisor that can boot macOS and inspect the way it uses the hardware.

This may seem like a roundabout way of doing things, but it turns out writing a tiny hypervisor like this isn't actually a very big project; it's not much more complicated than trying to do the tracing in macOS, and it has several advantages. The hypervisor can be used not just to log hardware accesses, but also to debug the underlying OS. This means we can also use it to help develop Linux and even m1n1 itself!

I've been streaming the development process of the hypervisor, and it's coming along quite well! At this time, it's at a stage where it can already boot m1n1 itself and Linux under it; here is an example session of what it looks like to boot Linux under it and debug a configuration issue in real time.

It also has basic virtual serial port support, which is very important. You may recall that one of the issues with M1 development is that you need a special serial cable (or another M1 device) for low-level debugging. But m1n1 now supports communicating via USB (which is much faster than serial and can work with any host machine, e.g. any Linux box). With this virtual serial port tunneled over USB, anyone can now work on m1n1 and Linux development without any special hardware; the virtual serial port is seamlessly routed to USB. The current version is a quick hack, but I'll be extending it to work as a more proper separate USB interface soon. This really opens up the machines to any developer who wants to help out or collaborate; no more special cables or having to buy a secondary M1 machine. It also makes Linux kernel debugging a lot more enjoyable. I will later add support for using the hypervisor as a GDB stub, which will enable it to work together with standard Linux debugging tools.

As for macOS, as of a couple days ago the kernel boots to the point of having serial and framebuffer debug output. I'm currently working through enabling some custom M1 features that macOS needs to work, which need to be configured properly in the hypervisor. If all goes well, I think we may have full virtualized macOS booting within a week or two! (By the way, macOS already boots non-virtualized from m1n1; you can use chainload.py --xnu to boot a macOS kernel from USB).

The hypervisor is designed as a hybrid C/Python app, with part running on the host and part running in m1n1 itself, which makes it very easy to iterate on by just editing python scripts, even in real time (in an interactive Python debug shell). Currently rebooting the machine and booting a macOS kernel with a new version of the hypervisor code takes just 26 seconds, which is a very efficient testing cycle (most of the time is uploading the 100+MB xnu kernel). I place a lot of value on efficient tooling like this, to enable myself and others to work comfortably. (Don't miss Sven's work reverse engineering Guarded Execution, which he did using the same m1n1 Python framework the hypervisor is based on :-))

Once this all works... it's time to tackle the GPU kernel side! And of course, this will also make figuring out all the other drivers much easier too (in particular, audio is another big question mark). The next progress report will be another double issue, covering April and May, and will be all about this project.

New devices

You all know Apple have announced new M1 devices, and I placed an order for an M1 iMac to be able to test on it too. Thank you to all my supporters; it wouldn't be possible for me to acquire testing hardware like this without your help. This helps ensure we can cover all hardware features and squash all device-specific bugs in the future!

Thanks again to everyone, and please look forward to the upcoming progress reports and the hypervisor!