Qubes-lite with KVM and Wayland - Thomas Leonard's blog

I've been running QubesOS as my main desktop since 2015. It provides good security, by running applications in different Xen VMs. However, it is also quite slow and has some hardware problems. I've recently been trying out NixOS, KVM, Wayland and SpectrumOS, and attempting to create something similar with more modern/compatible/faster technology.

This post gives my initial impressions of these tools and describes my current setup.

Table of Contents

QubesOS
NixOS
Why use virtual machines?
SpectrumOS
Wayland
Future work

( this post also appeared on Hacker News and Lobsters )

QubesOS

QubesOS aims to provide "a reasonably secure operating system". It does this by running multiple virtual machines under the Xen hypervisor. Each VM's windows have a different colour and tag, but they appear together as a single desktop. The VMs I run include:

com for email and similar (the only VM that sees my email password).
dev for software development.
shopping (the only VM that sees my card number).
personal (with no Internet access)
untrusted (general browsing)

The desktop environment itself is another Linux VM (dom0), used for managing the other VMs. Most of the VMs are running Fedora (the default for Qubes), although I run Debian in dev. There are also a couple of system VMs; one for dealing with the network hardware, and one providing a firewall between the VMs.

You can run qvm-copy in a VM to copy a file to another VM. dom0 pops up a dialog box asking which VM should receive the file, and it arrives there as ~/QubesIncoming/$source_vm/$file. You can also press Ctrl-Shift-C to copy a VM's clipboard to the global clipboard, and then press Ctrl-Shift-V in a window of the target VM to copy to that VM's clipboard, ready for pasting into an application.

I think Qubes does a very good job at providing a secure environment.

However, it has poor hardware compatibility and it feels sluggish, even on a powerful machine. I bought a new machine a while ago and found that the motherboard only provided a single video output, limited to 30Hz. This meant I had to buy a discrete graphics card. With the card enabled, the machine fails to resume from suspend, and locks up from time to time (it's completely stable with the card removed or disabled). I spent some time trying to understand the driver code, but I didn't know enough about graphics, the Linux kernel, PCI suspend, or Xen to fix it.

I was also having some other problems with QubesOS:

Graphics performance is terrible (especially on a 4k monitor). Qubes disables graphics acceleration in VMs for security reasons, but it was slow even for software rendering.
It recently started freezing for a couple of seconds from time to time - annoying when you're trying to type.
It uses LVM thin-pools for VM storage, which I don't understand, and which sometimes need repairing (haven't lost any data, though).
dom0 is out-of-date and generally not usable. This is intentional (you should be using VMs), but my security needs aren't that high and it would be nice to be able to do video conferencing these days. Also, being able to print over USB and use bluetooth would be handy.

Anyway, I decided it was time to try something new. Linux now has its own built-in hypervisor (KVM), and I thought that would probably work better with my hardware. I was also keen to try out Wayland, which is built around shared-memory and I thought it might therefore work better with VMs. How easy would it be to recreate a Qubes-like environment directly on Linux?

NixOS

I've been meaning to try NixOS properly for some time. Ever since I started using Linux, its package management has struck me as absurd. On Debian, Fedora, etc, installing a package means letting it put files wherever it likes; which effectively gives the package author root on your system. Not a good base for sandboxing!

Also, they make it difficult to try out 3rd-party software, or to test newer versions of just some packages.

In 2003 I created 0install to address these problems, and Nix has very similar goals. I thought Nix was a few years younger, but looking at its Git history the first commit was on Mar 12, 2003. I announced the first preview of 0install just two days later, so both projects must have started writing code within a few days of each other!

NixOS is made up of quite a few components. Here is what I've learned so far:

nix-store

The store holds the files of all the programs, and is the central component of the system. Each version of a package goes in its own directory (or file), at /nix/store/$HASH. You can add data to the store directly, like this:

$ echo hello > file

$ nix-store --add-fixed sha256 file
/nix/store/1vap48aqggkk52ijn2prxzxv7cnzvs0w-file

$ cat /nix/store/1vap48aqggkk52ijn2prxzxv7cnzvs0w-file
hello

Here, the store location is calculated from the hash of the contents of the file we added (as with 0install store add or git hash-object).

However, you can also add things to the store by asking Nix to run a build script. For example, to compile some source code:

You add the source code and some build instructions (a "derivation" file) to the store.
You ask the store to build the derivation. It runs your build script in a container sandbox.
The results are added to the store, using the hash of the build instructions (not the hash of the result) as the directory name.

If a package in the store depends on another one (at build time or run time), it just refers to it by its full path. For example, a bash script in the store will start something like:

#! /nix/store/vnyfysaya7sblgdyvqjkrjbrb0cy11jf-bash-4.4-p23/bin/bash
...

If two users want to use the same build instructions, the second one will see that the hash already exists and can just reuse that. This allows users to compile software from source and share the resulting binaries, without having to trust each other.

Ideally, builds should be reproducible. To encourage this, builds which use the hash of the build instructions for the result path are built in a sandbox without network access. So, you can't submit a build job like "Download and compile whatever is the latest version of Vim". But you can discover the latest version yourself and then submit two separate jobs to the store:

"Download Vim 8.2, with hash XXX" (a fixed-output job, which therefore has network access)
"Build Vim from hash XXX"

You can run nix-collect-garbage to delete everything from the store that isn't reachable via the symlinks under /nix/var/nix/gcroots/. Users can put symlinks to things they care about keeping in /nix/var/nix/gcroots/per-user/$USER/.

By default, the store is also configured with a trusted binary cache service, and will try to download build results from there instead of compiling locally when possible.

nix-instantiate

Writing derivation files by hand is tedious, so Nix provides a templating language to create them easily. The Nix language is dynamically typed and based around maps/dictionaries (which it confusingly refers to as "sets"). nix-instantiate file.nix will generate a derivation from file.nix and add it to the store.

An Nix file looks like this:

derivation { system = "x86_64-linux"; builder = ./myfile; name = "foo"; }

Running nix-instantiate on this will:

Add myfile to the store.
Add the generated foo.drv to the store, including the full store path of myfile.

nix-pkgs

Writing Nix expressions for every package you want would also be tedious. The nixpkgs Git repository contains a Nix expression that evaluates to a set of derivations, one for each package in the distribution. It also contains a library of useful helper functions for packages (e.g. it knows how to handle GNU autoconf packages automatically).

Rather than evaluating the whole lot, you use -A to ask for a single package. For example, you can use nix-instantiate ./nixpkgs/default.nix -A firefox to generate a derivation for Firefox.

nix-build is a quick way to create a derivation with nix-instantiate and build it with nix-store. It will also create a ./result symlink pointing to its path in the store, as well as registering ./result with the garbage collector under /nix/var/nix/gcroots/auto/. For example, to build and run Firefox:

nix-build ./nixpkgs/default.nix -A firefox
./result/bin/firefox

If you use nixpkgs without making any changes, it will be able to download a pre-built binary from the cache service.

nix-env

Keeping track of all these symlinks would be tedious too, but you can collect them all together by making a package that depends on every application you want. Its build script will produce a bin directory full of symlinks to the applications. Then you could just point your $PATH variable at that bin directory in the store.

To make updating easier, you will actually add ~/.nix-profile/bin/ to $PATH and update .nix-profile to point at the latest build of your environment package.

This is essentially what nix-env does, except with yet more symlinks to allow for switching between multiple profiles, and to allow rolling back to previous environments if something goes wrong.

For example, to install Firefox so you can run it via $PATH:

nix-env -i firefox

NixOS

Finally, just as nix-env can create a user environment with bin, man, etc, a similar process can create a root filesystem for a Linux distribution.

nixos-rebuild reads the /etc/nixos/configuration.nix configuration file, generates a system environment, and then updates grub and the /run/current-system symlink to point to it.

In fact, it also lists previous versions of the system environment in the grub file, so if you mess up the configuration you can just choose an earlier one from the boot menu to return to that version.

Installing NixOS

To install NixOS you boot one of the live images at https://nixos.org. Which you use only affects the installation UI, not the system you end up with.

The manual walks you through the installation process, showing how to partition the disk, format and mount the partitions, and how to edit the configuration file. I like this style of installation, where it teaches you things instead of just doing it for you. Most of the effort in switching to a new system is learning about it, so I'd rather spend 3 hours learning stuff following an installation guide than use a 15-minute single-click installer that teaches me nothing.

The configuration file (/etc/nixos/configuration.nix) is just another Nix expression. Most things are set to off by default (I approve), but can be changed easily. For example, if you want sound support you change that setting to sound.enable = true, and if you also want to use PulseAudio then you set hardware.pulseaudio.enable = true too.

Every system service supported by NixOS is controlled from here, with all kinds of options, from programs.vim.defaultEditor = true (so you don't get trapped in nano) to services.factorio.autosave-interval. Use man configuration.nix to see the available settings.

NixOS defaults to an X11 desktop, but I wanted to try Wayland (and Sway). Based on the NixOS wiki instructions, I used this:

  programs.sway = {
    enable = true;
    wrapperFeatures.gtk = true; # so that gtk works properly
    extraSessionCommands = "export MOZ_ENABLE_WAYLAND=1";
    extraPackages = with pkgs; [
      swaylock
      swayidle
      xwayland
      wl-clipboard
      mako
      alacritty
      dmenu
    ];
  };

The xwayland bit is important; without that you can't run any X11 applications.

My only complaint with the NixOS installation instructions is that following them will leave you with an unencrypted system, which isn't very useful. When partitioning, you have to skip ahead to the LUKS section of the manual, which just gives some options but no firm advice. I created two primary partitions: a 1G unencrypted /boot, and a LUKS partition for the rest of the disk. Then I created an LVM volume group from the /dev/mapper/crypted device and added the other partitions in that.

Once the partitions are mounted and the configuration file is complete, nixos-install downloads everything and configures grub. Then you reboot into the new system.

Once running the new system you can made further edits to the configuration file there in the same way, and use nixos-rebuild switch to generate a new system. It seems to be pretty good at updating the running system to the new settings, so you don't normally need to reboot after making changes.

The big mistake I made was forgetting to add /boot to fstab. When I ran nixos-rebuild it put all the grub configuration on the encrypted partition, rendering the system unbootable. I fixed that with chattr +i /boot on the unmounted partition. That way, trying to rebuild with /boot unmounted will just give an error message.

Thoughts on NixOS

I've been using the system for a few weeks now and I've had no problems with Nix so far. Nix has been fast and reliable and there were fairly up-to-date packages for everything I wanted (I'm using the stable release). There is a lot to learn, but plenty of documentation.

When I wanted a newer package (socat with vsock support, only just released) I just told Nix to install it from the latest Git checkout of nixpkgs. Unlike on Debian and similar systems, doing this doesn't interfere with any other packages (such as forcing a system-wide upgrade of libc).

I think Nix does download more data than most other systems, but networks are fast enough now that it doesn't seem to matter. For example, let's say you're running Python 3.9.0 and you want to update to 3.9.1:

With Debian: apt-get upgrade downloads the new version, which gets unpacked over the old one. As the files are unpacked, the system moves through an exciting series of intermediate states no-one has thought about. Running programs may crash as they find their library versions changing under them (though it's usually OK). Only root can update software.
With 0install: 0install update downloads the new version, unpacking it to a new directory. Running programs continue to use the old version. When a new program is started, 0install notices the update and runs the solver again. If the program is compatible with the new Python then it uses that. If not, it continues with the old one. You can run any previous version if there is a problem.
With Nix: nix-env -u downloads the new version, unpacking it to a new directory. It also downloads (or rebuilds) every package depending on Python, creating new directories for each of them. It then creates a new environment with symlinks to the latest version of everything. Running programs continue to use the old version. Starting a new program will use the new version. You can revert the whole environment back to the previous version if there is a problem.
With Docker: docker pull downloads the new version of a single application, downloading most or all of the application's packages, whether Python related or not. Existing containers continue running with the old version. New containers will default to using the new version. You can specify which version to use when starting a program. Other applications continue using the old version of Python until their authors update them (you must update each application individually, rather than just updating Python itself).

The main problem with NixOS is that it's quite different to other Linux systems, so there's a lot to relearn. Also, existing knowledge about how to edit fstab, sudoers, etc, isn't so useful, as you have to provide all configuration in Nix syntax. However, having a single (fairly sane) syntax for everything is a nice bonus, and being able to generate things using the templating language is useful. For example, for my network setup I use a bunch of tap devices (one for each of my VMs). It was easy to write a little Nix function (mktap) to generate them all from a simple list. Here's that section of my configuration.nix:

  networking = {
    useDHCP = false;
    interfaces =
      let mktap = ip: {
          virtual = true;
          virtualOwner = "tal";
          ipv4.addresses = [
            { address = ip; prefixLength = 31; }
          ];
        };
      in
      {
        eno2.useDHCP = true;
        wlo1.useDHCP = true;
        tapdev = mktap "10.0.0.2";
        tapcom = mktap "10.0.0.4";
        tapshopping = mktap "10.0.0.6";
        tapbanking = mktap "10.0.0.8";
        tapuntrusted = mktap "10.0.0.10";
      };
    nat = {
      enable = true;
      externalInterface = "eno2";
      internalIPs = [ "10.0.0.0/8" ];
    };
  };

Overall, I'm very happy with NixOS so far.

Why use virtual machines?

With NixOS I had a nice host environment, but after using Qubes I wanted to run my applications in VMs.

The basic problem is that Linux is the only thing that knows how to drive all the hardware, but Linux security is not ideal. There are several problems:

Linux is written in C. This makes security bugs rather common and, more importantly, means that a bug in one part of the code can impact any other part of the code. Nothing is secure unless everything is secure.
Linux has a rather large API (hundreds of syscalls).
The Linux (Unix) design predates the Internet, and security has been somewhat bolted on afterwards.

For example, imagine that we want to run a program with access to the network, but not to the graphical display. We can create a new Linux container for it using bubblewrap, like this:

$ ls -l /run/user/1000/wayland-0 /tmp/.X11-unix/X0
srwxr-xr-x 1 tal users 0 Feb 18 16:41 /run/user/1000/wayland-0
srwxr-xr-x 1 tal users 0 Feb 18 16:41 /tmp/.X11-unix/X0

$ bwrap \
    --ro-bind / / \
    --dev /dev \
    --tmpfs /home/tal \
    --tmpfs /run/user \
    --tmpfs /tmp \
    --unshare-all --share-net \
    bash

$ ls -l /run/user/1000/wayland-0 /tmp/.X11-unix/X0
ls: cannot access '/run/user/1000/wayland-0': No such file or directory
ls: cannot access '/tmp/.X11-unix/X0': No such file or directory

The container has an empty home directory, empty /tmp, and no access to the display sockets. If we run Firefox in this environment then... it opens its window just fine! How? strace shows what happened:

connect(4, {sa_family=AF_UNIX, sun_path="/run/user/1000/wayland-0"}, 27) = -1 ENOENT (No such file or directory)
socket(AF_UNIX, SOCK_STREAM|SOCK_CLOEXEC, 0) = 4
connect(4, {sa_family=AF_UNIX, sun_path=@"/tmp/.X11-unix/X0"}, 20) = 0

After failing to connect to Wayland, it then tried using X11 (via Xwayland) instead. Why did that work? If the first byte of the socket pathname is \0 then Linux instead interprets it as an "abstract" socket address, not subject to the usual filesystem permission rules.

Trying to anticipate these kinds of special cases is just too much work. Linux really wants everything on by default, and you have to find and disable every feature individually. By contrast, virtual machines tend to have integrations with the host off by default. The also tend to have much smaller APIs (e.g. just reading and writing disk blocks or network frames), with the rich Unix API entirely inside the VM, provided by a separate instance of Linux.

SpectrumOS

I was able to set up a qemu guest and restore my dev Qubes VM in that, but it didn't integrate nicely with the rest of the desktop. Installing ssh allowed me to connect in with ssh -Y dev, allowing apps in the VM to open an X connection to Xwayland on the host. That was somewhat usable, but still a bit slower than Qubes had been (which was already a bit too slow).

Searching for a way to forward the Wayland connection directly, I came across the SpectrumOS project. SpectrumOS aims to use one virtual machine per application, using shared directories so that VM files are stored on the host, simplifying management. It uses crosvm from the ChromiumOS project instead of qemu, because it has a driver that allows forwarding Wayland connections (and also because it's written in Rust rather than C). The project's single developer is currently taking a break from the project, and says "I'm currently working towards a proof of concept".

However, there is some useful stuff in the SpectrumOS repository (which is a fork of nixpkgs). In particular, it contains:

A version of Linux with the virtwl kernel module, which connects to crosvm's Wayland driver.
A package for sommelier, which connects applications to virtwl.
A Nix expression to build a root filesystem for the VM.

Building that, I was able to run the project's demo, which runs the Wayfire compositor inside the VM, appearing in a window on the host. Dragging the nested window around, the pixels flowed smoothly across my screen in exactly the way that pixels on QubesOS don't.

This was encouraging, but I didn't want to run a nested window manager. I tried running Firefox directly (without Wayfire), but it complained that sommelier didn't provide a new enough version of something, and running weston-terminal immediately segfaulted sommelier.

Why do we need the sommelier process anyway? The problem is that, while virtwl mostly proxies Wayland messages directly, it can't send arbitrary FDs to the host. For example, if you want to forward a writable stream from an application to virtwl you must first create a pipe from the host using a special virtwl ioctl, then read from that and copy the data to the application's regular Linux pipe.

With help from the mailing list, I managed to get it somewhat usable:

I enabled VIRTIO_FS, allowing me to mount a host directory into the VM (for sharing files).
I created some tap devices (as mentioned above) to get guest networking going.
Adding ext4 to the kernel image allowed me to mount the VM's LVM partition.
Setting FONTCONFIG_FILE got some usable fonts (otherwise, there was no monospace font for the terminal).
I hacked sommelier to claim it supported the latest protocols, which got Firefox running.
Configuring sommelier for Xwayland let X applications run.
I replaced the non-interactive bash shell with fish so I could edit commands.
I ran (while true; do socat vsock-listen:5000 exec:dash; done) at the end of the VM's boot script. Then I could start e.g. the VM's Firefox with echo 'firefox&' | socat stdin vsock-connect:7:5000 on the host, allowing me to add launchers for guest applications.

Making changes to the root filesystem was fairly easy once I'd read the Nix manuals. To add an application (e.g. libreoffice), you import it at the start of rootfs/default.nix and add it to the path variable. The Nix expression gets the transitive dependencies of path from the Nix store and packs them into a squashfs image.

True, my squashfs image is getting a bit big. Maybe I should instead make a minimal squashfs boot image, plus a shared directory of hard links to the required files. That would allow sharing the data with the host. I could also just share the whole /nix/store directory, if I wanted to make all host software available to guests.

I made another Nix script to add various VM boot commands to my host environment. For example, running qvm-start-shopping boots my shopping VM using crosvm, with the appropriate LVM data partition, network settings, and shared host directory.

I think, ideally, this would be a systemd socket-activated user service rather than a shell script. Then attempting to run Firefox by sending a command to the VM socket would cause systemd to boot the VM (if not already running). For now, I boot each VM manually in a terminal and then press Win-Shift-2 to banish it to workspace 2, with all the other VM root consoles.

The virlwl Wayland forwarding feels pretty fast (much faster than Qubes' X graphics).

Wayland

I now had a mostly functional Qubes-like environment, running most of my applications in VMs, with their windows appearing on the host desktop like any other application. However, I also had some problems:

A stated goal of Wayland is "every frame is perfect". However, applications generally seemed to open at the wrong size and then jump to their correct size, which was a bit jarring.
Vim opened its window with the scrollbar at the far left of the window, making the text invisible until you resized the window.
Wayland is supposed to have better support for high-DPI displays. However, this doesn't work with Xwayland, which turns everything blurry, and the recommended work-around is to use a scale-factor of 1 and configure each application to use bigger fonts. This is easy enough with X applications (e.g. set ft.dpi: 150 with xrdb), but Wayland apps must be configured individually.
Wayland doesn't have cursor themes and you have to configure every application individually to use a larger cursor too.
Copying text didn't seem to work reliably. Sometimes there would be a long delay, after which the text might or might not appear. More often, it would just paste something completely different and unexpected. Even when it did paste the right text, it would often have ^M characters inserted into it.

I decided it was time to learn more about Wayland. I discovered wayland-book.com, which does a good job of introducing it (though the book is only half finished at the moment).

Protocol

One very nice feature of Wayland is that you can run any Wayland application with WAYLAND_DEBUG=1 and it will display a fairly readable trace of all the Wayland messages it sends and receives. Let's look at a simple application that just connects to the server (compositor) and opens a window:

$ WAYLAND_DEBUG=1 test.exe
-> wl_display@1.get_registry registry:+2
-> wl_display@1.sync callback:+3

The client connects to the server's socket at /run/user/1000/wayland-0 and sends two messages to object 1 (of type wl_display), which is the only object available in a new connection. The get_registry request asks the server to add the registry to the conversation and call it object 2. The sync request just asks the server to confirm it got it, using a new callback object (with ID 3).

Both clients and servers can add objects to the conversation. To avoid numbering conflicts, clients assign low numbers and servers pick high ones.

On the wire, each message gives the object ID, the operation ID, the length in bytes, and then the arguments. Objects are thought of as being at the server, so the client sends request messages to objects, while the server emits event messages from objects. At the wire level there's no difference though.

When the server gets the get_registry request it adds the registry, which immediately emits one event for each available service, giving the maximum supported version. The client receives these messages, followed by the callback notification from the sync message:

<- wl_registry@2.global name:0 interface:"wl_compositor" version:4
<- wl_registry@2.global name:1 interface:"wl_subcompositor" version:1
<- wl_registry@2.global name:2 interface:"wl_shm" version:1
<- wl_registry@2.global name:3 interface:"xdg_wm_base" version:1
<- wl_registry@2.global name:4 interface:"wl_output" version:2
<- wl_registry@2.global name:5 interface:"wl_data_device_manager" version:3
<- wl_registry@2.global name:6 interface:"zxdg_output_manager_v1" version:3
<- wl_registry@2.global name:7 interface:"gtk_primary_selection_device_manager" version:1
<- wl_registry@2.global name:8 interface:"wl_seat" version:5
<- wl_callback@3.done callback_data:1129040

The callback tells the client it has seen all the available services, and so it now picks the ones it wants. It has to choose a version no higher than the one offered by the server. Protocols starting with wl_ are from the core Wayland protocol; the others are extensions. The leading z in zxdg_output_manager_v1 indicates that the protocol is "unstable" (under development).

The protocols are defined in various XML files, which are scattered over the web. The core protocol is defined in wayland.xml. These XML files can be used to generate typed bindings for your programming language of choice.

Here, the application picks wl_compositor (for managing drawing surfaces), wl_shm (for sharing memory with the server), and xdg_wm_base (for desktop windows).

-> wl_registry@2.bind name:0 id:+4(wl_compositor:v4)
-> wl_registry@2.bind name:2 id:+5(wl_shm:v1)
-> wl_registry@2.bind name:3 id:+6(xdg_wm_base:v1)

The bind message is unusual in that the client gives the interface and version of the object it is creating. For other messages, both sides know the type from the schema, and the version is always the same as the parent object. Because the client chose the new IDs, it doesn't need to wait for the server; it continues by using the new objects to create a top-level window:

-> wl_compositor@4.create_surface id:+7
-> xdg_wm_base@6.get_xdg_surface id:+8 surface:7
-> xdg_surface@8.get_toplevel id:+9
-> xdg_toplevel@9.set_title title:"example app"
-> wl_surface@7.commit

This API is pretty strange. The core Wayland protocol says how to make generic drawing surfaces, but not how to make windows, so the application is using the xdg_wm_base extension to do that. Logically, there's only one object here (a toplevel window), but it ends up making three separate Wayland objects representing the different aspects of it.

The commit tells the server that the client has finished setting up the window and the server should now do something with it.

The above was all in response to the callback firing. The client now processes the last message in that batch, which is the server destroying the callback:

<- wl_display@1.delete_id id:3

Object destruction is a bit strange in Wayland. Normally, clients ask for things to be destroyed (by sending a "destructor" message) and the server confirms by sending delete_id from object 1. But this isn't symmetrical: there is no standard way for a client to confirm deletion when the server calls a destructor (such as the callback's done), so these have to be handled on a case-by-case basis. Since callbacks don't accept any messages, there is no need for the client to confirm that it got the done message and the server just sends a delete message immediately.

The client now waits for the server to respond to all the messages it sent about the new window, and gets a bunch of replies:

<- wl_shm@5.format format:0
<- wl_shm@5.format format:1
<- wl_shm@5.format format:875709016
<- wl_shm@5.format format:875708993
<- xdg_wm_base@6.ping serial:1129043
-> xdg_wm_base@6.pong serial:1129043
<- xdg_toplevel@9.configure width:0 height:0 states:""
<- xdg_surface@8.configure serial:1129042
-> xdg_surface@8.ack_configure serial:1129042

It gets some messages telling it what pixel formats are supported, a ping message (which the server sends from time to time to check the client is still alive), and a configure message giving the size for the new window. Oddly, Sway has set the size to 0x0, which means the client should choose whatever size it likes.

The client picks a suitable default size, allocates some shared memory (by opening a tmpfs file and immediately unlinking it), shares the file descriptor with the server (create_pool), and then carves out a portion of the memory to use as a buffer for the pixel data:

-> wl_shm@5.create_pool id:+3 fd:(fd) size:1228800
-> wl_shm_pool@3.create_buffer id:+10 offset:0 width:640 height:480 stride:2560 format:1
-> wl_shm_pool@3.destroy

In this case it used the whole memory region. It could also have allocated two buffers for double-buffering. The client then draws whatever it wants into the buffer (mapping the file into its memory and writing to it directly), attaches the buffer to the window's surface, marks the whole area as "damaged" (in need of being redrawn) and calls commit, telling the server the surface is ready for display:

-> wl_surface@7.attach buffer:10 x:0 y:0
-> wl_surface@7.damage x:0 y:0 width:2147483647 height:2147483647
-> wl_surface@7.commit

At this point the window appears on the screen! The server lets the client know it has finished with the buffer and the client destroys it:

<- wl_display@1.delete_id id:3
<- wl_buffer@10.release 
-> wl_buffer@10.destroy

Although the window is visible, the content is the wrong size. Sway now suddenly remembers that it's a tiling window manager. It sends another configure event with the correct size, causing the client to allocate a fresh memory pool of the correct size, allocate a fresh buffer from it, redraw everything at the new size, and tell the server to draw it.

<- xdg_toplevel@9.configure width:1534 height:1029 states:""
...

This process of telling the client to pick a size and then overruling it explains why Firefox draws itself incorrectly at first and then flickers into position a moment later. It probably also explains why Vim tries to open a 0x0 window.

Copying text

A bit of searching revealed that the ^M problem is a known Sway bug.

However, the main reason copying text wasn't working turned out to be a limitation in the design of the core wl_data_device_manager protocol. The normal way to copy text on X11 is to select the text you want to copy, then click the middle mouse button where you want it (or press Shift-Insert).

X also supports a clipboard mechanism, where you select text, then press Ctrl-C, then click at the destination, then press Ctrl-V. The original Wayland protocol only supports the clipboard system, not the selection, and so Wayland compositors have added selection support through extensions. Sommelier didn't proxy these extensions, leading to failure when copying in or out of VMs.

I also found that the reason weston-terminal wouldn't start was because I didn't have anything in my clipboard, and sommelier was trying to dereference a null pointer.

One problem with the Wayland protocol is that it's very hard to proxy. Although the wire protocol gives the length in bytes of each message, it doesn't say how many file descriptors it has. This means that you can't just pass through messages you don't understand, because you don't know which FDs go with which message. Also, the wire protocol doesn't give types for FDs (nor does the schema), which is a problem for anything that needs to proxy across a VM boundary or over a network.

This all meant that VMs could only use protocols explicitly supported by sommelier, and sommelier limited the version too. Which means that supporting extra extensions or new versions means writing (and debugging) loads of C++ code.

I didn't have time to write and debug C++ code for every missing Wayland protocol, so I took a short-cut: I wrote my own Wayland library, ocaml-wayland, and then used that to write my own version of sommelier. With that, adding support for copying text was fairly easy.

For each Wayland interface we need to handle each incoming message from the client and forward it to the host, and also forward each message from the host to the client. Here's the code to handle the "selection" event in OCaml, which we receive from the host and send to the client (c):

method on_selection _ offer = C.Wl_data_device.selection c (Option.map to_client offer)

The host passes us an "offer" argument, which is a previously-created host offer object. We look up the corresponding client object with to_client and pass that as the argument to the client.

For comparison, here's sommelier's equivalent to this line of code, in C++:

static void sl_data_device_selection(void* data,
                                     struct wl_data_device* data_device,
                                     struct wl_data_offer* data_offer) {
  struct sl_host_data_device* host = static_cast<sl_host_data_device*>(
      wl_data_device_get_user_data(data_device));
  struct sl_host_data_offer* host_data_offer =
      static_cast<sl_host_data_offer*>(wl_data_offer_get_user_data(data_offer));

  wl_data_device_send_selection(host->resource, host_data_offer->resource);
}

I think this is a great demonstration of the difference between "type safety" and "type ceremony". The C++ code is covered in types, making the code very hard to read, yet it crashes at runtime because it fails to consider that data_offer can be NULL.

By contrast, the OCaml version has no type annotations, but the compiler would reject if I forgot to handle this (with Option.map).

Security

According to the GNOME wiki, the original justification for not supporting selection copies was "security concerns with unexpected data stealing if the mere act of selecting a text fragment makes it available to all running applications". The implication is that applications stealing data instead from the clipboard is OK, and that you should therefore never put anything confidential on the clipboard.

This seemed a bit odd, so I read the security section of the Wayland specification to learn more about its security model. That section of the specification is fairly short, so I'll reproduce it here in full:

Security and Authentication

mostly about access to underlying buffers, need new drm auth mechanism (the grant-to ioctl idea), need to check the cmd stream?

getting the server socket depends on the compositor type, could be a system wide name, through fd passing on the session dbus. or the client is forked by the compositor and the fd is already opened.

It looks like implementations have to figure things out for themselves.

The main advantage of Wayland over X11 here is that Wayland mostly isolates applications from each other. In X11 applications collaborate together to manage a tree of windows, and any application can access any window. In the Wayland protocol, each application's connection only includes that application's objects. Applications only get events relevant to their own windows (for example, you only get pointer motion events while the pointer is over your window). Communication between applications (e.g. copy-and-paste or drag-and-drop) is all handled though the compositor.

Also, to request the contents of the clipboard you need to quote the serial number of the mouse click or key press that triggered it. If it's too far in the past, the compositor can ignore the request.

I've also heard people say that security is the reason you can't take screenshots with Wayland. However, Sway lets you take screenshots, and this worked even from inside a VM through virtwl. I didn't add screenshot support to the proxy, because I don't want VMs to be able to take screenshots, but the proxy isn't a security tool (it runs inside the VM, which isn't trusted).

Clearly, the way to fix this was with a new compositor. One that would offer a different Wayland socket to each VM, tag the windows with the VM name, colour the frames, confirm copies across VM boundaries, and work with Vim. Luckily, I already had a handy pure-OCaml Wayland protocol library available. Unluckily, at this point I ran out of holiday.

Future work

There are quite a few things left to do here:

One problem with virtwl is that, while we can receive shared memory FDs from the host, we can't export guest memory to the host. This is unfortunate, because in Wayland the shared memory for window contents is allocated by the application from guest memory, and the proxy therefore has to copy each frame. If the host provided the memory to the guest, this wouldn't be needed. There is a wl_drm protocol for allocating video memory, which might help here, but I don't know how that works and, like many Wayland specifications, it seems to be in the process of being replaced by something else. Also, if we're going to copy the memory, we should at least only copy the damaged region, not the whole thing. I only got this code working just far enough to run the Wayland applications I use (mainly Firefox and Evince).
I'm still using ssh to proxy X11 connections (mainly for Vim and gitk). I'd prefer to run Xwayland in the VM, but it seems you need to provide a bit of extra support for that, which I haven't implemented yet. Sommelier can do this, but then copying doesn't work.
The host Wayland compositor needs to be aware of VMs, so it can colour the titles appropriately and limit access to privileged operations.
For the full Qubes experience, the network card should be handled by a VM, with another VM managing the firewall. Perhaps the Mirage unikernel firewall could be made to work on KVM too. I'm not sure how guest-to-guest communication works with KVM.

However, because the host NixOS environment is a fully-working Linux system, I can always trade off some security to get things working (e.g. by doing video conferencing directly on the host).

I hope the SpectrumOS project will resume at some point, or that Qubes will find a solution to its hardware compatibility and performance problems.