Monthly Archives: January 2017

local display for intel vgpu starts working

Intel vgpu guest display shows up in the qemu gtk window. This is a linux kernel booted to the dracut emergency shell prompt, with the framebuffer console running @ inteldrmfb. Booting a full linux guest with X11 and/or wayland not tested yet. There are rendering glitches too, running “dmesg” looks like this:

So, a bunch of issues to solve before this is ready for users, but it’s a nice start.

For the brave:

host kernel: https://www.kraxel.org/cgit/linux/log/?h=intel-vgpu
qemu: https://www.kraxel.org/cgit/qemu/log/?h=work/intel-vgpu

Take care, both branches are moving targets (aka: rebasing at times).

tweak arm images with libguestfs-tools

So, when using the official fedora arm images on your raspberry pi (or any other arm board) board you might have faced the problem that it is not easy to use them for a headless (i.e. no keyboard and display connected) machine. There is no default password, fedora asks you to set one on the first boot instead. Which is from a security point of view surely better than shipping with a fixed password. But for headless machines it is quite inconvenient …

Luckily there is an easy way out. You can use libguestfs-tools. The tools have been created to configure virtual machine images (this is where the name comes from). But the tools work fine with sdcards too.

I’m using a usb sdcard reader which shows up as /dev/sdc on my system. I can just pass /dev/sdc as image to the tools (take care, the device is probably something else for you). For example, to set a root password:

virt-customize -a /dev/sdc --root-password "password:<your-password-here>"

The initial setup on the first boot is a systemd service, and it can be turned off by simply removing the symlinks which enable the service:

virt-customize -a /dev/sdc \
  --delete /etc/systemd/system/multi-user.target.wants/initial-setup.service \
  --delete /etc/systemd/system/graphical.target.wants/initial-setup.service

You can use virt-copy-in (or virt-tar-in) to copy config files to the disk image. Small (or empty) configuration files can also be created with the write command:

virt-customize -a /dev/sdc --write "/.autorelabel:"

Adding the .autorelabel file will force selinux relabeling on the first boot (takes a while). It is a good idea to do that in case you copy files to the sdcard, to make sure the new files are labeled correctly. Especially in case you copy security sensitive things like ssh keys or ssh config files. Without relabeling selinux will not allow sshd access those files, which in turn can break remote logins.

There is alot more the virt-* tools can do for you. Check out the manual pages for more info. And you can easily script things, virt-customize has a --commands-from-file switch which accepts a file with a list of commands.

virtual gpu support landing upstream

The upstreaming process of virtual gpu support (vgpu) made a big step forward with the 4.10 merge window. Two important pieces have been merged:

First, the mediated device framework (mdev). Basically this allows kernel drivers to present virtual pci devices, using the vfio framework and interfaces. Both nvidia and intel will use mdev to partition the physical gpu of the host into multiple virtual devices which then can be assigned to virtual machines.

Second, intel landed initial mdev support for the i915 driver too. There is quite some work left to do in future kernel releases though. Accessing to the guest display is not supported yet, so you must run x11vnc or simliar tools in the guest to see the screen. Also there are some stability issues to find and fix.

If you want play with this nevertheless, here is how to do it. But be prepared for crashes and better don’t try this on a production machine.

On the host: create virtual devices

On the host machine you obviously need a 4.10 kernel. Also the intel graphics device (igd) must be broadwell or newer. In the kernel configuration enable vfio and mdev (all CONFIG_VFIO_* options). Enable CONFIG_DRM_I915_GVT and CONFIG_DRM_I915_GVT_KVMGT for intel vgpu support. Building the mtty sample driver (CONFIG_SAMPLE_VFIO_MDEV_MTTY, a virtual serial port) can be useful too, for testing.

Boot the new kernel. Load all modules: vfio-pci, vfio-mdev, optionally mtty. Also i915 and kvmgt of course, but that probably happened during boot already.

Go to the /sys/class/mdev_bus directory. This should look like this:

kraxel@broadwell ~# cd /sys/class/mdev_bus
kraxel@broadwell .../class/mdev_bus# ls -l
total 0
lrwxrwxrwx. 1 root root 0 17. Jan 10:51 0000:00:02.0 -> ../../devices/pci0000:00/0000:00:02.0
lrwxrwxrwx. 1 root root 0 17. Jan 11:57 mtty -> ../../devices/virtual/mtty/mtty

Each driver with mdev support has a directory there. Go to $device/mdev_supported_types to check what kind of virtual devices you can create.

kraxel@broadwell .../class/mdev_bus# cd 0000:00:02.0/mdev_supported_types
kraxel@broadwell .../0000:00:02.0/mdev_supported_types# ls -l
total 0
drwxr-xr-x. 3 root root 0 17. Jan 11:59 i915-GVTg_V4_1
drwxr-xr-x. 3 root root 0 17. Jan 11:57 i915-GVTg_V4_2
drwxr-xr-x. 3 root root 0 17. Jan 11:59 i915-GVTg_V4_4

As you can see intel supports three different configurations on my machine. The configuration (basically the amount of video memory) differs, and the number of instances you can create. Check the description and available_instance files in the directories:

kraxel@broadwell .../0000:00:02.0/mdev_supported_types# cd i915-GVTg_V4_2
kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# cat description 
low_gm_size: 64MB
high_gm_size: 192MB
fence: 4
kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# cat available_instance 
2

Now it is possible to create virtual devices by writing a UUID into the create file:

kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# uuid=$(uuidgen)
kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# echo $uuid
f321853c-c584-4a6b-b99a-3eee22a3919c
kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# sudo sh -c "echo $uuid > create"

The new vgpu device will show up as subdirectory of the host gpu:

kraxel@broadwell .../mdev_supported_types/i915-GVTg_V4_2# cd ../../$uuid
kraxel@broadwell .../0000:00:02.0/f321853c-c584-4a6b-b99a-3eee22a3919c# ls -l
total 0
lrwxrwxrwx. 1 root root    0 17. Jan 12:31 driver -> ../../../../bus/mdev/drivers/vfio_mdev
lrwxrwxrwx. 1 root root    0 17. Jan 12:35 iommu_group -> ../../../../kernel/iommu_groups/10
lrwxrwxrwx. 1 root root    0 17. Jan 12:35 mdev_type -> ../mdev_supported_types/i915-GVTg_V4_2
drwxr-xr-x. 2 root root    0 17. Jan 12:35 power
--w-------. 1 root root 4096 17. Jan 12:35 remove
lrwxrwxrwx. 1 root root    0 17. Jan 12:31 subsystem -> ../../../../bus/mdev
-rw-r--r--. 1 root root 4096 17. Jan 12:35 uevent

You can see the device landed in iommu group 10. We’ll need that in a moment.

On the host: configure guests

Ideally this would be as simple as adding <hostdev> to your guests libvirt xml config. The mdev devices don’t have a pci address on the host though, and because of that they must be passed to qemu using the sysfs device path instead of the pci address. libvirt doesn’t (yet) support sysfs paths though, so it is a bit more complicated for now. Alot of the setup libvirt does for hostdevs automatically must be done manually instead.

First, we must allow qemu access /dev. By default libvirt uses control groups to restrict access. That must be turned off. Edit /etc/libvirt/qemu.conf. Uncomment the cgroup_controllers line. Remove "devices" from the list. Restart libvirtd.

Second, we must allow qemu access the iommu group (10 in my case). A simple chmod will do:

kraxel@broadwell ~# chmod 666 /dev/vfio/10

Third, we must update the guest configuration:

<domain type='kvm' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
  [ ... ]
  <currentMemory unit='KiB'>1048576</currentMemory>
  <memoryBacking>
    <locked/>
  </memoryBacking>
  [ ... ]
  <qemu:commandline>
    <qemu:arg value='-device'/>
    <qemu:arg value='vfio-pci,addr=05.0,sysfsdev=/sys/class/mdev_bus/0000:00:02.0/f321853c-c584-4a6b-b99a-3eee22a3919c'/>
  </qemu:commandline>
</domain>

There is special qemu namespace which can be used to pass extra command line arguments to qemu. We do this here to use a qemu feature not yet supported by libvirt (use sysfs paths for vfio-pci). Also we must explicitly allow to lock down guest memory.

Now we are ready to go:

kraxel@broadwell ~# virsh start --console $guest

In the guest

It is a good idea to prepare the guest a bit before adding the vgpu to the guest configuration. Setup a serial console, so you can talk to it even in case graphics are broken. Blacklist the i915 module and load it manually, at least until you have a known-working configuration. Also booting to runlevel 3 (aka multi-user.target) instead of 5 (aka graphical.target) and starting the xorg server manually is better for now.

For the guest machine intel recommends the 4.8 kernel. In theory newer kernels should work too, in practice they didn’t last time I tested (4.10-rc2). Also make sure the xorg server uses the modesetting driver, the intel driver didn’t work in my testing. This config file will do:

root@guest ~# cat /etc/X11/xorg.conf.d/intel.conf 
Section "Device"
        Identifier  "Card0"
#       Driver      "intel"
        Driver      "modesetting"
        BusID       "PCI:0:5:0"
EndSection

I’m starting the xorg server with x11vnc, xterm and mwm (motif window manager) using this little script:

#!/bin/sh

# debug
echo "# $0: DISPLAY=$DISPLAY"

# start server
if test "$DISPLAY" != ":4"; then
        echo "# $0: starting Xorg server"
        exec startx $0 -- /usr/bin/Xorg :4
        exit 1
fi
echo "# $0: starting session"

# configure session
xrdb $HOME/.Xdefaults

# start clients
x11vnc -rfbport 5904 &
xterm &
exec mwm

The session runs on display 4, so you should be able to connect from the host this way:

kraxel@broadwell ~# vncviewer $guest_ip:4

Have fun!