Regresion in kernel 6.*
Newest kernel 6.x does not support old way of loading vfio-pci driver. The issue is freeze when loading udevd in initramfs stage.
Definitions and assumptions
- Initramfs
- Late Userspace
- mkinitcpio usage is assumed
Easy solution
Stay on old 5.19.3 linux kernel for some time until ArchWiki gets upgraded by someone else. This is somethings contradicting philospohy of using rolling distro like ArchLinux and it just does not feel good (at least for me). It also makes you an user of outdated kernel without cutting edge features and using this less popular/tested configuration of packages might finally cause issues.
#/etc/default/grub
GRUB_CMDLINE_LINUX=" [...] vfio-pci.ids=10de:13c2,10de:0fbb"
if you still want to stay on older kernel, then script below may help.
# file: download.sh [755]
#!/usr/bin/bash
set -e
# received from link below
# https://archive.archlinux.org/repos/2022/10/13/core/os/x86_64/
LINKS=(linux-5.19.13.arch1-1-x86_64.pkg.tar.zst
linux-5.19.13.arch1-1-x86_64.pkg.tar.zst.sig
linux-api-headers-5.18.15-1-any.pkg.tar.zst
linux-api-headers-5.18.15-1-any.pkg.tar.zst.sig
linux-docs-5.19.13.arch1-1-x86_64.pkg.tar.zst
linux-docs-5.19.13.arch1-1-x86_64.pkg.tar.zst.sig
linux-firmware-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-bnx2x-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-bnx2x-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-liquidio-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-liquidio-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-marvell-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-marvell-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-mellanox-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-mellanox-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-nfp-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-nfp-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-qcom-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-qcom-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-qlogic-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-qlogic-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-firmware-whence-20220913.f09bebf-1-any.pkg.tar.zst
linux-firmware-whence-20220913.f09bebf-1-any.pkg.tar.zst.sig
linux-headers-5.19.13.arch1-1-x86_64.pkg.tar.zst
linux-headers-5.19.13.arch1-1-x86_64.pkg.tar.zst.sig)
DOWNLOAD_DIR="$(dirname "$(realpath "$0")")/packages"
mkdir -p "$DOWNLOAD_DIR"
cd "$DOWNLOAD_DIR"
for i in "${LINKS[@]}"; do
wget https://archive.archlinux.org/repos/2022/10/13/core/os/x86_64/$i
done
echo packages downloaded succesfully
Manually downgrading packages and adding kernel (and related packages) to IgnoredPkg
section of /etc/pacman.conf
should solve this issue.
# file: /etc/pacman.conf
# assumes using non lts kernel
# keep in mind that you may have even more kernel dependent packages
[...]
IgnorePkg = linux linux-*
[...]
Good solution
Main requirement for this solution was to be simple enough and compatible with latest kernel.
Method of finding a solution
I was trying various solutions posted on internet, and debugging them inside initramfs using break=postmount kernel commandline. I have discovered that loading vfio-pci inside initrd would cause same runtime hanging behaviour as with old commandline vfio-pci.ids=[...]
, but same command modprobe -i vfio-pci
would work flawlessly inside late userspace.
Solution description
Let’s force initramfs to run hook before anything else (before udevd most importantly in order to avoid surprises) and load module later using systemd service inside late userspace. This can be achieved by creating and adding file to initramfs(controled by mkinitcpio.conf file), adding hook and systemd service (systemd runs after initramfs has done it’s job). This solution is almost same as on ArchWiki with small difference of modprobe -i vfio-pci
executed in late userspace and having early hook for driver override instead of kernel module aliasing in /etc/modprobe.d
.
Prerequisites
- Having archiso USB is prerequisite as system may end up in less useful state
- Creating backup copy of you system is advised.
- Fluence with Linux systems is required.
- IOMMU enabled (for AMD probably in UEFI/BIOS settings, for Intel probably in kernel cmdline)
Important remarks
Using this method
- Do not add these modules to MODULES array in mkinicpio.conf
vfio_pci vfio vfio_iommu_type1 vfio_virqfd
as they shall be loaded by systemd service later. - There is also no need for kernel commandline modification because vfio-pci driver is loaded after booting process.
Steps
- Create
/sbin/vfio-pci-override-vga.sh
file.
Get PCI bus addresses of devices from IOMMU group
lspci
Put these adresses in DEVS array
# file: /sbin/vfio-pci-override-vga.sh [755]
#!/bin/sh
DEVS="0000:01:00.0 0000:01:00.1"
if [ -z "$(ls -A /sys/class/iommu)" ]; then
exit 0
fi
for DEV in $DEVS; do
echo "vfio-pci" > "/sys/bus/pci/devices/$DEV/driver_override"
done
Make script executable
chmod +x /sbin/vfio-pci-override-vga.sh
- Add initcpio hook
/etc/initcpio
├── hooks
│ └── vfio
└── install
└── vfio
Hook script will run inside initramfs
# file: /etc/initcpio/hooks/vfio [644]
#!/usr/bin/ash
run_hook() {
/sbin/vfio-pci-override-vga.sh
}
# vim: set ft=sh ts=4 sw=4 et:
Build hook must also be created in order to copy hook file to initramfs image, otherwise hook would not work as it would not be seen by mkinicpio.
# file: /etc/initcpio/install/vfio [644]
#!/usr/bin/env bash
build() {
add_runscript
}
help() {
cat <<HELPEOF
vfio hook help
HELPEOF
}
name of hook is vfio but it may differ if you wish
- Add hook to
/etc/mkinitcpio.conf
and also add file/sbin/vfio-pci-override-vga.sh
to FILES array so it will be copied to initramfs.
FILES=(/sbin/vfio-pci-override-vga.sh)
[...]
# place vfio after base and before udev
HOOKS=(base vfio udev [...])
[...]
- Add systemd service which loads vfio-pci driver inside late userspace.
# file: /etc/systemd/system/vfio-load.service [644]
[Unit]
Description=Insert vfio-pci driver
[Service]
Type=oneshot
ExecStart=modprobe -i vfio-pci
[Install]
WantedBy=multi-user.target
Remember to enable the service
# systemctl daemon-reload
# systemctl enable vfio-load.service
- Apply changes from
/etc/mkinitcpio.conf
by generating new initramfs:
sudo mkinitcpio -P
- Reboot you computer and verify it worked by checking whether line
Kernel driver in use: vfio-pci
is present.
$ lspci -nnk
[...]
01:00.0 VGA compatible controller [0300]: NVIDIA Corporation [REDACTED]
Subsystem: Lenovo Device [17aa:[REDACTED]]
Kernel driver in use: vfio-pci
Kernel modules: nouveau
01:00.1 Audio device [0403]: NVIDIA Corporation [REDACTED]
Kernel driver in use: vfio-pci
Kernel modules: snd_hda_intel
[...]
Debugging
- If there are other drivers thant vfio-pci loaded, it means that initramfs hook didn’t execute properly.
- To debug hook boot you PC, press e on your Linux option in grub and add
break=postmount
kernel parameter, then press CTRL+x, enter decryption password and then you will be dropped inside intramfs shell where you can check if hooks file exists or why it doesn’t work. It worked if inside Late Userspace no drivers were loaded meaning line starting withKernel driver in use:
does containvfio-pci
or does not exist. - Systemd service will work only if initramfs hook executed correctly. If there is no
Kernel driver in use: vfio-pci
line for IOMMU devices, but no other drivers were loaded it means something wrong with service. You can check systemd service status usingsystemctl status vfio-load.service
or logs usingjournalctl -u vfio-load.service
.
Conclusion
I hope it helps. Feel free to suggest or point out any improvements, mistakes or bug fixes. And most importantly - Enjoy your awesome VMs!