This tutorial will guide you towards setting up fully local text-to-speech and speecth-to-text with Wyoming Whisper and Piper. We'll be doing this on a Proxmox 8.1.4 virtual environment, within an Ubuntu 22.04 LXC. Thanks to CUDA, this will be blazing fast, but it requires you to have a compatible GPU from NVIDIA; for example the GeForce RTX 3060.
The tutorial is based upon this modest rig:
What really matters here is that you have a GPU that supports CUDA, and that the GPU is compatible with your motherboard. If you do, your motherboard and CPU are also probably compatible with the Proxmox requirements for virtualization.
Outline:
Generally speaking, you may need to open up your BIOS for virtualization. The problem is, different vendors and different chipsets have different menus, names and default settings. I found this general guide for Proxmox 7.x, which I used as a guide for Proxmox 8.1.4.
In my particular BIOS, I did the following: Firstly, i Disabled CSM. Then I did the following under the Advanced menu:
I may have changed other virtualization options a year ago when setting up Proxmox the first time, but these were the ones I tweaked especially for the case of virtualizing the GPU. I'm not sure I needed to perform these steps, as everything worked on my first try after doing this.
In order to pass through your GPU to guests, you need the drivers installed on the host. Installing the drivers involves compiling them specifically for your system.
You need to be in the Linux shell of you Proxmox host as root to do these things. Use SSH or connect with a keyboard and screen directly to the Proxmox host computer.
For compiling, you need gcc and make:
apt-get install gcc
apt-get install make
For tailoring to your specific system, you need to download the correct kernel headers:
uname -a
This will output something like: Linux proxmox-node1 6.5.11-8-pve #1 SMP PREEMPT_DYNAMIC PMX 6.5.11-8 (2024-01-30T12:27Z) x86_64 GNU/Linux Now, use use the version number <6.5.11-8-pve> to install the kerne headers like this:
apt-get install pve-headers-6.5.11-8-pve
The Proxmox 8.1.4 kernel has a driver which is incompatible with the NVIDIA driver, so you need to disable the Nouveau kernel driver before you install the NVIDIA driver. This requires a few steps.
Create a file:
nano /etc/modprobe.d/blacklist-nouveau.conf
Add the following to the file:
blacklist nouveau
options nouveau modeset=0
Now run:
update-initramfs -u
reboot
After rebooting the Proxmox host, you can download the latest NVIDIA driver from the NVIDIA website. For this setup, choose Linux x86_64/AMD64, Latest Production Branch, download the drivers, make them runnable as root on the Proxmox host system, and install like this:
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
chmod +x NVIDIA-Linux-x86_64-535.154.05.run
./NVIDIA-Linux-x86_64-535.154.05.run
Don't worry about the x library path message during install. Also, I installed the 32-bit compatibility drivers but don't know if I needed it. And finally, don't update xorg settings.
Now it's time to check if your GPU is detected. Run:
nvidia-smi -L
Find the cgroups assigned to your GPU:
ls -l /dev/nvidia*
Look up the cgroup numbers for:
In my case they were 195 and 511. Take note of these cgroup numbers!
It's time to move on the to the guest LXC where we'll run whisper and piper.
Firstly, you need to download a template. You do this from the Proxmox webgui. You will probably need to access the GUI in a browser from another computer on the same network as your Proxmox server. Go to the following address:
In the GUI, "local" storage on your "proxmox node". The "local" storage you are looking for is located in the leftmost menu in Proxmox towarwds the bottom - not to be confused with the "local" in the second-leftmost menu.
PS! If you plan on using Whisper and Piper with Home Assistant, it's important that your VT gets an IP-address that is routable from your Home Assistant server. This is most easily accomplished by having the guest connect via a network bridge, e.g. vmbr0, with the same gateway as your Home Assistant server. You should use a static IP, but you may use a Dynamic IP if you reserve IPs on the DHCP-server side. If you use VLANs, also make the bridge is VLAN aware.
Now, go back to the Proxmox host shell as root to edit the LXC config file to pass through the GPU from the host to the guest:
nano /etc/pve/lxc/101.conf
Add the following 7 lines to the bottom of the lxc configuration file, but swap out 195 and 511 if you got other cgroup numbers:
lxc.cgroup2.devices.allow: c 195:* rwm
lxc.cgroup2.devices.allow: c 511:* rwm
lxc.mount.entry: /dev/nvidia0 dev/nvidia0 none bind,optional,create=file
lxc.mount.entry: /dev/nvidiactl dev/nvidiactl none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm dev/nvidia-uvm none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-modeset dev/nvidia-modeset none bind,optional,create=file
lxc.mount.entry: /dev/nvidia-uvm-tools dev/nvidia-uvm-tools none bind,optional,create=file
Now you are ready to start the LXC and go into it.
With the guest up and running, enter the guest shell via the console available in the Proxmox GUI.
Installing the drivers on the guest system is easier than on the host. You need to download the same drivers as for the host, but you aren't compiling anything, because we reuse the kernel modules that were built and installed on the host (because the LXC shares kernel and headers with it's host).
Download and install like this (as root on the guest):
wget https://us.download.nvidia.com/XFree86/Linux-x86_64/535.154.05/NVIDIA-Linux-x86_64-535.154.05.run
chmod +x NVIDIA-Linux-x86_64-535.154.05.run
./NVIDIA-Linux-x86_64-535.154.05.run --no-kernel-module
If the install fails in the middle, just try again. You may need 2 or 3 tries before it goes through. Don't worry about x, xorg and install the 32-bit driver like for the host.
In the end, check that the card is available inside the guest:
nvidia-smi -L
Your card should be listed now.
We need docker to run Whisper and Piper. Before we install docker, we need to update the system and for the sake of cleanliness, triple check and remove older docker components. Run these commands as root on the guest system:
apt-get update
apt-get dist-upgrade
for pkg in docker.io docker-doc docker-compose docker-compose-v2 podman-docker containerd runc; do sudo apt-get remove $pkg; done
# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc
# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
# Install docker
sudo apt-get update
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Technology moves really fast, but not all technology moves at the same pace. It seems NVIDIA is moving faster with their development kits than Wyoming whisper does with using the Cuda programming interface, so this guide may be outdated at the time of writing. CUDNN is a critical component in this mix, and it's currently only available in version 9 on the NVIDIA web pages, but we'll install version 8 (9 may work, but I haven't tried it). We'll use a mix of old and new libraries, but trust the prosess!
First, install the NVIDIA container toolkit and configure it. Do this as root on the guest system:
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \
&& curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
sudo apt-get update
sudo apt-get install -y nvidia-container-toolkit
sudo nvidia-ctk runtime configure --runtime=docker
Now we need to do a tweak on the NVIDIA container runtime (didn't work unless I did this on my system, not sure why...):
vi /etc/nvidia-container-runtime/config.toml
Change:
#no-cgroups = false
To:
no-cgroups = true
Now you need to restart docker:
sudo systemctl restart docker
Additionally, we need some libraries for Whisper, that we don't get from the NVIDIA container toolkit:
sudo apt install libcublas11
wget https://developer.download.nvidia.com/compute/redist/cudnn/v8.8.0/local_installers/11.8/cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb
sudo dpkg -i cudnn-local-repo-ubuntu2204-8.8.0.121_1.0-1_amd64.deb
sudo cp /var/cudnn-local-repo-ubuntu2204-8.8.0.121/cudnn-local-B66125A0-keyring.gpg /usr/share/keyrings
sudo apt-get update
sudo apt install libcudnn8
Finally, we need to install Whisper and Piper. I've installed it with Norwegian as the basis, but it's just a matter of changing parameters and language codes if you want something else.
Tips for Whisper. Tips for Piper.
Create a file on the guest system named docker-compose.yml and fill it with the following content:
version: '3'
services:
wyoming-whisper:
image: rhasspy/wyoming-whisper:latest
ports:
- "10300:10300"
volumes:
- ./whisper-data:/data
- /usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8:ro
- /usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8:ro
- /usr/lib/x86_64-linux-gnu/libcublasLt.so.11:/usr/lib/x86_64-linux-gnu/libcublasLt.so.12:ro
- /usr/lib/x86_64-linux-gnu/libcublas.so.11:/usr/lib/x86_64-linux-gnu/libcublas.so.12:ro
command: --model medium-int8 --language no --beam-size 5 --device cuda
restart: unless-stopped
runtime: nvidia
deploy:
resources:
reservations:
devices:
- driver: nvidia
count: 1
capabilities: [gpu]
piper:
image: rhasspy/wyoming-piper
command: --voice no_NO-talesyntese-medium
volumes:
- ./piper-data:/data
environment:
- TZ=Europe/Oslo
restart: unless-stopped
ports:
- 10200:10200
Start the services:
docker compose up -d
Debug containers:
docker ps
docker logs <id>
To connect to the services from Home Assistant, use the IP of the guest system + port numbers 10300 and 10200 respectively. To find the IP of the guest, run this command from the guest shell:
ip a
If you skimped on the disk, you can resize it on the fly from the host shell as root. Example, resize disk of guest id 101 to 30G:
pct resize 101 rootfs 30G
# Validate that it's ok - but it should be changed already, and no restart should be needed on LXC
vi /etc/pve/lxc/101.conf
Sometimes the cgroups can change between host reboots. If this happens, edit the LXC config accordingly and restart the guest system. If you experience other problems, try reinstalling the driver, but make sure you do it on both host and guest - with the same driver version to avoid problems.