Ollama runs smoothly on QubesOS

Original forum link
https://forum.qubes-os.org/t/31627
Original poster
Robert Ford
Created at
2025-01-20 12:37:52
Posts count
3
Likes count
3

If you like to run an AI model locally this is how I have been running ollama in a dedicated appVM. Performance is alright depending on the size of the choosen model.

Recommended settings for appVM

private storage max size: 80 GB
initial memory: 16000 MB
max memory: what you can spare
VCPUs: 4

In the template:

sudo pacman -Syu
sudo pacman -S ollama
sudo pacman -S docker docker-compose  # optional

In the appVM:

sudo mkdir -p /rw/bind-dirs/var/lib/ollama
sudo mkdir -p /rw/config/qubes-bind-dirs.d
sudo nano /rw/config/qubes-bind-dirs.d/50_user.conf

binds+=( '/var/lib/ollama' ) binds+=( '/var/lib/docker' )

sudo nano /rw/config/rc.local

!/bin/sh

# increase swap size swapoff /dev/xvdc1 parted -s /dev/xvdc rm 1 parted -s /dev/xvdc rm 3 parted -s /dev/xvdc mkpart primary linux-swap 10G mkswap /dev/xvdc swapon -d /dev/xvdc

# service is disabled in template systemctl start ollama

# several AI projects offer docker containers, you could # run ollama in a docker container instead if you like # systemctl start docker

Restart appVM, download a language model and run it

ollama help
ollama pull llama3.2
ollama run llama3.2
ollama on the command line is used similarly to docker. Using run gives you a chat interface in the terminal, however it's service also offers an API running/listening on 127.0.0.1:11434. Have fun and may enough RAM be with you.

https://github.com/ollama/ollama