│ ├── disable-nouveau.sls
│ ├── init.sls
│ └── map.jinja
| │ ├── f41-disable-nouveau.sls
│ ├── f41.sls
│ └── default.jinja
|
Let's start with the obvious. `top.sls` is a top file. It describes high state, which is really just a combination of convetional salt formulas. Stray piece of salt configuration can be referred to as `formula`, although I've seen this word being used in various contexts. `test.sls` is a state file. It contains a configuration written in yaml. `nvidia-driver` is also a state, although it is a directory. This is an alternative way to store state for situations when you want to have multiple state (or not only state) files. When a state directory is referenced, salt evaluates `init.sls` state file inside. State files may or may not be `include`d from `init.sls` or other state files.
Yaml configuration consists of states. In this context, state refers to a module - piece of code that most often does a pretty specific thing. In a configuration, states behave like commands or functions and methods of a programming language. One valuable thing to note here is that not all modules *are* state modules. There are [a lot](https://docs.saltproject.io/en/latest/py-modindex.html) of them, and they can do various things, but here we only need the state kind.
In addition to state files, you notice `map.jinja`. [Jinja](https://palletsprojects.com/projects/jinja/) is a templating engine. What it means is that it helps you to generalize your state files by adding variables, conditions and other cool features. You can easily recognize jinja by fancy brackets: `{{ }}`, `{% %}`, `{# #}`. This file in particular stores variable definitions and is used for configuration of the whole state directory thingy (nvidia-driver).
| Let's start with the obvious. `top.sls` is a top file. It describes high state, which is really just a combination of conventional salt formulas. Stray piece of salt configuration can be referred to as `formula`, although I've seen this word being used in various contexts. `test.sls` is a state file. It contains a configuration written in yaml. `nvidia-driver` is also a state, although it is a directory. This is an alternative way to store state for situations when you want to have multiple state (or not only state) files. When a state directory is referenced, salt evaluates `init.sls` state file inside. State files may or may not be included from `init.sls` or other state files.
> Since in this case different formulas are used depending on distribution, it doesn't make much sense to *have* `init.sls`. In this configuration, you can't just call for `nvidia-driver`, and must specify distribution too: `nvidia-driver.f41`
Yaml configuration consists of states. In this context, state refers to a module - piece of code that most often does a pretty specific thing. In a configuration, states behave like commands or functions and methods of a programming language. At the same time, salt formulas are distinct from conventional programming languages in their order of execution. Unless you clearly define the order using arguments like `require`, `require_in`, and `order`, you should not expect states to execute in the order you write them.
> One thing to note here is that not all modules *are* state modules. There are [a lot](https://docs.saltproject.io/en/latest/py-modindex.html) of them, and they can do various things, but here we only need the state kind.
In addition to state files, you notice `default.jinja`. [Jinja](https://palletsprojects.com/projects/jinja/) is a templating engine. What it means is that it helps you to generalize your state files by adding variables, conditions and other cool features. You can easily recognize jinja by fancy brackets: `{{ }}`, `{% %}`, `{# #}`. This file in particular stores variable definitions and is used for configuration of the whole state tree (directory `nvidia-folder`).
|
- name: {{ prefs.standalone_name }}
| - name: {{ nvd_f41['standalone']['name'] }}
|
- template: {{ prefs.template_name }}
- label: {{ prefs.standalone_label }}
- mem: {{ prefs.standalone_memory }}
- vcpus: {{ prefs.standalone_cpus }}
| - template: {{ nvd_f41['template']['name'] }}
- label: {{ nvd_f41['standalone']['label'] }}
- flags:
- standalone
- prefs:
- vcpus: {{ nvd_f41['standalone']['vcpus'] }}
- memory: {{ nvd_f41['standalone']['memory'] }}
|
- class: StandaloneVM
- prefs:
- label: {{ prefs.standalone_label }}
- mem: {{ prefs.standalone_memory }}
- vcpus: {{ prefs.standalone_cpus }}
- pcidevs: {{ devices }}
| - pcidevs: {{ nvd_f41['devices'] }}
|
- maxmem: 0
- class: StandaloneVM
| |
Here, I use qubes-specific `qvm.vm` state module (which in reality is a wrapper around other modules, like `prefs`, `features`, [etc](https://github.com/QubesOS/qubes-mgmt-salt-dom0-qvm?tab=readme-ov-file#qvm-vm).). Pretty much all values and keys here are the same as you can set and get using `qvm-prefs` and `qvm-features`. For nvidia drivers to work, kernel must be provided by the qube - that's why the field is empty. Similarly, to pass GPU we need to set virtualization mode to `hvm` and `maxmem` to 0 (it disables memory balancing).
| Here, I use qubes-specific `qvm.vm` state module (which in reality is a wrapper around other modules, like `prefs`, `features`, [etc](https://github.com/QubesOS/qubes-mgmt-salt-dom0-qvm?tab=readme-ov-file#qvm-vm).). Almost all values and keys here are the same as you can set and get using `qvm-prefs` and `qvm-features`. For nvidia drivers to work, kernel must be provided by the qube - that's why the field is empty. Similarly, to pass GPU we need to set virtualization mode to `hvm` and `maxmem` to 0 (it disables memory balancing).
|
Now, to the jinja statements. Here, they provide values for keys like label, template, name, etc. Some of them are done this way (as opposed to writing a value by hand) because the value is repeated in the state file multiple times, others are to simplify the process of configuration. In order to figure out why some of them use dot notation whereas other don't, we must check their declaration. In this state file they're imported using the following line:
| Now, to the jinja statements. Here, they provide values for keys like label, template, name, etc. Some of them are done this way (as opposed to writing a value by hand) because the value is repeated in the state file multiple times, others are to simplify the process of configuration. In this state file jinja variable is imported using the following snippet:
|
{% from 'nvidia-driver/map.jinja' import prefs,devices,paths %}
| {% if nvd_f41 is not defined %}
{% from 'nvidia-driver/default.jinja' import nvd_f41 %}
{% endif %}
|
This is pretty much just python in brackets. Notice that you need to specify state directory when importing, and use actual path instead of dot notation.
| Jinja is very similar in its syntax to python. In this case variable from `default.jinja` gets imported only if it is not declared in the current context. It allows us to both call this formula as is (without any jinja context) and include it in other formulas (potentially with custom definition of `nvd_f41`). Note that you need to specify state directory when importing, and use actual path instead of dot notation.
|
```jinja
{% set prefs = {
'standalone_name': 'fedora-40-nvidia',
'standalone_label': 'yellow',
'standalone_memory': 4000,
'standalone_cpus': 4,
'template_name': 'fedora-40-xfce',
} %}
{# Don't forget to check devices before running! #}
{% set devices = [
'01:00.0',
'01:00.1',
] %}
{% set paths = {
'nvidia_conf': '/usr/share/X11/xorg.conf.d/nvidia.conf',
'grub_conf': '/etc/default/grub',
'grub_out': '/boot/grub2/grub.cfg',
| ```python
{% set nvd_f41 = {
'template':{'name':'fedora-41-xfce'},
'standalone':{
'name':'fedora-41-nvidia-cuda',
'label':'yellow',
'memory':'4000',
'vcpus':'4',
},
'devices':['01:00.0','01:00.1'],
'paths':{
'nvidia_conf':'usr/share/X11/xorg.conf.d/nvidia.conf',
'grub_conf':'/etc/default/grub',
'grub_out':'/boot/grub2/grub.cfg',
},
|
Here, I declare dictionary `prefs`, list `devices`, and another dictionary `paths`. Since we need to pass all devices from the list to new qube, in the state file I reference the whole list using jinja expression (`{{ devices }}`). Dictionaries are used to fill parameters, and dot notation is used to reference specific values in them.
Double brackets tell the parser to "print" the value into state file before the show starts, whereas statements (`{% %}`) do logic. `{# #}` is a comment.
| Here, I declare dictionary `nvd_f41`. It contains sub-dictionaries for template parameters, standalone qube parameters, list of pci devices to pass through, and another dictionary for paths. Since we need to pass all devices in the list to new qube, in the state file I reference whole list.
Jinja behavior differs depending on what delimiter is used. Code in double brackets (called expression) tells the parser to "print" the resulting value into state file before the show starts. Statements (`{% %}`) do logic. `{# #}` is a comment.
|
Now, when we have a qube at the ready (you can check it by applying it), how to install drivers? I want to discuss what's going on next, because at the moment of writing (November 2024) this guide is for fedora 40 in combination with somewhat modern hardware, and it has some distro-specific issues.
> Tip: To apply state, put your state into the folder in your salt environment together with jinja file and run
| Now, when we have a qube at the ready (you can check it by applying the formula), how do we install drivers? I want to discuss what's going on next, because at the moment of writing (December 2024) this guide is for fedora 41, and it has some distro-specific issues.
[details="How do I apply a formula?"]
To apply a formula, put your state into the folder in your salt environment together with jinja file and run
|
(substitute `<name_of_your_state>`)
>
> Salt will apply the state to all targets. When not specified, dom0 is the only target. This is what we want here, because dom0 handles creation of qubes, but what to do if situation is different? Add `--skip-dom0` if you want to skip dom0 and add `--targets=<targets>` to add something else.
| (substitute `<name_of_your_state>`)
Salt will apply the state to all targets. When not specified, dom0 is the only target. This is what we want here, because dom0 handles creation of qubes. Add `--skip-dom0` if you want to skip dom0 and add `--targets=<targets>` to add something else.
[/details]
|
4. Delete `grubby-dummy`, becase it confilcts with `sdubby`, and nvidia drivers depend on it.
5. Install `akmod-nvidia` and `xorg-x11-drv-nvidia-cuda`
6. Wait for building process to finish
7. Delete X config, because we don't need it where we going :sunglasses:
8. **optional** : Disable nouveau, because nvidia install script may fail to convince the system that it should use nvidia driver.
| > According to my measurement, driver no longer needs more than 1G to build. I have decided to leave this step in just in case that problem still occurs with different hardware.
6. Install `akmod-nvidia` and `xorg-x11-drv-nvidia-cuda`
8. Wait for the driver to build
9. Delete X config, because we don't need it where we going :sunglasses:
10. **optional** : Disable nouveau, because nvidia install script may fail to convince the system that it should use nvidia driver.
|
Unless you are willing to write (and call for) multiple states to perform single operation, you might be wandering how to make salt apply only first state (qube creation) to dom0, and all others - to the nvidia qube. The answer is to use jinja:
| Unless you are willing to write (and call for) multiple states to perform one operation, you might be wandering how to make salt apply only first state (qube creation) to dom0, and all others - to the nvidia qube. The answer is to use jinja:
|
{% elif grains['id'] == prefs.standalone_name %}
<!-- prefs.standalone_name stuff goes here -->
| {% elif grains['id'] == nvd_f41['standalone']['name'] %}
<!-- nvd_f41['standalone']['name'] stuff goes here -->
|
Pretty self-explanatory. `{free,nonfree}` is used to enable multiple repositories at once. This is not salt or jinja-specific.
| Pretty self-explanatory. `{free,nonfree}` is used to enable multiple repositories at once. It is a feature of the shell, not salt or jinja.
|
- name: dnf config-manager --enable rpmfusion-{free,nonfree}{,-updates}
```
| - name: dnf config-manager setopt rpmfusion-{free,nonfree}{,-updates}.enabled=1
```
> Wow, new dnf syntax. How is this bulkier and less readable than the old one?
|
This lasts until reboot. 4G is probably overkill.
| This lasts until reboot. As I already mentioned, you might not need this. On the other hand, it is non-persistent and generally harmless, so why not?
|
- name: mount -o remount,size=4G /tmp/
| - name: mount -o remount,size=2G /tmp/
|
## 4. Delete `grubby-dummy`
| ## 4. Install drivers
This one is very bad. dnf syntax have changed after the update and salt-native package management state doesn't work anymore. I am forced to run this as a command.
Here, I use `- require:` parameter to wait for other states to apply before installing the drivers. Note that it needs both state (e.g. `cmd`) and label to function. Since `pkg.install` doesn't work with dnf yet, I resort to running a command. I declare jinja variable just before the command and immediately use it instead of writing the long command to keep the state tidy. As far as I can tell, `' '.join()` does the same thing it does in python: converts list into string by connecting its elements with `' '`.
|
nvidia-driver--remove-grubby:
pkg.purged:
- pkgs:
- grubby-dummy
```
## 5. Install drivers
Here, I use `- require:` parameter to wait for other states to apply before installing the drivers. Note that it needs both state (e.g. `cmd`) and label to function.
```yaml
| {% set packages = [
'akmod-nvidia',
'xorg-x11-drv-nvidia-cuda',
'vulkan',
] %}
|
pkg.installed:
- pkgs:
- akmod-nvidia
- xorg-x11-drv-nvidia-cuda
- nvtop
| cmd.run:
- name: dnf install -y {{ ' '.join(packages) }}
|
- pkg: nvidia-driver--remove-grubby
| |
- pkg: nvidia-driver--install
| - cmd: nvidia-driver--install
|
- name: {{ paths.nvidia_conf }}
| - name: {{ nvd_f41['paths']['nvidia_conf'] }}
|
- Why don't just add reboot state into the state file before this? Because only dom0 can reboot qubes, dom0 states are always applied first, and there is no way I know of to make it run part of its state, wait until a condition is met, and continue, without multiple calls to qubesctl, unless...
> To run state located inside state folder, use dot notation, e.g.: `state.sls nvidia-driver.disable-nouveau`
| - Why don't just add reboot state into the state file before this? Because only dom0 can reboot qubes, dom0 states are always applied first, and there is no way I know of to make it run part of its state, wait until a condition is met, and continue, without multiple calls to qubesctl, unless...
|
{% from 'nvidia-driver/map.jinja' import prefs,paths %}
{% if grains['id'] == prefs.standalone_name %}
| {% if nvd_f41 is not defined %}
{% from 'nvidia-driver/default.jinja' import nvd_f41 %}
{% endif %}
{% if grains['id'] == nvd_f41['standalone']['name'] %}
|
- name: {{ paths.grub_conf }}
| - name: {{ nvd_f41['paths']['grub_conf'] }}
|
- name: grub2-mkconfig -o {{ paths.grub_out }}
| - name: grub2-mkconfig -o {{ nvd_f41['paths']['grub_out'] }}
|
Make sure to change the paths if you're not running fedora 40.
| Make sure to change the paths if you're not running fedora 41.
|
***
Want to check out the complete state? Here you go:
| # Downloads
## Current version
- [github repo](https://github.com/RandyTheOtter/nvidia-driver/tree/main)
- Direct download:
- [f41.yaml|attachment](upload://nPz2haLCFV17rmFrz3FNAjUs2Cm.yaml) (1.9 KB)
- [f41-disable-nouveau.yaml|attachment](upload://yJVnEZLkahAJXKlsjt6gPQEN85H.yaml) (498 Bytes)
- [default.yaml|attachment](upload://6JKuhy0loihkwT1RA9gi1lYekc8.yaml) (379 Bytes)
## Old version (fedora 40)
This formula is quite a lot different from the new one due to the changes in dnf, resolution of `grubby-dummy` dependency conflict, and updates to the way I use jinja. Here is the old procedure:
1. Prepare qube
2. Enable rpmfusion repository
3. Grow `/tmp/`, because default 1G is too small to fit everything that driver building process spews out. It will fail otherwise.
5. Delete `grubby-dummy`, becase it confilcts with `sdubby`. Nvidia drivers depend on it. See [this issue](https://github.com/QubesOS/qubes-issues/issues/9556).
6. Install `akmod-nvidia` and `xorg-x11-drv-nvidia-cuda`
7. Wait for the driver to build
8. Delete X config, because we don't need it where we going :sunglasses:
9. **optional** : Disable nouveau, because nvidia install script may fail to convince the system that it should use nvidia driver.
Old state files. They're kind of uggo, but I like them anyway:
|