> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ramalama.com/llms.txt
> Use this file to discover all available pages before exploring further.

# cuda

> Platform-specific setup guide

# Setting Up RamaLama with CUDA Support on Linux systems

This guide walks through the steps required to set up RamaLama with CUDA support.

## Install the NVIDIA Container Toolkit

Follow the installation instructions provided in the [NVIDIA Container Toolkit installation guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html).

### Installation using dnf/yum (For RPM based distros like Fedora)

* Install the NVIDIA Container Toolkit packages

  ```bash theme={"system"}
  ```

sudo dnf install -y nvidia-container-toolkit

````

<Note>
 The NVIDIA Container Toolkit is required on the host for running CUDA in containers.
</Note>
<Note>
 If the above installation is not working for you and you are running Fedora, try removing it and using the [COPR](https://copr.fedorainfracloud.org/coprs/g/ai-ml/nvidia-container-toolkit/).
</Note>

### Installation using APT (For Debian based distros like Ubuntu)

* Configure the Production Repository

   ```bash
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | \
   sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg

   curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
   sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
   sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list
````

* Update the packages list from the repository

  ```bash theme={"system"}
  ```

sudo apt-get update

````

* Install the NVIDIA Container Toolkit packages

   ```bash
sudo apt-get install -y nvidia-container-toolkit
````

<Note>
  The NVIDIA Container Toolkit is required for WSL to have CUDA resources while running a container.
</Note>

## Setting Up CUDA Support

For additional information see:  [Support for Container Device Interface](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html)

# Generate the CDI specification file

```bash theme={"system"}
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
```

# Check the names of the generated devices

Open and edit the NVIDIA container runtime configuration:

```bash theme={"system"}
nvidia-ctk cdi list
INFO[0000] Found 1 CDI devices
nvidia.com/gpu=all
```

<Note>
  Generate a new CDI specification after any configuration change most notably when the driver is upgraded!
</Note>

## Testing the Setup

**Based on this Documentation:**  [Running a Sample Workload](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/sample-workload.html)

***

# **Test the Installation**

Run the following command to verify setup:

```bash theme={"system"}
podman run --rm --device=nvidia.com/gpu=all fedora nvidia-smi
```

# **Expected Output**

Verify everything is configured correctly, with output similar to this:

```text theme={"system"}
Thu Dec  5 19:58:40 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 565.72                 Driver Version: 566.14         CUDA Version: 12.7     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|=========================================+========================+======================|
|   0  NVIDIA GeForce RTX 3080        On  |   00000000:09:00.0  On |                  N/A |
| 34%   24C    P5             31W /  380W |     867MiB /  10240MiB |      7%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes:                                                                              |
|  GPU   GI   CI        PID   Type   Process name                              GPU Memory |
|        ID   ID                                                               Usage      |
|=========================================================================================|
|    0   N/A  N/A        35      G   /Xwayland                                   N/A      |
|    0   N/A  N/A        35      G   /Xwayland                                   N/A      |
+-----------------------------------------------------------------------------------------+
```

<Note>
  On systems that have SELinux enabled, it may be necessary to turn on the `container_use_devices` boolean in order to run the `nvidia-smi` command successfully from a container.
</Note>

To check the status of the boolean, run the following:

```bash theme={"system"}
getsebool container_use_devices
```

If the result of the command shows that the boolean is `off`, run the following to turn the boolean on:

```bash theme={"system"}
sudo setsebool -P container_use_devices 1
```

### CUDA\_VISIBLE\_DEVICES

RamaLama respects the `CUDA_VISIBLE_DEVICES` environment variable if it's already set in your environment. If not set, RamaLama will default to using all the GPU detected by nvidia-smi.

You can specify which GPU devices should be visible to RamaLama by setting this variable before running RamaLama commands:

```bash theme={"system"}
export CUDA_VISIBLE_DEVICES="0,1"  # Use GPUs 0 and 1
ramalama run granite
```

This is particularly useful in multi-GPU systems where you want to dedicate specific GPUs to different workloads.

If the `CUDA_VISIBLE_DEVICES` environment variable is set to an empty string, RamaLama will default to using the CPU.

```bash theme={"system"}
export CUDA_VISIBLE_DEVICES=""  # Defaults to CPU
ramalama run granite
```

To revert to using all available GPUs, unset the environment variable:

```bash theme={"system"}
unset CUDA_VISIBLE_DEVICES
```

## Troubleshooting

### CUDA Updates

On some CUDA software updates, RamaLama stops working complaining about missing shared NVIDIA libraries for example:

```bash theme={"system"}
ramalama run granite
Error: crun: cannot stat `/lib64/libEGL_nvidia.so.565.77`: No such file or directory: OCI runtime attempted to invoke a command that was not found
```

Because the CUDA version is updated, the CDI specification file needs to be recreated.

```bash theme={"system"}
sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
```

## See Also

[ramalama(1)](/pages/commands/ramalama/), [podman(1)](https://github.com/containers/podman/blob/main/docs/source/markdown/podman.1.md)

***

*Jan 2025, Originally compiled by Dan Walsh \<[dwalsh@redhat.com](mailto:dwalsh@redhat.com)>*