Skip to main content

ramalama musa.7

Setting Up RamaLama with MUSA Support on Linux systems

This guide walks through the steps required to set up RamaLama with MUSA support.

Install the MT Linux Driver

Download the appropriate MUSA SDK and follow the installation instructions provided in the MT Linux Driver installation guide.

Install the MT Container Toolkit

Obtain the latest MT CloudNative Toolkits and follow the installation instructions provided in the MT Container Toolkit installation guide.

Setting Up MUSA Support

$ (cd /usr/bin/musa && sudo ./docker setup $PWD)
$ docker info | grep mthreads
Runtimes: mthreads mthreads-experimental runc
Default Runtime: mthreads

Testing the Setup

Test the Installation

Run the following command to verify setup:

docker run --rm --env MTHREADS_VISIBLE_DEVICES=all ubuntu:22.04 mthreads-gmi

Expected Output

Verify everything is configured correctly, with output similar to this:

Thu May 15 01:53:39 2025
---------------------------------------------------------------
mthreads-gmi:2.0.0 Driver Version:3.0.0
---------------------------------------------------------------
ID Name |PCIe |%GPU Mem
Device Type |Pcie Lane Width |Temp MPC Capable
| ECC Mode
+-------------------------------------------------------------+
0 MTT S80 |00000000:01:00.0 |0% 3419MiB(16384MiB)
Physical |16x(16x) |59C YES
| N/A
---------------------------------------------------------------

---------------------------------------------------------------
Processes:
ID PID Process name GPU Memory
Usage
+-------------------------------------------------------------+
No running processes found
---------------------------------------------------------------

MUSA_VISIBLE_DEVICES

RamaLama respects the MUSA_VISIBLE_DEVICES environment variable if it's already set in your environment. If not set, RamaLama will default to using all the GPU detected by mthreads-gmi.

You can specify which GPU devices should be visible to RamaLama by setting this variable before running RamaLama commands:

export MUSA_VISIBLE_DEVICES="0,1"  # Use GPUs 0 and 1
ramalama run granite

This is particularly useful in multi-GPU systems where you want to dedicate specific GPUs to different workloads.


May 2025, Originally compiled by Xiaodong Ye <yeahdongcn@gmail.com>