> ## Documentation Index
> Fetch the complete documentation index at: https://docs.ramalama.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Models (OCI)

> Raw model files packaged as OCI artifacts for portability, provenance, and secure distribution.

RamaLama “Model” artifacts package raw model files (e.g., `.gguf`, Safetensors) using the OCI format.
They are registry‑hosted, content‑addressed, and provenance‑rich — ideal for reproducible deployments, enterprise controls, and air‑gapped environments.

## Why use OCI‑packaged models

* Portability: Pull the same model to any node that can reach your registry
* Provenance: Standardized annotations for origin, license, and file metadata
* Separation of concerns: Update models independently of runtimes and apps
* Air‑gapped: Mirror/pull once, distribute internally, mount read‑only

## Tags and discovery

* Use content tags like `:gguf` when pulling GGUF model files
* “Image‑as‑volume” variants use `:gguf-image` (for Podman `--mount type=image`)
* Browse tags: [https://registry.ramalama.com/projects/ramalama](https://registry.ramalama.com/projects/ramalama)
* Pull artifacts from: `rlcr.io/ramalama/...`

## Pull models locally

Use a tool like ORAS to download model files to disk, or reference the artifact directly with the RamaLama CLI.

<CodeGroup>
  ```bash title="ORAS (download to ./models)" theme={"system"}
  oras pull rlcr.io/ramalama/gemma3-270m:gguf -o ./models/
  ```

  ```bash title="RamaLama CLI (serve from OCI)" theme={"system"}
  ramalama serve --image rlcr.io/ramalama/llamacpp-cpu-distroless oci://rlcr.io/ramalama/gemma3-270m:gguf
  ```
</CodeGroup>

<Tip>
  You can find the full catalogue of RamaLama Labs images [here](https://registry.ramalama.com/projects/ramalama)
</Tip>

## Run with a runtime

Mount the model directory into a runtime container and pass the path to `--model`.

<CodeGroup>
  ```bash title="Docker (CPU runtime)" theme={"system"}
  docker run --rm -p 8080:8080 \
    -v "$PWD/models:/models:ro" \
    rlcr.io/ramalama/llamacpp-cpu-distroless:latest \
    --model /models/gemma-3-270m-it-Q6_K.gguf --host 0.0.0.0 --port 8080
  ```

  ```bash title="Docker (CUDA runtime)" theme={"system"}
  docker run --rm -p 8080:8080 --gpus all \
    -v "$PWD/models:/models:ro" \
    rlcr.io/ramalama/llamacpp-cuda-distroless:latest \
    --model /models/gemma-3-270m-it-Q6_K.gguf --host 0.0.0.0 --port 8080
  ```

  ```bash title="Podman (CPU runtime)" theme={"system"}
  podman run --rm -p 8080:8080 \
    -v "$PWD/models:/models:ro" \
    rlcr.io/ramalama/llamacpp-cpu-distroless:latest \
    --model /models/gemma-3-270m-it-Q6_K.gguf --host 0.0.0.0 --port 8080
  ```
</CodeGroup>

## Podman: Image‑as‑volume

Avoid a local models directory by mounting the OCI model artifact as a read‑only image volume.

```bash title="Podman" theme={"system"}
podman run --rm -p 8080:8080 \
  --mount type=image,src=rlcr.io/ramalama/gemma3-270m:gguf-image,target=/artifact,ro=true \
  rlcr.io/ramalama/llamacpp-cpu-distroless:latest \
  --model /artifact/models/<exact-file>.gguf --host 0.0.0.0 --port 8080
```

<Tip>
  Need the exact model filename? Inspect labels/annotations attached to artifacts.
  See the examples in `/pages/deploying/compose` under “Other Notes”.
</Tip>

## See also

* Runtimes (engines only): `/pages/artifacts/runtime`
* Turnkey model images (runtime + model): `/pages/artifacts/model-image`
