Using RamaLama Labs container artifacts on your local machine.
Our containerized AI artifacts are OCI compatible allowing you to directly use them with docker, podman and kubernetes wherever you need them: whether the cloud, a datacenter, or your basement.
Our artifacts are regularly rebuilt, updated, and scanned for vulnerabilities to provide, the smallest, fastest, and most secure runtime possible.
You can find comparisons between different images on the comparisons page of each image
(e.g. for llama.cpp’s cuda and cpu) runtimes.
You can find the full catalogue of RamaLama Labs images here
3
Get chatting
The endpoint is OpenAI‑compatible. Try a quick chat request:
ramalama chat "Say hello in one sentence"
Hello!
Many of our images come bundled with a web server GUI. If you’d prefer to chat directly with the agent you can access it at the root url where
the agent is being served (e.g. http://localhost:8080)