The RamaLama CLI is open-source and open to contributors.
Check the project out at https://github.com/containers/ramalama
Installation
Functionality
The CLI includes a variety of useful functions including- Local serving and interaction with AI models
- Packaging containerized AI deployments
- Building optimized deployments for RAG workloads
- etc…
Serve a REST API
RamaLama makes it easy to work with AI on your laptop. You can deploy an OpenAI compatible API with a single command.Model Repositories
RamaLama can serve models from any of the major model providers including RamaLama Labs, HuggingFace, Ollama, and Modelscope. Additionally, it supports generic oci model artifacts meaning you can easily run and serve models from your own or your enterprises own model registry. For example, you can easily serve an oci compatible artifact from Dockers modelhub withHardware acceleration
RamaLama inspects your system and chooses a matching runtime image (e.g., CUDA, ROCm, Intel GPU, CPU). However, you can override the default image explicitly with the —image command and runNext steps
Deploy to Production
Learn how to deploy with Docker Compose or Kubernetes
Explore on GitHub
Browse the full documentation, examples, and man pages

