Skip to main content
RamaLama Labs Logo
Welcome to RamaLama SDKs. RamaLama SDKs provide local-first AI capabilities for applications that run on any device with a container manager. The SDKs build on the RamaLama CLI to provision and run models on device.

What is RamaLama?

RamaLama is an open-source container orchestration system for AI. With the SDKs, you can integrate local inference into your apps while keeping data on device and minimizing latency. Once models are downloaded, inference can run fully offline.

Core AI Capabilities

Every RamaLama SDK provides access to these core AI features:

LLM (Large Language Model)

On-device chat with an OpenAI-compatible HTTP endpoint for direct requests.

STT (Speech-to-Text)

Local transcription with Whisper models running on device.

Why RamaLama?

  • Privacy by design
  • Low latency
  • Offline capable
  • Container-native model provisioning

Supported SDKs

PlatformStatusInstallationDocumentation
PythonActive developmentpip install ramalama-sdk/sdk/python/introduction
TypeScriptPlannedComing soon/sdk/typescript
GoPlannedComing soon/sdk/go
RustPlannedComing soon/sdk/rust

Get Started

  1. Choose your SDK from the list above.
  2. Install the SDK for your platform.
  3. Initialize and build with the quick start guide.