Introduction

Welcome to RamaLama SDKs. RamaLama SDKs provide local-first AI capabilities for applications that run on any device with a container manager. The SDKs build on the RamaLama CLI to provision and run models on device.

What is RamaLama?

RamaLama is an open-source container orchestration system for AI. With the SDKs, you can integrate local inference into your apps while keeping data on device and minimizing latency. Once models are downloaded, inference can run fully offline.

Core AI Capabilities

Every RamaLama SDK provides access to these core AI features:

LLM (Large Language Model)

On-device chat with an OpenAI-compatible HTTP endpoint for direct requests.

STT (Speech-to-Text)

Local transcription with Whisper models running on device.

Why RamaLama?

Privacy by design
Low latency
Offline capable
Container-native model provisioning

Supported SDKs

Platform	Status	Installation	Documentation
Python	Active development	`pip install ramalama-sdk`	/sdk/python/introduction
TypeScript	Planned	Coming soon	/sdk/typescript
Go	Planned	Coming soon	/sdk/go
Rust	Planned	Coming soon	/sdk/rust

Get Started

Choose your SDK from the list above.
Install the SDK for your platform.
Initialize and build with the quick start guide.

Getting Started

Python

Planned SDKs

What is RamaLama?

Core AI Capabilities

LLM (Large Language Model)

STT (Speech-to-Text)

Why RamaLama?

Supported SDKs

Get Started

Quick Links

Getting Started

Python

Planned SDKs

​What is RamaLama?

​Core AI Capabilities

​LLM (Large Language Model)

​STT (Speech-to-Text)

​Why RamaLama?

​Supported SDKs

​Get Started

​Quick Links

What is RamaLama?

Core AI Capabilities

LLM (Large Language Model)

STT (Speech-to-Text)

Why RamaLama?

Supported SDKs

Get Started

Quick Links