chat() method sends a chat completion request to a running model server and returns a ChatMessage payload. It is a simple API for quick prompts when you do not need to call the HTTP endpoint directly.
Context Manager
The SDK provides both a synchronous and asynchronous model API for chat.Before you call chat()
The server must be running. If you are not using a context manager you’ll need to manage the models lifecycle yourself:Method signature
Parameters
| Parameter | Type | Description | Default |
|---|---|---|---|
| message | str | User prompt content. | required |
| history | list[ChatMessage] or None | Optional prior conversation messages. | None |
Returns
AChatMessage typed dict with the assistant response.
| Field | Type | Description |
|---|---|---|
| role | Literal[‘system’, ‘user’, ‘assistant’, ‘developer’] | Message author role. |
| content | str | Message text content. |
Raises
RuntimeErrorif the server is not running.
Examples
Simple Q&A
Conversation flow
The SDK does not store history automatically. Pass it yourself:Error handling
When to use chat() vs direct HTTP
| Use case | Recommended approach |
|---|---|
| Quick responses | chat() |
| Custom payloads or full OpenAI schema control | Direct HTTP to /chat/completions |
| Interoperability with existing OpenAI clients | Direct HTTP to /chat/completions |
requests.
