Chat Completions
Make chat completion requests to LLM providers through ZIRI.
Endpoint
POST /api/chat/completionsAuthentication
Include your API key in the X-API-Key header:
curl -H "X-API-Key: ziri-user-123-a1b2c3d4e5f67890" \
https://your-server.com/api/chat/completionsRequest Body
{
provider: string // Required: "openai", "anthropic", etc.
model: string // Required: "gpt-4", "claude-3-opus", etc.
messages: Array<{ // Required: Conversation messages
role: string // "system", "user", "assistant"
content: string // Message content
}>
max_tokens?: number // Optional: Maximum tokens in response
temperature?: number // Optional: 0-2, controls randomness
top_p?: number // Optional: Nucleus sampling
frequency_penalty?: number // Optional: -2 to 2
presence_penalty?: number // Optional: -2 to 2
stop?: string[] // Optional: Stop sequences
stream?: boolean // Optional: Stream responses (not yet supported)
// ... other provider-specific parameters
}Example Request
curl -X POST http://localhost:3100/api/chat/completions \
-H "Content-Type: application/json" \
-H "X-API-Key: ziri-user-123-a1b2c3d4e5f67890" \
-d '{
"provider": "openai",
"model": "gpt-4o-mini",
"messages": [
{"role": "user", "content": "Hello, ZIRI!"}
],
"max_tokens": 100
}'Success Response
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1704067200,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help you?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 10,
"completion_tokens": 8,
"total_tokens": 18
}
}Response format matches the provider’s format (OpenAI or Anthropic).
Rate Limit Headers
Every response includes:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 95
X-RateLimit-Reset: 1704067200Error Responses
Missing API Key
{
"error": "API key required. Include X-API-Key header.",
"code": "API_KEY_REQUIRED",
"requestId": "req-123"
}Status: 400
Invalid API Key Format
{
"error": "Invalid API key format",
"code": "INVALID_API_KEY_FORMAT",
"requestId": "req-123"
}Status: 401
API Key Not Found
{
"error": "API key not found or invalid",
"code": "API_KEY_INVALID",
"requestId": "req-123"
}Status: 403
Missing Required Fields
{
"error": "provider, model, and messages are required",
"code": "MISSING_FIELDS",
"requestId": "req-123"
}Status: 400
Rate Limit Exceeded
{
"error": "Rate limit exceeded",
"code": "RATE_LIMIT_EXCEEDED",
"requestId": "req-123",
"retryAfter": 15,
"resetAt": "2025-01-01T12:00:00Z"
}Status: 429
Includes Retry-After header.
Queue Full
{
"error": "Server busy - queue full",
"code": "QUEUE_FULL",
"requestId": "req-123"
}Status: 503
Authorization Denied
{
"error": "Authorization denied: Policy policy1 denied the request",
"code": "AUTHORIZATION_DENIED",
"reason": "Policy policy1 denied the request",
"requestId": "req-123"
}Status: 403
Check your policies if you see this.
Provider Not Found
{
"error": "Provider not found",
"code": "PROVIDER_NOT_FOUND",
"requestId": "req-123"
}Status: 404
Add the provider in the admin UI first.
Provider Key Missing
{
"error": "Provider API key not configured",
"code": "PROVIDER_KEY_MISSING",
"requestId": "req-123"
}Status: 500
Configure the provider’s API key in the admin UI.
LLM Provider Error
{
"error": "OpenAI API error: Rate limit exceeded",
"code": "LLM_PROVIDER_ERROR",
"requestId": "req-123"
}Status: 500
The LLM provider returned an error. Check the error message for details.
Using the SDK
import { UserSDK } from "@ziri/sdk";
const sdk = new UserSDK({
apiKey: "ziri-user-123-a1b2c3d4e5f67890",
proxyUrl: "http://localhost:3100",
});
const response = await sdk.chatCompletions({
provider: "openai",
model: "gpt-4o-mini",
messages: [{ role: "user", content: "Hello, ZIRI!" }],
max_tokens: 100,
});
console.log(response.choices[0].message.content);Request Flow
- API key validated
- Model capability checked
- Cost estimated
- Spend reserved
- Rate limit checked
- Queue slot acquired
- Policies evaluated
- Request forwarded to provider
- Cost tracked
- Response returned
Common Issues
“Authorization denied” - Check that you have a policy allowing completions. Default policies deny everything.
“Rate limit exceeded” - You’ve hit your rate limit. Wait for the reset time or increase your limit.
“Provider not found” - Add the provider in the admin UI first.
“Model does not support completion” - The model you’re using doesn’t support chat completions. Use an embedding or image model instead.
Next Steps
- See Embeddings for text embeddings
- Check Images for image generation
- See Policy Examples for policy examples