API
Errors, limits, migration
Errors come back in the OpenAI shape, so existing error handling keeps working:
{
"error": {
"message": "Human-readable explanation",
"type": "rate_limit_error",
"code": "daily_limit_reached"
}
}
type is the category, code the specific condition. Match on code programmatically.
Status codes
| Status | Type | Typical codes |
|---|---|---|
| 400 | invalid_request_error |
Malformed request; output_exceeds_limit when max_tokens is above the model's cap |
| 401 | authentication_error |
invalid_api_key, api_key_disabled |
| 402 | billing_error |
insufficient_credits, subscription_credits_exhausted |
| 403 | permission_error |
model_not_allowed (model needs a higher plan), member_budget_exceeded (team budget cap), member_suspended, account_locked |
| 404 | invalid_request_error |
model_not_found, model_disabled |
| 413 | request_too_large |
input_exceeds_context (prompt larger than the model's context window) |
| 429 | rate_limit_error |
per_minute_limit_reached, hourly_limit_reached, daily_limit_reached, monthly_limit_reached, api_key_daily_limit_reached, api_key_monthly_limit_reached |
| 5xx | api_error |
Upstream provider failure |
Limits
Three independent layers can say no:
- Credit: a request that would cost more than your remaining balance is rejected up front with 402, before any tokens are spent. Top up or wait for the monthly plan refill. See Credit and invoices.
- Rate limits: burst protection per minute plus hourly, daily, and monthly request ceilings. The error message states when the window resets. On 429, back off and retry after the reset; with the official SDKs, automatic retries handle this.
- Per-key caps: each API key can carry its own daily and monthly spend limit, set by you in API keys. You are warned at 75 percent; at 100 percent the key returns 429 (
api_key_daily_limit_reachedorapi_key_monthly_limit_reached) until the day or month rolls over or you raise the cap.
Members of a team can additionally hit their per-member monthly budget (403, member_budget_exceeded).
Retry guidance
Retry with exponential backoff on 429 and 5xx. Do not retry 4xx requests unchanged: they will fail identically. 402 is fixed by credit, not retries.
Migrating from OpenAI
Two lines:
client = OpenAI(
base_url="https://api.kral.ai/v1", # was: api.openai.com/v1
api_key="sk-kral-...", # was: sk-...
)
What changes beyond that:
- Model names: every provider's models are available;
GET /v1/modelslists what your plan includes. OpenAI model names work unchanged. - Billing: usage draws from your kral credit, shared with the chat. No separate provider invoices.
- Errors: same shape, plus the additional
billing_errorfamily above.
Coming from the Anthropic or Google SDKs instead? Point them at the native endpoints and keep your request code as-is.