Provider translation
Send OpenAI format. CrabLLM translates to Anthropic, Gemini, Bedrock, and Azure automatically.
Route requests to OpenAI, Anthropic, Gemini, Azure, Bedrock, or Ollama. Sub-millisecond overhead. Single binary. No runtime.
cargo install crabllm crabctl“Same OpenAI/Anthropic format, any provider.”
Six capabilities the gateway gives you out of the box. Each one a plate; each plate a sentence; the sheet reads in any order you like.
Send OpenAI format. CrabLLM translates to Anthropic, Gemini, Bedrock, and Azure automatically.
Weighted random selection across providers. Exponential backoff retry. Automatic failover.
SSE proxied without buffering. Per-chunk extension hooks. Keep-alive pings.
Per-key model access control. Rate limiting, usage tracking, and budget enforcement.
SHA-256 response cache. Per-key RPM and TPM limits. Sliding window enforcement.
Per-key spend limits in USD. Automatic cost tracking from token usage and pricing config.
Gateway overhead at 5,000 concurrent requests per second. All gateways run with identical resource limits (2 CPUs, 512 MB) against a mock backend with instant responses. Full results →
| Gateway | P50 | P99 |
|---|---|---|
| CrabLLM | 0.26ms | 0.54ms |
| Bifrost | 0.61ms | 1.26ms |
| LiteLLM | 159ms | 227ms |
| Gateway | Peak RSS |
|---|---|
| CrabLLM | 34.9 MB |
| Bifrost | 171.7 MB |
| LiteLLM | 541.8 MB |
listen = "0.0.0.0:8080"
[providers.openai]
kind = "openai"
api_key = "${OPENAI_API_KEY}"
models = ["gpt-4o"]
[providers.anthropic]
kind = "anthropic"
api_key = "${ANTHROPIC_API_KEY}"
models = ["claude-sonnet-4-20250514"]crabllm --config crabllm.tomlcurl http://localhost:8080/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "claude-sonnet-4-20250514",
"messages": [{"role": "user", "content": "Hello!"}]}'Same OpenAI format, any provider. CrabLLM translates automatically.