LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.
Why It's in Trial
LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.
It belongs in Trial rather than Adopt for two reasons:
- Recent supply chain incident (March 2026): PyPI packages v1.82.7 and v1.82.8 were identified as potentially compromised and removed. The project paused releases pending a security review. Pin to v1.82.6 or earlier until verified clean versions are released.
- Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).
Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.
Two Deployment Modes
SDK mode — embed directly in your Python application:
import litellm
response = litellm.completion(
model="anthropic/claude-sonnet-4-6",
messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes
Proxy mode — deploy as a standalone AI gateway:
# config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4-6
api_key: os.environ/ANTHROPIC_API_KEY
litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000
Key Capabilities
| Capability | Details |
|---|---|
| Unified API | 100+ providers via OpenAI format; swap models with a string change |
| Model routing | Strategies: simple-shuffle, least-busy, usage-based, latency-based |
| Fallbacks | Automatic retry with fallback models on provider errors |
| Cost tracking | Per-key, per-team, per-user spend across all providers |
| Rate limiting | Per-key/team/user RPM and TPM limits |
| Caching | In-memory, Redis, and semantic caching |
| Guardrails | Pre/post call hooks for content filtering |
| Load balancing | Round-robin and weighted routing across multiple deployments |
Agentic Engineering Integration
LiteLLM has first-class support for the protocols this radar tracks:
MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.
A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.
# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
model="a2a/my-specialist-agent",
messages=[{"role": "user", "content": "Analyze this PR diff"}]
)
Observability
| Integration | What you get |
|---|---|
| Langfuse | Full trace visibility per LLM call, cost attribution |
| OpenTelemetry | Spans for each model interaction; works with Jaeger, Grafana Tempo |
| Prometheus | Latency, error rate, token usage metrics |
| LiteLLM Admin UI | Built-in dashboard for spend, usage, and key management |
Provider Coverage
| Tier | Providers |
|---|---|
| Frontier | OpenAI, Anthropic, Google Gemini, xAI Grok |
| Cloud | Azure OpenAI, Amazon Bedrock, Google Vertex AI |
| Open weights | Ollama, vLLM, Together AI, Replicate, Groq |
| Specialty | Cohere, Mistral, Perplexity, AI21, Fireworks AI |
Key Characteristics
| Property | Value |
|---|---|
| License | MIT (free); Enterprise from $250/month |
| Language | Python (proxy is language-agnostic via REST) |
| Latency | 8ms P95 at 1k req/s |
| GitHub stars | 20,000+ |
| Monthly downloads | 3.5M |
| Security note | Pin to ≤v1.82.6; supply chain review ongoing (March 2026) |
| MCP support | Yes — proxy as MCP server + tool format translation |
| A2A support | Yes — native a2a/ model prefix routing |
Further Reading
- LiteLLM Documentation — proxy setup, routing, cost tracking
- LiteLLM GitHub — source, issues, release notes
- Provider list — full list of supported models and providers
- Security update (March 2026) — supply chain incident details and safe version guidance
- LiteLLM Proxy (Java) — using LiteLLM from Java via the proxy