LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.
Why It's in Trial
LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.
It belongs in Trial rather than Adopt for two reasons:
- Critical supply chain compromise (March 24, 2026): Part of the TeamPCP cascading campaign — the attacker compromised LiteLLM's CI/CD via a poisoned Trivy GitHub Action, exfiltrated the PyPI publish token, and released backdoored v1.82.7 and v1.82.8 containing a three-stage credential stealer (CVE-2026-33634, CVSS 9.4). The malicious packages were downloaded 47,000 times in 46 minutes before PyPI quarantined the entire package. LiteLLM has paused releases pending a full security review. Mercor confirmed a security incident tied to the LiteLLM supply chain attack as of April 1, 2026 — the first named downstream victim. Pin to v1.82.6 or earlier and rotate any credentials that may have been exposed. See the LiteLLM security update and the Snyk technical analysis for details.
- Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).
Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.
Two Deployment Modes
SDK mode — embed directly in your Python application:
import litellm
response = litellm.completion(
model="anthropic/claude-sonnet-4-6",
messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes
Proxy mode — deploy as a standalone AI gateway:
# config.yaml
model_list:
- model_name: gpt-4o
litellm_params:
model: openai/gpt-4o
api_key: os.environ/OPENAI_API_KEY
- model_name: claude-sonnet
litellm_params:
model: anthropic/claude-sonnet-4-6
api_key: os.environ/ANTHROPIC_API_KEY
litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000
Key Capabilities
| Capability | Details |
|---|---|
| Unified API | 100+ providers via OpenAI format; swap models with a string change |
| Model routing | Strategies: simple-shuffle, least-busy, usage-based, latency-based |
| Fallbacks | Automatic retry with fallback models on provider errors |
| Cost tracking | Per-key, per-team, per-user spend across all providers |
| Rate limiting | Per-key/team/user RPM and TPM limits |
| Caching | In-memory, Redis, and semantic caching |
| Guardrails | Pre/post call hooks for content filtering |
| Load balancing | Round-robin and weighted routing across multiple deployments |
Agentic Engineering Integration
LiteLLM has first-class support for the protocols this radar tracks:
MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.
A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.
# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
model="a2a/my-specialist-agent",
messages=[{"role": "user", "content": "Analyze this PR diff"}]
)
Observability
| Integration | What you get |
|---|---|
| Langfuse | Full trace visibility per LLM call, cost attribution |
| OpenTelemetry | Spans for each model interaction; works with Jaeger, Grafana Tempo |
| Prometheus | Latency, error rate, token usage metrics |
| LiteLLM Admin UI | Built-in dashboard for spend, usage, and key management |
Provider Coverage
| Tier | Providers |
|---|---|
| Frontier | OpenAI, Anthropic, Google Gemini, xAI Grok |
| Cloud | Azure OpenAI, Amazon Bedrock, Google Vertex AI |
| Open weights | Ollama, vLLM, Together AI, Replicate, Groq |
| Specialty | Cohere, Mistral, Perplexity, AI21, Fireworks AI |
Key Characteristics
| Property | Value |
|---|---|
| License | MIT (free); Enterprise from $250/month |
| Language | Python (proxy is language-agnostic via REST) |
| Provider | BerriAI |
| Security note | Pin to ≤v1.82.6; CVE-2026-33634 (CVSS 9.4); supply chain review ongoing |
| MCP support | Yes — proxy as MCP server + tool format translation |
| A2A support | Yes — native a2a/ model prefix routing |
| GitHub | BerriAI/litellm |
| Website | litellm.ai |
Further Reading
- LiteLLM Documentation — proxy setup, routing, cost tracking
- LiteLLM GitHub — source, issues, release notes
- Provider list — full list of supported models and providers
- Security update (March 2026) — supply chain incident details and safe version guidance
- LiteLLM Proxy (Java) — using LiteLLM from Java via the proxy