LiteLLM

Jun 2026

Trial

LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.

Why It's in Trial

LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.

It belongs in Trial rather than Adopt for two reasons:

Critical supply chain compromise (March 24, 2026): Part of the TeamPCP cascading campaign — the attacker compromised LiteLLM's CI/CD via a poisoned Trivy GitHub Action, exfiltrated the PyPI publish token, and released backdoored v1.82.7 and v1.82.8 containing a three-stage credential stealer (CVE-2026-33634, CVSS 9.4). The malicious packages were downloaded 47,000 times in 46 minutes before PyPI quarantined the entire package. LiteLLM has paused releases pending a full security review. Mercor confirmed a security incident tied to the LiteLLM supply chain attack as of April 1, 2026 — the first named downstream victim. Pin to v1.82.6 or earlier and rotate any credentials that may have been exposed. See the LiteLLM security update and the Snyk technical analysis for details.
Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).

Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.

Two Deployment Modes

SDK mode — embed directly in your Python application:

import litellm

response = litellm.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes

Proxy mode — deploy as a standalone AI gateway:

# config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: os.environ/ANTHROPIC_API_KEY

litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000

Key Capabilities

Capability	Details
Unified API	100+ providers via OpenAI format; swap models with a string change
Model routing	Strategies: simple-shuffle, least-busy, usage-based, latency-based
Fallbacks	Automatic retry with fallback models on provider errors
Cost tracking	Per-key, per-team, per-user spend across all providers
Rate limiting	Per-key/team/user RPM and TPM limits
Caching	In-memory, Redis, and semantic caching
Guardrails	Pre/post call hooks for content filtering
Load balancing	Round-robin and weighted routing across multiple deployments

Agentic Engineering Integration

LiteLLM has first-class support for the protocols this radar tracks:

MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.

A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.

# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
    model="a2a/my-specialist-agent",
    messages=[{"role": "user", "content": "Analyze this PR diff"}]
)

Observability

Integration	What you get
Langfuse	Full trace visibility per LLM call, cost attribution
OpenTelemetry	Spans for each model interaction; works with Jaeger, Grafana Tempo
Prometheus	Latency, error rate, token usage metrics
LiteLLM Admin UI	Built-in dashboard for spend, usage, and key management

Provider Coverage

Tier	Providers
Frontier	OpenAI, Anthropic, Google Gemini, xAI Grok
Cloud	Azure OpenAI, Amazon Bedrock, Google Vertex AI
Open weights	Ollama, vLLM, Together AI, Replicate, Groq
Specialty	Cohere, Mistral, Perplexity, AI21, Fireworks AI

Key Characteristics

Property	Value
License	MIT (free); Enterprise from $250/month
Language	Python (proxy is language-agnostic via REST)
Provider	BerriAI
Security note	Pin to ≤v1.82.6; CVE-2026-33634 (CVSS 9.4); supply chain review ongoing
MCP support	Yes — proxy as MCP server + tool format translation
A2A support	Yes — native `a2a/` model prefix routing
GitHub	BerriAI/litellm
Website	litellm.ai