LiteLLM

Mar 2026

Trial

LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.

Why It's in Trial

LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.

It belongs in Trial rather than Adopt for two reasons:

Recent supply chain incident (March 2026): PyPI packages v1.82.7 and v1.82.8 were identified as potentially compromised and removed. The project paused releases pending a security review. Pin to v1.82.6 or earlier until verified clean versions are released.
Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).

Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.

Two Deployment Modes

SDK mode — embed directly in your Python application:

import litellm

response = litellm.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes

Proxy mode — deploy as a standalone AI gateway:

# config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: os.environ/ANTHROPIC_API_KEY

litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000

Key Capabilities

Capability	Details
Unified API	100+ providers via OpenAI format; swap models with a string change
Model routing	Strategies: simple-shuffle, least-busy, usage-based, latency-based
Fallbacks	Automatic retry with fallback models on provider errors
Cost tracking	Per-key, per-team, per-user spend across all providers
Rate limiting	Per-key/team/user RPM and TPM limits
Caching	In-memory, Redis, and semantic caching
Guardrails	Pre/post call hooks for content filtering
Load balancing	Round-robin and weighted routing across multiple deployments

Agentic Engineering Integration

LiteLLM has first-class support for the protocols this radar tracks:

MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.

A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.

# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
    model="a2a/my-specialist-agent",
    messages=[{"role": "user", "content": "Analyze this PR diff"}]
)

Observability

Integration	What you get
Langfuse	Full trace visibility per LLM call, cost attribution
OpenTelemetry	Spans for each model interaction; works with Jaeger, Grafana Tempo
Prometheus	Latency, error rate, token usage metrics
LiteLLM Admin UI	Built-in dashboard for spend, usage, and key management

Provider Coverage

Tier	Providers
Frontier	OpenAI, Anthropic, Google Gemini, xAI Grok
Cloud	Azure OpenAI, Amazon Bedrock, Google Vertex AI
Open weights	Ollama, vLLM, Together AI, Replicate, Groq
Specialty	Cohere, Mistral, Perplexity, AI21, Fireworks AI

Key Characteristics

Property	Value
License	MIT (free); Enterprise from $250/month
Language	Python (proxy is language-agnostic via REST)
Latency	8ms P95 at 1k req/s
GitHub stars	20,000+
Monthly downloads	3.5M
Security note	Pin to ≤v1.82.6; supply chain review ongoing (March 2026)
MCP support	Yes — proxy as MCP server + tool format translation
A2A support	Yes — native `a2a/` model prefix routing