Technology RadarTechnology Radar
Trial

LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.

Why It's in Trial

LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.

It belongs in Trial rather than Adopt for two reasons:

  1. Recent supply chain incident (March 2026): PyPI packages v1.82.7 and v1.82.8 were identified as potentially compromised and removed. The project paused releases pending a security review. Pin to v1.82.6 or earlier until verified clean versions are released.
  2. Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).

Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.

Two Deployment Modes

SDK mode — embed directly in your Python application:

import litellm

response = litellm.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes

Proxy mode — deploy as a standalone AI gateway:

# config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: os.environ/ANTHROPIC_API_KEY
litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000

Key Capabilities

Capability Details
Unified API 100+ providers via OpenAI format; swap models with a string change
Model routing Strategies: simple-shuffle, least-busy, usage-based, latency-based
Fallbacks Automatic retry with fallback models on provider errors
Cost tracking Per-key, per-team, per-user spend across all providers
Rate limiting Per-key/team/user RPM and TPM limits
Caching In-memory, Redis, and semantic caching
Guardrails Pre/post call hooks for content filtering
Load balancing Round-robin and weighted routing across multiple deployments

Agentic Engineering Integration

LiteLLM has first-class support for the protocols this radar tracks:

MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.

A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.

# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
    model="a2a/my-specialist-agent",
    messages=[{"role": "user", "content": "Analyze this PR diff"}]
)

Observability

Integration What you get
Langfuse Full trace visibility per LLM call, cost attribution
OpenTelemetry Spans for each model interaction; works with Jaeger, Grafana Tempo
Prometheus Latency, error rate, token usage metrics
LiteLLM Admin UI Built-in dashboard for spend, usage, and key management

Provider Coverage

Tier Providers
Frontier OpenAI, Anthropic, Google Gemini, xAI Grok
Cloud Azure OpenAI, Amazon Bedrock, Google Vertex AI
Open weights Ollama, vLLM, Together AI, Replicate, Groq
Specialty Cohere, Mistral, Perplexity, AI21, Fireworks AI

Key Characteristics

Property Value
License MIT (free); Enterprise from $250/month
Language Python (proxy is language-agnostic via REST)
Latency 8ms P95 at 1k req/s
GitHub stars 20,000+
Monthly downloads 3.5M
Security note Pin to ≤v1.82.6; supply chain review ongoing (March 2026)
MCP support Yes — proxy as MCP server + tool format translation
A2A support Yes — native a2a/ model prefix routing

Further Reading