Technology RadarTechnology Radar
Trial

LiteLLM is an open-source Python SDK and AI gateway that provides a unified, OpenAI-compatible interface to 100+ LLM providers — OpenAI, Anthropic, Azure, Google Vertex AI, Bedrock, Cohere, and more. With 20,000+ GitHub stars, 3.5M monthly downloads, and production deployments at Netflix and Rocket Money, it has become the de facto standard for multi-provider LLM routing and cost governance.

Why It's in Trial

LiteLLM solves a real problem: every LLM provider has a different API format, authentication scheme, and error model. Without an abstraction layer, switching providers or running A/B tests across models requires significant refactoring. LiteLLM eliminates that with one consistent API.

It belongs in Trial rather than Adopt for two reasons:

  1. Critical supply chain compromise (March 24, 2026): Part of the TeamPCP cascading campaign — the attacker compromised LiteLLM's CI/CD via a poisoned Trivy GitHub Action, exfiltrated the PyPI publish token, and released backdoored v1.82.7 and v1.82.8 containing a three-stage credential stealer (CVE-2026-33634, CVSS 9.4). The malicious packages were downloaded 47,000 times in 46 minutes before PyPI quarantined the entire package. LiteLLM has paused releases pending a full security review. Mercor confirmed a security incident tied to the LiteLLM supply chain attack as of April 1, 2026 — the first named downstream victim. Pin to v1.82.6 or earlier and rotate any credentials that may have been exposed. See the LiteLLM security update and the Snyk technical analysis for details.
  2. Python-only: No official Java or other JVM SDK. Java teams must go through the proxy server (see the Java radar entry).

Outside of those caveats, the tool is production-grade: 8ms P95 latency at 1,000 req/s, MIT licensed, actively maintained with weekly stable releases.

Two Deployment Modes

SDK mode — embed directly in your Python application:

import litellm

response = litellm.completion(
    model="anthropic/claude-sonnet-4-6",
    messages=[{"role": "user", "content": "What is agentic engineering?"}]
)
# Switch to GPT-5 by changing the model string — zero other changes

Proxy mode — deploy as a standalone AI gateway:

# config.yaml
model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: os.environ/OPENAI_API_KEY
  - model_name: claude-sonnet
    litellm_params:
      model: anthropic/claude-sonnet-4-6
      api_key: os.environ/ANTHROPIC_API_KEY
litellm --config config.yaml --port 8000
# Now any OpenAI-compatible client points at http://localhost:8000

Key Capabilities

Capability Details
Unified API 100+ providers via OpenAI format; swap models with a string change
Model routing Strategies: simple-shuffle, least-busy, usage-based, latency-based
Fallbacks Automatic retry with fallback models on provider errors
Cost tracking Per-key, per-team, per-user spend across all providers
Rate limiting Per-key/team/user RPM and TPM limits
Caching In-memory, Redis, and semantic caching
Guardrails Pre/post call hooks for content filtering
Load balancing Round-robin and weighted routing across multiple deployments

Agentic Engineering Integration

LiteLLM has first-class support for the protocols this radar tracks:

MCP (Model Context Protocol): LiteLLM proxy can act as an MCP server, exposing tool calling to connected clients. It also translates between OpenAI tool format and MCP tool format automatically.

A2A (Agent-to-Agent Protocol): Native A2A endpoint support — invoke agents via the a2a/ model prefix through the LiteLLM gateway. Enables cross-vendor agent coordination without direct SDK dependencies.

# Route agent-to-agent calls through LiteLLM
response = litellm.completion(
    model="a2a/my-specialist-agent",
    messages=[{"role": "user", "content": "Analyze this PR diff"}]
)

Observability

Integration What you get
Langfuse Full trace visibility per LLM call, cost attribution
OpenTelemetry Spans for each model interaction; works with Jaeger, Grafana Tempo
Prometheus Latency, error rate, token usage metrics
LiteLLM Admin UI Built-in dashboard for spend, usage, and key management

Provider Coverage

Tier Providers
Frontier OpenAI, Anthropic, Google Gemini, xAI Grok
Cloud Azure OpenAI, Amazon Bedrock, Google Vertex AI
Open weights Ollama, vLLM, Together AI, Replicate, Groq
Specialty Cohere, Mistral, Perplexity, AI21, Fireworks AI

Key Characteristics

Property Value
License MIT (free); Enterprise from $250/month
Language Python (proxy is language-agnostic via REST)
Provider BerriAI
Security note Pin to ≤v1.82.6; CVE-2026-33634 (CVSS 9.4); supply chain review ongoing
MCP support Yes — proxy as MCP server + tool format translation
A2A support Yes — native a2a/ model prefix routing
GitHub BerriAI/litellm
Website litellm.ai

Further Reading