Technology RadarTechnology Radar

LiteLLM Proxy (Java)

sdkstreaming
Assess

LiteLLM's proxy server exposes a fully OpenAI-compatible REST API, making it usable from Java without any LiteLLM-specific code. Java teams can route calls through a self-hosted LiteLLM instance using the OpenAI Java SDK or Spring AI's OpenAI integration — gaining multi-provider routing, cost tracking, and fallbacks without vendor lock-in.

Why It's in Assess

There is no official LiteLLM Java SDK. The approach works in practice — the proxy speaks OpenAI's REST protocol, which Java already handles well — but the architecture is indirect: you run a Python sidecar to gain the benefits. That adds operational complexity.

Additionally, a supply chain security incident (March 2026) affected LiteLLM PyPI packages v1.82.7 and v1.82.8. Pin the proxy image to v1.82.6 or earlier until verified clean versions are released.

Assess is appropriate: understand the pattern and prototype it; don't adopt it wholesale without validating it fits your deployment model.

How It Works

Java App
  └── OpenAI Java SDK (baseUrl = http://litellm-proxy:8000)
        └── LiteLLM Proxy (Python, self-hosted)
              ├── Anthropic Claude
              ├── OpenAI GPT-5
              ├── Azure OpenAI
              └── Amazon Bedrock

The proxy handles provider-specific auth, format translation, retries, and routing. Your Java code only ever speaks OpenAI format.

Spring AI Integration

Spring AI's OpenAI integration accepts a custom base URL — point it at LiteLLM:

// application.properties
spring.ai.openai.base-url=http://litellm-proxy:8000
spring.ai.openai.api-key=your-litellm-virtual-key
spring.ai.openai.chat.model=claude-sonnet  // LiteLLM model alias
@Autowired ChatClient chatClient;

String response = chatClient.prompt()
    .user("Review this code for security issues")
    .call()
    .content();
// Routed through LiteLLM to whichever provider is configured

Swap providers by changing LiteLLM's config — no Java code changes.

OpenAI Java SDK Integration

OpenAIClient client = OpenAIOkHttpClient.builder()
    .baseUrl("http://litellm-proxy:8000")
    .apiKey("your-litellm-virtual-key")
    .build();

ChatCompletion response = client.chat().completions().create(
    ChatCompletionCreateParams.builder()
        .model("claude-sonnet")   // LiteLLM alias, resolved server-side
        .addUserMessage("Explain the builder pattern")
        .build()
);

Streaming

Both Spring AI and the OpenAI Java SDK support streaming; LiteLLM proxies SSE streams without modification:

Flux<String> stream = chatClient.prompt()
    .user("Generate a JUnit 5 test suite for this class")
    .stream()
    .content();

What You Get from the Proxy

Feature Java benefit
Multi-provider routing Switch Anthropic → OpenAI → Bedrock in proxy config, not Java code
Automatic fallbacks Provider outage handled transparently; Java sees a normal response
Cost tracking Per-virtual-key spend tracked in LiteLLM admin UI
Rate limiting Protect downstream providers from runaway agents
Caching Semantic cache cuts token spend on repeated prompts
Load balancing Distribute across multiple Claude deployments or regions

Community Java SDK

A community-developed litellm-java-sdk exists with Project Reactor async support and type-safe interfaces. It is not officially maintained and has limited adoption — evaluate carefully before using in production. The proxy-via-OpenAI-SDK approach above is more stable.

When to Use LiteLLM Proxy vs Direct SDKs

Scenario Recommendation
Single provider, simple use case Direct SDK (Anthropic Java SDK, OpenAI Java SDK)
Multiple providers or A/B testing models LiteLLM proxy — centralises routing
Enterprise: per-team cost allocation LiteLLM proxy — built-in virtual key management
Air-gapped / strict security requirements Evaluate carefully; adds a Python runtime dependency
Spring AI already in use Spring AI handles multi-provider natively; LiteLLM adds routing layer

Key Characteristics

Property Value
Java SDK None official; community SDK available
Integration path OpenAI-compatible REST via OpenAI Java SDK or Spring AI
Proxy language Python (sidecar/separate service)
License MIT (proxy); Enterprise from $250/month
Security note Pin proxy to ≤v1.82.6 (supply chain incident, March 2026)
Streaming support Yes — SSE proxied transparently
Spring AI compatible Yes — set spring.ai.openai.base-url

Further Reading