LiteLLM Proxy (Java)

Mar 2026

Assess

LiteLLM's proxy server exposes a fully OpenAI-compatible REST API, making it usable from Java without any LiteLLM-specific code. Java teams can route calls through a self-hosted LiteLLM instance using the OpenAI Java SDK or Spring AI's OpenAI integration — gaining multi-provider routing, cost tracking, and fallbacks without vendor lock-in.

Why It's in Assess

There is no official LiteLLM Java SDK. The approach works in practice — the proxy speaks OpenAI's REST protocol, which Java already handles well — but the architecture is indirect: you run a Python sidecar to gain the benefits. That adds operational complexity.

Additionally, a supply chain security incident (March 2026) affected LiteLLM PyPI packages v1.82.7 and v1.82.8. Pin the proxy image to v1.82.6 or earlier until verified clean versions are released.

Assess is appropriate: understand the pattern and prototype it; don't adopt it wholesale without validating it fits your deployment model.

How It Works

Java App
  └── OpenAI Java SDK (baseUrl = http://litellm-proxy:8000)
        └── LiteLLM Proxy (Python, self-hosted)
              ├── Anthropic Claude
              ├── OpenAI GPT-5
              ├── Azure OpenAI
              └── Amazon Bedrock

The proxy handles provider-specific auth, format translation, retries, and routing. Your Java code only ever speaks OpenAI format.

Spring AI Integration

Spring AI's OpenAI integration accepts a custom base URL — point it at LiteLLM:

// application.properties
spring.ai.openai.base-url=http://litellm-proxy:8000
spring.ai.openai.api-key=your-litellm-virtual-key
spring.ai.openai.chat.model=claude-sonnet  // LiteLLM model alias

@Autowired ChatClient chatClient;

String response = chatClient.prompt()
    .user("Review this code for security issues")
    .call()
    .content();
// Routed through LiteLLM to whichever provider is configured

Swap providers by changing LiteLLM's config — no Java code changes.

OpenAI Java SDK Integration

OpenAIClient client = OpenAIOkHttpClient.builder()
    .baseUrl("http://litellm-proxy:8000")
    .apiKey("your-litellm-virtual-key")
    .build();

ChatCompletion response = client.chat().completions().create(
    ChatCompletionCreateParams.builder()
        .model("claude-sonnet")   // LiteLLM alias, resolved server-side
        .addUserMessage("Explain the builder pattern")
        .build()
);

Streaming

Both Spring AI and the OpenAI Java SDK support streaming; LiteLLM proxies SSE streams without modification:

Flux<String> stream = chatClient.prompt()
    .user("Generate a JUnit 5 test suite for this class")
    .stream()
    .content();

What You Get from the Proxy

Feature	Java benefit
Multi-provider routing	Switch Anthropic → OpenAI → Bedrock in proxy config, not Java code
Automatic fallbacks	Provider outage handled transparently; Java sees a normal response
Cost tracking	Per-virtual-key spend tracked in LiteLLM admin UI
Rate limiting	Protect downstream providers from runaway agents
Caching	Semantic cache cuts token spend on repeated prompts
Load balancing	Distribute across multiple Claude deployments or regions

Community Java SDK

A community-developed litellm-java-sdk exists with Project Reactor async support and type-safe interfaces. It is not officially maintained and has limited adoption — evaluate carefully before using in production. The proxy-via-OpenAI-SDK approach above is more stable.

When to Use LiteLLM Proxy vs Direct SDKs

Scenario	Recommendation
Single provider, simple use case	Direct SDK (Anthropic Java SDK, OpenAI Java SDK)
Multiple providers or A/B testing models	LiteLLM proxy — centralises routing
Enterprise: per-team cost allocation	LiteLLM proxy — built-in virtual key management
Air-gapped / strict security requirements	Evaluate carefully; adds a Python runtime dependency
Spring AI already in use	Spring AI handles multi-provider natively; LiteLLM adds routing layer

Key Characteristics

Property	Value
Java SDK	None official; community SDK available
Integration path	OpenAI-compatible REST via OpenAI Java SDK or Spring AI
Proxy language	Python (sidecar/separate service)
License	MIT (proxy); Enterprise from $250/month
Security note	Pin proxy to ≤v1.82.6 (supply chain incident, March 2026)
Streaming support	Yes — SSE proxied transparently
Spring AI compatible	Yes — set `spring.ai.openai.base-url`