Helicone

Jun 2026

Trial

Helicone is an open-source LLM observability platform and AI gateway — route all your provider calls through a single proxy URL and get instant cost tracking, request logging, caching, and rate limiting with one line of code change. It's the fastest path to LLM cost visibility across any provider.

The Proxy Approach

Unlike SDK-based observability tools, Helicone works at the HTTP level: change one base URL and every request is automatically logged, regardless of how your application calls the LLM.

# Before: direct OpenAI call
client = openai.OpenAI(api_key=OPENAI_API_KEY)

# After: route through Helicone (one line change)
client = openai.OpenAI(
    api_key=OPENAI_API_KEY,
    base_url="https://oai.helicone.ai/v1",
    default_headers={"Helicone-Auth": f"Bearer {HELICONE_API_KEY}"}
)

This works for any OpenAI-compatible provider — Anthropic, Google, Groq, AWS Bedrock, Azure OpenAI, and 100+ others — through a single gateway.

What You Get

Cost attribution: Per-request and aggregated cost breakdown across models and users
Request logging: Full prompt/response logging with session and user tracking
Caching: Semantic and exact-match caching to reduce duplicate LLM calls
Rate limiting: Per-user or per-key request limits
Prompt management: Version and deploy prompts through the gateway without code changes
Smart routing: Route to fastest, cheapest, or most reliable provider based on real-time availability

AI Gateway (Rust, 2026)

Helicone's standalone ai-gateway is a separate Rust-based lightweight gateway focused purely on routing and reliability — zero-latency overhead, handles millions of requests, provider failover and rate-limit awareness. Open source and self-hostable.

Scale

2.1B+ requests logged. 5,400+ GitHub stars on the main repo.

Key Characteristics

Property	Value
License	Apache 2.0
Provider	Helicone (YC W23)
GitHub	Helicone/helicone
Website	helicone.ai
Docs	docs.helicone.ai