Arize Phoenix is an open-source LLM observability platform built on OpenTelemetry and the OpenInference instrumentation standard — providing tracing, evaluation, and debugging for AI applications with no vendor lock-in. It's the strongest open-source alternative to LangSmith for teams that want framework-agnostic, OTel-native observability.
Why Phoenix Over LangSmith
LangSmith's tight LangChain coupling means breaking framework changes can disrupt your observability stack. Phoenix sidesteps this: it uses the OpenInference instrumentation standard built on OTLP — the same wire protocol used by Jaeger and Zipkin — so your traces are portable and not tied to any orchestration framework.
What It Tracks
- Distributed traces: Full span-level visibility into LLM calls, tool executions, retrieval operations, and agent reasoning loops
- LLM-specific spans: Captures model inputs/outputs, token counts, latency, and cost per provider
- Evaluations: Built-in LLM-as-a-judge scorers plus integration with Ragas, DeepEval, and Cleanlab
- Span Replay: Replay individual LLM calls with modified inputs for debugging without re-running your full pipeline
Framework Support
Out-of-the-box auto-instrumentation for OpenAI Agents SDK, Claude Agent SDK, LangGraph, LlamaIndex, CrewAI, DSPy, Vercel AI SDK, Mastra, Google ADK, AWS Bedrock, LiteLLM, and more.
Self-Hosting vs Cloud
pip install arize-phoenix openinference-instrumentation-openai
import phoenix as px
from openinference.instrumentation.openai import OpenAIInstrumentor
px.launch_app() # local UI at http://localhost:6006
OpenAIInstrumentor().instrument()
# All subsequent OpenAI calls are now traced automatically
Phoenix runs as a local server (Python process or Docker) for development. Phoenix Cloud (hosted, starting ~$50/month) is the managed option for production teams.
Key Characteristics
| Property | Value |
|---|---|
| License | Apache 2.0 |
| Provider | Arize AI |
| GitHub | Arize-ai/phoenix |
| Website | phoenix.arize.com |
| Docs | arize.com/docs/phoenix |