Trial
Voyage Code 3 is a code embedding model from Voyage AI optimized for code retrieval. It outperforms OpenAI's text-embedding-3-large and CodeSage-large by 13.8% and 16.8% respectively across 32 code retrieval benchmarks, and supports flexible dimensions and quantization for cost-efficient deployment.
Why It's in Trial
- Best-in-class code retrieval: Consistently outperforms alternatives across a broad benchmark suite — not a single cherry-picked task but 32 diverse code retrieval datasets.
- Production-ready flexibility: Supports Matryoshka dimensionality (2048, 1024, 512, 256) and quantized output types (int8, uint8, binary), enabling 4x–32x storage reduction with minimal quality loss.
- Wide adoption: Since its predecessor voyage-code-2 launched in January 2024, Voyage's code models have seen exponential adoption among coding assistant and agent startups for RAG-based code retrieval.
- 32K context length: Long enough to embed entire files, large code blocks, or multi-file context — important for agentic retrieval where queries span significant code context.
- Not yet the default: While leading on benchmarks, it hasn't displaced OpenAI embeddings as the default in most frameworks and tutorials. Trial — use it on real projects, but the ecosystem integration is still maturing.
Key Capabilities
- Matryoshka learning: Train once at 2048 dimensions, truncate to 256 at query time with graceful quality degradation — no retraining needed.
- Quantized embeddings: int8 (4x savings) or binary (32x savings) compared to 32-bit float, enabling cost-effective large-scale code search.
- AWS SageMaker deployment: Available for private deployment in your VPC — 90ms latency per query, ~$0.22/M tokens on ml.g6.xlarge.
Key Characteristics
| Property | Value |
|---|---|
| Developer | Voyage AI |
| Model | voyage-code-3 |
| Type | Code embedding (not generative) |
| Released | December 2024 |
| Context length | 32,768 tokens |
| Default dimensions | 1,024 |
| Max dimensions | 2,048 |
| Quantization | int8, uint8, binary, ubinary |
| Benchmark | +13.8% vs OpenAI text-embedding-3-large (32 datasets) |
| Deployment | Voyage API, AWS SageMaker |
| Pricing | API-based (see voyageai.com) |
| Hugging Face | voyageai/voyage-code-3 |