Voyage AI is an embedding and reranking API provider. You send text to their API and receive high-dimensional vector embeddings for semantic search, RAG, and classification. Their key differentiator is domain-specific models: voyage-code-2 for code retrieval, voyage-finance-2 for financial documents, and voyage-law-2 for legal text — trained specifically for those domains rather than relying on general-purpose embeddings.

How do Voyage AI's embeddings compare to OpenAI's?

On MTEB benchmarks, Voyage AI models (particularly voyage-large-2-instruct) consistently outperform OpenAI's text-embedding-3-large overall and significantly outperform on domain-specific retrieval tasks. For code search, voyage-code-2 substantially outperforms general embeddings. The gap is most pronounced in specialized domains. Voyage AI offers higher embedding quality at competitive API pricing.

What are Voyage AI's domain-specific models?

Voyage AI offers: voyage-code-2 (optimized for code search and code Q&A), voyage-finance-2 (financial documents and SEC filings), voyage-law-2 (legal documents, case law, contracts), and voyage-multilingual-2 (100+ languages). These domain-specific models are trained on domain-relevant data to understand domain-specific terminology, structure, and retrieval patterns.

Does Voyage AI have a reranker?

Yes. Voyage AI provides reranking models (rerank-1, rerank-lite-1) that re-score retrieved documents based on relevance to the query. The reranker is used post-retrieval to improve precision: first retrieve the top-50 candidates using embeddings, then re-rank with the reranker to get the true top-10. This two-stage retrieval significantly improves RAG accuracy.

Voyage AI | db.fyi

Why it matters

Consistently top MTEB benchmark performance — objectively the best embedding quality available via API for many retrieval tasks.
Domain-specific models (code, finance, legal) are not available from OpenAI or Cohere — a unique capability.
Reranking API enables two-stage retrieval that significantly improves RAG precision over single-stage embedding search.
Used by Anthropic — Claude's documentation Q&A and RAG features use Voyage AI embeddings.

Key capabilities

voyage-3 (general): State-of-the-art general embedding model; top MTEB performance.
voyage-code-2: Domain-specific embedding for code search, code Q&A, and technical retrieval.
voyage-finance-2: Optimized for financial document retrieval (10-Ks, earnings calls, market research).
voyage-law-2: Legal document retrieval (contracts, case law, regulations).
voyage-multilingual-2: 100+ language support with strong cross-lingual retrieval.
Rerankers: rerank-1 and rerank-lite-1 for post-retrieval precision improvement.
Instruction following: Embedding models that accept task instructions for better retrieval vs. classification.
REST API: Simple POST endpoint; Python and TypeScript SDKs.

Technical notes

Models: voyage-3, voyage-3-lite, voyage-code-2, voyage-finance-2, voyage-law-2, voyage-multilingual-2
Dimensions: 1024 (voyage-3); 512 (voyage-3-lite) — configurable
Max tokens: 32,000 tokens per input (voyage-3)
API: REST; Python SDK (pip install voyageai); TypeScript SDK
Pricing: Pay-per-token; ~$0.06/1M tokens (voyage-3); free tier available
Company: Voyage AI; Stanford-affiliated; founded 2023 by Tengyu Ma and colleagues

Ideal for

RAG applications in specialized domains (legal, finance, code) where domain-specific embeddings significantly improve retrieval.
Teams who have measured their RAG performance and need the highest possible retrieval accuracy.
Applications needing two-stage retrieval with reranking for precision-critical use cases.

Not ideal for

Simple general-purpose semantic search where OpenAI or Cohere embeddings are sufficient.
Teams heavily invested in OpenAI's ecosystem who want tighter integration.
Large-scale embeddings on a tight budget — premium quality comes at premium pricing.

Voyage AI

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

Voyage AI

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is Voyage AI?

How do Voyage AI's embeddings compare to OpenAI's?

What are Voyage AI's domain-specific models?

Does Voyage AI have a reranker?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also