Why it matters
- Hybrid search (vector + BM25) outperforms pure vector search for most RAG use cases — critical for production-quality retrieval where some queries are semantic and others are exact-match.
- Built-in LLM modules auto-vectorize data on ingestion — point at OpenAI/Cohere/HuggingFace and skip manual embedding generation pipelines.
- Open-source self-hosted option provides complete data privacy and cost control — critical for enterprises who can't send data to cloud vector databases.
- Multi-tenancy support handles SaaS applications where each customer has isolated vector data.
Key capabilities
- Vector search: Approximate nearest neighbor (ANN) search with HNSW indexing.
- BM25 search: Traditional keyword/term frequency search.
- Hybrid search: Combine vector and keyword search with configurable weighting.
- Auto-vectorization: Native modules for OpenAI, Cohere, HuggingFace, and local model embedding.
- GraphQL API: Query with GraphQL in addition to REST.
- Multi-tenancy: Data isolation between tenants for SaaS applications.
- Generative AI: RAG modules for generating answers from retrieved context.
- Self-hosted: Run on Docker/Kubernetes; no vendor lock-in.
- Weaviate Cloud: Managed service with free sandbox tier.
- Python/JS SDKs: Official client libraries.
Technical notes
- License: BSD 3-Clause
- GitHub: github.com/weaviate/weaviate
- Stars: 11K+
- Self-hosted: Docker/Kubernetes; free
- Cloud: Weaviate Cloud; Sandbox free (14-day TTL); paid from $25/month
- API: GraphQL + REST
- Embedding modules: OpenAI, Cohere, HuggingFace, Palm, custom
- Languages: Python, JavaScript/TypeScript, Go, Java client libraries
Usage example
import weaviate
from weaviate.classes.init import Auth
# Connect to Weaviate Cloud
client = weaviate.connect_to_weaviate_cloud(
cluster_url="your-cluster.weaviate.network",
auth_credentials=Auth.api_key("your-api-key"),
headers={"X-OpenAI-Api-Key": "your-openai-key"} # for auto-vectorization
)
# Hybrid search
response = client.collections.get("Documents").query.hybrid(
query="machine learning transformers",
alpha=0.5, # balance vector and keyword search
limit=5
)
Ideal for
- RAG applications requiring hybrid search for high-quality retrieval across semantic and keyword queries.
- Enterprise teams who need self-hosted vector storage with full data privacy.
- Multi-tenant SaaS applications where each customer's data must be isolated.
Not ideal for
- Simple vector search without keyword matching needs — Pinecone or Qdrant may be simpler.
- Teams wanting the simplest possible managed setup with no infrastructure thinking — Pinecone has less operational overhead.
- Very high vector counts (100M+) without managed cloud support — self-hosted Weaviate requires careful capacity planning.
See also
- LlamaIndex — RAG framework with native Weaviate integration; use together for full RAG pipeline.
- Jina Embeddings — High-quality embeddings for populating Weaviate.
- Haystack — NLP pipeline framework with Weaviate as a document store backend.