Why it matters
- The fastest way to add vector search to a Python app — literally 3 lines of code to store and query embeddings.
- Runs in-process (no server needed) or as a lightweight HTTP server — ideal for prototyping and small production apps.
- First-class LangChain and LlamaIndex integration makes it the default vector store in most RAG tutorials.
- Open source under Apache 2.0 — no license fees, no vendor lock-in, no usage caps.
Key capabilities
- In-process mode: Run entirely within your Python process with no separate server — just
import chromadb. - Persistent storage: Save collections to disk with
PersistentClient— survives restarts without a database server. - Client-server mode: Run Chroma as an HTTP server and connect from multiple processes or languages.
- Automatic embedding: Pass raw text and Chroma embeds it automatically using sentence-transformers or your chosen model.
- Metadata filtering: Filter by structured metadata fields alongside vector similarity in a single query.
- Multi-modal support: Store and query text, image, and audio embeddings in the same collection.
- LangChain integration:
from langchain_chroma import Chroma— drop-in vector store. - LlamaIndex integration:
ChromaVectorStoreavailable in llama-index-vector-stores-chroma.
Technical notes
- Language: Python (primary); JavaScript/TypeScript client available
- Storage backends: In-memory (ephemeral), persistent disk (SQLite + HNSW index), or HTTP server mode
- Embedding functions: sentence-transformers (default), OpenAI, Cohere, HuggingFace, Google, Ollama
- Indexing algorithm: HNSW (Hierarchical Navigable Small World) for approximate nearest neighbor search
- License: Apache 2.0 — open source
- Chroma Cloud: Managed hosted version in development; self-hosted is the primary deployment today
- Founded: 2022 by Jeff Huber and Anton Troynikov; backed by a16z
Ideal for
- Developers prototyping RAG applications who want the fastest setup without a separate database process.
- Privacy-sensitive applications where embeddings must stay local and never touch a third-party service.
- Small-to-medium production apps (up to millions of vectors) where simplicity is more valuable than horizontal scale.
Not ideal for
- Very large-scale deployments (100M+ vectors) — dedicated vector databases like Pinecone or Milvus scale better.
- Production apps needing managed infrastructure, SLAs, and enterprise support — consider Pinecone or Qdrant Cloud.
- Non-Python applications requiring a native SDK — the JavaScript client is less mature.