Why it matters
- Single
completion()function that works with 100+ providers — swap models by changing one string, not rewriting your code. - LiteLLM Proxy turns any codebase that uses the OpenAI SDK into one that can use Anthropic, Gemini, or any other provider instantly.
- Cost tracking and budget limits across all providers in one place — essential for teams using multiple LLMs.
- The de facto standard for building multi-provider LLM applications and agent frameworks in Python.
Key capabilities
- Unified completion API:
litellm.completion("anthropic/claude-3-5-sonnet", messages=[...])— same interface for all providers. - 100+ provider support: OpenAI, Anthropic, Google Gemini, Azure OpenAI, AWS Bedrock, Cohere, Groq, Together AI, Replicate, Ollama, and more.
- Input/output normalization: Translates between provider-specific formats automatically.
- Streaming support: Unified streaming interface across all providers that support it.
- Async support:
litellm.acompletion()for asyncio-based applications. - LiteLLM Proxy: Self-hosted OpenAI-compatible server with load balancing, fallbacks, rate limiting, and virtual API keys.
- Cost calculation: Built-in cost tracking per model with
litellm.completion_cost(). - Error handling: Consistent error types and retry logic across providers.
- Embedding support:
litellm.embedding()for unified access to embedding models.
Technical notes
- Language: Python 3.8+;
pip install litellm - License: MIT — fully open source at github.com/BerriAI/litellm
- Proxy deployment: Docker (
docker pull ghcr.io/berriai/litellm); config via YAML - OpenAI compatibility: Full OpenAI API compatibility — any OpenAI SDK client works with LiteLLM Proxy
- LiteLLM Enterprise: Custom pricing — adds SSO, audit logs, HIPAA compliance, dedicated support
- Providers: 100+ including all major US and international LLM API providers
- Founded: 2023 by Krish Dholakia and Ishaan Jaffer; San Francisco; YC backed
Ideal for
- Python developers building LLM applications who want provider flexibility without rewriting their code.
- Teams running a self-hosted LLM proxy to add load balancing, rate limiting, and cost controls on top of their LLM API usage.
- Companies using multiple LLM providers (e.g., Claude for quality, Groq for speed) and wanting a unified interface.
Not ideal for
- Teams who want a hosted multi-LLM API without self-hosting infrastructure — use OpenRouter instead.
- Non-Python applications — LiteLLM is a Python library; the proxy has REST API compatibility but needs DevOps.
- Simple single-provider setups — just use the official SDK; LiteLLM adds complexity if you only use one provider.
See also
- OpenRouter — Hosted API service for 100+ LLMs — simpler, no self-hosting.
- Portkey — AI gateway with LLM routing, fallbacks, and observability as a service.
- LangChain — LLM framework that integrates LiteLLM for multi-provider support.