Helicone is an LLM observability and analytics platform. It acts as a transparent proxy — you change one URL in your code (`https://api.openai.com` → `https://oai.helicone.ai`), add a header with your Helicone API key, and every LLM call is logged automatically. No code instrumentation, no SDK wrapping — just change the base URL.

Helicone has a free tier with 10,000 requests/month. Growth plan ($80/mo) adds 500,000 requests. Pro ($200/mo) adds 2M requests and priority support. Enterprise is custom-priced. Helicone is also open source — self-host for unlimited usage.

What LLMs does Helicone support?

Helicone supports OpenAI (GPT-4o, GPT-4-turbo, GPT-3.5), Anthropic Claude, Azure OpenAI, Groq, Together AI, AWS Bedrock, Mistral, Google Gemini, and any OpenAI-compatible API. You change the base URL to Helicone's proxy for each provider.

Does Helicone support streaming?

Yes. Helicone fully supports streaming responses (SSE) for real-time token output. It captures the full streaming response and logs it correctly without blocking or buffering the stream.

Helicone | db.fyi

Why it matters

The fastest integration path for LLM observability — literally one line of code (change base URL) vs. SDK instrumentation.
Captures 100% of LLM calls automatically as a transparent proxy — nothing slips through even if you forget to add tracing.
Real-time cost tracking across providers — essential for teams managing LLM API budgets.
Open source, so self-hosting is an option for teams with strict data privacy requirements.

Key capabilities

Proxy-based logging: Change your API base URL to Helicone's proxy; every request is logged automatically.
Cost tracking: Per-request and aggregate cost in USD by provider, model, and user property.
Latency analytics: P50/P95/P99 latency; slow request identification; model comparison.
Request/response logging: Full prompt and completion stored with configurable retention.
User tracking: Add Helicone-User-Id header to attribute usage to specific users or sessions.
Rate limiting: Configure per-user rate limits via Helicone headers — protect against runaway usage.
Caching: Cache identical LLM requests to reduce costs and latency.
Custom properties: Tag requests with metadata (experiment name, environment, model version) for filtering.

Technical notes

Integration: Change base URL to Helicone proxy and add Helicone-Auth header — no SDK needed
Proxy URLs: oai.helicone.ai (OpenAI), anthropic.helicone.ai (Anthropic), groq.helicone.ai (Groq)
Open source: github.com/Helicone/helicone — MIT license; self-hostable
Pricing: Free (10K req/mo); Growth $80/mo; Pro $200/mo; Enterprise custom
Streaming: Full SSE streaming support
Data retention: 30 days on free; 90 days on Growth; Enterprise configurable
Founded: 2023 by Justin Torre; San Francisco; YC W23

Ideal for

Teams who want LLM observability with minimal integration effort — no code instrumentation required.
Applications already using OpenAI SDK where changing one URL is the preferred integration path.
Teams tracking multi-user LLM usage and costs who need per-user attribution and rate limiting.

Not ideal for

Complex evaluation and prompt management workflows — LangSmith or LangFuse are more comprehensive.
Teams using custom LLM architectures that don't fit the proxy model.
Teams with strict data residency requirements who can't route API traffic through Helicone's servers (self-host instead).

Helicone

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

Helicone

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is Helicone?

Is Helicone free?

What LLMs does Helicone support?

Does Helicone support streaming?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also