Guidance is Microsoft's library for controlling LLM generation with more precision than prompting. A Guidance program is a template that mixes static text, Python variables, constrained LLM generation (select from options, match regex, generate JSON), and control flow (if/else, loops). The LLM generates only the parts marked for generation; everything else is fixed. This hybrid approach is more efficient and reliable than pure prompting.

Guidance is completely free and open source (MIT license). Microsoft released it publicly; it's maintained by the guidance-ai organization on GitHub. You bring your own LLM API keys (OpenAI, Anthropic) or use local models via transformers or llamacpp.

How does Guidance differ from Outlines?

Both do constrained generation, but different approaches. Guidance's template language interleaves generation and code in a single program — you write a template with `gen()` placeholders and Python logic mixed together. Outlines focuses on schema-based constraints (Pydantic, JSON schema, regex) applied to free-form generation. Guidance is better for complex programs with branching logic; Outlines for straightforward schema enforcement.

What is token healing in Guidance?

Token healing is a Guidance technique that fixes tokenization boundary issues. When an LLM generates text that abuts constrained content, tokenization can produce suboptimal tokens at boundaries. Token healing backs up and regenerates the last few tokens when a constraint takes effect, ensuring token boundaries don't degrade output quality. This is unique to Guidance and improves output quality in constrained generation.

Guidance | db.fyi

Why it matters

18K+ GitHub stars — one of the most popular structured generation libraries, backed by Microsoft Research's credibility.
Interleaved code and generation approach enables richer programs than pure prompt-and-parse pipelines.
Token healing produces higher-quality constrained output by fixing tokenization boundary artifacts — technically superior to naive constrained decoding.
Works with local models (via transformers, llama.cpp) for full control and privacy.

Key capabilities

gen() function: Mark generation points within templates — LLM generates only where specified.
Constrained generation: select(['option1', 'option2']) for classification; regex patterns; JSON schemas.
Token healing: Automatically fixes tokenization boundary issues for higher quality constrained output.
Template language: Mix Python variables, f-string-like syntax, and LLM generation in one template.
Control structures: Conditionals and loops that incorporate LLM output as conditions or loop variables.
Stateful sessions: Maintain conversation context efficiently — reuse prefills, reduce redundant computation.
Multi-model support: OpenAI, Anthropic, Azure OpenAI, local models (transformers, llamacpp).
Efficient caching: Cache shared prefixes across multiple generations for lower latency and cost.

Technical notes

License: MIT (open source)
GitHub: github.com/guidance-ai/guidance (18K+ stars)
Install: pip install guidance
Backends: OpenAI, Azure OpenAI, Anthropic, Hugging Face transformers, llamacpp
Python: 3.8+
Developed by: Microsoft Research
Alternative to: Langchain chains, Outlines, instructor for structured generation

Ideal for

Teams building complex LLM programs where the flow of generation depends on intermediate outputs — not just linear prompt chains.
Researchers who need precise control over the generation process for experiments and comparisons.
Applications where efficiency matters — Guidance's caching and constrained decoding reduce token usage vs. naive prompting.

Not ideal for

Simple JSON extraction where Outlines or OpenAI structured outputs are simpler.
Teams without Python expertise — Guidance's template language has a learning curve.
Commercial API users who just need basic schema compliance — OpenAI's native structured outputs feature may be sufficient.

Guidance

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

Guidance

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also

FAQ

What is Guidance?

Is Guidance free?

How does Guidance differ from Outlines?

What is token healing in Guidance?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Ideal for

Not ideal for

See also