Why it matters
- 18K+ GitHub stars — one of the most popular structured generation libraries, backed by Microsoft Research's credibility.
- Interleaved code and generation approach enables richer programs than pure prompt-and-parse pipelines.
- Token healing produces higher-quality constrained output by fixing tokenization boundary artifacts — technically superior to naive constrained decoding.
- Works with local models (via transformers, llama.cpp) for full control and privacy.
Key capabilities
gen()function: Mark generation points within templates — LLM generates only where specified.- Constrained generation:
select(['option1', 'option2'])for classification; regex patterns; JSON schemas. - Token healing: Automatically fixes tokenization boundary issues for higher quality constrained output.
- Template language: Mix Python variables, f-string-like syntax, and LLM generation in one template.
- Control structures: Conditionals and loops that incorporate LLM output as conditions or loop variables.
- Stateful sessions: Maintain conversation context efficiently — reuse prefills, reduce redundant computation.
- Multi-model support: OpenAI, Anthropic, Azure OpenAI, local models (transformers, llamacpp).
- Efficient caching: Cache shared prefixes across multiple generations for lower latency and cost.
Technical notes
- License: MIT (open source)
- GitHub: github.com/guidance-ai/guidance (18K+ stars)
- Install:
pip install guidance - Backends: OpenAI, Azure OpenAI, Anthropic, Hugging Face transformers, llamacpp
- Python: 3.8+
- Developed by: Microsoft Research
- Alternative to: Langchain chains, Outlines, instructor for structured generation
Ideal for
- Teams building complex LLM programs where the flow of generation depends on intermediate outputs — not just linear prompt chains.
- Researchers who need precise control over the generation process for experiments and comparisons.
- Applications where efficiency matters — Guidance's caching and constrained decoding reduce token usage vs. naive prompting.
Not ideal for
- Simple JSON extraction where Outlines or OpenAI structured outputs are simpler.
- Teams without Python expertise — Guidance's template language has a learning curve.
- Commercial API users who just need basic schema compliance — OpenAI's native structured outputs feature may be sufficient.
See also
- Outlines — Alternative structured generation library; stronger for schema-based constraints.
- DSPy — Higher-level LLM program optimization; treats prompts as learnable parameters.
- Instructor — Simpler Pydantic-based structured output for commercial LLM APIs.