Instructor is a Python library that wraps LLM clients (OpenAI, Anthropic, etc.) to return Pydantic objects instead of raw text. You define a Pydantic model describing the data you want (`class User(BaseModel): name: str; age: int`), and Instructor handles: prompting the model to return JSON, parsing the response, validating it against your schema, and automatically retrying with error feedback if validation fails.

Completely free and open source (MIT license). You pay only for the LLM API calls you make — Instructor is a thin wrapper with no additional costs. `pip install instructor` to get started. GitHub: github.com/jxnl/instructor.

How does Instructor handle validation failures?

When an LLM returns invalid JSON or data that doesn't match the Pydantic schema, Instructor automatically retries: it sends the validation error back to the model as feedback ('Field `age` must be an integer, but got string `"thirty"`. Please fix this.') and asks it to correct the output. This retry loop (configurable max retries) dramatically improves reliability for structured extraction.

How does Instructor compare to Outlines?

Outlines uses constrained decoding — it mathematically prevents invalid tokens during generation, guaranteeing valid output without retries. Instructor uses prompt engineering + retry loops — simpler to use with any LLM API but relies on the model cooperating. Outlines is more reliable but requires local model access; Instructor works with any API (OpenAI, Anthropic, etc.) and is simpler to integrate. For API-based development, Instructor is more practical.

Instructor | db.fyi

Why it matters

Eliminates JSON parsing boilerplate — no more json.loads(response.content) with try/except wrappers.
Automatic retry with validation feedback dramatically improves structured extraction reliability vs. naive JSON prompting.
Works with every major LLM API (OpenAI, Anthropic, Google, Cohere, Mistral) via provider-specific patches.
Created by Jason Liu (jxnl), a respected practitioner in the LLM engineering community — built from real production patterns.

Key capabilities

Pydantic integration: Define data models with Pydantic; receive typed, validated Python objects.
Auto-retry: Validation errors are fed back to the LLM for automatic correction (configurable max_retries).
Multi-provider: OpenAI, Anthropic Claude, Google Gemini, Cohere, Mistral, Groq, Ollama.
Streaming: Stream structured outputs token-by-token with partial object construction.
Nested models: Support for complex nested Pydantic schemas with relationships.
Validators: Custom Pydantic validators run on extracted data; failures trigger retry.
Async support: AsyncInstructor for async/await usage with async LLM clients.
Hooks: Pre/post-processing hooks for logging, caching, and monitoring.

Technical notes

Install: pip install instructor
License: MIT (open source)
GitHub: github.com/jxnl/instructor (9K+ stars)
Providers: OpenAI, Anthropic, Google, Cohere, Mistral, Groq, Ollama, Bedrock
Python: 3.9+
Created by: Jason Liu (jxnl)

Usage example

import instructor
from openai import OpenAI
from pydantic import BaseModel

client = instructor.from_openai(OpenAI())

class User(BaseModel):
    name: str
    age: int

user = client.chat.completions.create(
    model="gpt-4o",
    response_model=User,
    messages=[{"role": "user", "content": "Extract: John is 30 years old"}],
)
# user.name == "John", user.age == 30

Ideal for

Building data extraction pipelines where LLM output must conform to specific schemas (invoices, entities, classifications).
API developers who want typed responses from LLMs without building custom parsing and validation logic.
Teams using Pydantic already who want to extend it to LLM responses naturally.

Not ideal for

Local model users who want guaranteed constrained generation — use Outlines for mathematical guarantees.
Free-form text generation where structure isn't needed.
Non-Python environments — Instructor is Python-only (JavaScript users can use Zod + Vercel AI SDK).

Instructor

Why it matters

Key capabilities

Technical notes

Usage example

Ideal for

Not ideal for

See also

FAQ

Alternatives

Integrations

Built on

Why it matters

Key capabilities

Technical notes

Usage example

Ideal for

Not ideal for

See also

Instructor

Why it matters

Key capabilities

Technical notes

Usage example

Ideal for

Not ideal for

See also

FAQ

What is Instructor?

Is Instructor free?

How does Instructor handle validation failures?

How does Instructor compare to Outlines?

Alternatives

Integrations

Built on

Related tools

Why it matters

Key capabilities

Technical notes

Usage example

Ideal for

Not ideal for

See also