Instructor is a library for getting structured, validated outputs from LLMs. Instead of parsing messy text, define a Pydantic model and get a typed object back — every time.
Why Instructor Fixes LLM Output
A developer was using regex to parse LLM responses into JSON. It broke 30% of the time — hallucinated fields, wrong types, missing data. Instructor guarantees structured output with automatic retries.
Key Features:
- Structured Output — Get Pydantic models from any LLM
- Validation — Automatic retry on validation errors
- Streaming — Stream partial objects as they're generated
- Multi-Provider — OpenAI, Anthropic, Cohere, Mistral, local models
- Type-Safe — Full IDE support with type inference
Quick Start
pip install instructor
import instructor
from openai import OpenAI
from pydantic import BaseModel
client = instructor.from_openai(OpenAI())
class User(BaseModel):
name: str
age: int
email: str
user = client.chat.completions.create(
model="gpt-4",
response_model=User,
messages=[{"role": "user", "content": "Extract: John is 30, email john@example.com"}]
)
print(user.name) # "John"
print(user.age) # 30
print(user.email) # "john@example.com"
Complex Extraction
class Address(BaseModel):
street: str
city: str
country: str
class Company(BaseModel):
name: str
employees: int
address: Address
tags: list[str]
company = client.chat.completions.create(
model="gpt-4",
response_model=Company,
messages=[{"role": "user", "content": article_text}]
)
Why Choose Instructor
- Guaranteed structure — no more parsing failures
- Validation + retries — automatic error correction
- Any provider — works with all major LLMs
Check out Instructor docs to get started.
Building AI data pipelines? Check out my Apify actors or email spinov001@gmail.com for custom solutions.
Top comments (0)