LLMs sometimes return invalid JSON. Outlines makes it physically impossible — by constraining token generation to match your schema.
What Is Outlines?
Outlines is a Python library that constrains LLM text generation using formal grammars. Unlike Instructor (which validates after generation), Outlines constrains DURING generation — the LLM can only produce valid tokens.
import outlines
model = outlines.models.transformers("mistralai/Mistral-7B-v0.3")
# JSON schema constraint
schema = '''{
"type": "object",
"properties": {
"name": {"type": "string"},
"age": {"type": "integer"},
"city": {"type": "string"}
},
"required": ["name", "age", "city"]
}'''
generator = outlines.generate.json(model, schema)
result = generator("Extract info: John is 30 and lives in NYC")
# GUARANTEED valid JSON matching schema
# {"name": "John", "age": 30, "city": "NYC"}
Regex Constraints
# Phone number — only valid phone patterns
phone_gen = outlines.generate.regex(model, r"\+1-\d{3}-\d{3}-\d{4}")
phone = phone_gen("Give me a US phone number")
# "+1-555-123-4567" — ALWAYS matches regex
# Email
email_gen = outlines.generate.regex(model, r"[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}")
email = email_gen("Generate an email for John Doe")
# "john.doe@example.com"
# Date
date_gen = outlines.generate.regex(model, r"\d{4}-\d{2}-\d{2}")
Enum/Choice
from enum import Enum
class Sentiment(str, Enum):
POSITIVE = "positive"
NEGATIVE = "negative"
NEUTRAL = "neutral"
sentiment_gen = outlines.generate.choice(model, list(Sentiment))
result = sentiment_gen("The movie was absolutely terrible")
# "negative" — can ONLY be one of the three options
How It Works
Outlines builds a finite-state machine from your schema/regex. At each generation step, it masks all tokens that would lead to invalid output. The LLM can only choose valid next tokens.
This means: 100% valid output. Not 99%. Not "retry until valid." One hundred percent. Every time.
Outlines vs Instructor
| Feature | Instructor | Outlines |
|---|---|---|
| Method | Validate after generation | Constrain during generation |
| Guarantee | Retry on failure | 100% valid by construction |
| Speed | May need retries | One pass |
| Models | Cloud APIs | Local models (transformers) |
Building AI tools? Check out my data extraction toolkit or email spinov001@gmail.com.
Top comments (0)