Every test suite has the same dirty secret: name="Test User", email="test@test.com", bio="Lorem ipsum". Copy-pasted across 50 tests, never catching real edge cases, never feeling like production data.
I built FixtureForge to fix this — but along the way, I learned that AI is the wrong tool for most of the problem. Here's what I mean.
The Problem With "Just Use Faker"
Faker is great for structured fields — names, emails, phone numbers, addresses. But it can't generate a realistic user bio, a convincing product review, or an angry customer complaint that actually tests your edge cases.
# This is what most test data looks like:
user = User(name="Test User", email="test@test.com", bio="Lorem ipsum...")
# It doesn't catch real-world edge cases.
# It doesn't feel like production data.
# Writing 500 of them by hand? Not happening.
The obvious answer in 2026 is "use AI." But sending every field to an LLM is expensive, slow, and unnecessary. An email address doesn't need AI. An auto-incrementing ID definitely doesn't need AI.
The Insight: Only Semantic Fields Need AI
FixtureForge splits every model field into four tiers:
| Tier | Examples | Generator | API Cost |
|---|---|---|---|
| Structural |
id, user_id, created_at
|
Internal counters / FK registry | Free |
| Standard |
name, email, phone
|
Faker | Free |
| Computed |
@computed_field properties |
Pydantic | Free |
| Semantic |
bio, description, review
|
LLM (batched) | API tokens |
The key: 100 users with 2 semantic fields = 2 API calls, not 200. FixtureForge batches all semantic values into a single prompt and asks the LLM to return a JSON array.
from fixtureforge import Forge
from pydantic import BaseModel
class User(BaseModel):
id: int
name: str
email: str
bio: str
forge = Forge()
users = forge.create_batch(User, count=50, context="SaaS platform users")
FixtureForge routes id to a counter, name and email to Faker, and only bio hits the AI — once, for all 50 records.
CI Mode: No AI, No Network, No Flakiness
This is the part that matters most. In CI, you don't want non-deterministic AI calls making your pipeline flaky. FixtureForge has a deterministic mode:
forge = Forge(use_ai=False, seed=42)
users = forge.create_batch(User, count=100)
# Identical output every run — no network calls
seed=42 guarantees byte-identical output across every run, every machine. Faker handles the standard fields deterministically, and semantic fields fall back to template-based generation. No API key required.
The Context Parameter Is Where It Gets Interesting
The real power isn't generating random data — it's generating data that tests specific scenarios:
angry_users = forge.create_batch(
Review,
count=20,
context="1-star reviews from angry holiday shoppers"
)
Each bio or review field comes back with realistic frustration, specific complaints, edge-case formatting (ALL CAPS, emoji, long rants). This is the kind of data that catches bugs in text processing, truncation, rendering, and content moderation — bugs that "Lorem ipsum" never finds.
pytest Integration
In conftest.py:
from fixtureforge import forge_fixture
from myapp.models import User, Order
forge_fixture(User, count=50)
forge_fixture(Order, count=200)
In your tests:
def test_users_have_emails(users):
assert all(u.email for u in users)
def test_order_count(orders):
assert len(orders) == 200
The forge fixture is auto-available. No factory classes to maintain, no fixture files to update.
When You Should NOT Use This
I want to be honest about the limitations:
Don't use FixtureForge if:
- Your tests only need IDs and emails — Faker alone is sufficient and simpler
- You're in a strict air-gapped environment with no API access — CI mode works, but you lose the AI-generated quality
- Your test data needs to match a specific production database schema exactly — use database dumps or migrations instead
Do use FixtureForge if:
- You need realistic text content (bios, reviews, descriptions) at scale
- You want to test edge cases in text processing without writing them by hand
- You need deterministic CI with realistic dev-time data from one tool
- You're tired of maintaining factory_boy factory classes for every model change
How It Compares
| FixtureForge | factory_boy | faker | hypothesis | |
|---|---|---|---|---|
| AI-generated content | Yes | No | No | No |
| Deterministic seed | Yes | Yes | Yes | Yes |
| FK relationships | Auto | Manual | No | No |
| pytest plugin | Yes | Via pytest-factoryboy | No | Yes |
| Large datasets (100k+) | Yes | Manual loops | Manual loops | No |
| Zero config | Yes | Factory classes needed | Provider setup | Strategy setup |
FixtureForge isn't a replacement for Faker — it uses Faker internally. It's the layer between "I need data" and "I need it to feel real."
Try It
pip install fixtureforge
GitHub: Yaniv2809/fixtureforge
Docs: yaniv2809.github.io/fixtureforge
If you've built something similar or have opinions on AI-generated test data vs traditional fixtures, I'd like to hear about it.
Yaniv Metuku (yaniv2809) — QA Automation Engineer. Also building Financial-Integrity-Ecosystem and Failscope.
Top comments (0)