DEV Community

Cover image for Building a Chatbot API From Scratch - Part 3: UUID Migration, Product Search, and Tests That Caught a Real Bug
Chris Kechagias
Chris Kechagias

Posted on

Building a Chatbot API From Scratch - Part 3: UUID Migration, Product Search, and Tests That Caught a Real Bug

Two months in, both APIs are running. Before RAG touches either of them, I needed to fix decisions I'd been living with since day one.

The retail API had integer primary keys. The chatbot had no tests. Both needed to change before I bolted anything else on.

Five PRs across two repos. One Alembic migration I was genuinely nervous about. A test suite that found a bug I didn't know existed.


UUID primary keys on the retail API

The retail API launched with auto-incrementing integers for product and variant IDs. They work. They're also predictable — a scraper can enumerate your entire catalog by counting up from 1.

More importantly: integers don't compose well when you're building a RAG pipeline where embeddings need stable, unique identifiers across multiple services. id=42 is not what you want there. id=550e8400-e29b-41d4-a716-446655440000 is.

So I migrated.

The model change

SQLModel makes the declaration straightforward:

from uuid import UUID, uuid4
from sqlmodel import Field

id: UUID = Field(default_factory=uuid4, primary_key=True)
Enter fullscreen mode Exit fullscreen mode

default_factory=uuid4 means every new product gets a UUID generated by Python. No sequence, no extra database round-trip.

Variants were trickier. The product_id foreign key needs to store a UUID and tell SQLAlchemy to use PostgreSQL's native UUID type:

from uuid import UUID, uuid4
from sqlalchemy import Column, ForeignKey
from sqlalchemy.dialects.postgresql import UUID as PGUUID

product_id: UUID = Field(
    sa_column=Column(
        PGUUID(as_uuid=True),
        ForeignKey("product.id", ondelete="CASCADE"),
        nullable=False,
    )
)
Enter fullscreen mode Exit fullscreen mode

as_uuid=True tells SQLAlchemy to return Python UUID objects, not raw strings. Skip that and you'll spend 20 minutes confused about why comparisons fail.

The migration

The model change is two lines. The Alembic migration is where it gets real.

You can't drop the integer column and add a UUID one.There's existing data, and the variant table has a foreign key pointing at the product table. The migration has to be surgical.

Eight steps:

  1. Add id_new UUID to product with gen_random_uuid() as server default
  2. Add product_id_new UUID to product_variant
  3. Assign SKUs via ROW_NUMBER() OVER (PARTITION BY category) while we're already in there
  4. Link variant UUIDs to their products via a join
  5. Drop the FK constraint on product_variant.product_id
  6. Drop the old integer columns on both tables
  7. Rename the new columns, recreate PK and FK constraints
  8. Add unique constraint on sku

The downgrade() raises NotImplementedError. This migration is one-way.

def downgrade() -> None:
    raise NotImplementedError("UUID migration is irreversible.")
Enter fullscreen mode Exit fullscreen mode

I've seen empty downgrade functions. That's worse! It runs silently and does nothing, which means you think you rolled back when you haven't. An explicit error is honest.


SKU generation

Every product now gets a SKU on creation. Format: two-letter category prefix plus four-digit sequential count within that category.

TS-0001. SW-0003. Seven characters max.

The prefix map lives in utils/sku.py:

CATEGORY_PREFIX: dict[Category, str] = {
    Category.TEES: "TS",
    Category.SWEATERS: "SW",
    Category.SHIRTS: "ST",
    Category.PANTS: "PT",
    Category.SHORTS: "SH",
    Category.TANK_TOPS: "TT",
    Category.OTHER: "OR",
}

def generate_sku(category: Category, count: int) -> str:
    prefix = CATEGORY_PREFIX[category]
    return f"{prefix}-{count:04d}"
Enter fullscreen mode Exit fullscreen mode

The controller counts existing products in that category and passes the next number:

category_count = session.exec(
    select(func.count()).where(Product.category == product.category)
).one()
sku = generate_sku(product.category, category_count + 1)
db_product = Product.model_validate(product, update={"sku": sku})
Enter fullscreen mode Exit fullscreen mode

.one() on a COUNT query always returns exactly one row. That's why it's .one() and not .first().

One thing to catch: max_length=7 on the SKU field. TS-0001 is 7 characters. I had it set to 10 initially and only caught it during review.


Product search

One endpoint. One query parameter. Two behaviours.

GET /products/search?q=TS-0001 should return an exact match. GET /products/search?q=blue tee should return everything with that phrase in the name. Same URL, different logic based on what the query looks like.

def looks_like_sku(q: str) -> bool:
    return bool(re.match(r'^[A-Za-z]{2}-\d{4}$', q.strip()))

def search_products_controller(q: str, session: SessionDep):
    if looks_like_sku(q):
        product = session.exec(
            select(Product).where(Product.sku == q.upper())
        ).first()
        return [product] if product else []
    else:
        statement = select(Product).where(Product.name.ilike(f"%{q}%"))
        return session.exec(statement).all()
Enter fullscreen mode Exit fullscreen mode

Always returns a list. The router signature is list[Product] regardless of which branch ran. Returning a single object from a list endpoint causes a ResponseValidationError. I found this out from a 500.

The route goes above /{product_id} in the router file. FastAPI matches routes top-to-bottom. If /{product_id} comes first, /search gets matched as a product ID and FastAPI tries to cast the string "search" to a UUID. You get a 422. I found this out from a 422.

The long-term plan: ilike becomes semantic search when ChromaDB is wired in. Same endpoint, smarter backend. This PR is the foundation.


Chatbot tests

Part 2 shipped streaming, composable prompts, and Docker. The chatbot had no tests.

Fine when you're moving fast. Until you want to add RAG on top of it, at which point you really want to know the foundation holds.

The setup

Same pattern as the retail API: SQLite in-memory, dependency override, lifespan override.

engine = create_engine(
    "sqlite://",
    connect_args={"check_same_thread": False},
    poolclass=StaticPool,
)

@pytest.fixture(name="client")
def client_fixture(session):
    app.dependency_overrides[get_session] = lambda: session
    app.router.lifespan_context = lifespan_override
    with TestClient(app) as client:
        yield client
    app.dependency_overrides.clear()
Enter fullscreen mode Exit fullscreen mode

One gotcha: without pythonpath = ["."] in pyproject.toml, VS Code's test runner breaks on import. It overrides PYTHONPATH and relative imports stop resolving. The fix is one line in config, but it costs you a debugging session to find it.

Mocking the stream

The chat endpoints return StreamingResponse. TestClient handles that, it collects the full stream and gives you the response body. The OpenAI call still needs mocking.

The streaming function is an async generator, so the mock has to be too:

async def mock_streaming_response():
    chunks = [
        MagicMock(choices=[MagicMock(delta=MagicMock(content="Hello"))], usage=None),
        MagicMock(choices=[MagicMock(delta=MagicMock(content=" world"))], usage=None),
        MagicMock(choices=[], usage=MagicMock(total_tokens=10)),
    ]
    for chunk in chunks:
        yield chunk
Enter fullscreen mode Exit fullscreen mode

Then patch it:

@patch(
    "app.controllers.chat.get_chat_completion_stream",
    return_value=mock_streaming_response()
)
def test_create_chat_success(mock_stream, client):
    ...
Enter fullscreen mode Exit fullscreen mode

One thing that burned me: using MagicMock for async functions. get_chat_completion and generate_conversation_title are both async def, so they need AsyncMock:

@patch("app.services.summarizer.get_chat_completion", new_callable=AsyncMock)
Enter fullscreen mode Exit fullscreen mode

MagicMock on an async function returns a coroutine object instead of the value. The test passes but asserts against the wrong thing. Silent failure, the worst kind.

The summarizer tests use side_effect to simulate the nano model failing and the mini model picking it up:

mock_get_completion.side_effect = [
    OpenAIServiceException(...),  # nano fails
    MagicMock(content="Summary text"),  # mini succeeds
]
Enter fullscreen mode Exit fullscreen mode

What the tests actually caught

While writing the streaming tests, I noticed POST /chat/ had status_code=status.HTTP_201_CREATED on the router decorator.

But StreamingResponse ignores the decorator's status code. It always returns 200. The 201 was a lie.

Fixed to status_code=status.HTTP_200_OK. Small thing. But clients checking status codes would have gotten 200 and assumed it was wrong.

Tests found it. Code review didn't.

Final count: 10 tests on the chat endpoints, 3 on the token utility, 6 on the summarizer background tasks.


The formatting rule

The composable prompt system from Part 2 earned its keep here.

Testing the chatbot on Telegram, responses came back with markdown bold text, bullet points with asterisks, full headers. Fine in a web UI. Ugly in a messaging app where asterisks are just asterisks.

The fix: one new file, prompts/rules/formatting.md:

Do not use markdown formatting unless explicitly asked.
No headers, no bullet points, no numbered lists, no bold text.
Write in plain prose only.
Keep responses concise — this is a messaging interface, not a document.
Enter fullscreen mode Exit fullscreen mode

And one line in prompts.yaml:

stoic:
  rules: [communication, factuality, formatting]
Enter fullscreen mode Exit fullscreen mode

A new markdown file and a YAML entry. That's the whole change.

That's exactly what the prompt system was designed for.


What's next

Both APIs are solid. UUID keys, human-readable SKUs, search endpoints, proper test coverage, a chatbot that doesn't spam markdown at Telegram.

RAG is next. ChromaDB for the retail API, semantic search replacing ilike, product embeddings generated on creation. The search endpoint is already wired, it just needs a smarter backend.

The repo is public: GitHub

Follow the journey: LinkedIn | Medium

Built PR by PR. Mistakes included.

Top comments (0)