Discussion on: Building Serverless APIs with TDD and AI-Powered Spec Generation

View post

AI-generated specs as the failing test source is a really underrated pattern. One gotcha I've hit: when you let the model write the OpenAPI schema and the tests from the same prompt, any hallucination gets "verified" by its own stub — the test passes, the spec is wrong, and the bug ships. I now run the spec and tests through separate model calls with different system prompts so at least one catches drift. Curious if you've layered schema validation (like pydantic / zod) on top of the generated tests to catch that third failure mode?

Salih Guler AWS • Apr 17

Thanks for your comment!

I now run the spec and tests through separate model calls with different system prompts so at least one catches drift.

This is a great practice to do. I think as developers we are still learning on how to work with AI tools and one of the mistakes we do is trying to do all at once without directing AI tools to work effectively.

I use zod for schema validation (I am a big TS fan, if you check my GitHub, you wills see my dislike towards "any"). Also, as the projects grow and definitions become clear your "system prompts" to work as a guide and guardrail on top of your "instruction prompt" to generate schemas will evolve as well.

In this blog post I wanted to show folks how to think about generating specs and thinking in features. Maybe this could be another blog post to go deeper in to this concept.