Last month, I had to build a dozen API endpoints for a new microservice. I knew the patterns – CRUD operations, SQLAlchemy models, Pydantic schemas – but typing out all that boilerplate felt soul-crushing.
I turned to AI, hoping it would save me hours. What followed was a rollercoaster of bad outputs, hallucinations, and frustration. But after a week of failed attempts, I found a prompting approach that actually produced reliable, copy-pasteable code.
This isn't a “just use this tool” story. It's about the technique that finally worked for me, with real examples you can adapt.
The Problem: AI Kept Writing Garbage
I started with the obvious: “Write a Flask route for creating a user.” The AI spat back something like:
@app.route('/users', methods=['POST'])
def create_user():
data = request.json
# Create user logic
return jsonify({'id': 1, 'username': 'john'})
Hardcoded response! Not even using the input. I tried again: “Write a proper endpoint with validation.” It gave me a mix of Flask and FastAPI syntax. Maddening.
I realized the problem was me. I was asking AI to read my mind. It didn't know my database schema, error handling patterns, or even which ORM I used.
What Didn't Work
I tried several things that made things worse:
- Vague prompts: “Generate a user CRUD” → got a generic mess with no imports
- One-shot without context: No examples → model guessed wildly
- Assuming it remembered previous messages: In stateless API calls, every prompt is a fresh start
- Asking for too much at once: A single prompt to write the entire module → output often cut off or inconsistent
The Breakthrough: Structure Before Generation
I stopped treating the AI like a senior developer and started treating it like a diligent intern who needs extremely detailed instructions. The key was to provide:
- Explicit input/output format – “Given a model definition in JSON, produce a Flask blueprint file.”
- Few-shot examples – Show two or three complete input/output pairs before asking it to generate.
- A system message setting the role – “You are a Python backend developer. You write clean, type-annotated code following Flask best practices.”
Here's the actual approach that clicked for me.
The Few-Shot Prompt Template
I built a prompt generator in Python. It takes a schema description and produces a prompt with examples:
import requests
def generate_endpoint_code(schema_json: dict) -> str:
system_msg = "You are a Python backend developer. You write Flask routes with SQLAlchemy and Marshmallow. Always include error handling and docstrings."
examples = [
{
"input": '{"model": "Project", "fields": ["id", "name", "created_at"]}',
"output": '''
# routes/projects.py
from flask import Blueprint, request, jsonify
from models import db, Project
from schemas import ProjectSchema
projects_bp = Blueprint('projects', __name__)
@projects_bp.route('/projects', methods=['POST'])
def create_project():
schema = ProjectSchema()
try:
data = schema.load(request.json)
except ValidationError as err:
return jsonify(err.messages), 400
project = Project(**data)
db.session.add(project)
db.session.commit()
return jsonify(schema.dump(project)), 201
'''
}
]
# Build the prompt with examples + new request
user_prompt = f"""
Generate a Flask endpoint for creating a resource based on the following schema:
{schema_json}
Use the same style as the examples below.
Here is an example:
Example input:
{examples[0]['input']}
Example output:
{examples[0]['output']}
Now generate for this input:
{schema_json}
"""
response = requests.post(
"https://api.openai.com/v1/chat/completions",
headers={"Authorization": "Bearer YOUR_KEY"},
json={
"model": "gpt-4",
"messages": [
{"role": "system", "content": system_msg},
{"role": "user", "content": user_prompt}
],
"temperature": 0.2 # low creativity for deterministic output
}
)
return response.json()['choices'][0]['message']['content']
The Result
After this change, the AI started generating code that:
- Actually used
request.json - Imported the correct modules
- Had consistent error handling
- Used my naming conventions (because the example showed them)
I could then copy the output directly into my project, only tweaking a few details.
Lessons Learned & Trade-offs
This approach isn't magic. Here's what I wish someone had told me:
-
You still need to review everything. AI hallucinates imports or uses non-existent methods. I caught a
from sqlalchemy import Column, Integer, Stringthat was fine, but it once invented afrom flask_sqlalchemy import dbthat doesn't exist without extension setup. - Few-shot works best for repetitive patterns. For complex business logic with multiple conditions, AI often makes mistakes. I still write those manually.
- Temperature matters. Setting temperature to 0.0 gave extremely predictable but sometimes stilted code. I settled on 0.2.
- Context length is a limit. My examples took up tokens, leaving less room for the actual generation. For large schemas, I had to shorten the examples.
- It's not free. API calls cost money. For a quick script, it's fine; for generating hundreds of endpoints, you might prefer a template engine.
When NOT to Use This
- When the pattern is trivial and you can type it in 10 seconds.
- When security is critical – AI might skip authentication checks.
- When you need to adhere to a very specific code style that the model hasn't seen.
What I'd Do Differently Next Time
- Invest in a local model. For privacy and cost, I'm now trying Llama 3 with a quantized version on my machine. The quality is close.
-
Use a schema compiler. Instead of generating code with AI, I'd use a tool like
datamodel-code-generatorfor Pydantic models and only use AI for the wiring (routes, error handling). - Build a prompt library. I keep reusable prompt fragments for different tasks (models, routes, tests) so I don't start from scratch.
Final Thoughts
AI code generation can save you from burnout on boilerplate, but it's not a silver bullet. The trick is to give the model the right amount of structure – not too vague, not too rigid. Embrace few-shot prompting. And always, always review the output.
I'm still early in this journey. How do you handle AI-generated code in your workflow? Do you rely on prompts, or have you built a custom tool around it? I'd love to hear what works for you.
Top comments (0)