I built a runbook generator with FastAPI and Groq — here's how it works

#devops #python #fastapi #opensource

Every time there's an incident or a deployment, I need a runbook. The structure is always the same: Overview → Prerequisites → Procedure → Verification → Troubleshooting → Rollback.

I got tired of filling in the same skeleton over and over, so I built Auto Runbook Generator.

What it does

Paste a Kubernetes YAML, a docker-compose file, or just describe your system in plain English. Pick a runbook type (deployment, incident response, rollback, maintenance, onboarding). Get a complete Markdown runbook back in seconds.

Live demo: https://auto-runbook-gen-v0-1-0.onrender.com

How I built it

Backend: FastAPI + AsyncGroq

The core is a single /api/generate endpoint. It takes the user's input and runbook type, builds a prompt, and calls Groq's API with llama-3.3-70b.

response = await get_groq().chat.completions.create(
    model="llama-3.3-70b-versatile",
    messages=[
        {"role": "system", "content": SYSTEM_PROMPT},
        {"role": "user", "content": user_prompt},
    ],
    max_tokens=2048,
    temperature=0.3,
)

I use AsyncGroq (not Groq) because FastAPI is async — calling a sync client inside an async endpoint blocks the event loop.

Rate limiting: slowapi

Free tier gets 10 generations/day per IP. Pro API key bypasses this.

@app.post("/api/generate")
@limiter.limit("10/day")
async def generate(request: Request, ...):

The system prompt matters a lot

The prompt enforces a consistent structure with 6 sections every time:

# [Service Name] Runbook
## Overview
## Prerequisites
## Procedure
## Verification
## Troubleshooting
## Rollback

Low temperature (0.3) keeps outputs consistent and factual rather than creative.

Frontend: zero dependencies

Single HTML file, vanilla JS. fetch() to call the API, navigator.clipboard to copy the result. No React, no build step.

Deploy: Docker on Render

FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "app.main:app", "--host", "0.0.0.0", "--port", "8000"]

Results

The output quality is good enough to use directly for most common cases. Kubernetes and Docker scenarios work especially well since the LLM has strong training data for those.