Before I ship any LLM feature to production, I run through this checklist. It takes 20 minutes. It has caught every major incident I've had in the last 18 months.
The Checklist
1. Drift & Monitoring
- [ ] Baseline outputs captured for all prompts
- [ ] Drift detection running on hourly schedule
- [ ] Alert thresholds set and tested
- [ ] Dashboard shows current drift status
- [ ] Historical drift data accessible
2. Input Validation
- [ ] Known prompt injection patterns blocked
- [ ] Input length limits enforced
- [ ] Special characters handled
- [ ] Non-text inputs (images, files) have defined limits
3. Output Validation
- [ ] JSON schema validation in place
- [ ] String match vs semantic similarity testing done
- [ ] Format regression tests exist
- [ ] Fallback behavior defined for malformed output
4. Rate Limiting & Cost Control
- [ ] Per-user rate limits set
- [ ] Per-model cost tracking in place
- [ ] Circuit breaker implemented
- [ ] Monthly budget alert configured
5. Error Handling
- [ ] Timeout handling defined (LLM calls can hang)
- [ ] Retry logic with exponential backoff implemented
- [ ] Dead letter queue for failed requests
- [ ] User-facing error messages are clear
6. Testing
- [ ] Unit tests cover prompt logic
- [ ] Integration tests cover full call chain
- [ ] Long conversation test (>20 messages) done
- [ ] Edge case prompts tested (empty input, max length, etc.)
How I Use This
I copy the checklist into Notion or a Google Doc before each release. I check each item. If something can't be checked, it either gets fixed or the launch gets delayed.
No checklist = higher incident risk. It's that simple.
The PDF Version
I turned this into a PDF checklist you can print or share with your team:
LLM Production Readiness Checklist — £19
47 items, organized by category, with space to add notes. Use it before every LLM launch.
Why This Isn't Overkill
Every incident I've had in 18 months of LLM development was a failure of one of these checklist items. Not complex requirements — basic things like "set a rate limit" or "test long conversations."
The checklist doesn't make you over-engineer. It makes you not forget the basics.
This checklist comes from 18 months of production LLM incidents. If you want the PDF: £19 one-time
Top comments (0)