DEV Community

Jamie Cole
Jamie Cole

Posted on

The LLM Production Readiness Checklist I Use Before Shipping Anything

Before I ship any LLM feature to production, I run through this checklist. It takes 20 minutes. It has caught every major incident I've had in the last 18 months.

The Checklist

1. Drift & Monitoring

  • [ ] Baseline outputs captured for all prompts
  • [ ] Drift detection running on hourly schedule
  • [ ] Alert thresholds set and tested
  • [ ] Dashboard shows current drift status
  • [ ] Historical drift data accessible

2. Input Validation

  • [ ] Known prompt injection patterns blocked
  • [ ] Input length limits enforced
  • [ ] Special characters handled
  • [ ] Non-text inputs (images, files) have defined limits

3. Output Validation

  • [ ] JSON schema validation in place
  • [ ] String match vs semantic similarity testing done
  • [ ] Format regression tests exist
  • [ ] Fallback behavior defined for malformed output

4. Rate Limiting & Cost Control

  • [ ] Per-user rate limits set
  • [ ] Per-model cost tracking in place
  • [ ] Circuit breaker implemented
  • [ ] Monthly budget alert configured

5. Error Handling

  • [ ] Timeout handling defined (LLM calls can hang)
  • [ ] Retry logic with exponential backoff implemented
  • [ ] Dead letter queue for failed requests
  • [ ] User-facing error messages are clear

6. Testing

  • [ ] Unit tests cover prompt logic
  • [ ] Integration tests cover full call chain
  • [ ] Long conversation test (>20 messages) done
  • [ ] Edge case prompts tested (empty input, max length, etc.)

How I Use This

I copy the checklist into Notion or a Google Doc before each release. I check each item. If something can't be checked, it either gets fixed or the launch gets delayed.

No checklist = higher incident risk. It's that simple.


The PDF Version

I turned this into a PDF checklist you can print or share with your team:

LLM Production Readiness Checklist — £19

47 items, organized by category, with space to add notes. Use it before every LLM launch.


Why This Isn't Overkill

Every incident I've had in 18 months of LLM development was a failure of one of these checklist items. Not complex requirements — basic things like "set a rate limit" or "test long conversations."

The checklist doesn't make you over-engineer. It makes you not forget the basics.


This checklist comes from 18 months of production LLM incidents. If you want the PDF: £19 one-time

Top comments (0)