A minimal QA checklist before shipping AI app output changes

#testing #ai #devops

AI app changes can fail quietly. A prompt tweak, model swap, retrieval change, or schema update can still return plausible answers while breaking required fields, citations, length limits, or safety wording.

Here is the smallest clean-room routine I use before release:

Use synthetic fixtures only. Do not use customer logs, secrets, support tickets, private documents, or production prompts.
Define expected output checks in plain rules: required fields, forbidden claims, required citations, length bounds, and valid JSON when needed.
Run the same scenarios before every prompt/model/RAG change.
Generate one pass/fail release note so the team can see exactly what changed.
Keep one human review step for edge cases deterministic checks cannot judge.