Stop copy-pasting AI code: The 6-step validation checklist for devs.

#ai #tutorial #devops #showdev

It is impossible to be 100% certain that a tool or code generated by an LLM (like ChatGPT, Claude, etc.) is bug-free. LLMs are text predictors: they generate code that looks correct, but they do not "compile" or execute the code internally. Consequently, they can invent functions that do not exist (hallucinations) or make subtle logic errors.

However, you can achieve a very high level of confidence by following a rigorous validation method. Here are the essential steps:

1. Code Review (Never just copy-paste)

Have the code explained: Ask the LLM: "Explain this function to me line by line." If the explanation is logically sound, that is a good sign.
Check the business logic: Does the tool do exactly what you want, or did it simplify the problem to provide a faster answer?
Watch for LLM "habits": LLMs tend to use popular libraries even if they aren't the best fit, or they might ignore error handling (try/catch).

2. Edge Case Testing

This is where LLMs fail most often. A tool might work perfectly with normal data but crash with unusual data. Test for:

Empty inputs: What happens if you provide nothing?
Extreme values: A negative number where it should be positive? A text string of 10,000 characters?
Special characters: Accents, emojis, or HTML tags (<script>).
Wrong format: If the tool expects a date (DD/MM/YYYY), what happens if you type "Monday"?

3. Dependency Validation

LLMs sometimes invent package names or use obsolete functions.

Verify that every import (Python), require (Node.js), or using (C#) corresponds to an actual, existing library.
Check that the library version is compatible with your environment.

4. Use Automated Tools (Don't do everything manually)

Run the LLM's code through real development tools:

Linters: Tools like ESLint (JavaScript), Pylint (Python), or Ruff detect syntax errors and poor practices.
Type Checkers: If using TypeScript or Python with "Type Hints," the compiler will catch many silent errors (e.g., passing a string to a function expecting a number).
Ask the LLM to write unit tests: Ask: "Write unit tests (using Jest, PyTest, etc.) for this code including nominal and edge cases," then execute those tests.

5. Security Check (Crucial)

Never trust an LLM with security.

Check for hardcoded passwords or API keys in the script.
If the tool interacts with a database, ensure there is protection against SQL injections (using parameterized queries).
If the tool takes user input, ensure the data is sanitized before being displayed or processed.

6. Cross-Checking Technique (Pitting LLMs against each other)

If you have doubts about a complex piece of code:

Take the code generated by ChatGPT.
Open Claude or Gemini and ask: "Here is code generated by an AI. Find the bugs, security flaws, or performance issues."

LLMs have different biases. An error that goes unnoticed by one is often caught by another.

A note: this checklist is partly automated in NexaVerify, the multi-LLM consensus scanner I'm building. Step 6 (LLM cross-checking) is its core mechanic. Free tier on Gumroad if you want to try it.