Series context: This is a deep-dive follow-up to How I Automate Parts of My SDLC with AI Agents. If you haven't read that post, the short version: I built an agentic dev workflow (ADW) that automates my full development cycle: Plan → Build → Validate → Test → Review → Document. This post focuses on the Validate phase.
Why Validation Is the Most Underrated Phase
- The elephant in the room: AI-generated code is fast but imperfect
- Linters and static analysis exist for human written code why would AI-written code get a free pass?
- Without a validate phase, imperfections land directly in your test agent (or worse, in review)
- The validate phase is the quality gate that makes the rest of the pipeline trustworthy
- Quick recap of where it sits in the pipeline:
Plan → Build → [Validate ×3] → [Test ×3] → Review → Document
↑
You are here
What Validation Is NOT (Scope Clarity)
- Not running unit tests that is the Test Agent's job (separate agent, separate concerns)
- Not running the application
- No external service calls or DB connections
- Purely static analysis we only analyze the code itself, nothing needs to execute
This separation is intentional. Each agent does one thing well (SRP). Keeping validation static means it is fast enough to retry 3 times without killing your pipeline's momentum and not burning a hole in your wallet.
The Tool Stack
JavaScript / TypeScript (the original)
- ESLint with custom architectural rules
- Custom rules enforce things like: no direct fetch in components, no model imports in routes
- One command, JSON output, done
- I create custom claude commands that encapsulate the exact flow each agent needs, so a simplified version of the validation agent's command looks like this:
- For an actual example see this validate file:
cd backend && npm run validate:architecture:json
cd frontend && npm run validate:architecture:json
Java / Spring Boot (the new addition)
The same philosophy, different tools. Here is the parallel:
| Concern | JS Tool | Java Tool |
|---|---|---|
| Architecture | ESLint custom rules | ArchUnit |
| Code style | ESLint | Checkstyle |
| Code smells | ESLint plugins | PMD |
| Bug patterns | — | SpotBugs |
| Fast fail gate | implicit | mvn compile |
Execution order matters fastest checks first:
# 1. Fast fail stop here if this breaks, no point running anything else
mvn compile -q
# 2. Style + formatting
mvn checkstyle:check
# 3. Code smells and complexity
mvn pmd:check
# 4. Bytecode-level bug patterns
mvn spotbugs:check
# 5. Architecture rules only isolated by JUnit tag
mvn test -Dgroups=architecture -Dsurefire.failIfNoSpecifiedTests=false
Why compile first? Most Java static analysis tools require compiled bytecode. A compile failure is also the cheapest signal no point running ArchUnit on code that does not compile.
Architecture Rules with ArchUnit
- Brief intro: ArchUnit lets you write your architecture decisions as executable tests
- These are not regular unit tests they validate structure, not logic
- Tag them separately so the Validation Agent and Test Agent have zero overlap
@Tag("architecture")
@AnalyzeClasses(packages = "com.yourapp")
class ArchitectureRules {
// Controllers must not call repositories directly
@ArchTest
static final ArchRule no_direct_repo_in_controllers =
noClasses()
.that().resideInAPackage("..controller..")
.should().dependOnClassesThat()
.resideInAPackage("..repository..");
// Services must not import Spring MVC annotations
@ArchTest
static final ArchRule services_must_not_use_mvc =
noClasses()
.that().resideInAPackage("..service..")
.should().dependOnClassesThat()
.resideInAPackage("org.springframework.web.bind.annotation..");
// Naming conventions enforced
@ArchTest
static final ArchRule controllers_named_correctly =
classes()
.that().resideInAPackage("..controller..")
.should().haveSimpleNameEndingWith("Controller");
// @Transactional only allowed on service layer
@ArchTest
static final ArchRule transactional_only_on_services =
noClasses()
.that().resideOutsideOfPackage("..service..")
.should().beAnnotatedWith(Transactional.class);
}
The Test Agent excludes this tag so there is zero overlap between the two agents:
# Validation Agent runs ONLY architecture tests
mvn test -Dgroups=architecture -Dsurefire.failIfNoSpecifiedTests=false
# Test Agent runs everything EXCEPT architecture tests
mvn test -DexcludedGroups=architecture
The Standardized Violation Schema The Secret Sauce
- I find each tool has its own output format it can be noisy and inconsistent across tools
- The agent cannot reliably reason about what to fix if the input format varies per tool
- Solution: normalize everything into one consistent JSON schema before feeding it into the fix loop
- We can use AI to help with this normalization step write a prompt that takes raw tool output and maps it to the schema this can be done by giving it a few examples of the input and output format that are required.
- This is the same schema used in the JS version the contract does not change, only the tools that populate it do
- Tools change but the schema is stable and consistent across languages this is the key to making the rest of the pipeline tool-agnostic.
[
{
"rule": "ArchUnit/no-direct-repo-in-controllers",
"file": "src/main/java/com/app/controller/UserController.java",
"line": 34,
"column": null,
"severity": "error",
"message": "Controllers should not import repositories directly. Use a service.",
"fix_suggestion": "Replace UserRepository injection with UserService. Controllers should only depend on the service layer."
},
{
"rule": "checkstyle/MethodLength",
"file": "src/main/java/com/app/service/OrderService.java",
"line": 87,
"column": 1,
"severity": "warning",
"message": "Method length is 72 lines (max 50).",
"fix_suggestion": "Extract the validation logic into a private helper method to reduce method length."
}
]
Two severity levels, two behaviors:
-
severity: "error"→ fails validation, triggers the auto-fix retry loop -
severity: "warning"→ logged and visible but does not fail the phase
A note on fix_suggestion for Java: ESLint can generate suggestions natively. Java tools cannot. Instead, maintain a small rule registry a lookup map of rule name → suggestion string that the normalizer uses when building the output. Upfront effort, but it pays off every retry cycle.
The Auto-Fix Retry Loop
- When violations are found the agent does not stop it feeds the structured violations back to the LLM to fix, then re-validates
- Hard cap at 3 retries before escalating to human intervention
- Two failure modes to guard against:
Build Output
↓
Run Validation Tools
↓
Normalize all output → JSON violations array
↓
violations.length > 0?
YES → Feed violations to fix agent → Re-validate (max 3 attempts)
NO → Phase complete ✅
↓
Still failing after 3 retries? → Halt, surface to human 🛑
Violation diffing between retries track the count before and after each fix attempt. If the count is not going down, the agent is stuck. Escalate instead of looping.
Regression detection if new violations appear that were not present before a fix attempt, the fix introduced a regression. Treat this as a separate signal and re-run from the last clean state rather than continuing forward.
What this looks like in your pipeline output:
Phase 3: Validation
======================================================================
SUCCESS
Critical: 0, Warnings: 2, Attempts: 2
Found 4 violations on first pass
Auto-fixed: controller importing repository directly, method too long,
missing @Override annotation, unused import
Re-validated: CLEAN
What About SonarQube?
- You might already have SonarQube running in CI does it belong here too?
- Short answer: no, not inside the validation agent
-
mvn sonar:sonaris slow and expensive a bad fit for a loop that may run 3 times - This agent runs before a push, so there is no Sonar result to even poll yet
- SonarQube's natural home is post-push in CI, as a final safety net before merge
- Instead, run SonarLint in connected mode locally in your IDE same quality profile as your server, zero pipeline cost
The principle: order checks by cost. Cheap and fast first, expensive later (or delegate to CI entirely). This is why compile runs before Checkstyle, and Checkstyle before ArchUnit.
Before and After: What Your Test Agent Receives
Without a validation agent the test agent receives raw AI output that may include:
- A controller calling a repository directly (arch violation)
- A method that is 90 lines long (PMD)
- Unused imports (Checkstyle)
- A missing null check (SpotBugs)
- The test agent now has to fight bad structure AND broken tests simultaneously
With a validation agent the test agent receives code that has already passed:
- Compile check
- Style and formatting rules
- Smell and complexity thresholds
- Bug pattern analysis
- Your architectural boundaries
The test agent works with clean, well-structured code every time. That is why your test agent rarely needs all 3 of its own retries.
Key Takeaway
The validate phase is not about distrusting AI. It is about applying the same rigor to AI generated code that you apply to any code the same linters, the same architectural rules, the same standards your team already agreed on. The difference is it runs automatically, fixes itself, and only escalates to you when it genuinely cannot resolve the issue.
The code that reaches your Test Agent has already been through a compile check, style validation, smell detection, bug pattern analysis, and your architecture rules. You are not reviewing raw AI output. You are reviewing code that has already been through the gauntlet.
What's Next
- I plan on writing a post around the Review Agent next why its needed and how can we apply a but more automation if there are any issues found during that phase.
- The Review Agent did the AI actually build what the spec asked for?
Top comments (0)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.