DEV Community

bob lee
bob lee

Posted on

How I Built a Self-Verifying AI Agent with DynamoDB and ReAct Reasoning

Built for the #H0Hackathon — Hack the Zero Stack with Vercel v0 and AWS Databases


Most AI pipelines follow a fixed script: input in, output out, nobody checks the work. For the H0 hackathon (Track 2: Monetizable B2B App), I built ChemSpectra Agent — an FTIR spectral analysis system where the AI verifies its own conclusions and self-corrects when evidence conflicts.

The ReAct Loop

Instead of hardcoding which tools to call, the agent uses a ReAct loop with Qwen-3.7-Max function calling. The LLM autonomously selects from 5 tools — identify_material (130K+ reference spectra), explain_peaks, assign_functional_groups, match_library_topk, and search_public_results. A material ID request might trigger two tools; a deformulation request triggers all three analytical tools. The LLM decides, not the developer.

Cross-Validation and Self-Verification

After tools return results, _detect_evidence_conflicts() compares outputs. If identify_material says "PET" but assign_functional_groups found no ester groups, that's a contradiction:

expected_groups = {
    "pet": ["ester", "c=o", "aromatic"],
    "nylon": ["amide", "n-h", "c=o"],
}
Enter fullscreen mode Exit fullscreen mode

The agent estimates confidence from match scores, candidate score gaps, and functional group coverage. Below 0.75 confidence or any conflicts, a verification round fires automatically:

needs_verification = (
    confidence < 0.75 or len(conflicts) > 0
)
Enter fullscreen mode Exit fullscreen mode

The agent gets told exactly what went wrong and calls additional tools to investigate. Post-verification confidence is logged, creating traces like [0.62, 0.84].

DynamoDB: Beyond Key-Value Storage

Every session persists to DynamoDB with 30-day TTL — tool call logs, confidence traces, synthesis, final report. But we went deeper than basic CRUD:

  • Two GSIsgsi-created (partition: ALL, sort: created_at) replaces full-table scan with efficient time-ordered query; gsi-material (partition: top_match, sort: created_at) enables "show me all PET analyses" aggregation
  • Atomic counters — a separate chemspectra-stats table tracks total_analyses and total_tools_called via DynamoDB ADD operations, safe under concurrent requests
  • Conditional writes — confirmed sessions use attribute_not_exists(session_id) OR step <> :confirmed to prevent concurrent overwrites of finalized reports

Regulated industries (pharma, forensics) require this audit trail. DynamoDB fits because the primary access is single-item by session_id, the GSIs cover the two secondary patterns, and TTL handles cleanup automatically.

Results

The loop runs 2-4 iterations in under 30 seconds. Self-repair for malformed LLM JSON has near-100% recovery. This turns "AI that gives answers" into "AI that checks its work" — essential when reports go into regulatory filings.

Try it: chemspectra-agent-h0.vercel.app | Code: github.com/jxbaoxiaodong/chemspectra-agent-h0

This article was written as part of my participation in the H0 AWS+Vercel Hackathon.

Top comments (0)