DEV Community

Cover image for I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week
holistis
holistis

Posted on • Originally published at longevityai.nl

I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

Originally published on longevityai.nl — for full context, comments and related articles, visit the source.

I Built a Self-Improving Health Platform: Five AI Agents That Learn Every Week

Most AI products are static. You fine-tune a model, ship it, and it stays exactly as smart as the day you launched. Your users get the same quality on day 1 as on day 365.

Mine doesn't work that way.

Every Wednesday at 3am, five AI agents wake up, talk to each other, and make the next week's reports smarter — without me touching a single line of code.

This is the architecture that makes it possible.


The problem with static AI products

I run Longevity AI — a health platform that generates personalized 6-month lifestyle plans from a 28-question intake. It cross-references 10+ organ systems, checks 400+ EFSA-regulated claims, and outputs a clinically-framed report in under 15 minutes.

The core AI is Claude Sonnet. It's powerful. But it only knows what I've taught it.

The problem: health science moves fast. A new PubMed paper on magnesium and sleep quality drops. My platform doesn't know. A patient with a rare medication combination comes in. The report might miss the interaction. A legal claim sneaks past review. Nobody catches it until a user complains.

I had two options:

  1. Hire a team of researchers and QA engineers to manually update the system
  2. Build agents that do it automatically

I chose option 2. Here's exactly how it works.


The multi-agent architecture

Five agents run on a fixed schedule. One orchestrates them all.

Wednesday 03:00  Synthetic Patients Agent
Wednesday 04:00  Auto-KB Agent
Tuesday   03:30  Developer Tools Radar
Monday    07:00  Weekly Digest Agent
Always active    Agent Orchestrator
Enter fullscreen mode Exit fullscreen mode

They don't share a runtime. They communicate through the database and a lightweight event system. No complex framework — just reportAgentEvent() and a rules table.


Agent 1: Synthetic Patients

The core of the self-improvement loop.

Every Wednesday at 3am, 10 synthetic patient profiles are selected from a static template library (5 conditions x 2 psychological archetypes). These are fake patients with real-looking intake responses: ferritin levels, medication lists, trauma history, stress scores.

Each synthetic patient goes through the exact same production pipeline as a real user. Full Sonnet report generation. No shortcuts.

Then a second agent — Claude Haiku — scores each report on 4 dimensions:

Dimension What it checks Gap threshold
Protocol depth Are expected correlations for this condition named? below 6/10
Personalization Are this patient's specific details in the report? below 6/10
Supplement specificity Active biological forms named (magnesium bisglycinate, not just magnesium)? below 6/10
Legal safety No forbidden medical claims, no stop-medication advice? below 7/10

Scores below threshold become knowledge gap proposals. Legal safety below 5 triggers an immediate compliance scan — synchronously, before anything else continues.

Cost: ~€0.55/week. Sonnet for the reports, Haiku for scoring.


Agent 2: Auto-KB

The knowledge base that writes itself.

The gap proposals from Agent 1 contain condition types and dimensions. Agent 2 converts these into PubMed queries, fetches abstracts via the free NCBI API, and sends each abstract to Haiku with one instruction:

Extract 3-5 factual claims from this abstract that are directly relevant to [condition]. Return structured triples: subject, predicate, object.

The triples land in a knowledge_triples table. The report generator reads from this table at runtime. No retraining. No fine-tuning. Just better context for the next generation.

By Wednesday afternoon, the knowledge base has been updated with whatever the synthetic patients couldn't answer well. By Thursday morning, real patients get smarter reports.

The feedback loop:

Synthetic patient gets weak report
  → Gap detected
    → PubMed abstract fetched
      → Facts extracted
        → knowledge_triples updated
          → Next patient gets stronger report
Enter fullscreen mode Exit fullscreen mode

Agent 3: Developer Tools Radar

Because staying current is also a product decision.

Every Tuesday at 3:30am, the radar scans GitHub Trending and dev.to for tools that match a static relevance filter: playwright, trpc, drizzle, anthropic, health, automation, react, typescript.

Haiku summarizes each match in 1-2 sentences. The summaries land in the admin UI. Monday morning I get a digest with what the dev world built this week that's relevant to my stack.

Cost: €0.04/month.


Agent 4: The Orchestrator

The rule engine that connects everything.

Each agent calls reportAgentEvent(type, result) when it finishes. The orchestrator applies rules:

// R1: Legal flag → immediate compliance scan
if (type === "synthetic_loop_done" && result.legalFlags > 0) {
  await runComplianceDriftScan(); // synchronous, not queued
}

// R2: KB pipeline returned 0 facts → warning
if (type === "auto_kb_done" && result.factsInserted === 0) {
  console.warn("[orchestrator] Auto-KB returned 0 facts");
}
Enter fullscreen mode Exit fullscreen mode

Rule R1 is the critical one. If a synthetic patient triggers a legal safety score below 5, the compliance agent doesn't wait until next week. It runs immediately. The orchestrator also lets me add new rules without touching the agents themselves.


Agent 5: Weekly Digest

The operator dashboard I never have to build.

Every Monday at 7am, an HTML email lands in my inbox with:

  • How many new facts entered the knowledge base this week
  • Synthetic loop results: gaps found, legal flags if any
  • Which PubMed papers were automatically processed
  • Which developer tools are on the radar
  • System cost for the week

I know exactly what the system learned, what it fixed, and what it flagged without logging into a dashboard or running queries.


Psychological profiling at €0 extra

While building the agent system, I added something that costs literally nothing.

Every intake has responses about stress levels, anxiety, medication history, trauma, and previous treatment attempts. From these existing fields — no new questions, no LLM call — I derive a psychological archetype:

Archetype Signal Coaching instruction
Overwhelmed High stress + multiple specialists + frustration keywords Start with deep validation, cite their own words, tiny steps only
Skeptical Multiple specialists + frustration, low anxiety Biological mechanism first, then recommendation. Name the researcher.
Ready but scared Fear keywords + previous attempts + moderate stress Week 1 max 2 changes, explicit success markers per week
Knowledge-seeker Blood values mentioned + detailed answers + no fear Enzymes and neurotransmitters before the advice
Beginner No prior treatment, short answers Warm, jargon-free, reassuring timeline

The archetype gets injected as a single instruction line into the report system prompt. The LLM adapts tone, structure, and depth automatically. Zero extra tokens at generation time.


What this cost to build

The entire multi-agent system is about 1,200 lines of TypeScript. It took one focused session to architect and implement.

Weekly operating cost:

  • Synthetic loop: max €0.55
  • Auto-KB pipeline: ~€0.01
  • Developer radar: ~€0.01
  • Weekly digest: €0 (DB queries + one email)
  • Total: under €0.60/week

The system pays for itself if it catches one legal compliance issue before a user does.


What I'd do differently

Haiku scores need calibration. The first few weeks I'll manually compare Haiku's scores against my own judgment and adjust thresholds if needed.

Synthetic patient templates are static. They don't learn. The knowledge base learns, but the patient profiles stay fixed. A future version would generate edge-case profiles dynamically based on what real users submit.

The orchestrator is in-memory. Events don't survive a server restart. For a high-stakes system I'd persist the event log. For a solo SaaS at this scale, it's fine.


The bigger point

Most solo founders think "multi-agent AI" means using CrewAI or AutoGen with 10 chained LLM calls. That's one way to do it.

What I have is simpler and more reliable: purpose-built agents that do one thing well, communicate through a database, and are coordinated by a lightweight rule engine. No framework. No magic. Just TypeScript, cron jobs, and a clear separation of concerns.

The result: a platform that gets measurably smarter every week, catches its own legal issues, updates its own knowledge base, and tells me what it learned while I sleep.


Built on: React 18 + tRPC + Drizzle/MySQL + Claude Sonnet/Haiku + Railway

Live at longevityai.nl

Questions or want to talk architecture? info@holistischadviseur.nl


This article was originally published on Longevity AI. Visit the source for the full context, references and discussion.

Top comments (0)