Robert Richter

Posted on Apr 15

Beyond "Just Add a Field" - Where MongoDB's Flexibility Actually Matters

#ai #agents #mongodb #architecture

As somebody working as a Solutions Architect at MongoDB I talk quite a lot about the benefits of "flexibility"... especially about data modelling and schema flexibility. And there is nothing wrong with it - I would not stand for it otherwise as a professional - but sometimes the topic is discussed "undercomplex" which can hide the real world value.

A simple pitch could be "With MongoDB, you can just add a field! No migrations, no ALTER TABLE, no downtime."... and, yeah, that is true. The document model really does give you that flexibility at the database level.

But as with everything in tech... details matter and reality is more complex. I sometimes see people hear this pitch and land in one of two places: either "great, that solves everything!" or (more often among experienced developers) "wait, that makes no sense, it doesn't matter for my Spring application." Both miss the full story.

Before joining MongoDB I spent 7 years building and architecting - not on my own of course :D - a patient engagement application for cancer patients, so I want to ground this in healthcare. And of course we need to talk about AI agents somewhere, because nothing is allowed to exist without that in 2026... but luckily, multi-agent systems are actually a great context to make my point.

Starting Where the Skeptics Start

As somebody that has worked for nearly a decade with Java and Spring ... damn, I am getting old ... I hear "just add a field" and think: "that doesn't change much for my Spring application". Sales psychology teaches me that I will likely not contradict my own conclusion (Dirty trick from sales book 101), but maybe I should still look a step further.

Say I am running a Spring Boot service backed by MongoDB. I want to add a moodScore to my patient pain diary entries. What do I actually need to do?

I change the @Document entity class... update the DTO... adjust the validation. Then I update the OpenAPI spec... no wait, I did this first because I do API first! Then I write tests... again, wrong order, failed TDD again. Finally: I deploy.

Alright, now let's compare that to the relational world: I write a Liquibase changeset, update the JPA entity, the DTO, the validation, the OpenAPI spec, tests, deploy. The steps are almost identical. For a simple additive change, the migration tooling handles it just fine either way.

And for any non-trivial change (renaming a field, restructuring a nested object, splitting a collection) likely I am going to need a migration strategy regardless. MongoDB has the Schema Versioning Pattern for exactly this, which is powerful, but it is still some work. The document model does not magically skip the hard part of a breaking schema change within a single service.

So if this would have been the only lens through which I evaluated schema flexibility... I might still conclude it does not make a big difference. And frankly at that level of abstraction for a statically typed language with strong framework support (and guardrails)... there is some truth to that. So yes! I am not wrong and did not contradict myself.

But wait! Here is the thing: stopping here is where we miss the full value. Schema flexibility is not primarily a per-service convenience. It is an architectural property. And its value shows when you look at the system, not the service.

Zoom Out: The System Has More Than One Component

Ok, if we now zoom out and go from the "inside a single service" view to the system/ component level... the picture changes quite a bit.

Why? Because within a single service, you control both sides of a schema change. You update the entity, the DTO, the validation, the consumer - all in one codebase, one deployment, done. The schema change is coordinated by definition. That is why flexibility feels like it does not matter much there.

But the moment you cross a service or component boundary, that coordination is gone. Components evolve on their own schedule. Nobody merges a single PR that updates all of them at once. The question is no longer "can I add a field?" it is "can I add a field without requiring every other component to update simultaneously?". This is the core promise of micro-service architectures and at the same time the most challenging part to do well.

Think about what "adding a field" actually means when:

Two versions of your service are running simultaneously during a Kubernetes rolling update
Multiple FHIR data sources send you Patient resources with different extensions and profiles
Four AI agents enrich the same document, each with their own output structure, each evolving on their own schedule

This is where flexibility stops being a developer convenience and becomes an architectural property. The system is not just the sum of its services - it is the sum of its services plus all the gaps between their independent lifecycles.

Let me walk through a concrete example to show what I mean.

The Pain Diary That Grew Agents

The scenario: You are building a digital pain diary for chronic pain patients. This is a classic patient-reported outcome use case.

It starts simple. A patient rates their pain (0-10), picks a body location, maybe adds a note. Stored in MongoDB. Straightforward so far and not much to discuss.

{
  "patientId": "Patient/12345",
  "authored": "2026-04-14",
  "painIntensity": 7,
  "painLocation": "lower-back",
  "note": "Worse after sitting all day"
}

Then the agents arrive! Product management wants AI-powered enrichment: a triage agent classifying urgency, a medication interaction agent cross-referencing the patient's med list, a trend analysis agent computing whether pain is worsening, and a clinical summary agent generating a readable summary for the clinician. If agent, then why not go straight "multi agents"?

Each agent reads the diary entry and writes back its assessment. The document model can handle this natively. Of course, the document grows. And - this is the important bit - the agents do not evolve on the same schedule and speed. The triage agent gets updated weekly, the trend agent monthly, the summary agent whenever someone tweaks the prompt.

This is where it gets architecturally interesting. Here we get some nice level of complexity and interacting parts. Time to look at what established architecture has to say about it, before we reinvent things.

The Basics Still Apply

I have said it before and I will keep saying it: the basics of solid software architecture are extra important when building with AI agents. We should not forget the costly lessons we already learned just because the technology is exciting and new. Software engineering is a constant fight against chaos, after all.

What I just described (multiple independent agents reading from and writing to a shared data structure) is not a new problem. It is a well-known architecture pattern called the Blackboard Pattern. Originating from AI research in the 1980s (Yes, AI did not start with ChatGPT).

Several people have written good pieces about this in the context of modern agent systems. Rajat Pandit describes it nicely as solving the "Phone Game" problem in agent chains. There is even a Python SDK implementing it with MongoDB and PostgreSQL backends. And it shows up in various forms in discussions about multi-agent coordination patterns.

You will find similar ideas under newer labels... "memory engineering," "shared agent state," "collaborative workspace"... to be fair, some of those are broader than what the blackboard pattern covers. Memory engineering in particular goes well beyond shared state. But if there is an established pattern that has been studied for a long time, maybe worth looking at before building from scratch at least.

The three components:

Blackboard = your MongoDB collection. The shared, structured data space.
Knowledge Sources = your agents. Independent, specialized, loosely coupled.
Control Component = your orchestrator. LangGraph, a custom event-driven pipeline, whatever fits.

But here is the thing: the blackboard pattern tells you something about the structure. It tells you that agents should interact through shared state rather than direct messaging. What it does not tell you is what happens to that shared state when agents evolve.

And that is the data model question. That is where the document model comes in... and where flexibility actually matters.

What Happens on the Blackboard Over Time

Alright, let's look at the actual data. Here is a fully enriched pain diary entry, all four agents have processed it:

{
  "patientId": "Patient/12345",
  "authored": "2026-04-14",
  "painIntensity": 7,
  "painLocation": "lower-back",
  "agentEnrichments": {
    "triage": {
      "agentVersion": 2,
      "processedAt": "2026-04-14T08:01:12Z",
      "urgency": "high",
      "urgencyCategory": "acute-escalation",
      "confidenceScore": 0.89
    },
    "medicationCheck": {
      "agentVersion": 1,
      "processedAt": "2026-04-14T08:01:15Z",
      "interactions": [],
      "currentMedications": ["Ibuprofen 400mg", "Pregabalin 75mg"]
    },
    "trendAnalysis": {
      "agentVersion": 1,
      "processedAt": "2026-04-14T08:02:01Z",
      "trend": "worsening",
      "weekOverWeekDelta": 1.3,
      "dataPointsUsed": 14
    },
    "clinicalSummary": {
      "agentVersion": 1,
      "processedAt": "2026-04-14T08:02:05Z",
      "summary": "Worsening lower back pain (7/10), trending up over 2 weeks. No medication interactions. Consider clinical review."
    }
  }
}

Now here is where it gets interesting... In the same collection, you will also find entries that look quite different:

An older entry, from before the trend agent even existed:

{
  "patientId": "Patient/67890",
  "authored": "2026-02-10",
  "painIntensity": 5,
  "painLocation": "knee",
  "agentEnrichments": {
    "triage": { "agentVersion": 1, "urgency": "moderate" },
    "medicationCheck": { "agentVersion": 1, "interactions": [] }
  }
}

A recent entry where the triage agent was updated. Its output shape changed, but the other agents are still at v1:

{
  "patientId": "Patient/11111",
  "authored": "2026-04-12",
  "painIntensity": 8,
  "painLocation": "lower-back",
  "agentEnrichments": {
    "triage": { "agentVersion": 2, "urgency": "high", "urgencyCategory": "acute-escalation", "confidenceScore": 0.91 },
    "medicationCheck": { "agentVersion": 1, "interactions": ["NSAIDs + blood thinner"] },
    "trendAnalysis": { "agentVersion": 1, "trend": "worsening", "weekOverWeekDelta": 1.8 },
    "clinicalSummary": { "agentVersion": 1, "summary": "Escalating lower back pain with flagged medication interaction." }
  }
}

And an entry that is still being processed. Only the triage agent has run so far:

{
  "patientId": "Patient/99999",
  "authored": "2026-04-14",
  "painIntensity": 6,
  "painLocation": "shoulder",
  "agentEnrichments": {
    "triage": { "agentVersion": 2, "urgency": "moderate", "urgencyCategory": "stable", "confidenceScore": 0.74 }
  }
}

So we have three documents that are all valid but in different states. Again: No problem for the document model. We like heterogeneous data in our collections.
From database perspective: No nullable columns for agent outputs that do not exist yet. No EAV tables. No pre-defined union schema that needs updating every time you add an agent or change a prompt.

With a relational database, you would need to pre-allocate columns for every possible agent output, deal with nulls everywhere, or use other workarounds. PostgreSQL JSONB can technically store this, but there are real limitations beyond just storage: Franck Pachot has explored these challenges in great detail and I do not want to repeat.

By the way, despite working for MongoDB at the time of writing: I have nothing against PostgreSQL, used it for years. But when it comes to document data at enterprise scale I think it is just not the same league.

Agent Evolution Is the New Rolling Deployment

In traditional microservices, a rolling deployment means two code versions running simultaneously. Your database needs to support both. This is a well-understood (not saying easy, though) problem, and MongoDB's Schema Versioning Pattern and the Expand-Contract Pattern handle it nicely.

But in multi-agent systems, there is an important twist: agent versions do not (only) change because a developer merged a PR. They change because someone updated a prompt... or swapped the underlying model... maybe a pesky end user in an admin UI even! Maybe added a new tool to some tool config file?! The output shape shifts - sometimes subtly, sometimes significantly - and this is driven by AI evolution, not code releases.

Consider: triage agent v1 produced a simple urgency string. After a prompt update, v2 now also emits structured categories, confidence scores, and evidence sources.

// Processed by triage agent v1
{
  "triage": {
    "agentVersion": 1,
    "urgency": "high",
    "reasoning": "Pain intensity 8/10, increasing trend"
  }
}

// Processed by triage agent v2 (after prompt update)
{
  "triage": {
    "agentVersion": 2,
    "urgency": "high",
    "urgencyCategory": "acute-escalation",
    "reasoning": "Pain intensity 8/10, increasing trend",
    "confidenceScore": 0.89,
    "evidenceSources": ["diary-trend", "medication-history"]
  }
}

Both shapes coexist in the same collection. The clinical dashboard normalizes on read:

function getTriageInfo(entry) {
  const triage = entry.agentEnrichments?.triage;
  if (!triage) return null;

  return {
    urgency: triage.urgency,
    category: triage.urgencyCategory ?? "unclassified",
    confidence: triage.confidenceScore ?? null
  };
}

The ?? fallbacks handle the v1-vs-v2 difference. This is basically the expand-contract pattern applied to agent evolution instead of code deployment. Same structural challenge, same solution.

I think many teams building multi-agent systems will run into this sooner or later. And the document model (where each document carries its own version context and shape) makes this manageable rather than painful. It is not magic - you still need the discipline of agentVersion fields and normalization logic. But at least the storage layer does not fight you.

And Then There Is Integration

Now imagine all of the above... but the input data is already heterogeneous before the agents even touch it.

If you work in healthcare (or really any domain with complex integrations) you know this reality. FHIR, the healthcare interoperability standard, is built around extensibility. A hospital sends you US Core Patient resources with race and ethnicity extensions. A wearable API adds custom device-binding extensions. A pharmacy system uses DaVinci drug formulary extensions. All describing the same patient, all shaped differently. And more often than not, you have zero control over these upstream systems. My former product manager Mark always said: "The great thing about standards is that everyone can have their own".

So now you have two layers of schema variation: the input data already varies by source, and then your agents add their own evolving output shapes on top. That is the real world picture. Not one source of heterogeneity, but two, compounding each other.

The document model handles both layers natively. No need to normalize everything to the same shape before agents can work with it. No schema change when a new integration partner shows up. Speed of integration matters, and you really do not want to block everything downstream just because an upstream system did a minor change.

(If you are working with FHIR data specifically: storing FHIR resources natively in MongoDB and using schema validation to enforce profiles is a surprisingly clean setup. From personal experience though: normalizing FHIR into relational tables is quite the experience. Been there, done this. Can not recommend ;) )

The Flexibility Spectrum

Alright, to bring it together: I think schema flexibility operates on a spectrum. And its value increases the higher up you go.

Level	What Changes	Flexibility Value
Per-Service Code	A developer adds a field	Real, but your framework handles most of it
Deployment	Two code versions coexist during rollout	Significant. Schema Versioning Pattern.
Agent Evolution	Prompt/model changes alter output shape	Critical. Blackboard + versioning.
System Integration	Heterogeneous sources, independent lifecycles	Fundamental. Document = natural unit of integration.

At the bottom (per-service code) flexibility is real and it does help. But how much... it depends. Your framework handles most of the heavy lifting. At least for something like Java and Spring. You could design a less constrained service natively around the document benefits - thinking avoiding all static typing and going all in on schema-on-read - but this is not the typical enterprise setup.

At the top (system integration and agent evolution) flexibility becomes an architectural quality attribute. I hold it with Mark Richards... well, rather echo him... that I prefer this term over "non-functional requirement" (if you are not familiar with his work, can recommend). It directly impacts how independently your components can evolve.

The initial pitch ("just add a field") is true... the initial reaction "does it really matter?" as well... but it describes the bottom of the spectrum. The full value of the document model lives higher up.

So What?

If you are evaluating MongoDB especially with respect to its flexibility - I mean, of course there is way more to it than "just" that - I would encourage you to not stop at the per-service level. Look at it more holistically.

Ask yourself:

Can my system absorb a new integration partner without a coordinated schema change?
Can two versions of an agent produce different output shapes without breaking consumers?
Can my components evolve at their own pace without deployment coupling?

These are architecture questions. And to make sure you can say "yes" to all of them, the document model is - in my experience and opinion - a genuinely strong helper.

Of course, none of this means schema flexibility is a silver bullet. You still need to think about data governance, validation (MongoDB's schema validation is quite powerful by the way, I wrote a playground for that if you are curious), and you definitely want agentVersion fields and proper normalization-on-read logic. With great flexibility comes also at least some responsibility.

This is all not a super original line of thought maybe, but I think worth writing down. Interested to hear your experiences - especially if you are building multi-agent systems or working with healthcare integrations. What patterns are you using to handle evolving data shapes?

Robert Richter currently works as a Solution Architect at MongoDB in Germany (as of April 2026). Views are his own. This is not a MongoDB sponsored article. He has 15+ years in product development, including 7 years building healthcare software (oncology, FHIR integrations, patient-reported outcomes) at Varian / Siemens Healthineers.

Disclaimer: I want to thank my good friend Claude for helping me write this article - brainstorming ideas, organizing my thoughts, and doing some of the draft grunt work. As you do these days ;) The opinions and architectural hot takes are mine, the patience for my late-night iteration cycles is his.

DEV Community