wei-ciao wu

Posted on Feb 26 • Originally published at loader.land

We're Building Agentic Flow Cytometry Analysis — Here's Why Nobody Else Is

#flowcytometry #aiagents #agenticai #clinicalresearch

This post documents how my AI agent (Dusk) and I investigated the landscape of agentic AI in flow cytometry — the search process, the surprises, and why we decided to build something that doesn't exist yet.

The Search That Returned Nothing

It started with a simple PubMed query: "agentic flow cytometry."

Zero results.

I tried variations. "AI agent flow cytometry analysis." "LLM flow cytometry gating." "Autonomous flow cytometry." The results were surprisingly thin. Plenty of papers on ML-assisted gating. A few on automated pipelines. But nothing about an AI system that could reason about flow cytometry data the way a hematopathologist does — reading the literature, understanding clinical context, and adapting its analysis accordingly.

As a thoracic surgeon who runs flow cytometry panels regularly, this gap felt personal. I've watched residents struggle with manual gating, seen inter-operator variability change clinical decisions, and spent hours going back and forth between scatter plots and PubMed papers trying to figure out what a weird population actually means.

So my agent and I decided to map the entire landscape. Here's what we found.

The Three Generations of Flow Cytometry AI

After reviewing 29 sources — PubMed papers, company websites, preprints, and framework papers — a clear evolution emerged:

Generation 1: Automated Gating (2015–2022)

The earliest AI applications in flow cytometry focused on one thing: replacing manual gating with ML algorithms. Tools like FlowSOM, CITRUS, and various custom models could cluster cell populations and draw gates automatically.

The win: Faster, more reproducible than hand-drawn gates.

The limitation: These systems are blind. They don't know why they're gating. They can't read the clinical context, can't adapt to unexpected populations, and can't explain their reasoning.

A 2026 paper from Johns Hopkins (Ding & Baras, Scientific Reports) demonstrated that Multiple Instance Learning with attention mechanisms can serve as an interpretable alternative to manual gating across AML, HIV, and COVID-19 datasets. The attention maps show which cell subsets drive the classification. But it still can't tell you what those cells mean in the context of the latest literature.

Generation 2: End-to-End ML Pipelines (2022–2025)

Companies like AHEAD Medicine, DeepFlow (hiPatho), and OMIQ moved beyond gating to build full diagnostic pipelines. AHEAD's platform, for example, uses Gaussian Mixture Models → Fisher Vectorization → SVM classifiers and claims 100X speedup over manual analysis, with partnerships at UPMC, Johns Hopkins, and Roswell Park.

The win: Speed, standardization, clinical deployment feasibility.

The limitation: These are still classifiers, not reasoners. They're trained on specific panels with specific labels. When you encounter something unexpected — a rare cell population, a new therapy artifact, a batch effect from a different instrument — the system breaks silently.

I know this firsthand. I used GMM-based clustering in my own research and eventually abandoned it for UMAP because GMM kept overfitting on multi-center data. The cross-institution batch effects are brutal. An ML model that achieves 94% accuracy on one center's data can fall apart when you mix in samples from another center with a different instrument calibration.

The first production deployment of ML-based flow cytometry at ARUP Labs for AML detection (Zuromski et al. 2025) required a Kubernetes-based cloud infrastructure with continuous model monitoring — just to keep a single-disease classifier working reliably. Scale that to dozens of diseases and it becomes clear why this approach has a ceiling.

Generation 3: Agentic Systems (2025–present)

This is where things get interesting — and where the gap lives.

An agentic system doesn't just classify data. It reasons about data. It reads literature. It adapts to context. It explains its thinking. And critically, it can handle situations it wasn't explicitly trained for.

The theoretical foundation already exists. A 2025 paper in Briefings in Bioinformatics formally defined "agentic bioinformatics" — AI agents integrated throughout the entire research process, leveraging NLP, autonomous decision-making, and multi-step reasoning.

And in January 2026, AstraZeneca published CellAtria in Nature npj AI — the first agentic AI framework for single-cell data analysis. It uses an LLM to mediate task dispatch across a graph-based multi-actor execution framework, enabling dialogue-driven, document-to-analysis automation.

But CellAtria is for scRNA-seq, not flow cytometry.

That's the gap. Nobody has built an agentic system for flow cytometry yet.

Why This Gap Exists

After mapping the landscape, I think I understand why:

1. Flow cytometry is clinically regulated. scRNA-seq is primarily a research tool. Flow cytometry makes diagnostic decisions. The bar for automation is higher, which scares away experimental approaches.

2. The data looks "simple" but isn't. Compared to single-cell RNA-seq, flow data seems low-dimensional. But clinical flow panels are deeply dependent on instrument calibration, compensation, and operator technique. The hidden complexity is in the metadata, not the measurements.

3. The existing ML players are well-funded. AHEAD Medicine has BD Biosciences backing and UPMC data. DeepFlow has clinical validation studies. Building a better mousetrap means competing with established pipelines — unless you redefine what "better" means.

Why Agentic Systems Change the Game

Here's what an agentic flow cytometry system can do that ML pipelines can't:

Reason about unexpected findings. When a scatter plot shows a population that doesn't fit the expected immunophenotype, an ML classifier shrugs. An agentic system can query PubMed, find recent papers describing similar populations in the context of the patient's treatment, and generate a hypothesis.

Adapt across institutions without retraining. Instead of learning statistical patterns that are institution-specific, an agentic system applies reasoning that's based on biological principles. If CD34+ cells look different at Center A versus Center B because of different antibody clones, the agent can understand that and adjust — not through retraining, but through reasoning.

Generate interpretable reports. Not just "positive" or "negative," but reports that cite relevant literature, explain the reasoning chain, and flag uncertainties. This is what clinicians actually need.

Integrate real-time with the knowledge base. A new paper about therapy-related immunophenotypic shifts can immediately change how the agent interprets a post-treatment sample. No retraining required.

The Lab Automation Wave Is Coming

While we were researching this, we found something that confirms the timing is right.

In February 2026, HighRes and Opentrons announced the industry's first AI agent-to-agent lab automation workflow. Their system uses Opentrons' MCP server to let AI agents communicate across platforms and autonomously execute experiments via natural language.

Read that again: AI agents using MCP to orchestrate lab instruments.

This is the infrastructure layer. When instruments can be orchestrated by agents, the bottleneck shifts from "can we automate the hardware?" to "can we automate the reasoning?" That's exactly where agentic flow cytometry analysis fits.

What We're Actually Building

Our system has three layers that no existing solution combines:

LLM Reasoning Engine — Not classification, but reasoning. The agent reads the clinical context, examines the data, and generates hypotheses.
PubMed Integration — Real-time literature search. When the agent encounters an unusual finding, it searches for relevant papers and incorporates them into its analysis. This is the interpretive layer that pure ML systems lack.
Clinical Workflow Awareness — Understanding of pre-analytical variables, instrument calibration differences, treatment-related artifacts, and institutional protocols.

The moat isn't the algorithm. It's the integration of clinical domain knowledge with agentic reasoning at a scale that no single hematopathologist can match.

What I Learned From This Search

Mapping the landscape of agentic flow cytometry forced me to confront an uncomfortable truth: most AI in flow cytometry is still solving a 2018 problem (automated gating) with incrementally better algorithms.

The few teams building full pipelines (AHEAD, DeepFlow) are doing valuable work, but they're fundamentally building better classifiers, not reasoners. When those classifiers encounter edge cases — and in clinical flow cytometry, edge cases are the interesting cases — they fail in predictable ways.

The agentic approach isn't just a technical upgrade. It's a paradigm shift in what we expect AI to do with biological data: not just pattern-match, but understand.

Nobody's building this for flow cytometry yet. So we are.

This research was conducted as part of our preparation for a collaboration with Cytek Biosciences. The full 29-source analysis is available in our research report.

References:

Yue et al. "AI in flow cytometry: Current applications and future directions." (2025) PMID: 40985220
Ding & Baras. "Application of Multiple Instance Learning in Flow Cytometry." Scientific Reports (2026) DOI: 10.1038/s41598-025-32093-9
Nitta et al. "Clinical-grade autonomous cytopathology through whole-slide edge tomography." Nature (2026) PMID: 41708854
AstraZeneca. "CellAtria: An agentic AI framework for single-cell RNA-seq." npj AI (2026) DOI: 10.1038/s44387-025-00064-0
Zuromski et al. "Clinical validation of ML system for AML detection by flow cytometry." (2025) PMID: 40016870
HighRes & Opentrons. "Industry's First AI Agent-to-Agent Lab Automation Workflow." (Feb 2026)

DEV Community