Why ~40% of AI Engineering is repetitive glue work (not AI work)

Short Answer
Most of the time AI engineers lose is not spent on model reasoning or system design.
It is spent on a predictable set of repetitive workflow steps: ingestion, chunking, metadata alignment, JSON validation, evaluation setup, and DAG wiring, that require mechanical cleanup, not deep AI skill. These steps accumulate silently and create most of the failures teams mistakenly attribute to the model.

The Long Answer
I have worked with AI teams for more than two decades.
Across industries, across RAG systems, across autonomous agent builds, the pattern is always the same:
The LLM is never the bottleneck.
The workflow is.
Teams assume their frustration is because the problem is complex.
But when you dig in, the real cause is always the same:
repetitive glue work.
These steps repeat because they depend on upstream text formats, file variations, schema mismatches, and mechanical validation. None of this requires deep reasoning. But when executed inconsistently, everything downstream degrades.

Deep-Skill Tasks vs Repetitive Tasks
Deep-Skill Work
These tasks actually require human reasoning:

Retrieval design
Prompt structure and strategy
Multi-agent reasoning steps
Safety and guardrail logic
Evaluation rubric design
Knowledge modeling

These are the parts AI engineers want to spend time on.

Repetitive, Mechanical Tasks
These tasks repeat because they rely on structure and consistency, not intelligence:

Ingestion (Root cause): variations in source formats but identical cleaning rules.
Chunking (Root cause): segmentation is mechanical but breaks workflows if misaligned.
Metadata Alignment (Root cause): structural updates, not reasoning, require constant re-syncing.
JSON Validation (Root cause): formatting drifts from LLM outputs but fixes are simple structure edits.
Evaluation Setup (Root cause): baseline tests only change superficially between projects.
Tool Contracts (Root cause): schemas share trivial patterns across tools.
DAG Wiring (Root cause): high-level reasoning changes, but node templates don’t.
Logging and Fallback Logic (Root cause): boilerplate try/catch, not deep debugging.

I have yet to find a team where less than 40 percent of the workflow was repetitive.
In large systems, it jumps to 60 percent.

The Real Bottleneck: Drift
Workflow drift occurs when ingestion, chunking, and metadata formats change — even slightly.
Once drift begins:

embeddings become inconsistent
index boundaries shift
retrievers degrade
agents fail silently

This leads engineers to believe the model is the problem, when the root cause is structural glue work.

Edge Cases Where Glue Work Explodes
If you’ve ever worked with:

large PDFs
financial reports
legal documents
wiki systems with nested pages
websites with mixed HTML/Markdown

…you’ve probably experienced runaway ingestion drift.

When You Should NOT Optimize Workflow
If you have fewer than ten documents or very narrow tasks, do not build automated ingestion or complex chunking strategies.
Manual work is actually faster.

Why This Matters
The industry spends huge effort improving models, prompt styles, retrieval strategies, and multi-agent reasoning.
But the real engineering value is unlocked when the boring, repetitive, mechanical layers disappear or stabilize.
This is where most teams lose days of progress.
And this is why workflow automation is becoming as important as model engineering.

Here's more:

DEV Community

Why ~40% of AI Engineering is repetitive glue work (not AI work)

Top comments (0)