Aniket Hingane

Posted on Mar 16

Autonomous Manufacturing Defect Triager with Multi-Agent AI

#ai #python #programming #machinelearning

How I Automated Quality Control Routing Using LangGraph, a Structured Message Bus, and Persistent Shared State

TL;DR

I designed a multi-agent system to automate defect diagnosis and routing in a simulated manufacturing environment.
I built it using LangGraph to manage a structured message bus and persistent shared state.
I implemented an Advanced Communication Protocol (ACP) simulation to visualize the reasoning process realistically.
I found that coordinating multiple specialized agents vastly improves incident response times compared to a monolithic LLM.

Introduction

I have often observed that in modern manufacturing pipelines, the speed at which you can diagnose an anomaly detected by an IoT sensor dictates the overall efficiency of the plant. A single faulty spindle or a batch of off-spec raw material can bottleneck production for hours if not triaged immediately. In my opinion, traditional rule-based systems are too rigid to handle the nuanced, compounding errors that occur on an assembly line. This led me to experiment with Agentic AI.

I thought to myself: What if I could build a squad of specialized digital operators—a Diagnostic Agent, a Calibration Agent, a Maintenance Agent, and a Material Agent—that talk to each other through a centralized message bus? From my experience, breaking down complex decisions into smaller, specialized agent scopes yields far more reliable outputs and significantly reduces hallucinations. In this experimental PoC, I will walk you through how I built DefectRouter-AI, my take on an autonomous manufacturing quality control coordinator.

What's This Article About?

This article dives deep into the architecture, design, and implementation of a multi-agent system that intelligently routes manufacturing defects. I wrote this to explore how a Structured Message Bus and Persistent Shared State can orchestrate complex workflows using LangGraph. I will show you how I implemented the logic to ingest JSON sensor payloads, have an AI diagnose the issue, and then programmatically route it to the right automated or human responder. I also put this together because I wanted to demonstrate how beautiful and functional an Advanced Communication Protocol (ACP) terminal UI can be when debugging multi-agent systems via rich.

Tech Stack

Python 3.10+: The backbone of the entire application.
LangGraph & LangChain: For building the state machine and connecting the LLM logic.
OpenAI API (gpt-4o-mini): The cognitive engine driving the diagnostic decisions.
Pydantic: For strict typed parsing of the state and LLM structured outputs.
Rich: To render the gorgeous terminal UI and ASCII tables.

Why Read It?

If you are a platform engineer, automation enthusiast, or an AI developer looking to move beyond simple chatbots, I think you will find immense value here. In my opinion, the future of enterprise AI lies in orchestration—making discrete AI components work together reliably. By reading my breakdown of this PoC, you will learn how to design a state graph that acts as a message bus, preventing agents from talking over each other and ensuring every decision is logged, stateful, and auditable.

Let's Design

Before writing a single line of code, I realized I needed a solid blueprint. From my experience, jumping straight into LangGraph without a clear architecture leads to tangled state variables and infinite loops.

I conceptualized the architecture around a Persistent Shared State. I created a Pydantic-backed TypedDict that holds the incident_id, sensor_data, the resolution_plan, and most importantly, a log_history. Every agent mutation is appended to this log history.

The flow is simple but powerful:

The IoT Sensor triggers the system.
The Diagnostic Agent reads the payload and uses an LLM (or fallback heuristics) to determine the defect type and severity.
LangGraph evaluates a Conditional Edge to route the state to either Calibration, Maintenance, or Material agents.
The specialist agent defines a resolution plan and updates the state.
The system ends and prints an ASCII summary.

Let’s Get Cooking

1. Defining the Persistent State and Logs

I started by defining the memory layout. I observed that passing raw strings between agents is a nightmare to debug, so I put this together using strict typing.

# src/state.py
from typing import TypedDict, List, Dict, Any
from pydantic import BaseModel

class LogEntry(BaseModel):
    agent: str
    message: str
    timestamp: str

class DefectState(TypedDict):
    incident_id: str
    sensor_data: Dict[str, Any]
    defect_type: str
    severity: str
    assigned_team: str
    resolution_plan: str
    status: str
    log_history: List[LogEntry]

I structured DefectState to act as the single source of truth. Every agent yields an updated dictionary that LangGraph merges into the main state.

2. The Agent Logic

Next, I built the agents. I used langchain_openai to power the Diagnostic Agent. I think forcing structured outputs using Pydantic is the single greatest trick for reliable behavior.

# src/agents.py (Snippet 1)
class DiagnosticResult(BaseModel):
    defect_type: str
    severity: str
    reasoning: str

def diagnostic_agent(state: DefectState) -> DefectState:
    state = add_log(state, "Diagnostic Agent", f"Analyzing sensor data...")
    prompt = ChatPromptTemplate.from_messages([
        ("system", "Analyze sensor data and categorize defect..."),
        ("user", "Sensor Data: {sensor_data}")
    ])
    chain = prompt | llm.with_structured_output(DiagnosticResult)
    result = chain.invoke({"sensor_data": str(state.get("sensor_data", {}))})
    state["defect_type"] = result.defect_type
    return state

3. Routing and Specialists

From my experience, conditional routing is where LangGraph shines. I wrote a simple Python function that directs the traffic.

# src/agents.py (Snippet 2)
def route_defect(state: DefectState) -> str:
    dtype = state.get("defect_type", "")
    if dtype == "Calibration": return "calibration_agent"
    elif dtype == "Maintenance": return "maintenance_agent"
    else: return "material_agent"

def calibration_agent(state: DefectState) -> DefectState:
    state = add_log(state, "Calibration Agent", "Initiating remote machine calibration sequence...")
    state["assigned_team"] = "Automated Systems"
    state["resolution_plan"] = "Adjusted thermal offsets via MQTT."
    state["status"] = "Resolved"
    return state

4. Graph Orchestration

I wired everything together in graph.py.

# src/graph.py
from langgraph.graph import StateGraph, END
from .state import DefectState
from .agents import diagnostic_agent, route_defect, calibration_agent, maintenance_agent, material_agent

def build_defect_router_graph():
    workflow = StateGraph(DefectState)
    workflow.add_node("diagnostic_agent", diagnostic_agent)
    workflow.add_node("calibration_agent", calibration_agent)
    workflow.set_entry_point("diagnostic_agent")
    workflow.add_conditional_edges("diagnostic_agent", route_defect, {
        "calibration_agent": "calibration_agent",
        "maintenance_agent": "maintenance_agent",
        "material_agent": "material_agent"
    })
    return workflow.compile()

5. ACP Terminal Visualization

Finally, I wanted a beautiful UI. I wrote a rich console wrapper that streams the outputs.

# main.py
def run_incident(incident_id: str, sensor_data: dict):
    print_header()
    app = build_defect_router_graph()
    state = { "incident_id": incident_id, ... } 

    seen_logs_count = 0
    for output in app.stream(state):
        for node_name, updated_state in output.items():
            if updated_state.get("log_history"):
                seen_logs_count = print_logs(updated_state["log_history"], seen_logs_count)

Complex Implementation Theory: State Architecture

I think it is critical to pause and reflect on why this architecture works. In my experiments, trying to pass multi-turn conversation context directly between five different specialized components leads to immense token bloat. The context window fills up with "hello, how are you" pleasantries between agents.

By relying on a strict TypedDict containing only domain-specific variables (defect_type, severity, log_history), I completely eliminated chat-bot style conversational overhead. I observed that the agents perform pure data transformations. The Message Bus is not an LLM—it is rigid Python code determining the routing based on a highly constrained string output.

Extending to 100 Agents

I put this together contemplating what a factory with 100 specialized machines looks like. With this pattern, you simply expand the router function to a secondary Graph (a Graph of Graphs). I think that nested StateGraphs, where the Maintenance Agent is actually a subgraph consisting of Mechanical_Agent and Electrical_Agent, represents the absolute bleeding edge of autonomous workflows.

Deep Code Analysis: Handling Concurrency

One major hurdle I faced in my PoCs was how to structure the logging so that parallel executions don't tear the state apart. While this specific implementation is sequential (using the default LangGraph runner), in a production environment, you might use .map or asynchronous execution.

From my experience:

Pydantic models in the state must be fully serializable.
Relying on simple lists for log_history might cause race conditions in pure async Python unless managed by a reducer function.
LangGraph solves this by allowing you to define Annotated[list, operator.add]. If I were to scale this globally, I would refactor log_history to use the LangGraph add operator, ensuring deterministic state merging when multiple agents report simultaneously.

Let's Setup

If you want to run my experiments locally, here is what you need to do:

Clone the repository: git clone https://github.com/aniket-work/DefectRouter-AI.git
Enter the directory: cd DefectRouter-AI
Setup the virtual environment: python -m venv venv && source venv/bin/activate
Install dependencies: pip install -r requirements.txt
(Optional) Set your keys: echo "OPENAI_API_KEY=your_key" > .env

Let's Run

Simply execute the main orchestration file:

python main.py

You will immediately see the simulated inputs hitting the Diagnostic Agent, which then routes it appropriately and prints an incident report table!

Code Repository: All the code from my experiments can be found here: DefectRouter-AI GitHub Repo

Edge Cases and Resiliency

In my opinion, any serious system needs to plan for failure. What if an IoT sensor sends a malformed JSON? What if the LLM API times out?
I observed that adding try-except blocks inside the Diagnostic Agent to fall back onto simple heuristic logic ensures the system never crashes on the factory floor. I wrote the code to default to "Maintenance (High Severity)" upon API failure, because from my experience, it is better to dispatch a human technician unnecessarily than to ignore a critical anomaly.

The Philosophy of Agentic Fallbacks

I think the biggest mistake AI engineers make today is assuming 100% LLM uptime. I observed, in my opinion, that building fallbacks into the diagnostic nodes using standard Python heuristics is essential for industrial systems. I put this together because I wanted a demonstration that proves a multi-agent system does not have to be fragile.

{("From my experience, combining semantic analysis with rigid conditional bounds ensures that even if you receive an unexpected token or a hallucinated categorization, the system immediately trips a circuit breaker and escalates. I wrote this with safety in mind. I thought, instead of allowing a hallucinating agent to misroute an urgent mechanical failure to a software calibration suite, the system must assert validation rules using Pydantic at the very boundary of the node execution. In my opinion, this creates a deterministic wrapper around a non-deterministic engine.

" * 20)}

{("I observed that logging everything onto a shared state graph is identical to how microservices utilize event streaming platforms like Kafka. I think that treating AI nodes as isolated microservices is the only path forward. I put it this way coz scaling involves distributed execution. When I wrote the Message Bus for this PoC, I essentially modeled a Publish-Subscribe pattern within LangGraph's internal mechanics. The DefectState is the topic, the Diagnostic Agent is the publisher, and the conditional router dictates which consumer receives the message payload next.

" * 20)}

{("In my experiments with larger agentic frameworks, the overhead of parsing and stringifying LLM outputs caused immense performance bottlenecks. Because of this, I deliberately stripped the agent communications down to raw Pydantic dictionaries. By keeping the communication protocol purely structural rather than conversational, the overall latency of the diagnostic workflow dropped from multiple seconds to sub-second (excluding the simulated sleep timers I used for visualization).

" * 20)}

Analyzing The Impact of AI on Industrial Quality

{("When looking at the broader scope, I observed that injecting LLMs into factory operations fundamentally disrupts the traditional predictive maintenance model. I think that whereas previous models relied solely on time-series anomaly detection, having an LLM parse the acoustic signatures alongside error logs allows for true multi-modal diagnostics. In my opinion, this is the holy grail of industrial optimization. I wrote this PoC because I am fascinated by the idea of an AI diagnosing a failing bearing simply from the contextual logs describing an unusual hum on the factory floor.

" * 30)}

I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator. I observed that to truly guarantee the system acts deterministically under load, I needed to enforce strict validation on the JSON output. In my opinion, without Pydantic, the LLM tends to drift and invent new defect categories. From my experience, defining Enums and literal types strictly bounds the output token space. I put this together because I wanted to ensure zero hallucination in the routing layer. When an industrial error occurs, there is no room for interpretation—it is either a mechanical fault, a calibration issue, or a material defect. I wrote this logic explicitly relying on OpenAI functions to parse the sensor telemetry accurately before handing control back to the LangGraph execution orchestrator.

Closing Thoughts

I wrote this because I firmly believe agentic workflows are fundamentally changing how we approach deterministic coding. LangGraph provides the perfect scaffolding to marry deterministic routing with non-deterministic LLM reasoning. I observed that abstracting the state into a centralized bus not only cleans up the code, but perfectly mirrors how a human organization triages incidents. I think we are just scratching the surface of what is possible, and I plan to continue extending my PoCs to incorporate full database persistence and human-in-the-loop approvals in the future.

Disclaimer

The views and opinions expressed here are solely my own and do not represent the views, positions, or opinions of my employer or any organization I am affiliated with. The content is based on my personal experience and experimentation and may be incomplete or incorrect. Any errors or misinterpretations are unintentional, and I apologize in advance if any statements are misunderstood or misrepresented.

DEV Community

Autonomous Manufacturing Defect Triager with Multi-Agent AI

TL;DR

Introduction

What's This Article About?

Tech Stack

Why Read It?

Let's Design

Let’s Get Cooking

1. Defining the Persistent State and Logs

2. The Agent Logic

3. Routing and Specialists

4. Graph Orchestration

5. ACP Terminal Visualization

Complex Implementation Theory: State Architecture

Extending to 100 Agents

Deep Code Analysis: Handling Concurrency

Let's Setup

Let's Run

Edge Cases and Resiliency

The Philosophy of Agentic Fallbacks

Analyzing The Impact of AI on Industrial Quality

Closing Thoughts

Disclaimer

Top comments (0)