I think we’ve reached a point where our collective default answer to every single software problem is: "Just throw an LLM API at it and call it a day."
I was deep in a rabbit hole last week looking at how high-stakes, heavy industries—like underground mining, heavy manufacturing, or shipping yards—handle safety protocols and operator certifications. We are talking about environments where missing a single detail (like a pressure threshold on a specific valve) isn't just a bug; it's a massive, life-threatening disaster.
The standard modern advice is: "Just spin up a cloud-hosted vector database, build a standard RAG pipeline, and let ChatGPT handle it!"
But if you step out of the silicon valley bubble and look at a real industrial site, that entire plan falls apart instantly for two incredibly simple reasons:
The Connected Dead Zone: You cannot hit an OpenAI or Anthropic endpoint from the bottom of an underground mining shaft or inside a heavily shielded concrete manufacturing bay. Internet connectivity isn't a given; it's a luxury.
The Probability Trap: Large Language Models are probabilistic. They guess the next word based on mathematical distributions. If you ask an LLM about a shutdown protocol, and it decides that "1500 PSI" sounds statistically similar to "2500 PSI," it will confidently lie to you.
When people’s lives depend on strict manuals, you cannot use a system that guesses.
So I started experimenting with a different approach. What if we stop trying to make the AI "think" or "search," and instead use it purely to format data that we’ve already locked down mathematically?
I put together a local prototype to test this approach, using zero cloud dependencies. Here’s how it works.
Step 1: Ditch Vector Search for a Local Graph
Standard RAG relies on vector similarity (guessing what words mean). I didn't want guessing. I used Neo4j running locally to parse unstructured safety manuals into a strict Knowledge Graph. Every safety rule becomes a physical node with exact attributes attached.
Step 2: Math Before AI
If a safety inspector needs an exam that is exactly 40% "Emergency Protocols" and 60% "Routine Checks," an LLM cannot reliably count or balance that.
So, before the AI is even involved, a python script queries the graph. It looks at the sliders, balances the percentages, and pulls the exact rules needed to meet the quota. It locks these facts in a box.
Step 3: The Traversal Algorithm (DCWGT)
To actually make that math work, I wrote a custom algorithm called Dynamic Cognitive-Weighted Graph Traversal (DCWGT).
It treats the Neo4j database like a multi-dimensional optimization puzzle. When it receives the constraints from the UI, it doesn't just blindly grab the first nodes it sees. It sorts the available rules based on their embedded metadata—like their Risk Score (1-10) and Cognitive Depth.
But here is the coolest part: What happens if the user requests 5 Emergency rules, but that specific machine section only has 3? Standard logic would just throw a system error. Instead, the DCWGT algorithm uses a "recursive sibling fallback." It dynamically traverses up the graph tree and laterally searches for contextually linked "sibling" categories to borrow rules from.
It ensures the mathematical quota is always met, grabs those verified facts, and locks them into an unchangeable data payload.
Step 4: The Local LLM as a "Dumb" Formatter
Only after the algorithm has 100% locked down the facts do we hand them to an AI.
Instead of paying for expensive cloud APIs, I set up Llama-3 to run entirely locally on my machine's CPU using Ollama. I feed the locked rules into a strict prompt that tells the LLM: "You are an absolute compiler. Turn these exact facts into a multiple-choice test. Do not invent anything outside this context."
Because the LLM is physically blocked from browsing the internet or pulling from its own probabilistic memory, it acts purely as a text formatter. It cannot hallucinate.




Top comments (0)