DEV Community

Sitanshu Kumar
Sitanshu Kumar

Posted on

How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency

If you’ve built AI agents recently, you know the standard playbook: you take a user's prompt, feed it into GPT-4 or Claude alongside a massive JSON schema of available tools, and ask the LLM to figure out which tool to use.

This works for prototypes. But in an Enterprise IT environment, it’s a disaster.

Using an LLM for Intent Routing takes anywhere from 800ms to 2,000ms. It burns API tokens on every single "hello" or "my laptop is broken" message. Worse, LLMs hallucinate—if a user asks to "Provision an Azure SQL database," an overly helpful LLM might hallucinate a non-existent tool call and crash your pipeline.

I wanted to build an autonomous IT Helpdesk agent that was deterministic, instant, and practically free to run. That led me to build AegisDesk, an open-source, multi-agent IT platform powered by LangGraph, SQLite, and Zero-Token Semantic Routing.

The Architecture: Zero-Token Routing
Instead of relying on a monolithic prompt, AegisDesk abandons LLM-based routing entirely.

When a query enters AegisDesk, it never hits the cloud. Instead, the local pipeline intercepts the query and embeds it using the BAAI/bge-small-en-v1.5 sentence-transformer model via ONNX (fastembed).

This local vector is then mathematically compared (via Cosine Similarity) against an offline vocabulary of IT intents:

network_diagnostics: (ping, traceroute, nmap, tcp, udp)
cloud_integrations: (okta, jira, aws, azure, cyberark)
web_scraping: (wiki, internal docs, cve lookup)
The result? The query is mathematically routed to the correct highly-specialized LangGraph sub-agent in ~4.5 milliseconds for $0.00.

TIP

Enterprise Safety Net: If the semantic match confidence falls below 0.55, AegisDesk refuses to guess. It safely falls back to a generalized, read-only RAG (Retrieval-Augmented Generation) agent, guaranteeing no destructive commands are executed by mistake.

Dynamic Few-Shot Learning via SQLite
Static keywords are great, but IT environments evolve. What happens when a user types an obscure proprietary software name that isn't in our offline vocabulary?

To solve this, I integrated Dynamic Few-Shot Learning directly into the routing layer using SQLite Graph Memory.

When AegisDesk initializes, it queries a routing_examples table inside an ACID-compliant SQLite database. It extracts historical, successfully resolved IT tickets and embeds them dynamically into the routing corpus.

If an Administrator notices the agent struggling with a query like "Run a traceroute to internal-git.corp", they can manually inject the learning directly via the CLI:

bash

aegisdesk teach-router "Run a traceroute to internal-git.corp" it_support network_diagnostics
The next time the router boots, it embeds that exact phrase. The system effectively "fine-tunes" its routing logic in real-time, achieving >90% strict-match routing accuracy without a single line of Python code being altered.

Zero-Trust Security Boundaries
Building an autonomous agent that can execute ipconfig, ping, or scrape internal HR wikis is inherently dangerous. AegisDesk implements two critical security mitigations at the tool execution layer:

RCE Defense (Remote Code Execution): Subprocess execution explicitly enforces shell=False. Before any command touches the OS, inputs are scrubbed using strict Regex [^a-zA-Z0-9.-_] to eliminate bash metacharacters (&, |, ;, $).
SSRF Defense (Server-Side Request Forgery): The Web Scraping agent is hardened against TOCTOU (Time-Of-Check to Time-Of-Use) attacks. Outbound HTTP requests undergo pre-flight DNS checks. Any resolution attempting to hit loopback (127.0.0.1) or private cloud metadata subnets (169.254.169.254) is aborted at the socket level.
Even with these defenses, AegisDesk utilizes LangGraph's interrupt_before functionality to trigger Human-in-the-Loop (HITL) confirmations before executing any terminal command.

Try It Out
AegisDesk proves that you don't need massive, bloated monolithic LLMs to build intelligent enterprise agents. By pairing lightning-fast deterministic routing with specialized LangGraph swarms, you can build systems that are safer, cheaper, and exponentially faster.

You can install the CLI directly from PyPI today:

bash

pip install aegisdesk
Check out the full source code and documentation on GitHub: github.com/sitanshukr08/Aegisdesk

If you’re building multi-agent swarms or semantic routers, I’d love to hear your thoughts in the comments!

Top comments (0)