DEV Community: Sitanshu Kumar

SynaptoRoute v0.4.0: Re-Architecting for Massive Concurrency & Zero-Downtime Indexing

Sitanshu Kumar — Wed, 03 Jun 2026 17:36:25 +0000

This is a follow-up to SynaptoRoute v0.3.0: Matching Semantic Router While Scaling to 50,000 Routes. If you're new here: SynaptoRoute is a high-performance semantic routing engine that classifies user queries into deterministic software logic locally, without API calls.

The Wall We Hit

In v0.3.0, we proved that SynaptoRoute could match the accuracy of industry standards on standard benchmarks (Banking77, CLINC150) while retaining <50ms P99 latency across 50,000 dense routes.

But scale isn't just about total capacity. It's about concurrent mutation.

Under heavy asynchronous load, specifically, when a system is attempting to route incoming queries while simultaneously adding hundreds of new routes, the architecture began to show stress fractures. The FaissIndex required global locks to rebuild. FastEmbed mathematical execution was starving the asyncio event loop. SQLite connections threw ProgrammingError exceptions across multiple threads. And our new RedisSyncManager created an O(N^2) broadcast storm when 10 replicas all synced identical state changes simultaneously.

In v0.4.0, we ripped the internal engine apart and completely re-architected it to survive extreme adversarial chaos.

What's Architecturally New

1. ThreadPoolExecutor Isolation

In previous versions, FastEmbedEncoder executed mathematically dense ONNX inference on the same execution path as the router. Under high traffic, this sequential compute starved the asynchronous event loop.

In v0.4.0, we explicitly isolated the embedding engine into a dedicated ThreadPoolExecutor. ONNX hardware inference is now completely decoupled from asyncio, preventing sequential compute starvation and radically smoothing tail latencies on asynchronous traffic spikes.

2. The In-Memory Write-Ahead Log (WAL)

When FaissIndex exhausts its pre-allocated capacity, it must rebuild. In v0.3.0, this meant locking the router, blocking all incoming mutations and routing requests until the memory reallocation completed.

We deployed a custom In-Memory Write-Ahead Log (WAL). Now, when the index is actively rebuilding, the router buffers mutations (add_route, delete_route) into the WAL. Incoming queries scan both the stale index and the WAL sequentially, achieving zero-downtime O(1) throughput during heavy background index garbage collection.

3. Bounded SQLite Pooling & O(1) Redis Sync

To solve the multithreading deadlocks, we deployed a Bounded Connection Pool for SQLiteStorage with strict thread-local isolation (check_same_thread=True), neutralizing multithreaded contention locks.

To solve the cluster broadcast storm, we upgraded the RedisSyncManager to utilize explicit target_id payloads. Rather than processing every mutation broadcast recursively, replicas now instantly drop loopback events, cutting synchronization network overhead from O(N^2) to strictly linear scaling.

The Chaos Simulation

To empirically prove these architectural changes worked, we stopped running standard sequential unit tests and built an Adversarial Chaos Simulation.

We hammered the in-memory SQLite and FAISS instances with 100 simultaneous threads:

50 Concurrent Writers rapidly injecting corrupted routes and forcing rollbacks.
50 Concurrent Readers aggressively triggering the indexing boundaries.

The Results (85-second duration):

1,000 successful route mutations.
2,500 successful reads.
0 Thread Crashes
0 SQLite Locks
0 Memory Leaks
0 Utterance Duplications

The ThreadPool isolation and WAL context managers held perfectly.

Independent Hardware Validation

One recurring question in local-first AI is hardware determinism. If you run a semantic router on a cloud GPU vs a consumer laptop, do the mathematical boundaries shift?

We tested SynaptoRoute v0.4.0 independently across five distinct consumer CPUs (from Intel 4C/8T to AMD 16C/24T).

Banking77 Dataset Results:

Top-1 Accuracy: 92.85% ± 0.00% across all machines.

CLINC150 Dataset Results:

Top-1 Accuracy: 75.04% ± 0.00% across all machines.

We formally established that the underlying ONNX inference and L2-normalized cosine thresholds are strictly deterministic. Your routing logic will behave identically on an edge device as it does on a massive Kubernetes cluster. Raw latency scales with hardware; logical accuracy does not.

What's Next? (v0.5.0)

We have stabilized the underlying infrastructure for massive concurrency. Now, we move up the stack.

The v0.5.0 roadmap is focused on Dynamic Boundary Generation and Multi-Modal Integration:

LLM-assisted synthetic utterance generation to automatically seed intents from Python docstrings.
Native LangGraph ToolNode injection.
CLIP/ImageBind integrations to accept visual data (PIL.Image) directly into the router.

If you are building Agentic workflows or orchestration layers, give v0.4.0 a spin.

pip install synaptoroute==0.4.0

Repository: github.com/sitanshukr08/SynaptoRoute
PyPI: pypi.org/project/synaptoroute

If you run this in production under high load, I'd like to hear about it. Drop a comment below or open an issue on GitHub.

SynaptoRoute v0.3.0: Matching Semantic Router While Scaling to 50,000 Routes

Sitanshu Kumar — Mon, 01 Jun 2026 15:51:25 +0000

This is a follow-up to SynaptoRoute: A Study in Local Semantic Routing. If you haven't read it, the short version is: SynaptoRoute is a zero-token semantic routing engine that classifies user queries into intents using local embeddings instead of LLM API calls.

SynaptoRoute v0.3.0: Matching Semantic Router While Scaling to 50,000 Routes

What Changed Since v0.2.0

When I published the first post, SynaptoRoute had just shipped dynamic batching and O(1) hot-reload. The throughput numbers were promising, but the accuracy story was incomplete. I had internal benchmarks but no comparison against a widely adopted baseline under identical, reproducible conditions.

That gap is now closed.

v0.3.0 is live on PyPI:

pip install synaptoroute==0.3.0

The Benchmarking Journey

Getting to these numbers took multiple benchmark revisions.

Early synthetic datasets produced catastrophic accuracy collapse and initially suggested that both SynaptoRoute and Semantic Router were performing poorly. After deeper investigation, the root cause turned out to be flaws in the dataset generation pipeline rather than limitations of the routing engines themselves.

Several rounds of validation, failure analysis, threshold tuning, adversarial testing, and external benchmarking followed. All final results presented in this article come from independent public datasets with strict train/test separation, eliminating dataset leakage and benchmark inflation.

That process was valuable because it forced the project to validate assumptions against real-world data instead of relying on synthetic benchmarks.

The Benchmark That Actually Matters

I evaluated SynaptoRoute against Semantic Router on two standard NLU datasets. Same embedding model (BAAI/bge-small-en-v1.5). Same hardware. Same evaluation script. Same train/test splits loaded from HuggingFace.

CLINC150

150 intents spanning 10 domains, plus an out-of-domain class. This is the standard stress test for intent routers.

Metric	SynaptoRoute	Semantic Router
Top-1 Accuracy	74.20%	73.35%
Precision	78.53%	74.68%
Recall	86.91%	88.46%
F1	81.34%	80.45%

Banking77

77 highly overlapping intents in a single domain. This dataset punishes routers that cannot distinguish between semantically adjacent queries like "card not working" and "card payment declined."

Metric	SynaptoRoute	Semantic Router
Top-1 Accuracy	91.81%	91.29%
Precision	91.29%	91.41%
Recall	91.80%	91.28%
F1	91.40%	91.28%

I want to be explicit about what this does and does not prove.

It proves that SynaptoRoute's architecture (Faiss-backed index, SQLite persistence, adaptive threshold fitting) produces classification accuracy that is competitive with the most widely adopted open-source semantic router.

It does not prove that one system is categorically better than the other. Half a percentage point on a single run is within normal benchmark variance. What it does establish is benchmark parity.

Current benchmark results show no evidence of a meaningful accuracy trade-off for SynaptoRoute's architectural advantages.

Scale Numbers

These are not accuracy benchmarks. These are infrastructure stress tests.

Metric	Result
Max Routes Tested	50,000
P99 Latency at 50k Routes	<50ms
Index Backend	Faiss FlatIP (L2-normalized)
Cold Boot (Prebuilt Index Load)	0.45s

At 50,000 routes, the system sustains approximately 302 queries per second on consumer hardware (Ryzen 7, 16GB RAM, no GPU).

The significance of these numbers is not raw accuracy. They demonstrate that routing quality can remain competitive while scaling to route counts that are rarely evaluated in semantic routing systems.

What's Architecturally New

Pluggable Encoders

v0.2.0 was hardcoded to FastEmbed. v0.3.0 introduces a BaseEncoder interface. You can now route through remote embedding endpoints without modifying the core:

from synaptoroute.encoder import OpenAIEncoder

encoder = OpenAIEncoder(model="text-embedding-3-small", dim=1536)
router = AdaptiveRouter(encoder, storage)

The OpenAIEncoder wraps the synchronous OpenAI client in asyncio.to_thread internally, so it does not block the batch worker's event loop.

Distributed State Sync

The biggest limitation I called out in the first post was this:

"The router is intentionally stateful. Different pods may have different local routing matrices."

That's no longer true. v0.3.0 ships a RedisSyncManager that broadcasts route mutations over Redis pub/sub. When one replica adds, updates, or deletes a route, all peers invalidate their local cache and rebuild.

from synaptoroute.sync import RedisSyncManager

sync = RedisSyncManager(redis_url="redis://localhost:6379")
router = AdaptiveRouter(encoder, storage, sync_manager=sync)

This is not a distributed consensus protocol. It is cache invalidation. The source of truth remains SQLite on each node. Redis is the notification bus.

Optimization Profiles

Rather than exposing raw batch sizes and timeout parameters, v0.3.0 introduces named profiles:

from synaptoroute.router import AdaptiveRouter, OptimizationProfile

router = AdaptiveRouter(
    encoder,
    storage,
    profile=OptimizationProfile.THROUGHPUT
)

THROUGHPUT configures larger batch sizes and longer queue drain intervals.

router = AdaptiveRouter(
    encoder,
    storage,
    profile=OptimizationProfile.LATENCY
)

LATENCY bypasses the queue entirely and encodes synchronously for single-query workloads.

Framework Integrations

SynaptoRoute can now be injected directly into LangChain and LlamaIndex pipelines:

from synaptoroute.integrations.langchain import SynaptoRouteTool

tool = SynaptoRouteTool(router=router)

What's Still Missing

I committed in the first post to being direct about limitations. That hasn't changed.

Cross-Encoder Reranking: Experimental prototypes have been evaluated and benchmarked but are not yet included in the production package. The current release continues to use a single-pass cosine similarity architecture. Production-grade reranking remains a v0.4.0 objective.
GPU Acceleration: The ONNX runtime falls back to CPU on all tested configurations. FastEmbed's CUDA provider requires specific cuDNN versions that are not trivially installable.
Multilingual Routing: Not validated. The benchmark model (bge-small-en-v1.5) is English-only. Multilingual routing requires a different embedding model and a separate evaluation.

What We Learned

A few conclusions became clear during benchmarking:

Semantic routing remains highly effective on real-world intent classification datasets.
Larger embedding models do not automatically produce better routing accuracy.
Both SynaptoRoute and Semantic Router struggle with logical reasoning tasks such as negation, double negation, and mixed-intent queries.
Most routing failures occur at semantic boundaries where multiple routes are genuinely plausible.
Architectural improvements such as batching, indexing, persistence, and state synchronization can significantly improve scalability without sacrificing benchmark accuracy.

The most important takeaway is that scaling semantic routing is primarily an infrastructure problem rather than an LLM problem.

What's Next

The next milestone is independent reproducibility.

The benchmarking work completed for v0.3.0 was performed on local hardware using publicly available datasets and documented evaluation scripts. The next release cycle will focus on building a dedicated benchmarking package that allows anyone to install SynaptoRoute, execute the same evaluations, and generate reproducible benchmark manifests containing:

Accuracy metrics
Latency metrics
Throughput metrics
Resource utilization
Hardware specifications
Software versions
Dataset metadata

The goal is simple: make every published benchmark independently verifiable.

Try It

pip install synaptoroute==0.3.0

from synaptoroute import AdaptiveRouter, Route
from synaptoroute.encoder import FastEmbedEncoder
from synaptoroute.storage import SQLiteStorage

encoder = FastEmbedEncoder()
storage = SQLiteStorage("routes.db")
router = AdaptiveRouter(encoder, storage)

router.add_route(Route(
    name="billing",
    utterances=[
        "check my balance",
        "payment history",
        "how much do I owe"
    ],
    threshold=0.5
))

result = router("what's my current balance?")
print(result.name)  # billing

The full benchmark methodology, raw numbers, and reproducibility instructions are documented in docs/BENCHMARKS.md and docs/COMPARISON.md in the repository.

Repository: https://github.com/sitanshukr08/SynaptoRoute

PyPI: https://pypi.org/project/synaptoroute/

If you run the benchmarks on your own hardware, I'd genuinely like to see the results. Open an issue, submit a benchmark manifest, or leave a comment.

SynaptoRoute: A Study in Local Semantic Routing

Sitanshu Kumar — Wed, 27 May 2026 16:09:47 +0000

1. Introduction: The "Why"

Why this project exists

In modern agentic architectures, systems often rely on Large Language Models (LLMs) to make basic routing decisions (e.g., determining if a user is asking for a password reset, a refund, or general support). While effective, this approach introduces three significant bottlenecks:

High Latency: Calling an external API takes hundreds of milliseconds.
Token Costs: Paying per-token for simple classification is economically inefficient at scale.
Non-Determinism: LLMs can occasionally hallucinate or return improperly formatted JSON.

Semantic routing solves this by locally converting the user's query into a vector embedding and using mathematical similarity (Cosine Similarity) against a predefined set of intents to make instant, free, and deterministic routing decisions.

Why we built SynaptoRoute

While exploring existing open-source solutions like Aurelio's semantic-router, we identified specific architectural bottlenecks. Existing routers often execute a deep memory copy of their entire multidimensional array whenever a new route is added dynamically. As the dataset grows, this O(N) memory degradation makes live "hot-reloading" in production highly inefficient. Furthermore, many existing solutions evaluate queries sequentially, failing to utilize the parallel processing power of GPUs.

Our goal was to learn if we could engineer a fundamentally better architecture: a router optimized explicitly for high-throughput concurrency and efficient dynamic memory management.

2. Architecture: The "How"

How we encode the text

We utilized the BAAI/bge-small-en-v1.5 model. To push the physical limits of Python inference, we explicitly opted for an INT8 quantized version of the model via the fastembed ONNX runtime. By reducing the mathematical precision from 32-bit floats to 8-bit integers, we slashed the memory bandwidth requirements, allowing the CPU and GPU to process the tensors significantly faster with negligible accuracy loss.

How we manage memory (The Hot-Reload Problem)

Instead of deep-copying the entire vector array every time a user adds a new utterance, we implemented a lazy-compilation strategy.
New embeddings are instantly appended to a lightweight Python list (O(1)time complexity). We defer the expensive O(N) numpy.vstack reallocation penalty until the very next incoming query. While this slightly delays the next immediate request, it prevents the web server from blocking during live updates.

How we achieve throughput (Dynamic Batching)

To fully utilize hardware acceleration, we realized that sending queries one-by-one is highly inefficient.
We introduced an asyncio.Queue and a background worker task. When a query arrives, it is dropped into the queue. The worker waits up to 5 milliseconds to collect up to 32 queries. It then passes the entire batch to the encoder to compute the cosine similarity as a single matrix multiplication.

API & Deployment (FastAPI)

To transition the engine from a Python library into a scalable microservice, we wrapped the AdaptiveRouter in a fully asynchronous FastAPI application. The FastAPI lifecycle hooks are tightly coupled to the router's asyncio batching worker, ensuring graceful startup and shutdown. The system is containerized via Docker, allowing developers to deploy a ready-to-use semantic routing REST API (/route, /routes) with a single command.

How we optimize boundaries

Routing relies on a "similarity threshold" to decide if a query matches an intent. Hardcoding this threshold is brittle. We implemented a machine-learning optimizer (fit_thresholds) that automatically iterates through potential thresholds against a labeled dataset, calculating the F1-score to find the perfect cutoff point for every individual route.

System Diagram

3. Architecture Iterations & Lessons Learned

This project was a continuous learning experience. Our initial implementations revealed severe structural flaws that we had to systematically engineer our way out of.

Iteration 1: Concurrency and Zombie Futures
When we first built the dynamic batching worker, we discovered that if the background task crashed or was cancelled during server shutdown, the queries waiting in the queue were abandoned. The asyncio.Future objects were never resolved, causing the client API requests to hang indefinitely.
The Solution: We learned to wrap asynchronous background workers in strict try/finally blocks to aggressively drain the queue and explicitly throw asyncio.CancelledError to all pending clients during a crash.

Iteration 2: DDoS Vulnerability and Backpressure
Our initial asyncio.Queue was unbounded. We quickly realized that if the router was hit by a massive traffic spike, the queue would grow infinitely until the server crashed from Out-of-Memory (OOM) errors.
The Solution: We applied a strict maxsize=10000 limit to the queue. By utilizing put_nowait(), the router instantly rejects overflow requests with a custom exception, providing vital backpressure so the web framework can gracefully return HTTP 429 Too Many Requests.

Iteration 3: Stale Memory Leaks
When designing the hot-reload feature, we initially allowed users to overwrite existing routes. However, we forgot to garbage-collect the old vectors from the NumPy array. This caused memory bloat and allowed the router to incorrectly match against deleted data.
The Solution: We implemented a rigid memory-rebuild mechanism. If a route is overwritten, the router completely drops the in-memory array and safely rebuilds it from the SQLite database truth-source.

4. Evaluation & Results

Hardware & Methodology

Standard Cloud CPU: GitHub Actions ubuntu-latest Runner (Standard 2-core VM)
Local GPU: NVIDIA GeForce RTX 3050 Laptop GPU (ONNX CUDAExecutionProvider)
Dataset: bitext/customer-support-intent-dataset (80% Train / 20% Val), plus synthetic Out-of-Domain (OOD) and typographical error injections.

Latency & Scalability

Through dynamic batching and quantization, the system achieves exceptional throughput on both standard cloud infrastructure and dedicated GPUs.

Metric	Cloud CPU (2-Core)	Local GPU (RTX 3050)	Context
Inference P99 (Batch=1)	3.94 ms	~14.11 ms	Even on standard cloud hardware, the quantized architecture guarantees single-digit millisecond latency for sequential queries.
Amortized P50 (Batching)	2.69 ms	0.157 ms	Under heavy concurrent load (1,000 queries), dynamic batching processes queries in under 3ms on a cloud CPU, and 157 microseconds on a GPU.
Hot-Reload Penalty	5.04 ms	~30.19 ms	We mathematically verified our tradeoff: deferring the O(N) `np.vstack` penalty allows for 5ms route additions without blocking the server.

Classification Accuracy

Test Type	Score	Note
In-Domain Accuracy	100.0%	Flawless mapping of known user intents in our test set.
Out-of-Domain FPR	40.0%	A baseline limitation; requires significant negative-sample tuning in production.
Adversarial Accuracy	98.0%	highly resilient to spelling errors and character injections compared to Regex.

System Stability and Stress Testing

To validate production-readiness, the system was subjected to three stress testing scenarios:

Concurrency Limits (20,000 Concurrent Requests): The bounded internal queue (maxsize=10000) successfully managed an overload scenario. The system processed the first 10,000 queries and rejected the remaining 10,000 via RouterOverloadedError, preventing Out-of-Memory (OOM) failures with zero unhandled exceptions.
Memory Allocation Durability: The router processed 2,000 consecutive route additions and overwrites. Memory usage remained stable at a 0.32 MB peak allocation. This confirms that the O(1) NumPy mask replacement strategy resolved the memory degradation previously caused by np.vstack reallocation.
Edge-Case Input Handling: The pipeline was tested against empty strings, pure whitespace, 1-megabyte text payloads, unstructured noise, and extended Unicode characters. The ONNX runtime processed all inputs sequentially without raising critical exceptions or blocking the background worker task.

5. Unresolved Limitations

While we successfully hardened the router for local deployment, there are inherent limitations to this architecture that we chose not to solve, as they conflict with our goal of keeping the package lightweight and dependency-free.

Kubernetes Split-Brain (Cache Incoherency)
SynaptoRoute is fiercely stateful. If deployed across multiple Kubernetes pods behind a load balancer, an add_utterance request hitting Pod A will update Pod A's local NumPy matrix. Pod B will remain entirely unaware, resulting in split-brain routing logic across the cluster. Solving this would require integrating a Redis Pub/Sub event bus to broadcast memory invalidations. We explicitly opted against this to avoid heavy external dependencies.

6. Conclusion

By asking "why" semantic routers degrade in memory and "how" we could utilize GPU concurrency, we successfully built a mathematically hardened, asynchronous routing engine. The journey required us to confront the realities of asynchronous Python, threading locks, and hardware transfer overheads. SynaptoRoute stands as a highly educational study in optimizing local AI infrastructure.

How I Built AegisDesk: A Zero-Token Semantic IT Agent with <5ms Latency

Sitanshu Kumar — Fri, 22 May 2026 16:34:00 +0000

If you’ve built AI agents recently, you know the standard playbook: you take a user's prompt, feed it into GPT-4 or Claude alongside a massive JSON schema of available tools, and ask the LLM to figure out which tool to use.

This works for prototypes. But in an Enterprise IT environment, it’s a disaster.

Using an LLM for Intent Routing takes anywhere from 800ms to 2,000ms. It burns API tokens on every single "hello" or "my laptop is broken" message. Worse, LLMs hallucinate—if a user asks to "Provision an Azure SQL database," an overly helpful LLM might hallucinate a non-existent tool call and crash your pipeline.

I wanted to build an autonomous IT Helpdesk agent that was deterministic, instant, and practically free to run. That led me to build AegisDesk, an open-source, multi-agent IT platform powered by LangGraph, SQLite, and Zero-Token Semantic Routing.

The Architecture: Zero-Token Routing
Instead of relying on a monolithic prompt, AegisDesk abandons LLM-based routing entirely.

When a query enters AegisDesk, it never hits the cloud. Instead, the local pipeline intercepts the query and embeds it using the BAAI/bge-small-en-v1.5 sentence-transformer model via ONNX (fastembed).

This local vector is then mathematically compared (via Cosine Similarity) against an offline vocabulary of IT intents:

network_diagnostics: (ping, traceroute, nmap, tcp, udp)
cloud_integrations: (okta, jira, aws, azure, cyberark)
web_scraping: (wiki, internal docs, cve lookup)
The result? The query is mathematically routed to the correct highly-specialized LangGraph sub-agent in ~4.5 milliseconds for $0.00.

TIP

Enterprise Safety Net: If the semantic match confidence falls below 0.55, AegisDesk refuses to guess. It safely falls back to a generalized, read-only RAG (Retrieval-Augmented Generation) agent, guaranteeing no destructive commands are executed by mistake.

Dynamic Few-Shot Learning via SQLite
Static keywords are great, but IT environments evolve. What happens when a user types an obscure proprietary software name that isn't in our offline vocabulary?

To solve this, I integrated Dynamic Few-Shot Learning directly into the routing layer using SQLite Graph Memory.

When AegisDesk initializes, it queries a routing_examples table inside an ACID-compliant SQLite database. It extracts historical, successfully resolved IT tickets and embeds them dynamically into the routing corpus.

If an Administrator notices the agent struggling with a query like "Run a traceroute to internal-git.corp", they can manually inject the learning directly via the CLI:

bash

aegisdesk teach-router "Run a traceroute to internal-git.corp" it_support network_diagnostics
The next time the router boots, it embeds that exact phrase. The system effectively "fine-tunes" its routing logic in real-time, achieving >90% strict-match routing accuracy without a single line of Python code being altered.

Zero-Trust Security Boundaries
Building an autonomous agent that can execute ipconfig, ping, or scrape internal HR wikis is inherently dangerous. AegisDesk implements two critical security mitigations at the tool execution layer:

RCE Defense (Remote Code Execution): Subprocess execution explicitly enforces shell=False. Before any command touches the OS, inputs are scrubbed using strict Regex [^a-zA-Z0-9.-_] to eliminate bash metacharacters (&, |, ;, $).
SSRF Defense (Server-Side Request Forgery): The Web Scraping agent is hardened against TOCTOU (Time-Of-Check to Time-Of-Use) attacks. Outbound HTTP requests undergo pre-flight DNS checks. Any resolution attempting to hit loopback (127.0.0.1) or private cloud metadata subnets (169.254.169.254) is aborted at the socket level.
Even with these defenses, AegisDesk utilizes LangGraph's interrupt_before functionality to trigger Human-in-the-Loop (HITL) confirmations before executing any terminal command.

Try It Out
AegisDesk proves that you don't need massive, bloated monolithic LLMs to build intelligent enterprise agents. By pairing lightning-fast deterministic routing with specialized LangGraph swarms, you can build systems that are safer, cheaper, and exponentially faster.

You can install the CLI directly from PyPI today:

bash

pip install aegisdesk
Check out the full source code and documentation on GitHub: github.com/sitanshukr08/Aegisdesk

If you’re building multi-agent swarms or semantic routers, I’d love to hear your thoughts in the comments!