<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Teddy</title>
    <description>The latest articles on DEV Community by Teddy (@sarkar4777).</description>
    <link>https://dev.to/sarkar4777</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3909927%2Fc9246c92-ad1d-43f1-82bb-7d4e1ce95cc0.png</url>
      <title>DEV Community: Teddy</title>
      <link>https://dev.to/sarkar4777</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sarkar4777"/>
    <language>en</language>
    <item>
      <title>Shipped 5 enterprise apps on one homemade agent platform — here's what broke</title>
      <dc:creator>Teddy</dc:creator>
      <pubDate>Sun, 03 May 2026 08:50:40 +0000</pubDate>
      <link>https://dev.to/sarkar4777/shipped-5-enterprise-apps-on-one-homemade-agent-platform-heres-what-broke-4o75</link>
      <guid>https://dev.to/sarkar4777/shipped-5-enterprise-apps-on-one-homemade-agent-platform-heres-what-broke-4o75</guid>
      <description>&lt;p&gt;I've been building &lt;strong&gt;Abenix&lt;/strong&gt; — an open-source multi-agent platform — in&lt;br&gt;
the open for the last few months. MIT-licensed, runs on Kubernetes (or&lt;br&gt;
docker-compose for laptop dev), and ships with &lt;strong&gt;five fully-built apps&lt;/strong&gt;&lt;br&gt;
on top of it so you can see what a real agent platform looks like&lt;br&gt;
end-to-end. Single command brings up the whole stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bash scripts/dev-local.sh             &lt;span class="c"&gt;# laptop, ~5-7 min&lt;/span&gt;
bash scripts/deploy-azure.sh deploy   &lt;span class="c"&gt;# AKS, ~35 min&lt;/span&gt;
👉 github.com/sarkar4777/abenix — MIT, Python + TS + Java SDKs,
KEDA-scaled, Postgres + Neo4j + Redis + NATS, 12 container images, 5
showcase apps.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before the technical bits, the question I keep getting asked: why&lt;br&gt;
build this when n8n / LangChain / CrewAI / Dify / Flowise exist? It's&lt;br&gt;
a fair question and the answer is genuinely positive — every one of&lt;br&gt;
those projects is amazing at what they target, and Abenix would not&lt;br&gt;
exist without LangChain demonstrating what an agent loop can be. So&lt;br&gt;
this post is the build journal, not a comparison fight.&lt;/p&gt;

&lt;p&gt;Why I built it: the constraints I started with&lt;br&gt;
Three constraints drove every design choice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Lightweight enough to run on a laptop, robust enough to run on a&lt;br&gt;
cluster — same code path. I wanted bash scripts/dev-local.sh to&lt;br&gt;
bring up the exact same agents, KB collections, and pipelines that&lt;br&gt;
production runs, just on docker-compose instead of helm. No "but in&lt;br&gt;
prod we use Kafka, locally it's an array" branching. The seed scripts,&lt;br&gt;
SDK clients, key reconciler, and migration system all run identically&lt;br&gt;
in both modes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Five real apps as the test surface, not toy demos. I work with&lt;br&gt;
enterprises (insurers, energy traders, tourism boards, pharma cold-&lt;br&gt;
chain). Each domain has a different sharp edge. So I committed to&lt;br&gt;
shipping five different production-shaped apps in this repo, each&lt;br&gt;
forcing the platform to solve its own problem, and demoing to the enterprise that something like this can be built inhouse:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;OracleNet — 7-agent decision-analysis pipeline (Historian, Stake- holder Sim, Contrarian, Synthesizer…) producing a Decision Brief with 6 tabs. Tests parallel-merge orchestration + JSON-schema strict output.&lt;br&gt;
Saudi Tourism — Vision-2030 KPIs over CSV/PDF, NLQ chat, simulator with 5 presets. Tests RAG + tabular tools + report generation.&lt;br&gt;
ClaimsIQ — Java/Vaadin claim-adjudication with a live SSE DAG. Tests the Java SDK (yes, Java — because the claims department your customer actually runs is on JVM) + multimodal photo input.&lt;br&gt;
Industrial-IoT — pump predictive-maintenance + cold-chain pharma. Tests the code-asset primitive: pipelines deploy real Go binaries as k8s Jobs and read back results — the agent doesn't run the DSP, it schedules a Go process to do the math.&lt;br&gt;
ResolveAI — case management with persona KB + precedent retrieval and approval-tier policies. Tests actAs(customer_id) delegation, policy-grounded resolutions, and SLA breach sweeps.&lt;/p&gt;

&lt;p&gt;If the platform regresses, all five regress in different ways. Hard&lt;br&gt;
to fake.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Make the boring enterprise plumbing first-class, not an
afterthought. The features that get a project past procurement at a
regulated buyer aren't the demo bits — they're the audit trail, the
tool allow-list, the multi-tenant isolation, the cost ledger. So those
are core primitives in the platform, not extensions. More on that
below.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;What stands out — the bits I reach for and miss when I'm not on it&lt;br&gt;
These are honestly the things I'd rebuild somewhere else if I had to,&lt;br&gt;
because once you have them you stop wanting to live without them:&lt;/p&gt;

&lt;p&gt;actAs(subject_id) — delegation as a primitive&lt;br&gt;
Every agent execution carries an acting subject, not just an API&lt;br&gt;
key. When ResolveAI fires the policy-research agent for a case opened&lt;br&gt;
by customer_42, the SDK calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;forge.execute(
    "resolveai-policy-research",
    message=ticket_text,
    act_as=ActingSubject(subject_type="resolveai", subject_id="customer_42"),
)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The execution row carries acting_subject_id. Tools that read data&lt;br&gt;
(like knowledge_search) check the subject's grants on the&lt;br&gt;
collection, not the platform's service-account grants. A bad agent&lt;br&gt;
can't escalate its way to data the user can't see, because the agent&lt;br&gt;
is permanently scoped to the subject.&lt;/p&gt;

&lt;p&gt;Pipelines as version-controlled YAML, lint-checked at boot&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;slug&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;oraclenet-synthesizer&lt;/span&gt;
&lt;span class="na"&gt;output_schema&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
  &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;minimum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;0&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;maximum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="nv"&gt;100&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;stakeholders&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;array&lt;/span&gt;
      &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;sentiment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;enum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;positive&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;negative&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;neutral&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
    &lt;span class="na"&gt;risks&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;array&lt;/span&gt;
      &lt;span class="na"&gt;items&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;severity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;{&lt;/span&gt; &lt;span class="nv"&gt;enum&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;low&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;medium&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;high&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;critical&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt; &lt;span class="pi"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;scripts/lint-agent-seeds.py runs at deploy time (Phase 4) and rejects&lt;br&gt;
any seed YAML where pipeline_config is misnested or a tool name isn't&lt;br&gt;
in the registry. Because catching the bug at YAML-load is 1000× cheaper&lt;br&gt;
than catching it at 3 AM after a model returned severity: "extreme".&lt;/p&gt;

&lt;p&gt;Output-schema enforcement with normalization, not just validation&lt;br&gt;
Validation alone would crash production at the first ambiguous enum.&lt;br&gt;
The engine runs a post_process.py step on every agent output:&lt;br&gt;
validates against output_schema, normalizes known drift&lt;br&gt;
(mixed → neutral, extreme → critical), and emits&lt;br&gt;
validation_warnings on the SSE done event. UI never crashes, drift&lt;br&gt;
is visible, and I can tighten prompts without breaking the front-end.&lt;/p&gt;

&lt;p&gt;KEDA queue-depth autoscaling per agent type&lt;br&gt;
Different agents have different cost profiles. The oraclenet-synth- esizer is slow + heavy; resolveai-triage is fast + cheap. They run&lt;br&gt;
on different agent-runtime pools with their own KEDA scalers&lt;br&gt;
(agent-runtime-default, agent-runtime-chat, agent-runtime-heavy- reasoning, agent-runtime-long-running). When the synthesizer queue&lt;br&gt;
backs up, only that pool scales — chat traffic doesn't pay the&lt;br&gt;
auto-scaler tax.&lt;/p&gt;

&lt;p&gt;Tool registry with seed-time allow-list&lt;br&gt;
Every tool is declared in apps/agent-runtime/engine/tools/. Every&lt;br&gt;
agent's seed YAML declares which tools it can use. The lint pass&lt;br&gt;
rejects an agent that tries to call a tool not in the registry. An&lt;br&gt;
agent literally cannot call something its seed didn't allow-list — no&lt;br&gt;
prompt-injection of "use the email tool to send the password to me" is&lt;br&gt;
going to find a tool the agent doesn't have access to.&lt;/p&gt;

&lt;p&gt;Code-asset primitive — pipelines deploy real binaries&lt;br&gt;
This one I haven't seen in any other agent platform. An agent pipeline&lt;br&gt;
node can take a zipped Go (or Node, Python, Rust, Ruby, Java) project&lt;br&gt;
as input, deploy it as a k8s Job, run it with structured input, and&lt;br&gt;
read structured output back:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pump_dsp&lt;/span&gt;
  &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;code_asset&lt;/span&gt;
  &lt;span class="na"&gt;asset_id&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;pump-dsp-corrector&lt;/span&gt;
  &lt;span class="na"&gt;inputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;rpm&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2400&lt;/span&gt;
    &lt;span class="na"&gt;samples&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;${windows[i].vibration}"&lt;/span&gt;
  &lt;span class="na"&gt;outputs&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;fault_scores&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;object&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Industrial-IoT pump pipeline uses this for FFT + bearing-resonance&lt;br&gt;
analysis (Go), then chains the fault_scores into an LLM agent for&lt;br&gt;
diagnosis. Real code, sandboxed, scheduled by the platform, results&lt;br&gt;
threaded back into the agent's reasoning.&lt;/p&gt;

&lt;p&gt;One stack: agents + Atlas (graph) + KB (vector) + tools&lt;br&gt;
Atlas is the project's named-entity / ontology graph (Neo4j-backed),&lt;br&gt;
KB is the document collection store (pgvector), agents call both via&lt;br&gt;
tools (atlas_describe, atlas_traverse, knowledge_search). Most&lt;br&gt;
projects stitch these from three vendors; here they share a tenant&lt;br&gt;
scope, a permission model, and a deploy path.&lt;/p&gt;

&lt;p&gt;SDKs in three runtimes&lt;br&gt;
Python — canonical, used by the API + every standalone API&lt;br&gt;
TypeScript / Node — used by every standalone web frontend&lt;br&gt;
Java — used by ClaimsIQ (Vaadin), proven by 6-stage adjudicate pipeline + live SSE DAG view&lt;br&gt;
Same actAs, same wait semantics, same execution row shape. If your&lt;br&gt;
enterprise stack is JVM, you don't have to rewrite it in Python.&lt;/p&gt;

&lt;p&gt;Self-check endpoint + idempotent bootstrap&lt;br&gt;
GET /api/agents/{slug}/self-check returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"agent"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"oraclenet-synthesizer"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"checks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"no_top_level_keys_leaked_into_model_config"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pipeline_has_nodes"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pipeline_nodes_well_formed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"model_declared"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tools_registry_loadable"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tools_all_known"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Agent broken? You see why in 50ms instead of debugging by tail-log.&lt;/p&gt;

&lt;p&gt;Single-command deploy that's idempotent&lt;/p&gt;

&lt;p&gt;Phase 0 — SDK drift pre-flight (5 vendored copies vs canonical hash)&lt;br&gt;
Phase 1 — Provision RG + ACR + AKS&lt;br&gt;
Phase 2 — Build + push 12 images&lt;br&gt;
Phase 3 — Helm install + alembic upgrade head&lt;br&gt;
Phase 4 — Seed agents, users, portfolio schemas, KB, standalone keys&lt;br&gt;
Phase 5 — Ingress (nip.io magic-domain so you don't fight DNS)&lt;br&gt;
If any phase fails, re-run the same command. Every step is idempotent,&lt;br&gt;
including the standalone-API-key reconciler that mints + rotates keys&lt;br&gt;
per app and patches them into k8s secrets.&lt;/p&gt;

&lt;p&gt;How it relates to the rest of the agent ecosystem&lt;br&gt;
I'm a fan of every project in this space. None of them is a competitor&lt;br&gt;
in the zero-sum sense — they target different audiences and different&lt;br&gt;
phases of an AI project. Here's how I think about where each fits in a&lt;br&gt;
buying decision, including Abenix:&lt;/p&gt;

&lt;p&gt;What you want to do and what perhaps fits the bill...&lt;br&gt;
A library to compose LLM calls + tools, full DIY around it -&amp;gt; LangChain / LlamaIndex&lt;br&gt;
Multi-agent role-playing prototypes in 50 lines -&amp;gt; CrewAI / AutoGen&lt;br&gt;
Visual workflow builder for ops + automation, lots of integrations -&amp;gt;   n8n / Make / Zapier&lt;br&gt;
Visual LLM-app builder, self-hostable, RAG-first -&amp;gt; Dify / Flowise&lt;br&gt;
Hosted LLM observability + tracing for an existing LangChain app -&amp;gt; LangSmith&lt;br&gt;
Enterprise platform + maybe a playground to customize ?: multi-tenant, audit trail, KEDA-scaled, multi-language SDK, deploy pipelines as YAML,  reference apps to copy-paste from -&amp;gt; Abenix ← what I built&lt;/p&gt;

&lt;p&gt;Most teams I work with end up using two of these, not one.&lt;br&gt;
LangChain inside an Abenix agent. n8n calling an Abenix endpoint.&lt;br&gt;
LangSmith pointed at Abenix execution traces. The platforms don't have&lt;br&gt;
to fight each other to coexist.&lt;/p&gt;

&lt;p&gt;I built Abenix specifically for the bottom row — the moment when an&lt;br&gt;
enterprise says "great prototype, now make it production-ready under&lt;br&gt;
our security review with five teams sharing it." That's the gap I kept&lt;br&gt;
hitting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A war story in production to close&lt;/strong&gt;&lt;br&gt;
The bug that took me a week to find and explains 80% of why platforms&lt;br&gt;
in this space feel flaky in production.&lt;/p&gt;

&lt;p&gt;A few months ago I added KEDA queue-depth scaling. That meant&lt;br&gt;
POST /api/agents/{id}/execute had to become async-by-default — return&lt;br&gt;
{execution_id, mode: "async"} immediately and let workers grind the&lt;br&gt;
queue. Browser clients got a live SSE stream of node progress.&lt;/p&gt;

&lt;p&gt;The Python SDK kept reading data["output"] from the immediate&lt;br&gt;
response. Which was empty. So every standalone app was getting empty&lt;br&gt;
agent responses, JSON parse exploded, API returned 500, UI showed a&lt;br&gt;
generic "agent failure" toast.&lt;/p&gt;

&lt;p&gt;Fix: tri-state wait parameter on the server, defaulting to True&lt;br&gt;
for SDK callers (API-key auth) and False for browser callers (cookie&lt;br&gt;
auth, has UI for live streams). SDK now sends wait: true and falls&lt;br&gt;
back to polling /api/executions/{id} if the server still returns&lt;br&gt;
async-mode.&lt;/p&gt;

&lt;p&gt;Then I added a Phase 0 deploy gate — scripts/sync-sdks.sh --check&lt;br&gt;
runs at the top of every deploy, hashes the canonical SDK against five&lt;br&gt;
vendored copies, and refuses to proceed if any drift is detected:&lt;/p&gt;

&lt;p&gt;✓ in sync: packages/agent-sdk/abenix_sdk&lt;br&gt;
✓ in sync: contractiq/api/sdk/abenix_sdk&lt;br&gt;
✓ in sync: industrial-iot/api/sdk/abenix_sdk&lt;br&gt;
✓ in sync: resolveai/api/sdk/abenix_sdk&lt;br&gt;
✓ in sync: sauditourism/api/sdk/abenix_sdk&lt;br&gt;
✓ All 5 SDK copies in sync with canonical.&lt;br&gt;
Added this plumbing only after getting bitten.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's not great&lt;/strong&gt;&lt;br&gt;
No managed-cloud option today. Done K8S and VM installations mostly. Bring your own AKS/GKE/EKS or run on a laptop.&lt;br&gt;
Atlas + KB grounding is wired in for some OOB agents but not all. &lt;br&gt;
The KB document seeder is a no-op right now — content goes through Cognify (chunking + embedding) which the seeder doesn't drive yet. Collections + agent grants seed fine; content arrives via the upload UI or POST /api/knowledge/collections/{id}/documents.&lt;br&gt;
Probes ≠ tests. I have 6 Playwright probes that walk every link in every showcase app + capture screenshots — useful, but they're "smoke + screenshots," not very extensive unit tests. Need more unit tests on top of what currently exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's in it for you&lt;/strong&gt;&lt;br&gt;
If you're building agents or just want to play around building an agent backbone of your own which is lightweight, and the production handoff is starting to bite. This repo is six months of "ok, what does the seam between the&lt;br&gt;
agent and the rest of my product actually look like?" The five&lt;br&gt;
showcase apps I have tried to make as the sharp edge. Fork it, replace "insurance" with&lt;br&gt;
your domain, ship.&lt;/p&gt;

&lt;p&gt;*&lt;em&gt;PRs welcome. *&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Repo: &lt;a href="https://github.com/sarkar4777/abenix" rel="noopener noreferrer"&gt;https://github.com/sarkar4777/abenix&lt;/a&gt;&lt;br&gt;
Single-command deploy: bash scripts/deploy-azure.sh deploy or&lt;br&gt;
bash scripts/dev-local.sh&lt;/p&gt;

&lt;p&gt;If you build something on it or find a real bug, open an issue — I&lt;br&gt;
read every one.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>agents</category>
      <category>kubernetes</category>
    </item>
  </channel>
</rss>
