<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Vignesh Reddy</title>
    <description>The latest articles on DEV Community by Vignesh Reddy (@vignesh_reddy_53e403f62d2).</description>
    <link>https://dev.to/vignesh_reddy_53e403f62d2</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3959687%2Ffd423e78-9781-46fa-9884-d34085f9b6a8.jpg</url>
      <title>DEV Community: Vignesh Reddy</title>
      <link>https://dev.to/vignesh_reddy_53e403f62d2</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vignesh_reddy_53e403f62d2"/>
    <language>en</language>
    <item>
      <title>I published pip install ajah-sdk and npm install ajah-sdk — here's what they do</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Thu, 18 Jun 2026 18:14:54 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/i-published-pip-install-ajah-sdk-and-npm-install-ajah-sdk-heres-what-they-do-3m9h</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/i-published-pip-install-ajah-sdk-and-npm-install-ajah-sdk-heres-what-they-do-3m9h</guid>
      <description>&lt;p&gt;After two weeks of building Ajah — an &lt;br&gt;
open-source self-hosted LLM observability &lt;br&gt;
gateway — today I hit a milestone that &lt;br&gt;
actually matters for developer adoption.&lt;/p&gt;

&lt;p&gt;pip install ajah-sdk&lt;br&gt;
npm install ajah-sdk&lt;/p&gt;

&lt;p&gt;Both are live. Both work. Here's what &lt;br&gt;
they do and why I built them.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
THE PROBLEM THEY SOLVE&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Ajah is a gateway proxy that sits between &lt;br&gt;
your app and any LLM provider. It scores &lt;br&gt;
every response for hallucination risk, &lt;br&gt;
verifies RAG outputs, detects narrative &lt;br&gt;
drift across sessions, attributes costs &lt;br&gt;
per feature, and masks PII before storage.&lt;/p&gt;

&lt;p&gt;Before the SDKs, using Ajah required:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloning the repo&lt;/li&gt;
&lt;li&gt;Configuring Docker&lt;/li&gt;
&lt;li&gt;Manually setting headers on every request&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now it's one import.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
PYTHON SDK&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;pip install ajah-sdk&lt;/p&gt;

&lt;p&gt;from ajah import AjahClient&lt;/p&gt;

&lt;p&gt;client = AjahClient(&lt;br&gt;
    gateway_url="&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;",&lt;br&gt;
    api_key="your-groq-key",&lt;br&gt;
    feature_name="my-app",&lt;br&gt;
    user_id="user-123",&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;response = client.chat(&lt;br&gt;
    model="llama-3.3-70b-versatile",&lt;br&gt;
    messages=[{"role": "user",&lt;br&gt;
               "content": "Hello"}],&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;Every call through the SDK automatically &lt;br&gt;
injects the Ajah observability headers:&lt;/p&gt;

&lt;p&gt;X-Feature-Name, X-User-ID, X-Session-ID, &lt;br&gt;
X-Agent-Step&lt;/p&gt;

&lt;p&gt;These headers drive the entire Ajah &lt;br&gt;
pipeline — cost attribution, quality &lt;br&gt;
scoring, PII detection, session tracing.&lt;/p&gt;

&lt;p&gt;Session tracking for multi-turn agents:&lt;/p&gt;

&lt;p&gt;with client.session() as session:&lt;br&gt;
    plan = session.chat(&lt;br&gt;
        model="llama-3.3-70b-versatile",&lt;br&gt;
        messages=[{"role": "user",&lt;br&gt;
                   "content": "Plan research"}],&lt;br&gt;
        step_name="step-1-planner",&lt;br&gt;
    )&lt;br&gt;
    research = session.chat(&lt;br&gt;
        model="llama-3.3-70b-versatile", &lt;br&gt;
        messages=[{"role": "user",&lt;br&gt;
                   "content": "Execute plan"}],&lt;br&gt;
        step_name="step-2-researcher",&lt;br&gt;
    )&lt;br&gt;
    print(f"View session: {session.dashboard_url}")&lt;/p&gt;

&lt;p&gt;AjahSession automatically increments step &lt;br&gt;
numbers, maintains the session ID across &lt;br&gt;
turns, and gives you a direct URL to the &lt;br&gt;
visual step tree in the Ajah dashboard.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
NODE.JS SDK (TYPESCRIPT)&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;npm install ajah-sdk&lt;/p&gt;

&lt;p&gt;import { AjahClient } from 'ajah-sdk'&lt;/p&gt;

&lt;p&gt;const client = new AjahClient({&lt;br&gt;
  gatewayUrl: '&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;',&lt;br&gt;
  apiKey: process.env.GROQ_API_KEY!,&lt;br&gt;
  featureName: 'my-app',&lt;br&gt;
  userId: 'user-123',&lt;br&gt;
})&lt;/p&gt;

&lt;p&gt;const response = await client.chat({&lt;br&gt;
  model: 'llama-3.3-70b-versatile',&lt;br&gt;
  messages: [{ role: 'user',&lt;br&gt;
               content: 'Hello' }],&lt;br&gt;
})&lt;/p&gt;

&lt;p&gt;Full TypeScript types included. &lt;br&gt;
AjahSession works the same way:&lt;/p&gt;

&lt;p&gt;const session = client.session()&lt;/p&gt;

&lt;p&gt;const r1 = await session.chat({&lt;br&gt;
  model: 'llama-3.3-70b-versatile',&lt;br&gt;
  messages: [...],&lt;br&gt;
  stepName: 'step-1-planner',&lt;br&gt;
})&lt;/p&gt;

&lt;p&gt;console.log(session.dashboardUrl)&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
WHAT RUNS BEHIND THE SDK&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Every call through the SDK goes through &lt;br&gt;
the Ajah gateway which runs:&lt;/p&gt;

&lt;p&gt;Hallucination scoring — sentence &lt;br&gt;
transformers evaluate every response &lt;br&gt;
for factual grounding. Async. Zero &lt;br&gt;
latency added.&lt;/p&gt;

&lt;p&gt;Claim density detection — flags responses &lt;br&gt;
that make many specific claims on &lt;br&gt;
low-context prompts.&lt;/p&gt;

&lt;p&gt;Linguistic hedge detection — flags &lt;br&gt;
overconfident responses on complex &lt;br&gt;
medical, legal, or financial questions.&lt;/p&gt;

&lt;p&gt;Narrative drift detection — compares &lt;br&gt;
claims across session turns. Flags &lt;br&gt;
when a model reverses position.&lt;/p&gt;

&lt;p&gt;Cost attribution — USD cost per call, &lt;br&gt;
tracked by feature and model.&lt;/p&gt;

&lt;p&gt;PII masking — emails, phones, SSNs, &lt;br&gt;
credit cards masked before storage.&lt;/p&gt;

&lt;p&gt;RAG verification — if you pass source &lt;br&gt;
documents, responses are verified &lt;br&gt;
against them. Contradictions flagged.&lt;/p&gt;

&lt;p&gt;Prometheus metrics — all signals exposed &lt;br&gt;
at /metrics for Grafana integration.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
SELF-HOSTED&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;The SDK points at your own running Ajah &lt;br&gt;
instance. No data goes through my servers.&lt;/p&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/VigneshReddy-afk/ajah" rel="noopener noreferrer"&gt;https://github.com/VigneshReddy-afk/ajah&lt;/a&gt;&lt;br&gt;
cd ajah&lt;br&gt;
docker-compose up -d&lt;/p&gt;

&lt;p&gt;Then use the SDK pointing at localhost:8080.&lt;/p&gt;

&lt;p&gt;MIT license. Free forever.&lt;/p&gt;

&lt;p&gt;→ pip install ajah-sdk&lt;br&gt;
→ npm install ajah-sdk&lt;br&gt;&lt;br&gt;
→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  python #nodejs #llm #opensource
&lt;/h1&gt;

&lt;h1&gt;
  
  
  buildinpublic #devtools #aiinfrastructure
&lt;/h1&gt;

</description>
      <category>javascript</category>
      <category>llm</category>
      <category>python</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I built Python and Node.js SDKs for my open-source LLM observability gateway — and I need a hosting sponsor</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Mon, 15 Jun 2026 18:43:54 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/i-built-python-and-nodejs-sdks-for-my-open-source-llm-observability-gateway-and-i-need-a-3pm2</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/i-built-python-and-nodejs-sdks-for-my-open-source-llm-observability-gateway-and-i-need-a-3pm2</guid>
      <description>&lt;p&gt;261 developers cloned Ajah in the &lt;br&gt;
first two weeks.&lt;/p&gt;

&lt;p&gt;Zero of them should need to understand &lt;br&gt;
Docker to get value from it.&lt;/p&gt;

&lt;p&gt;Today I shipped two SDKs that change that.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
PYTHON SDK&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;pip install ajah-sdk&lt;/p&gt;

&lt;p&gt;from ajah import AjahClient&lt;/p&gt;

&lt;p&gt;client = AjahClient(&lt;br&gt;
    gateway_url="&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;",&lt;br&gt;
    api_key="your-groq-key",&lt;br&gt;
    feature_name="my-app",&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;response = client.chat(&lt;br&gt;
    model="llama-3.3-70b-versatile",&lt;br&gt;
    messages=[{"role": "user",&lt;br&gt;
               "content": "Hello"}],&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;Every call automatically gets:&lt;br&gt;
→ Cost attribution per feature&lt;br&gt;
→ Hallucination risk scoring&lt;br&gt;
→ PII masking before storage&lt;br&gt;
→ Full trace in the dashboard&lt;/p&gt;

&lt;p&gt;Session tracking for multi-turn agents:&lt;/p&gt;

&lt;p&gt;with client.session() as session:&lt;br&gt;
    r1 = session.chat(&lt;br&gt;
        model="llama-3.3-70b-versatile",&lt;br&gt;
        messages=[...],&lt;br&gt;
        step_name="step-1-planner",&lt;br&gt;
    )&lt;br&gt;
    r2 = session.chat(&lt;br&gt;
        model="llama-3.3-70b-versatile",&lt;br&gt;
        messages=[...],&lt;br&gt;
        step_name="step-2-researcher",&lt;br&gt;
    )&lt;br&gt;
    print(session.dashboard_url)&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
NODE.JS SDK (TYPESCRIPT)&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;npm install ajah-sdk&lt;/p&gt;

&lt;p&gt;import { AjahClient } from 'ajah-sdk';&lt;/p&gt;

&lt;p&gt;const client = new AjahClient({&lt;br&gt;
  gatewayUrl: '&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;',&lt;br&gt;
  apiKey: 'your-groq-key',&lt;br&gt;
  featureName: 'my-app',&lt;br&gt;
  userId: 'user-123',&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;const response = await client.chat({&lt;br&gt;
  model: 'llama-3.3-70b-versatile',&lt;br&gt;
  messages: [{ role: 'user', &lt;br&gt;
               content: 'Hello' }],&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;Full TypeScript types included.&lt;br&gt;
Session tracking built in.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
THE HONEST ASK&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Ajah is self-hosted today. &lt;br&gt;
You run it on your own infrastructure.&lt;/p&gt;

&lt;p&gt;The next step is managed cloud hosting — &lt;br&gt;
so developers can use the SDK without &lt;br&gt;
running Docker at all.&lt;/p&gt;

&lt;p&gt;I'm looking for a sponsor or infrastructure &lt;br&gt;
partner to make that happen.&lt;/p&gt;

&lt;p&gt;If you're a cloud provider, accelerator, &lt;br&gt;
or investor who believes in open-source &lt;br&gt;
AI infrastructure — let's talk.&lt;/p&gt;

&lt;p&gt;&lt;a href="mailto:vigneshreddy181200@gmail.com"&gt;vigneshreddy181200@gmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource
&lt;/h1&gt;

&lt;h1&gt;
  
  
  python #nodejs #devtools #aiinfrastructure
&lt;/h1&gt;

</description>
      <category>llm</category>
      <category>opensource</category>
      <category>python</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How to add full observability to your LangChain and LlamaIndex agents in under 10 minutes</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Sun, 14 Jun 2026 14:47:33 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/how-to-add-full-observability-to-your-langchain-and-llamaindex-agents-in-under-10-minutes-35p1</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/how-to-add-full-observability-to-your-langchain-and-llamaindex-agents-in-under-10-minutes-35p1</guid>
      <description>&lt;p&gt;If you're running LangChain or LlamaIndex &lt;br&gt;
agents in production, you're missing &lt;br&gt;
critical signals.&lt;/p&gt;

&lt;p&gt;You know what your agent said.&lt;br&gt;
You don't know what it cost per step.&lt;br&gt;
You don't know when it hallucinated.&lt;br&gt;
You don't know when it reversed a position &lt;br&gt;
under pressure across a long conversation.&lt;/p&gt;

&lt;p&gt;Today I shipped two integrations for Ajah &lt;br&gt;
that fix this — a LangChain callback handler &lt;br&gt;
and a LlamaIndex observer. Both are single-file &lt;br&gt;
drops into your existing project.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
SETUP (2 MINUTES)&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Clone and start Ajah:&lt;/p&gt;

&lt;p&gt;git clone &lt;a href="https://github.com/VigneshReddy-afk/ajah" rel="noopener noreferrer"&gt;https://github.com/VigneshReddy-afk/ajah&lt;/a&gt;&lt;br&gt;
cd ajah&lt;br&gt;
cp .env.example .env&lt;br&gt;
docker-compose up -d&lt;/p&gt;

&lt;p&gt;Dashboard live at localhost:3000.&lt;br&gt;
Gateway at localhost:8080.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
LANGCHAIN INTEGRATION&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Copy examples/langchain/ajah_callback.py &lt;br&gt;
into your project. Then:&lt;/p&gt;

&lt;p&gt;pip install langchain-openai langchain-core&lt;/p&gt;

&lt;p&gt;from ajah_callback import AjahCallbackHandler&lt;/p&gt;

&lt;p&gt;handler = AjahCallbackHandler(&lt;br&gt;
    gateway_url="&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;",&lt;br&gt;
    feature_name="my-agent",&lt;br&gt;
    user_id="user-123",&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;llm = ChatOpenAI(&lt;br&gt;
    base_url="&lt;a href="http://localhost:8080/v1" rel="noopener noreferrer"&gt;http://localhost:8080/v1&lt;/a&gt;",&lt;br&gt;
    api_key="your-groq-key",&lt;br&gt;
    model="llama-3.3-70b-versatile",&lt;br&gt;
    callbacks=[handler],&lt;br&gt;
    model_kwargs={&lt;br&gt;
        "extra_headers": &lt;br&gt;
            handler.get_extra_headers("step-1")&lt;br&gt;
    },&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;What you get automatically for every call:&lt;/p&gt;

&lt;p&gt;Cost attribution — how much each agent &lt;br&gt;
step costs in USD, tracked per feature &lt;br&gt;
and model in real time.&lt;/p&gt;

&lt;p&gt;Hallucination risk — every response &lt;br&gt;
scored async using local ML models. &lt;br&gt;
Zero latency added to your agent.&lt;/p&gt;

&lt;p&gt;Claim density detection — flags responses &lt;br&gt;
that make many specific claims on &lt;br&gt;
low-context prompts. Catches a class &lt;br&gt;
of hallucination that embedding similarity &lt;br&gt;
misses.&lt;/p&gt;

&lt;p&gt;Narrative drift detection — compares &lt;br&gt;
claims across session turns. Flags when &lt;br&gt;
your agent reverses a position under &lt;br&gt;
pressure. Critical for long-running agents.&lt;/p&gt;

&lt;p&gt;RAG verification — if you pass source &lt;br&gt;
documents, every response is verified &lt;br&gt;
against them. Contradictions flagged &lt;br&gt;
before they reach users.&lt;/p&gt;

&lt;p&gt;Full session trace — visual step tree &lt;br&gt;
in the dashboard showing every turn, &lt;br&gt;
cost, latency, and quality score.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
LLAMAINDEX INTEGRATION&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Copy examples/llamaindex/ajah_observer.py &lt;br&gt;
into your project. Then:&lt;/p&gt;

&lt;p&gt;pip install llama-index llama-index-llms-openai&lt;/p&gt;

&lt;p&gt;from ajah_observer import AjahObserver&lt;br&gt;
from llama_index.core import Settings&lt;br&gt;
from llama_index.llms.openai import OpenAI&lt;/p&gt;

&lt;p&gt;observer = AjahObserver(&lt;br&gt;
    gateway_url="&lt;a href="http://localhost:8080" rel="noopener noreferrer"&gt;http://localhost:8080&lt;/a&gt;",&lt;br&gt;
    feature_name="rag-pipeline",&lt;br&gt;
    user_id="user-123",&lt;br&gt;
)&lt;br&gt;
observer.register()&lt;/p&gt;

&lt;p&gt;Settings.llm = OpenAI(&lt;br&gt;
    api_base="&lt;a href="http://localhost:8080/v1" rel="noopener noreferrer"&gt;http://localhost:8080/v1&lt;/a&gt;",&lt;br&gt;
    api_key="your-groq-key",&lt;br&gt;
    model="llama-3.3-70b-versatile",&lt;br&gt;
    additional_kwargs={&lt;br&gt;
        "extra_headers": &lt;br&gt;
            observer.get_extra_headers(&lt;br&gt;
                "step-1-query")&lt;br&gt;
    },&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;Every RAG query now gets full observability &lt;br&gt;
— grounding scores, contradiction detection, &lt;br&gt;
cost tracking, and session tracing.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
WHAT YOU SEE IN THE DASHBOARD&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;After running your agent:&lt;/p&gt;

&lt;p&gt;Sessions page — visual step tree showing &lt;br&gt;
every LLM call in your agent run, grouped &lt;br&gt;
by session ID with per-step cost and latency.&lt;/p&gt;

&lt;p&gt;Warnings page — any hallucination flags, &lt;br&gt;
RAG contradictions, claim density alerts, &lt;br&gt;
or narrative drift detected across your &lt;br&gt;
session turns.&lt;/p&gt;

&lt;p&gt;Traces page — live feed of every call &lt;br&gt;
with quality scores, PII detection, &lt;br&gt;
and RAG verdicts.&lt;/p&gt;

&lt;p&gt;Overview — cost by feature and model, &lt;br&gt;
quality trend over time.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Both integrations are in the repo under &lt;br&gt;
examples/langchain/ and examples/llamaindex/.&lt;/p&gt;

&lt;p&gt;Self-hosted. No data leaves your server. &lt;br&gt;
MIT license. Free forever.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  langchain #llamaindex #llm #opensource
&lt;/h1&gt;

&lt;h1&gt;
  
  
  buildinpublic #devtools #aiagents
&lt;/h1&gt;

</description>
      <category>agents</category>
      <category>llm</category>
      <category>monitoring</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Rate limiting, email alerts, health checks, and Grafana — what we shipped to make Ajah production-ready</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Sat, 13 Jun 2026 07:07:43 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/rate-limiting-email-alerts-health-checks-and-grafana-what-we-shipped-to-make-ajah-1p4f</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/rate-limiting-email-alerts-health-checks-and-grafana-what-we-shipped-to-make-ajah-1p4f</guid>
      <description>&lt;p&gt;When we launched Ajah two weeks ago, &lt;br&gt;
261 developers cloned it in the first week.&lt;/p&gt;

&lt;p&gt;The product worked. But it wasn't &lt;br&gt;
production-ready for enterprise teams.&lt;/p&gt;

&lt;p&gt;Today that changes.&lt;/p&gt;

&lt;p&gt;Here's exactly what we shipped and why &lt;br&gt;
each piece matters for teams running &lt;br&gt;
LLMs in production.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
RATE LIMITING PER FEATURE&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;The problem: a single misconfigured &lt;br&gt;
agent or a traffic spike on one feature &lt;br&gt;
can exhaust your entire API budget before &lt;br&gt;
anyone notices.&lt;/p&gt;

&lt;p&gt;The fix: per-feature rate limiting using &lt;br&gt;
a Redis sliding window counter.&lt;/p&gt;

&lt;p&gt;Configure requests per minute from the &lt;br&gt;
Settings page — no code changes needed. &lt;br&gt;
When a feature exceeds its limit, the &lt;br&gt;
gateway returns 429 before the request &lt;br&gt;
ever reaches your LLM provider:&lt;/p&gt;

&lt;p&gt;{&lt;br&gt;
  "error": "rate limit exceeded",&lt;br&gt;
  "feature": "chat",&lt;br&gt;
  "limit": 60,&lt;br&gt;
  "reset_in_seconds": 34&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Response headers include X-RateLimit-Limit &lt;br&gt;
and X-RateLimit-Reset for client-side &lt;br&gt;
handling. One Redis INCR call per request — &lt;br&gt;
sub-millisecond overhead.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
EMAIL ALERTS VIA SMTP&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;The problem: Slack webhooks reach &lt;br&gt;
developers. They don't reach compliance &lt;br&gt;
teams, finance teams, or anyone who &lt;br&gt;
needs an audit trail.&lt;/p&gt;

&lt;p&gt;The fix: SMTP email alerts alongside &lt;br&gt;
existing Slack webhooks.&lt;/p&gt;

&lt;p&gt;Configure once via the Settings API:&lt;/p&gt;

&lt;p&gt;POST /settings&lt;br&gt;
{&lt;br&gt;
  "smtp_config": {&lt;br&gt;
    "host": "smtp.gmail.com",&lt;br&gt;
    "port": 587,&lt;br&gt;
    "username": "&lt;a href="mailto:alerts@yourcompany.com"&gt;alerts@yourcompany.com&lt;/a&gt;",&lt;br&gt;
    "password": "your-app-password",&lt;br&gt;
    "from": "&lt;a href="mailto:alerts@yourcompany.com"&gt;alerts@yourcompany.com&lt;/a&gt;"&lt;br&gt;
  }&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Then set alert_email_to per feature. &lt;br&gt;
Cost spikes and risk flags fire email &lt;br&gt;
automatically — subject lines like:&lt;/p&gt;

&lt;p&gt;[Ajah Alert] Cost spike — feature: chat&lt;br&gt;
[Ajah Alert] Risk flag — feature: support-bot&lt;/p&gt;

&lt;p&gt;Fire-and-forget goroutines. Zero latency &lt;br&gt;
added to the hot path.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
PER-DEPENDENCY HEALTH CHECKS&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;The problem: {"status":"ok"} is useless &lt;br&gt;
when your load balancer needs to know &lt;br&gt;
which specific dependency is down at 2am.&lt;/p&gt;

&lt;p&gt;The fix: /health now pings Redis, &lt;br&gt;
PostgreSQL, and ClickHouse individually &lt;br&gt;
with a 3-second timeout per dependency:&lt;/p&gt;

&lt;p&gt;{&lt;br&gt;
  "status": "ok",&lt;br&gt;
  "version": "0.1.0",&lt;br&gt;
  "dependencies": {&lt;br&gt;
    "redis":      {"status": "ok"},&lt;br&gt;
    "postgres":   {"status": "ok"},&lt;br&gt;
    "clickhouse": {"status": "ok"}&lt;br&gt;
  }&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;If any dependency is down, the response &lt;br&gt;
returns HTTP 503 with the specific error:&lt;/p&gt;

&lt;p&gt;{&lt;br&gt;
  "status": "degraded",&lt;br&gt;
  "dependencies": {&lt;br&gt;
    "redis": {&lt;br&gt;
      "status": "down",&lt;br&gt;
      "error": "dial tcp: connection refused"&lt;br&gt;
    }&lt;br&gt;
  }&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;Your monitoring system, load balancer, &lt;br&gt;
and on-call engineer know exactly what &lt;br&gt;
to fix.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;br&gt;
GRAFANA DASHBOARD&lt;br&gt;
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;The problem: we shipped 10 Prometheus &lt;br&gt;
metrics two weeks ago. Nobody wants &lt;br&gt;
to build 18 Grafana panels from scratch.&lt;/p&gt;

&lt;p&gt;The fix: docs/grafana-dashboard.json &lt;br&gt;
— one import, production dashboard.&lt;/p&gt;

&lt;p&gt;18 panels across 5 sections:&lt;/p&gt;

&lt;p&gt;Traffic&lt;br&gt;
→ Requests per second by feature&lt;br&gt;
→ Requests per second by provider&lt;/p&gt;

&lt;p&gt;Latency&lt;br&gt;
→ LLM p50 and p95 by provider&lt;br&gt;
→ Scorer p50 and p95&lt;/p&gt;

&lt;p&gt;Cost&lt;br&gt;
→ Cost per hour by feature (USD)&lt;br&gt;
→ Cost per hour by model (USD)&lt;/p&gt;

&lt;p&gt;Quality and Safety&lt;br&gt;
→ Hallucination risk gauges by feature&lt;br&gt;
→ Claim density risk by feature&lt;br&gt;
→ Narrative drift risk by feature&lt;/p&gt;

&lt;p&gt;Warnings and PII&lt;br&gt;
→ Warning rate by risk level&lt;br&gt;
→ PII detection rate by feature&lt;/p&gt;

&lt;p&gt;Import the JSON, point at your Prometheus &lt;br&gt;
datasource, and you have a complete &lt;br&gt;
LLM observability dashboard in under &lt;br&gt;
60 seconds.&lt;/p&gt;

&lt;p&gt;━━━━━━━━━━━━━━━━━━━━━━━━━━━━━&lt;/p&gt;

&lt;p&gt;Ajah is open source, self-hosted, &lt;br&gt;
MIT licensed.&lt;/p&gt;

&lt;p&gt;No data leaves your server. &lt;br&gt;
No vendor lock-in. &lt;br&gt;
No acquisition risk.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
      <category>api</category>
      <category>devops</category>
      <category>llm</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>How I built narrative drift detection for LLM agent runs</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Sat, 06 Jun 2026 16:40:08 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/how-i-built-narrative-drift-detection-for-llm-agent-runs-2i56</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/how-i-built-narrative-drift-detection-for-llm-agent-runs-2i56</guid>
      <description>&lt;p&gt;Every LLM observability tool monitors &lt;br&gt;
individual requests.&lt;/p&gt;

&lt;p&gt;None of them monitor position consistency &lt;br&gt;
across a conversation.&lt;/p&gt;

&lt;p&gt;That's the gap I shipped today in Ajah.&lt;/p&gt;

&lt;p&gt;The problem:&lt;/p&gt;

&lt;p&gt;In a long agent run or multi-turn &lt;br&gt;
conversation, a model can reverse its &lt;br&gt;
position under social pressure — and &lt;br&gt;
nothing flags it. Turn 2 says one thing. &lt;br&gt;
Turn 8 says the opposite. Both responses &lt;br&gt;
look perfectly normal in isolation.&lt;/p&gt;

&lt;p&gt;For healthcare, legal, and financial &lt;br&gt;
AI systems, this is a liability.&lt;/p&gt;

&lt;p&gt;How narrative drift detection works:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Every session turn stores up to 2000 &lt;br&gt;
characters of response text in Redis&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;When a new request comes in with a &lt;br&gt;
session ID, Ajah fetches the full &lt;br&gt;
session history and passes it to &lt;br&gt;
the scorer&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;The scorer extracts factual claims &lt;br&gt;
from each turn — sentences containing &lt;br&gt;
proper nouns, numbers, or absolute &lt;br&gt;
statements&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Claims are embedded using &lt;br&gt;
sentence-transformers and compared &lt;br&gt;
across turns using cosine similarity&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;High similarity + negation markers &lt;br&gt;
= contradiction signal&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;drift_risk score + drift_verdict &lt;br&gt;
(stable / possible_drift / drift_detected) &lt;br&gt;
returned with every scored response&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;narrative_drift flag fires in the &lt;br&gt;
Warnings dashboard when drift_risk &amp;gt; 0.5&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Everything runs async. Zero latency &lt;br&gt;
added to your users.&lt;/p&gt;

&lt;p&gt;MIT license. Self-hosted.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
      <category>agents</category>
      <category>llm</category>
      <category>monitoring</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How I added real-time Slack alerts to an open-source LLM gateway in one day</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Fri, 05 Jun 2026 15:26:36 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/how-i-added-real-time-slack-alerts-to-an-open-source-llm-gateway-in-one-day-31hn</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/how-i-added-real-time-slack-alerts-to-an-open-source-llm-gateway-in-one-day-31hn</guid>
      <description>&lt;p&gt;When something goes wrong with your LLM &lt;br&gt;
in production, you shouldn't have to &lt;br&gt;
check a dashboard to find out.&lt;/p&gt;

&lt;p&gt;Today I shipped Slack webhook support &lt;br&gt;
to Ajah — two types of alerts, both &lt;br&gt;
fire-and-forget, zero latency added.&lt;/p&gt;

&lt;p&gt;Cost spike alerts:&lt;/p&gt;

&lt;p&gt;When a feature's daily LLM spend exceeds &lt;br&gt;
the configured threshold, Ajah fires a &lt;br&gt;
formatted Slack message:&lt;/p&gt;

&lt;p&gt;🚨 Cost Alert — Ajah&lt;br&gt;
Feature: chat&lt;br&gt;
Cost today: $4.23&lt;br&gt;
Threshold: $2.00&lt;br&gt;
Model: gpt-4o&lt;/p&gt;

&lt;p&gt;Deduplication is built in — one alert &lt;br&gt;
per feature per day maximum, using a &lt;br&gt;
Redis SetNX with 24h TTL.&lt;/p&gt;

&lt;p&gt;Risk alerts:&lt;/p&gt;

&lt;p&gt;When a response is flagged — hallucination, &lt;br&gt;
RAG contradiction, claim density, or &lt;br&gt;
overconfidence — Ajah fires a Slack alert &lt;br&gt;
with the risk level, scores, and exact &lt;br&gt;
reason strings.&lt;/p&gt;

&lt;p&gt;⚠️ Risk Alert — Ajah&lt;br&gt;
Feature: support-bot&lt;br&gt;
Risk Level: high&lt;br&gt;
Hallucination Risk: 0.78&lt;br&gt;
Grounding Score: 0.31&lt;br&gt;
Reasons: Response contradicts source document&lt;/p&gt;

&lt;p&gt;Both use the webhook_url configured per &lt;br&gt;
feature in the Settings page. One URL, &lt;br&gt;
both alert types. Configure in 30 seconds.&lt;/p&gt;

&lt;p&gt;Self-hosted. MIT license.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
      <category>llm</category>
      <category>monitoring</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>The LLM failure mode nobody is monitoring: overconfident responses in high-stakes domains</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Thu, 04 Jun 2026 14:46:29 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/the-llm-failure-mode-nobody-is-monitoring-overconfident-responses-in-high-stakes-domains-2min</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/the-llm-failure-mode-nobody-is-monitoring-overconfident-responses-in-high-stakes-domains-2min</guid>
      <description>&lt;p&gt;Hallucination detection tools measure &lt;br&gt;
factual drift. RAG verification catches &lt;br&gt;
contradictions. Claim density scoring &lt;br&gt;
flags unverifiable assertions.&lt;/p&gt;

&lt;p&gt;None of them measure this:&lt;/p&gt;

&lt;p&gt;A model that responds to a complex medical, &lt;br&gt;
legal, or financial question with absolute &lt;br&gt;
certainty. No hedging. No caveats. Full &lt;br&gt;
confidence in an answer that may be &lt;br&gt;
dangerously incomplete or wrong.&lt;/p&gt;

&lt;p&gt;This is the failure mode that gets &lt;br&gt;
companies sued.&lt;/p&gt;

&lt;p&gt;Today I shipped linguistic hedge detection &lt;br&gt;
in Ajah — the first LLM observability tool &lt;br&gt;
to score responses for overconfidence &lt;br&gt;
relative to question complexity.&lt;/p&gt;

&lt;p&gt;How it works:&lt;/p&gt;

&lt;p&gt;Every response is evaluated on two dimensions:&lt;/p&gt;

&lt;p&gt;Question complexity — does the prompt &lt;br&gt;
contain conditional language, high-stakes &lt;br&gt;
domain markers (medical, legal, financial, &lt;br&gt;
scientific), or multi-part uncertainty signals?&lt;/p&gt;

&lt;p&gt;Response certainty — does the response use &lt;br&gt;
absolute language ("definitely", "certainly", &lt;br&gt;
"guaranteed", "proven", "without question") &lt;br&gt;
without appropriate hedging ("may", "might", &lt;br&gt;
"it depends", "consult a professional")?&lt;/p&gt;

&lt;p&gt;hedge_risk = certainty_score × complexity_score&lt;/p&gt;

&lt;p&gt;When hedge_risk exceeds the threshold, &lt;br&gt;
Ajah flags the response as &lt;br&gt;
"overconfident_response" in the Warnings &lt;br&gt;
dashboard — with the exact score, the &lt;br&gt;
feature name, and the full response for review.&lt;/p&gt;

&lt;p&gt;This runs async on every LLM call. &lt;br&gt;
Zero latency added to your users.&lt;/p&gt;

&lt;p&gt;For teams building AI in healthcare, &lt;br&gt;
finance, legal, or government — this is &lt;br&gt;
the signal that tells you when your model &lt;br&gt;
is speaking with authority it hasn't earned.&lt;/p&gt;

&lt;p&gt;MIT license. Self-hosted. &lt;br&gt;
No data leaves your server.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>monitoring</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Helicone got acquired. Langfuse got acquired. Here's what I built instead.</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Tue, 02 Jun 2026 14:54:03 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/helicone-got-acquired-langfuse-got-acquired-heres-what-i-built-instead-55le</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/helicone-got-acquired-langfuse-got-acquired-heres-what-i-built-instead-55le</guid>
      <description>&lt;p&gt;In the last 6 months:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Helicone was acquired by Mintlify → 
maintenance mode&lt;/li&gt;
&lt;li&gt;Langfuse was acquired by ClickHouse → 
January 2026&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both tools are still usable. But the pattern &lt;br&gt;
is clear: every LLM observability tool &lt;br&gt;
eventually gets acquired or goes cloud-only.&lt;/p&gt;

&lt;p&gt;For teams in regulated industries — healthcare, &lt;br&gt;
finance, government — that's not acceptable. &lt;br&gt;
Your prompts cannot leave your server.&lt;/p&gt;

&lt;p&gt;So I built Ajah.&lt;/p&gt;

&lt;p&gt;One docker-compose up. Everything runs on &lt;br&gt;
your infrastructure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gateway proxy — 9 providers, &amp;lt;2ms overhead&lt;/li&gt;
&lt;li&gt;Cost attribution — per user, per feature, 
per model&lt;/li&gt;
&lt;li&gt;PII masking — before anything hits storage&lt;/li&gt;
&lt;li&gt;Hallucination flagging — async, zero latency&lt;/li&gt;
&lt;li&gt;RAG verification — catches contradictions 
against your source documents&lt;/li&gt;
&lt;li&gt;Claim density scoring — flags responses 
with many specific claims on low-context 
prompts&lt;/li&gt;
&lt;li&gt;Prometheus /metrics — plug into your 
existing Grafana stack&lt;/li&gt;
&lt;li&gt;Multi-agent session tracing — visual step 
tree, per-step cost visibility&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No cloud dependency. No vendor lock-in. &lt;br&gt;
No acquisition risk.&lt;/p&gt;

&lt;p&gt;MIT license. Free forever for self-hosted use.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;br&gt;
→ useajah.com&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
    </item>
    <item>
      <title>I found my own tool making twice the API calls it should. Here's what I fixed.</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Mon, 01 Jun 2026 15:12:02 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/i-found-my-own-tool-making-twice-the-api-calls-it-should-heres-what-i-fixed-193n</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/i-found-my-own-tool-making-twice-the-api-calls-it-should-heres-what-i-fixed-193n</guid>
      <description>&lt;p&gt;Every request through Ajah was silently making &lt;br&gt;
two calls to the scorer.&lt;/p&gt;

&lt;p&gt;Not one. Two.&lt;/p&gt;

&lt;p&gt;I didn't plan it that way. It grew organically — &lt;br&gt;
the flagger needed hallucination scores, so it &lt;br&gt;
called the scorer. Main.go needed quality scores, &lt;br&gt;
so it called the scorer again. Two separate &lt;br&gt;
functions. Two separate HTTP calls. Same scorer. &lt;br&gt;
Same request. Every single time.&lt;/p&gt;

&lt;p&gt;The scorer was doing double the work and nobody &lt;br&gt;
noticed because the responses were still correct. &lt;br&gt;
Silent waste is the worst kind of bug — it doesn't &lt;br&gt;
break anything, it just costs you.&lt;/p&gt;

&lt;p&gt;Here's what the call structure looked like before:&lt;/p&gt;

&lt;p&gt;Request comes in&lt;br&gt;
→ main.go calls scorer (gets quality score, RAG verdict)&lt;br&gt;
→ flagger.go calls scorer again (gets hallucination score)&lt;br&gt;
→ Two scorer results, mostly overlapping, one thrown away&lt;/p&gt;

&lt;p&gt;And here's what the scorer was returning that we &lt;br&gt;
were completely ignoring:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;flags[] — high_claim_density, toxicity_detected&lt;/li&gt;
&lt;li&gt;claim_density_risk — float, carefully computed&lt;/li&gt;
&lt;li&gt;toxicity_score&lt;/li&gt;
&lt;li&gt;factual_consistency_score&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of it silently discarded. The flagger decoded &lt;br&gt;
exactly two fields and threw the rest away.&lt;/p&gt;

&lt;p&gt;The fix was a proper refactor — not a patch. &lt;br&gt;
Single scorer call. Full result captured. &lt;br&gt;
Everything threaded through to where decisions &lt;br&gt;
are made.&lt;/p&gt;

&lt;p&gt;After the fix, warnings went from this:&lt;/p&gt;

&lt;p&gt;"High hallucination signal detected (score: 0.60)"&lt;/p&gt;

&lt;p&gt;To this:&lt;/p&gt;

&lt;p&gt;"High claim density detected — response contains &lt;br&gt;
many specific claims on low-context prompt (risk: 1.00)"&lt;br&gt;
"High hallucination signal detected (score: 0.60)"&lt;/p&gt;

&lt;p&gt;One tells you a number. &lt;br&gt;
The other tells you what to do.&lt;/p&gt;

&lt;p&gt;That's the difference between logging and signal.&lt;/p&gt;

&lt;p&gt;If you're building anything that sits between &lt;br&gt;
an app and an LLM — check your call patterns. &lt;br&gt;
Silent duplication is easy to miss and expensive &lt;br&gt;
at scale.&lt;br&gt;
I AM 100% SURE THIS WILL WORK &lt;/p&gt;

&lt;p&gt;Ajah is open source, self-hostable, MIT license.&lt;/p&gt;

&lt;p&gt;→ github.com/VigneshReddy-afk/ajah&lt;/p&gt;

&lt;h1&gt;
  
  
  buildinpublic #llm #opensource #devtools
&lt;/h1&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>go</category>
      <category>performance</category>
    </item>
    <item>
      <title>I got tired of LLM observability tools getting acquired. So I built one that can't be.</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Sun, 31 May 2026 06:01:51 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/i-got-tired-of-llm-observability-tools-getting-acquired-so-i-built-one-that-cant-be-4gc8</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/i-got-tired-of-llm-observability-tools-getting-acquired-so-i-built-one-that-cant-be-4gc8</guid>
      <description>&lt;p&gt;Helicone got acquired. Langfuse got acquired.&lt;br&gt;
Two of the most trusted tools in the LLM &lt;br&gt;
observability space, gone within months of &lt;br&gt;
each other.&lt;/p&gt;

&lt;p&gt;I don't say this to criticize the founders.&lt;br&gt;
Building and selling is legitimate.&lt;/p&gt;

&lt;p&gt;But for engineering teams running AI in &lt;br&gt;
production — especially in healthcare, finance, &lt;br&gt;
and government where data cannot leave your &lt;br&gt;
servers — every acquisition is a crisis.&lt;/p&gt;

&lt;p&gt;So I stopped waiting for the next one.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Why I built Ajah after Helicone went into maintenance mode</title>
      <dc:creator>Vignesh Reddy</dc:creator>
      <pubDate>Sat, 30 May 2026 08:30:51 +0000</pubDate>
      <link>https://dev.to/vignesh_reddy_53e403f62d2/why-i-built-ajah-after-helicone-went-into-maintenance-mode-120d</link>
      <guid>https://dev.to/vignesh_reddy_53e403f62d2/why-i-built-ajah-after-helicone-went-into-maintenance-mode-120d</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;In March 2026, Helicone — one of the most popular &lt;br&gt;
LLM observability tools — was acquired by Mintlify &lt;br&gt;
and went into maintenance mode. Thousands of &lt;br&gt;
developers were left looking for an alternative.&lt;/p&gt;

&lt;p&gt;But the deeper problem wasn't just Helicone. &lt;br&gt;
Every LLM observability tool available today has &lt;br&gt;
one of these problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloud-locked (your prompts leave your server)&lt;/li&gt;
&lt;li&gt;Acquired and abandoned&lt;/li&gt;
&lt;li&gt;Only does one thing (cost OR observability OR evals)&lt;/li&gt;
&lt;li&gt;Requires sending sensitive data to third parties&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For enterprises in healthcare, finance, and &lt;br&gt;
government — none of these tools work. They &lt;br&gt;
legally cannot send prompts to external servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;Ajah is a self-hostable LLM gateway that sits &lt;br&gt;
between your application and any LLM provider.&lt;/p&gt;

&lt;p&gt;It does 5 things in one tool:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Gateway Proxy&lt;/strong&gt;&lt;br&gt;
Point your app at Ajah instead of OpenAI directly. &lt;br&gt;
One line change. Supports 9 providers automatically &lt;br&gt;
detected from your API key prefix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. RAG Verification&lt;/strong&gt;&lt;br&gt;
When your app uses retrieval-augmented generation, &lt;br&gt;
Ajah verifies whether the LLM response is actually &lt;br&gt;
grounded in your source documents. Contradictions &lt;br&gt;
are flagged before they reach users.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Hallucination Flagging&lt;/strong&gt;&lt;br&gt;
Every response is scored for hallucination risk &lt;br&gt;
in parallel — zero latency added. Uses local ML &lt;br&gt;
models, no external API calls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Multi-Agent Session Tracing&lt;/strong&gt;&lt;br&gt;
Visual step-by-step trace of every agent run. &lt;br&gt;
Cost, quality, and&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>privacy</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
