<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Roshan Mayengbam</title>
    <description>The latest articles on DEV Community by Roshan Mayengbam (@roshan_mayengbam_3597c388).</description>
    <link>https://dev.to/roshan_mayengbam_3597c388</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3967771%2F1bb44e44-3c44-4116-a05d-265aa6db0ad8.jpg</url>
      <title>DEV Community: Roshan Mayengbam</title>
      <link>https://dev.to/roshan_mayengbam_3597c388</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/roshan_mayengbam_3597c388"/>
    <language>en</language>
    <item>
      <title>I built a self-hosted AI cost tracker after our team's bill exploded</title>
      <dc:creator>Roshan Mayengbam</dc:creator>
      <pubDate>Thu, 04 Jun 2026 07:29:40 +0000</pubDate>
      <link>https://dev.to/roshan_mayengbam_3597c388/i-built-a-self-hosted-ai-cost-tracker-after-our-teams-bill-exploded-5a25</link>
      <guid>https://dev.to/roshan_mayengbam_3597c388/i-built-a-self-hosted-ai-cost-tracker-after-our-teams-bill-exploded-5a25</guid>
      <description>&lt;p&gt;Last month something embarrassing happened.&lt;/p&gt;

&lt;p&gt;Our team's OpenAI bill doubled.&lt;/p&gt;

&lt;p&gt;Nobody knew why. Nobody knew who caused it.&lt;br&gt;
We just got one giant number at the end &lt;br&gt;
of the month with zero breakdown.&lt;/p&gt;

&lt;p&gt;I spent a week digging through logs manually &lt;br&gt;
trying to figure out which engineer or &lt;br&gt;
workflow was responsible.&lt;/p&gt;

&lt;p&gt;That week I decided to build something.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Every company using OpenAI, Claude, or &lt;br&gt;
Gemini has the same issue.&lt;/p&gt;

&lt;p&gt;You get one bill. One number. Nothing else.&lt;/p&gt;

&lt;p&gt;No breakdown by engineer.&lt;br&gt;
No breakdown by team.&lt;br&gt;
No breakdown by feature.&lt;/p&gt;

&lt;p&gt;You cannot set limits per person.&lt;br&gt;
You cannot get warned before it explodes.&lt;br&gt;
You cannot automatically stop runaway spend.&lt;/p&gt;

&lt;p&gt;Microsoft felt this at scale when they &lt;br&gt;
cancelled Claude Code licenses company wide.&lt;br&gt;
Uber felt this when they burned their entire &lt;br&gt;
2026 AI budget in 4 months.&lt;/p&gt;

&lt;p&gt;But this is not just a big company problem.&lt;br&gt;
Any team with 5+ engineers using AI APIs &lt;br&gt;
faces this every month.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Built
&lt;/h2&gt;

&lt;p&gt;TokenGuard is a self-hosted proxy that &lt;br&gt;
sits between your engineers and &lt;br&gt;
OpenAI/Claude/Gemini.&lt;/p&gt;

&lt;p&gt;Engineers change 2 lines of code:&lt;/p&gt;

&lt;p&gt;Before:&lt;br&gt;
api_key = "sk-openai-real-key"&lt;br&gt;
base_url = "&lt;a href="https://api.openai.com" rel="noopener noreferrer"&gt;https://api.openai.com&lt;/a&gt;"&lt;/p&gt;

&lt;p&gt;After:&lt;br&gt;
api_key = "tg_live_yourkey"&lt;br&gt;
base_url = "&lt;a href="http://tokenguard.yourserver.com/proxy/openai" rel="noopener noreferrer"&gt;http://tokenguard.yourserver.com/proxy/openai&lt;/a&gt;"&lt;/p&gt;

&lt;p&gt;That is the entire integration.&lt;br&gt;
Nothing else changes for engineers.&lt;/p&gt;




&lt;h2&gt;
  
  
  What You Get
&lt;/h2&gt;

&lt;p&gt;Real-time dashboard showing:&lt;br&gt;
→ Exactly who spent what this month&lt;br&gt;
→ Which team is approaching their limit&lt;br&gt;
→ Which AI model is costing the most&lt;br&gt;
→ Automatic alerts at 80% budget&lt;br&gt;
→ Auto-block or reroute when limit hit&lt;/p&gt;

&lt;p&gt;The smart routing part is my favourite feature.&lt;/p&gt;

&lt;p&gt;Instead of hard blocking an engineer &lt;br&gt;
when they hit their limit — TokenGuard &lt;br&gt;
silently switches their request to a &lt;br&gt;
cheaper model automatically.&lt;/p&gt;

&lt;p&gt;Engineer keeps working.&lt;br&gt;
Company stops overspending.&lt;br&gt;
Nobody notices anything changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Self-Hosted Matters
&lt;/h2&gt;

&lt;p&gt;Every competitor routes your prompts &lt;br&gt;
through their servers.&lt;/p&gt;

&lt;p&gt;That means your code, your data, your &lt;br&gt;
business logic goes through a third &lt;br&gt;
party system.&lt;/p&gt;

&lt;p&gt;No enterprise company will accept that.&lt;/p&gt;

&lt;p&gt;TokenGuard runs entirely on your own &lt;br&gt;
server. One Docker command to deploy.&lt;br&gt;
Your data never leaves your network.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current Status
&lt;/h2&gt;

&lt;p&gt;Core proxy is working and tested &lt;br&gt;
with real API calls.&lt;/p&gt;

&lt;p&gt;Dashboard is complete with:&lt;br&gt;
→ Usage tracking per employee&lt;br&gt;
→ Team budget management&lt;br&gt;
→ Smart routing rules&lt;br&gt;
→ Alerts console&lt;br&gt;
→ Reports and CSV export&lt;/p&gt;

&lt;p&gt;Still finishing some parts but &lt;br&gt;
the core works.&lt;/p&gt;




&lt;h2&gt;
  
  
  Looking For
&lt;/h2&gt;

&lt;p&gt;I am looking for 3-5 engineering teams &lt;br&gt;
who want to beta test this for free &lt;br&gt;
in exchange for honest feedback.&lt;/p&gt;

&lt;p&gt;If your team uses OpenAI, Claude, or &lt;br&gt;
Gemini APIs and gets surprised by &lt;br&gt;
monthly bills — I would love to talk.&lt;/p&gt;

&lt;p&gt;Not a sales pitch.&lt;br&gt;
Just want real feedback from teams &lt;br&gt;
dealing with this problem.&lt;/p&gt;

&lt;p&gt;Drop a comment or email me:&lt;br&gt;
[your email here]&lt;/p&gt;




&lt;p&gt;Built solo over 2 months.&lt;br&gt;
Still learning. Still building.&lt;br&gt;
Honest feedback welcome.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>monitoring</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
