adamday75

Posted on Mar 28

Built a Caching Proxy for OpenAI — Saved 40% on API Bills

#beginners #node #openai #sass

A maintenance manager's first SaaS. Technical deep-dive + lessons learned.

Hey dev.to! 👋

I'm not a career developer. I supervise industrial mechanics and run a maintenance department. But we needed AI for our CMMS (Computerized Maintenance Management System), and the OpenAI API costs were getting crazy.

So I built a caching proxy. Here's how it works, what I learned, and the actual code.

───

The Problem

We're using AI for:

• Auto-generating work orders
• Predictive maintenance alerts
• Vendor communications
• Training docs

Issue: Same prompts, repeated constantly, paying every time.

User: "Generate work order for HVAC maintenance"
→ Pay $0.002

User: "Generate work order for HVAC maintenance" (same prompt)
→ Pay $0.002 again

User: "Generate work order for HVAC maintenance" (same prompt, 3rd time)
→ Pay $0.002 AGAIN

This adds up FAST at scale.

───

The Solution: Caching Proxy

Intercept OpenAI requests, hash the prompt, cache the response.

Architecture:

Your App → AI Optimizer Proxy → OpenAI API
                ↓
           SQLite Cache
                ↓
        (hash → response)

Flow:

App sends request to proxy
Proxy hashes: sha256(prompt + model + params)
Check cache: • Hit: Return cached response (FREE) • Miss: Forward to OpenAI, cache response, return
Dashboard tracks hits/misses/savings

───

The Code (Simplified)

Cache lookup:

async function getCacheKey(hash) {
  const db = await getDb();
  return db.get('SELECT * FROM cache WHERE hash = ?', [hash]);
}

async function setCacheKey(hash, response, ttl = 86400) {
  const db = await getDb();
  await db.run(
    'INSERT OR REPLACE INTO cache (hash, response, expires_at) VALUES (?, ?, ?)',
    [hash, JSON.stringify(response), Date.now() + ttl * 1000]
  );
}

Request Handler:

app.post('/v1/chat/completions', async (req, res) => {
  const hash = hashPrompt(req.body);
  const cached = await getCacheKey(hash);

  if (cached && cached.expires_at > Date.now()) {
    // Cache HIT
    analytics.recordHit(true);
    return res.json(cached.response);
  }

  // Cache MISS - call OpenAI
  const response = await fetch('https://api.openai.com/v1/chat/completions', {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.OPENAI_API_KEY}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify(req.body)
  });

  const data = await response.json();
  await setCacheKey(hash, data);
  analytics.recordHit(false);

  res.json(data);
});

───

The Results

My workload (1 week):

• Total requests: 1,847
• Cache hits: 1,385
• Cache misses: 462
• Cache hit rate: 75%
• Cost savings: ~40%

Real money saved: If you're spending $100/month, expect $40-60 savings.

───
Technical Challenges

Device Fingerprinting

For license validation, I needed to identify devices without login:

const { machineId } = require('node-machine-id');
const deviceId = await machineId();
const fingerprint = sha256(deviceId + os.hostname());

Stripe Webhooks

First payment webhook failed because I didn't handle customer.subscription.created vs customer.subscription.updated differently. Now I route by event.type:

app.post('/stripe', (req, res) => {
  const eventType = req.body.type;

  switch(eventType) {
    case 'customer.subscription.created':
      createLicense(req.body.data.object);
      break;
    case 'customer.subscription.deleted':
      revokeLicense(req.body.data.object);
      break;
    // ... 4 more event types
  }
});

Email Delivery

Gmail OAuth was a pain. Had to:

• Create Google Cloud project
• Enable Gmail API
• Get OAuth credentials
• Handle refresh tokens
• Deploy secrets to Fly.io

First emails failed with unauthorized_client. Turned out my refresh token was from a different OAuth client. Started fresh, worked immediately.

───

The Stack

| Component   | Tech              |
| ----------- | ----------------- |
| Backend     | Node.js + Express |
| Desktop App | Electron          |
| Database    | SQLite            |
| Hosting     | Fly.io            |
| Payments    | Stripe            |
| Email       | Gmail API         |
| Builds      | electron-builder  |

Total build time: ~1 month (nights/weekends)

───

Lessons Learned

Schema matters: Added device_limit column after deploying. Had to recreate licenses. Check your schema BEFORE launch.
Fresh Stripe signup > DB hacking: When my license broke, creating a new test subscription was faster than debugging the DB.
Ship before perfect: My first build had no stats dashboard. Shipped anyway. Added it later.
Non-devs can build SaaS: I learned enough to ship. You can too.

───

Try It...

Free 14-day trial: https://ai-optimizer-landing.vercel.app

GitHub (open source): https://github.com/adamday75/ai-optimizer-app

Drop-in replacement. Change one env var. See your savings.

───

Questions?

I'm happy to answer anything about:

(https://github.com/adamday75/ai-optimizer-app)

• The caching strategy
• License system
• Stripe integration
• Electron builds
• Learning Node.js as a non-dev

Drop a comment! 👇

───