The Problem
If you're building with OpenAI or Claude, you're
probably overpaying by 60-80% on every API call.
Here's why:
Most AI apps call GPT-4 for every single request —
even when they already have the answer cached from
a previous call. Same question, 100 different users,
100 full-price API calls.
I got tired of seeing this problem everywhere,
so I built VibeCore to fix it automatically.
What is VibeCore?
VibeCore is a middleware layer that sits between
your app and any AI API. It automatically:
- Caches repeated prompts (zero cost on duplicates)
- Understands similar prompts (semantic caching)
- Routes simple queries to free models
- Tracks your savings on every request
How it works
Layer 1 — Exact Cache
When the same prompt is asked again, VibeCore
returns the cached response instantly.
Cost: Rs.0
Speed: ~5ms
Layer 2 — Semantic Cache
When a similar prompt is asked (e.g. "capital of
France?" vs "What is France's capital?"), VibeCore
finds the closest cached response using embeddings.
Cost: Rs.0
Speed: ~30ms
Layer 3 — Smart Routing
Simple prompts (under 20 words, no complex keywords)
are routed to free local models like Groq's llama.
Cost: Rs.0
Speed: ~500ms
Integration
Install the npm package:
npm install @aadi0001/vibecore
Use it in your app:
const VibeCore = require('@aadi0001/vibecore')
const vc = new VibeCore('YOUR_API_KEY')
const result = await vc.generate('What is photosynthesis?')
console.log(result.response)
console.log('Saved: Rs.' + result.saved)
console.log('Source:', result.source)
For Python:
import requests
response = requests.post(
'https://vibecore-07n6.onrender.com/generate',
json={'prompt': 'What is photosynthesis?'},
headers={'x-api-key': 'YOUR_API_KEY'}
)
print(response.json()['response'])
print('Saved:', response.json()['saved'])
Response format
Every response includes cost data:
{
"response": "Photosynthesis is...",
"cached": false,
"source": "groq",
"saved": 0.012,
"total_saved": 0.024
}
Real results
In testing with 10 requests:
- 6 cache hits (60% cache rate)
- 4 groq calls (free model)
- 0 paid API calls
- Total saved: Rs.0.08
At scale with 10,000 requests/day:
- Estimated savings: Rs.800/day
- Monthly savings: Rs.24,000
The dashboard
Every user gets a personal dashboard showing:
- Total requests made
- Total money saved
- Cache hit rate
- Live request log
Get started free
Get your free API key (1000 requests, no credit card):
https://vibecore-07n6.onrender.comInstall:
npm install @aadi0001/vibecoreReplace your AI calls — savings start immediately.
Tech stack
- FastAPI (Python backend)
- Redis (caching)
- Groq API (free AI model)
- Sentence Transformers (semantic similarity)
- Node.js SDK (npm package)
- Render (deployment)
Built this in 48 hours. Would love your feedback
in the comments!
What other AI cost optimizations have you tried?
Top comments (0)