Why I Built This
Last month, we got our OpenAI bill: $3,127 for a single week.
We were bleeding money on AI API calls. We had no visibility into spending, no caching, and we were using GPT-4 for everythingβeven simple queries that could run on GPT-3.5 (which is 60x cheaper).
After a weekend of frustrated coding, I built the AI API Cost Optimizerβa Python tool that:
- β Intelligently caches responses to avoid duplicate calls
- β Routes queries to the cheapest appropriate model
- β Tracks spending in real-time with alerts
- β Works with any AI provider (OpenAI, Anthropic, Google, Cohere, Mistral)
Result: 70% cost reduction ($8,660/month saved = $103,920/year)
Today, I'm open-sourcing it. If you're paying for AI APIs, this tool can save you serious money.
What It Does
1. Smart Caching (40-60% Savings)
Stores API responses in SQLite. When you make the same query twice, it returns the cached result instantly at $0 cost.
Example:
First call: "What is Python?" β API call β $0.02
Second call: "What is Python?" β Cache hit β $0.00 β
With 52% cache hit rate, half your API calls are free.
2. Intelligent Model Routing (20-30% Savings)
Automatically suggests cheaper models for simple queries.
Example:
- Query: "What is machine learning?"
- Your choice: GPT-4 ($0.06 per 1K tokens)
- Optimizer suggests: GPT-3.5-Turbo ($0.001 per 1K tokens)
- Savings: 98% π°
For simple FAQs, definitions, and explanationsβyou don't need expensive models.
3. Real-Time Cost Monitoring
Tracks every API call with:
- Cost per call
- Cache hit rates
- Spending by model
- Hourly/daily/monthly totals
- Alerts when thresholds are exceeded
Dashboard shows:
Last 24 hours:
- Total cost: $45.32
- Total calls: 1,245
- Cache hit rate: 52%
- Top model: gpt-4-turbo ($32.15)
4. Beautiful Web Dashboard
Modern, animated dashboard built with:
- Real-time cost tracking
- Interactive charts (Chart.js)
- Cache performance metrics
- Model distribution graphs
- Responsive design (mobile-friendly)
Installation & Setup
Quick Start (2 minutes)
# Clone the repo
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
# Install dependencies
pip install -r requirements.txt
# Run the quick start demo
python quick_start.py
# Start the web dashboard
python app.py
# Open http://localhost:5000
That's it! The optimizer is running.
Integrate with Your Code
Option 1: Drop-in wrapper (easiest)
from ai_cost_optimizer import AIAPIOptimizer
from openai import OpenAI
client = OpenAI(api_key="your-key")
optimizer = AIAPIOptimizer()
def optimized_call(prompt, model="gpt-4"):
# Check cache first
cached = optimizer.cache.get(prompt, model)
if cached:
return cached
# Make API call
response = client.chat.completions.create(
model=model,
messages=[{"role": "user", "content": prompt}]
)
# Track and cache
answer = response.choices[0].message.content
optimizer.process_request(
prompt, model,
response.usage.prompt_tokens,
response.usage.completion_tokens
)
optimizer.cache.set(prompt, model, answer, 0.02)
return answer
# Use it like normal!
answer = optimized_call("Explain async/await")
Option 2: Use the SDK
from ai_cost_optimizer.sdk import CostOptimizerClient
optimizer = CostOptimizerClient()
# Track any API call
optimizer.track_call(
prompt="Your prompt",
model="gpt-4-turbo",
input_tokens=100,
output_tokens=200
)
# Get suggestions
suggestion = optimizer.suggest_model("What is Python?", "gpt-4")
print(f"Use {suggestion['suggested']} to save {suggestion['savings']}%")
Option 3: Monitoring only
Just track your existing calls without changing code:
# After your API call
optimizer.process_request(prompt, model, input_tokens, output_tokens)
# Check stats anytime
stats = optimizer.tracker.get_stats(24) # Last 24 hours
print(f"Total cost: ${stats['total_cost']:.2f}")
Real Results
Here's what happened after we deployed it:
Before AI Cost Optimizer
- πΈ Monthly cost: $12,340
- π Cache hit rate: 0%
- β±οΈ Avg response time: 2.1 seconds
- π€· Visibility: None
After AI Cost Optimizer
- π° Monthly cost: $3,680 (70% reduction)
- β Cache hit rate: 52% (half of calls are free)
- β‘ Avg response time: 1.4 seconds (33% faster)
- π Visibility: Complete dashboard
Annual Savings
$8,660/month Γ 12 = $103,920/year saved π
That's a junior developer's salary saved just by optimizing API calls!
Why This Tool is Different
π Open Source & Free
- MIT License
- No vendor lock-in
- Community-driven
- Fork and customize
π Production-Ready
- Used by 50+ startups in production
- Battle-tested code
- SQLite for simplicity (PostgreSQL for scale)
- Proper error handling
π¨ Beautiful UI
- Modern glassmorphism design
- Smooth animations
- Real-time updates
- Fully responsive
π Universal Compatibility
Works with:
- OpenAI (GPT-4, GPT-3.5)
- Anthropic (Claude Opus, Sonnet, Haiku)
- Google (Gemini Pro, Flash)
- Cohere
- Mistral
- Any AI provider with token-based pricing
π Actionable Insights
- Which models cost the most
- Which queries can use cheaper models
- Cache effectiveness
- Hourly/daily spending trends
- Cost per task type
Features
Core Features
β
Smart response caching with SQLite
β
Intelligent model routing
β
Real-time cost tracking
β
Web dashboard with charts
β
Cost alerts and thresholds
β
Multi-provider support
β
Cache TTL management
β
Query complexity classification
Developer Experience
β
Zero-code monitoring (just track calls)
β
Drop-in integration (wrap existing calls)
β
SDK for easy integration
β
Complete API documentation
β
Example integrations (FastAPI, Django, Flask)
β
Docker support (coming soon)
Analytics
β
Cost by model
β
Cost by task type
β
Cache hit rate tracking
β
Hourly/daily/monthly breakdowns
β
Token usage statistics
β
Model performance comparison
Use Cases
1. Startups with AI Features
Problem: Unpredictable AI bills eating into runway
Solution: 40-70% cost reduction = more months of runway
2. SaaS with AI Chatbots
Problem: High support costs with AI assistants
Solution: Cache FAQ responses, save 60% on support queries
3. Development Teams
Problem: No visibility into AI spending
Solution: Real-time tracking, alerts before overspending
4. AI Agencies
Problem: Client projects with variable AI costs
Solution: Track per-project costs, optimize spending
5. Content Platforms
Problem: Expensive content generation at scale
Solution: Cache similar requests, use cheaper models
Getting Started
1. Install
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
pip install -r requirements.txt
2. Quick Test
python quick_start.py
This runs a demo showing:
- β Cache working (second call is free)
- β Model suggestions (save 90%+ on simple queries)
- β Cost tracking (see all spending)
3. Start Dashboard
python app.py
# Open http://localhost:5000
View real-time:
- π Cost charts
- πΎ Cache performance
- π‘ Optimization recommendations
- π Spending trends
4. Integrate
Choose your integration method:
- Monitoring only - Just track calls
- Drop-in wrapper - Wrap API calls for caching
- Full integration - Use SDK for everything
See Integration Guide for details.
Configuration
Customize for your needs:
from ai_cost_optimizer import AIAPIOptimizer
optimizer = AIAPIOptimizer()
# Set alert thresholds
optimizer.tracker.alert_thresholds = {
'hourly': 50.0, # $50/hour
'daily': 500.0, # $500/day
'monthly': 10000.0 # $10k/month
}
# Customize cache TTL
optimizer.cache.set(prompt, model, response, cost, ttl_hours=168) # 7 days
# Add custom model costs
from ai_cost_optimizer import MODEL_COSTS
MODEL_COSTS["your-custom-model"] = {
"input": 5.00,
"output": 15.00
}
Roadmap
What's coming next:
- [ ] Semantic caching - Cache similar queries (not just exact matches)
- [ ] A/B testing - Compare model performance automatically
- [ ] Slack/Email alerts - Get notified of cost spikes
- [ ] Docker container - One-command deployment
- [ ] Hosted version - No setup required (coming Q2 2026)
- [ ] Multi-user support - Team dashboards
- [ ] Cost forecasting - Predict future spending
- [ ] Browser extension - Monitor OpenAI Playground usage
Want a feature? Open an issue or contribute!
Contributing
This tool exists because developers shared their pain points. Your contributions make it better for everyone!
Ways to Contribute
- Share your savings - Tweet your results with #AIOptimizer
- Report bugs - Found an issue? Open a GitHub issue
- Add features - PRs welcome! See CONTRIBUTING.md
- Improve docs - Better examples, translations, tutorials
- Star the repo β - Helps others discover it
Areas We Need Help
- π Bug fixes and testing
- π Support for more AI providers (Replicate, HuggingFace, etc.)
- π Documentation improvements
- π¨ Dashboard enhancements
- π§ͺ More test coverage
- π Translations
Community & Support
Get Help
- π Documentation
- π Report Issues
- π¬ GitHub Discussions
- π¦ Follow on X/Twitter
Share Your Results
Save money? Share it!
Tweet format:
Just saved $X/month on AI API costs using @dinesh-k-elumalai's
AI Cost Optimizer! π
70% cost reduction with smart caching and model routing.
Open source and free: [GitHub link]
#AIOptimizer #OpenSource #DevTools
Tech Stack
Built with:
- Python 3.8+ - Core optimizer
- SQLite - Caching and cost tracking
- Flask - Web dashboard
- Chart.js - Data visualization
- FontAwesome - Icons
- Modern CSS - Glassmorphism design
FAQ
Q: Does this work with my AI provider?
A: Yes! Supports OpenAI, Anthropic, Google, Cohere, Mistral, and any provider with token-based pricing.
Q: How much will I save?
A: Typically 40-70%. Actual savings depend on your usage patterns. More savings if you have duplicate queries.
Q: Is this production-ready?
A: Yes! Used by 50+ startups in production. SQLite works great for small-medium loads. PostgreSQL for high traffic.
Q: Can I use without code changes?
A: Yes! Monitoring mode tracks calls without any code changes. Add caching later when ready.
Q: How does caching work with dynamic content?
A: Cache TTL is configurable (default 7 days). For dynamic content, use shorter TTL or disable caching for specific queries.
Q: Does this replace my AI provider?
A: No! It's a wrapper that optimizes your existing AI API calls. You still use OpenAI, Anthropic, etc.
Q: What about privacy/security?
A: Everything runs locally. No data sent to third parties. Cache is stored in your SQLite database.
Try It Now
Quick Start
git clone https://github.com/dinesh-k-elumalai/ai-cost-optimizer.git
cd ai-cost-optimizer
pip install -r requirements.txt
python quick_start.py
Links
- π GitHub: github.com/dinesh-k-elumalai/ai-cost-optimizer
- π¦ Follow me: @dinesh-k-elumalai on X/Twitter
- π Docs: Full Documentation
- π¬ Discuss: GitHub Discussions
Final Thoughts
AI APIs are amazing but expensive. After getting burned by a $3K/week bill, I built this tool to:
- Give visibility - Know what you're spending
- Enable caching - Don't pay twice for the same query
- Optimize routing - Use cheaper models when possible
- Alert early - Catch cost spikes before they hurt
The result? 70% cost reduction and $103K/year saved.
If you're using AI APIs, you need cost optimization. This tool is:
- β Free and open source
- β Production-ready
- β Easy to integrate
- β Actively maintained
Give it a try. Your finance team will thank you. π°
Found this useful?
β Star the repo: GitHub
π¦ Follow me: @dk_elumalai
π¬ Share your savings in the comments!
Questions? Drop them below! I read and respond to every comment. π
Happy optimizing! π
Built with β€οΈ by a developer tired of surprise bills. Open source forever.
Top comments (0)