If this is useful, a ❤️ helps others find it.
I've shipped multiple apps with AI features. My AI infrastructure cost: $0/month.
Here's exactly how — every tool, every limit, every workaround.
The free AI toolkit
1. Gemini API — Google AI Studio
- Free tier: 500 req/day (Gemini 2.5 Flash), no credit card
- Best for: Strong reasoning, document analysis, code debugging
- Get it: aistudio.google.com
2. Ollama — Local LLMs
- Free tier: Unlimited (runs on your machine)
- Best for: Privacy-sensitive data, offline use, code autocomplete
- Get it: ollama.com
3. Apple Vision Framework — OCR
- Free tier: Built into every Mac, no API key
- Best for: Text extraction from scanned PDFs and images
- Access: Swift/Objective-C, callable from Tauri via sidecar
4. Whisper (via Ollama or local)
- Free tier: Unlimited local
- Best for: Speech to text, audio transcription
-
Model:
whispervia Ollama
Architecture patterns for zero-cost AI
Pattern 1: User brings their own key
User gets a free Gemini API key, pastes it into settings. Your app makes API calls on their behalf. Their key, their quota. You pay nothing.
Pattern 2: Local-first, cloud for complex tasks
Run a local model for 90% of tasks. Fall back to Gemini API for the 10% that need stronger reasoning.
Pattern 3: On-device only
Apple Vision for OCR, local Ollama for everything else. Zero network calls. Strongest privacy story.
The PII problem
Free tier APIs may use submitted data for training. Before sending anything to Gemini:
pub fn sanitize_for_ai(text: &str) -> String {
let text = IP_RE.replace_all(text, "[IP]");
let text = EMAIL_RE.replace_all(&text, "[EMAIL]");
let text = TOKEN_RE.replace_all(&text, "[TOKEN]");
text.to_string()
}
Mask before sending. Always.
Rate limit design patterns
Cache results — same input shouldn't trigger two API calls:
let hash = sha256(input);
if let Some(cached) = cache.get(&hash) {
return cached;
}
User-triggered only — never auto-fire AI on every event:
// Bad: fires on every keypress
onChange={() => callGemini(value)}
// Good: fires only on explicit action
onClickDiagnose={() => callGemini(selectedLog)}
Graceful degradation — when the free tier is exhausted, the app still works:
match call_gemini(prompt).await {
Ok(result) => show_ai_result(result),
Err(_) => show_manual_fallback(), // app works without AI
}
What $0/month actually gets you
- Gemini 2.5 Flash: 500 requests/day → ~15,000/month
- Ollama: unlimited local inference
- Apple Vision: unlimited on-device OCR
For a solo developer building tools used by hundreds (not millions) of users: this is enough. Scale up to paid when revenue justifies it.
Hiyoko PDF Vault → https://hiyokoko.gumroad.com/l/HiyokoPDFVault
X → @hiyoyok
Top comments (0)