DEV Community

S M Tahosin
S M Tahosin

Posted on • Originally published at github.com

HOCKS AI: I Open-Sourced a Full AI Platform With Chat, Vision, Video Analysis & Website Generation — Runs at $0/Month

TL;DR: I built and open-sourced a production-ready AI platform that combines chat, image analysis, video analysis, and website generation. It uses free models where possible and costs ~$0/month to run. Live demo | GitHub


Why I Built This

Every AI tool I tried was either:

  • Too expensive — GPT-4 API bills adding up fast
  • Single-purpose — chat OR image analysis, never both
  • Closed source — no way to learn from the architecture

I wanted a single platform that handles multiple AI modalities, uses the best free models available, and is fully open-source so other developers can learn from it.

The result is HOCKS AI — a multi-modal AI assistant platform.

🔗 Live: hocks.app
📦 Source: github.com/x-tahosin/hocks-ai


What It Does

Feature AI Model Monthly Cost
💬 Streaming Chat OpenRouter GPT-OSS-120B (free) $0
🌐 Website Generator OpenRouter Nemotron-3 120B (free) $0
🖼️ Image Analysis Google Gemini 2.0 Flash ~$0.002/call
🎬 Video Analysis Google Gemini 2.0 Flash ~$0.003/call
🧠 Memory System Firebase Firestore $0 (free tier)
🔐 Auth + Admin Firebase Auth $0

Total monthly cost: ~$0–5 depending on vision API usage.


The Hybrid Model Strategy

This is the key architectural decision. Instead of paying for one expensive model for everything, I split by capability:

Free Models for Text Tasks

Chat + Code Generation → OpenRouter API
├── openai/gpt-oss-120b:free (120B params, conversational)
└── nvidia/nemotron-3-super-120b-a12b:free (code generation)
Enter fullscreen mode Exit fullscreen mode

These free 120B parameter models are genuinely production-quality for text tasks. GPT-OSS-120B handles conversational AI beautifully — context tracking, nuanced responses, multi-turn dialogue. Nemotron-3 excels at code generation and can build full websites from prompts.

Paid Models for Vision Tasks

Image + Video Analysis → Google Gemini 2.0 Flash
├── analyzeImage (~$0.002/call)
└── analyzeVideo (~$0.003/call)
Enter fullscreen mode Exit fullscreen mode

Free models simply can't match Gemini's multimodal capabilities yet. Image understanding, OCR, visual reasoning — Gemini 2.0 Flash delivers production-quality results at extremely low per-call costs.


Architecture Deep Dive

┌─────────────────────────────────────────────┐
│          Frontend (React 18 + Vite)         │
│         Firebase Hosting / hocks.app        │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│     Firebase Cloud Functions (Node 20)      │
├─────────────────────────────────────────────┤
│  streamChat ────► OpenRouter (GPT-OSS-120B) │
│  generateCode ──► OpenRouter (Nemotron-3)   │
│  analyzeImage ──► Google Gemini 2.0 Flash   │
│  analyzeVideo ──► Google Gemini 2.0 Flash   │
└──────────────────┬──────────────────────────┘
                   │
                   ▼
┌─────────────────────────────────────────────┐
│           Firebase Services                 │
│  • Firestore (users, memories, analytics)   │
│  • Authentication (Google + Email/Pass)     │
│  • Secret Manager (all API keys)            │
│  • Storage (file uploads)                   │
└─────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Key Design Decisions

1. Zero API Keys in Frontend

Every AI call is proxied through Firebase Cloud Functions. API keys live exclusively in Firebase Secret Manager — not in environment variables, not in .env files, not anywhere in client code.

// Cloud Function reads secret at runtime
const geminiApiKey = defineSecret("GEMINI_API_KEY");

exports.analyzeImage = onCall(
  { secrets: [geminiApiKey] },
  async (request) => {
    // Key is only available server-side
    const model = genAI.getGenerativeModel({ model: "gemini-2.0-flash" });
    // ...
  }
);
Enter fullscreen mode Exit fullscreen mode

2. SSE Streaming for Real-Time Chat

Instead of waiting for the full response, the chat streams tokens in real-time using Server-Sent Events:

// Server: Stream each chunk from OpenRouter
const reader = orResponse.body.getReader();
while (true) {
  const { done, value } = await reader.read();
  if (done) break;
  res.write(`data: ${JSON.stringify({ text, fullText })}\n\n`);
}

// Client: Render as tokens arrive
eventSource.onmessage = (event) => {
  const { text } = JSON.parse(event.data);
  updateChatUI(text); // Instant visual feedback
};
Enter fullscreen mode Exit fullscreen mode

3. Per-User Memory System

The AI remembers context across sessions. Users can save memories that persist in Firestore and are injected into every AI conversation:

// Inject memories into system prompt
let systemContent = SYSTEM_PROMPT;
if (memories.length > 0) {
  systemContent += "\n\n=== USER'S SAVED MEMORIES ===\n";
  memories.forEach((mem, i) => {
    systemContent += `${i + 1}. ${mem.content}\n`;
  });
}
Enter fullscreen mode Exit fullscreen mode

4. Admin Dashboard with Cost Tracking

Built-in analytics track every API call in real-time:

  • Usage counters per feature (chat, image, video, website)
  • Daily cost breakdown with budget alerts
  • Feature toggles — disable any AI feature instantly
  • Audit logging for all admin actions

Security Architecture

Layer Implementation
API Keys Firebase Secret Manager (never in code)
Data Isolation Firestore rules enforce per-user access
Admin Access Custom claims + email verification
Authentication Firebase Auth (Google + email/password)
Audit Trail Every admin action logged with timestamp

Tech Stack

Layer Technology
Frontend React 18, Vite, CSS3 (Glassmorphism dark UI)
Backend Firebase Cloud Functions (Node.js 20)
AI Engine Google Gemini 2.0 Flash + OpenRouter (free models)
Database Cloud Firestore
Auth Firebase Authentication
Hosting Firebase Hosting (custom domain)
Secrets Firebase Secret Manager

Get Started in 5 Minutes

# Clone
git clone https://github.com/x-tahosin/hocks-ai.git
cd hocks-ai

# Install
cd functions && npm install && cd ..

# Set your API keys securely
firebase functions:secrets:set GEMINI_API_KEY
firebase functions:secrets:set OPENROUTER_API_KEY

# Deploy everything
firebase deploy
Enter fullscreen mode Exit fullscreen mode

You need:

  • Node.js 20+
  • Firebase CLI (npm i -g firebase-tools)
  • A Gemini API key from ai.google.dev (free)
  • An OpenRouter API key from openrouter.ai (free models available)

What I Learned

  1. Free AI models are production-viable — 120B parameter models handle conversational AI surprisingly well
  2. Hybrid strategies save money — use free for text, paid only for vision
  3. Firebase Secret Manager > .env files — proper secret management matters in production
  4. SSE streaming transforms UX — users seeing real-time responses feels dramatically better than waiting
  5. Cost tracking from day one — know exactly where every dollar goes

Try It


What free AI models are you using in production? I'd love to hear about your hybrid model strategies in the comments.

Top comments (0)