noxlie

Posted on Jul 5

Why I Replaced My SaaS AI Tools With Self-Hosted Alternatives

#productivity #selfhosted #ai #privacy

My AI Tool Bill Was $50/Month. I Cut It to $8.

Let me add up what I was paying: ChatGPT Plus at $20/month, Claude Pro at $20/month, and Midjourney at $10/month. That's $50/month or $600/year for AI tools I used maybe 2 hours a day.

Three months ago I cancelled all of them. Now I spend about $8/month on API calls and run the rest locally. Honestly? The quality difference is smaller than you'd think, and the privacy improvement is massive.

This isn't a "SaaS is evil" rant. It's a practical breakdown of what I actually did, what it cost, and where the tradeoffs are.

The $50/Month Stack I Replaced

Here's exactly what I used each tool for:

Tool	Monthly Cost	What I Actually Used It For
ChatGPT Plus	$20	Code review, quick Q&A, writing drafts
Claude Pro	$20	Long document analysis, nuanced writing
Midjourney	$10	Occasional header images, concept art

Total: $50/month. Most days I used maybe ChatGPT for 30 minutes and Claude for 20 minutes. Midjourney maybe twice a week. The per-use cost was terrible.

What I Run Now

For chat/code tasks: NanoGPT API. Pay-per-use, no subscription. I spend about $5-6/month for my usage patterns. The models available are competitive with GPT-4 and Claude for most tasks. There's a solid guide to privacy-focused AI options if you want to explore alternatives.

For local tasks: Ollama running llama3.1:8b on my desktop (RTX 3090 I already owned). Free, fast, private. I use this for anything where I don't want my prompts hitting a server.

For images: Stable Diffusion XL via ComfyUI. One-time setup, runs locally. The quality isn't Midjourney-level for artistic stuff, but for blog headers and concept mockups it's more than enough.

Here's my actual spend breakdown for the last 3 months:

# My real costs (last 3 months)
costs = {
    "nanogpt_api": {
        "month_1": 5.23,
        "month_2": 6.87,
        "month_3": 7.12,
    },
    "electricity_gpu": {
        "month_1": 2.10,  # ~70W avg for local inference
        "month_2": 2.30,
        "month_3": 1.90,
    },
    "previous_saas": {
        "chatgpt": 20.00,
        "claude": 20.00,
        "midjourney": 10.00,
    }
}

# Average monthly cost
saas_total = 50.0
self_hosted_avg = (5.23 + 6.87 + 7.12) / 3 + (2.10 + 2.30 + 1.90) / 3
print(f"SaaS: ${saas_total}/month")
print(f"Self-hosted: ${self_hosted_avg:.2f}/month")
print(f"Savings: ${saas_total - self_hosted_avg:.2f}/month")
# SaaS: $50.0/month
# Self-hosted: $8.51/month
# Savings: $41.49/month

$41.49/month savings. That's $497.88/year. Not life-changing money, but it paid for a nice dinner every month.

The Self-Hosted Setup

Here's how I actually set up the local stack. It took a weekend.

Step 1: Ollama for Text

# Install Ollama
curl -fsSL https://ollama.ai/install.sh | sh

# Pull models (8B for fast tasks, 70B for complex ones if you have the VRAM)
ollama pull llama3.1:8b
ollama pull codellama:13b  # for code-specific tasks

# Test it
ollama run llama3.1:8b "Write a Python function to merge two sorted lists"

The 8B model runs fine on 8GB VRAM. If you have 24GB (like a 3090 or 4090), the 13B code model is noticeably better for programming tasks.

Step 2: NanoGPT API for Cloud Tasks

For anything the local models can't handle (complex reasoning, very long contexts), I use NanoGPT's API. It's pay-per-token with no subscription:

import requests

def ask_nanogpt(prompt: str, model: str = "default") -> str:
    resp = requests.post(
        "https://api.nanogpt.io/v1/chat/completions",
        headers={
            "Authorization": f"Bearer {os.environ['NANOGPT_API_KEY']}",
            "Content-Type": "application/json"
        },
        json={
            "messages": [{"role": "user", "content": prompt}],
            "max_tokens": 2000
        },
        timeout=30
    )
    resp.raise_for_status()
    return resp.json()["choices"][0]["message"]["content"]

If you want to try this approach, NanoGPT is a good starting point for pay-per-use API access without subscriptions.

Step 3: Stable Diffusion for Images

# Install ComfyUI
git clone https://github.com/comfyanonymous/ComfyUI.git
cd ComfyUI
pip install -r requirements.txt

# Download SDXL base model
wget -P models/checkpoints/ \
  "https://huggingface.co/stabilityai/stable-diffusion-xl-base-1.0/resolve/main/sd_xl_base_1.0.safetensors"

# Run it
python main.py --listen 0.0.0.0

Access the web UI at http://localhost:8188. For blog headers and quick concepts, I generate 4 images and pick the best one. Takes about 30 seconds per image on a 3090.

Where Self-Hosted Wins

Privacy. My prompts never leave my network for local tasks. For a developer, that means I can paste entire codebases into my local model without worrying about training data policies.

No rate limits. Ollama doesn't throttle me. I can run 100 queries in a row if I want.

Customization. I fine-tuned a small model on my coding style and project conventions. ChatGPT can't do that.

Offline works. Power outage? Internet down? Local models still run (on battery backup obviously, but still).

Where SaaS Still Wins

I'll be honest about the tradeoffs:

Claude's writing quality. For nuanced, long-form writing, Claude is still better than anything I can run locally. I keep a NanoGPT API call as a substitute, but it's not the same.

GPT-4's reasoning. Complex multi-step reasoning tasks still go to the cloud. The 8B local model makes mistakes that GPT-4 doesn't.

Midjourney's aesthetics. SDXL is good, but Midjourney's artistic quality is still ahead. For hero images on important posts, I sometimes use a Midjourney alternative API.

Zero maintenance. SaaS tools update themselves. My local stack needs model updates, driver updates, and occasional troubleshooting.

My Actual Daily Workflow

Here's what a typical day looks like:

Morning code review — Local Ollama (codellama:13b). Fast, private, free.
Quick questions — NanoGPT API. Cheap, fast, good enough.
Long document analysis — NanoGPT API with larger context. ~$0.02 per document.
Blog header images — Local SDXL via ComfyUI. Free.
Complex writing — NanoGPT API. Still the best for nuanced text.

The key insight: 80% of what I used ChatGPT for didn't need GPT-4 quality. A local 8B model handles "write a regex for email validation" just fine.

The Privacy Angle

This matters more than people think. When you use ChatGPT, every prompt goes through OpenAI's servers. Your code, your ideas, your business logic. Their privacy policy allows using inputs to improve models (unless you opt out, and even then, who verifies?).

With local models, nothing leaves your machine. With NanoGPT, their privacy policy is explicit about not training on your data.

For personal projects, maybe it doesn't matter. For work code? Client data? Business strategies? I'd rather keep that local.

Getting Started

If you want to try this, start small:

Install Ollama, pull llama3.1:8b
Use it for a week alongside your paid tools
Track which queries actually need the paid tools
You'll find most don't

After a month, cancel the subscriptions you're not using. Keep the ones where the quality gap matters for your workflow.

The goal isn't to never pay for AI. It's to stop paying $50/month for stuff you can get for $8/month with better privacy.

Originally published at https://privacy-ai-guide.vercel.app

DEV Community