Sam Hartley

Posted on Mar 15

How I Automated My Entire Dev Workflow with AI Agents (Running 24/7 on a Mac Mini)

#ai #productivity #automation #tutorial

I used to spend 3 hours a day on repetitive tasks. Now an AI agent handles them while I sleep.

This isn't a concept post — this setup is running right now on a Mac Mini in my home office. Here's the full picture.

What Gets Automated

Task	Before (Manual)	After (AI Agent)
Email triage	30 min/day	Automatic
Code review	45 min/day	12 seconds per PR
Customer inbox check	20 min/day	Hourly cron job
Calendar reminders	Forget constantly	Proactive alerts
Weather check	Open app manually	Agent tells me before I leave
Market data	Check 5 websites	Live on my watch face

Total time saved: ~2.5 hours/day. That's 75 hours a month I got back.

The Architecture

Mac Mini M4 (always on)
├── AI Agent (orchestrator)
│   ├── Email checker (hourly)
│   ├── Calendar scanner (every 4 hours)
│   ├── Inbox monitor (Fiverr/orders, every hour)
│   └── Market data fetcher (every 15 min)
├── Ollama (local LLM — free inference)
│   ├── Qwen 3.5 9B (general tasks)
│   └── Qwen 3 Coder 30B (via network GPU)
├── Notification layer
│   └── Telegram bot → my phone
└── Background services
    ├── Garmin watch face data feed
    └── System health monitoring

Total hardware cost: $0 additional (Mac Mini was already there).
Monthly API cost: ~$25 (only complex reasoning hits the cloud).

Key Design Principles

1. Don't Automate Everything

The 80/20 rule applies hard:

✅ Automate: Checking, monitoring, formatting, reminders
❌ Don't automate: Decisions that need judgment, creative work, human relationships

I tried automating replies to clients once. Deleted it after day one. Some things need a human touch.

2. Alert, Don't Act

My agent tells me when something needs attention — it doesn't reply to customers, send emails, or make purchases.

One wrong automated email can destroy a client relationship. AI is excellent at detection. It's not great at nuance.

3. Local First, Cloud Fallback

if query_is_simple(prompt):
    # Free, instant, private
    response = ollama.generate(model="qwen3.5:9b", prompt=prompt)
else:
    # Rare — complex reasoning only
    response = claude.complete(prompt)

95% of automation tasks are simple: "Is there a new message? What does it say? Is it urgent?"

A 9B model handles that perfectly at zero cost.

4. Fail Silently, Alert Loudly

If the weather API is down → no notification (who cares).
If a paying customer messages → instant alert to my phone.

Not all failures are equal. Treat them that way.

Real Example: The Inbox Monitor

Every hour, the agent:

Opens the browser → inbox
Scans for new messages
Checks against known spam/scam patterns (regex + AI)
If legitimate: sends me a Telegram notification with the preview
If junk: logs it and moves on quietly

Scam detection rate: 100% so far (pattern matching + LLM analysis).
False positives: 0 (I still review every alert manually before responding).
Time saved: 20 min/day × 30 days = 10 hours/month.

The Full Stack (All Free or Open Source)

Component	Tool	Cost
LLM inference	Ollama + Qwen 3.5	$0
Orchestration	Cron jobs + Python scripts	$0
Notifications	Telegram Bot API	$0
Email access	Apple Mail + CLI bridge	$0
Calendar	EventKit (macOS)	$0
Browser automation	Safari + accessibility APIs	$0
Cloud reasoning (10%)	Anthropic API	~$25/mo

Monthly total: ~$25. Before this setup I was paying $150+ just in API costs.

How I Started (Without Going Crazy)

I didn't build this in a weekend. Here's the honest timeline:

Week 1: Just the email checker. One script, one cron job.
Week 2: Added Telegram notifications. Suddenly I could see it working.
Week 3: Calendar alerts. This was a game changer.
Month 2: Inbox monitoring, market data feed.
Month 3: Everything orchestrated together.

The lesson: start with one automation, get it reliable, then add the next. Trying to build everything at once is how you end up with a half-working mess you don't trust.

Lessons From Running This in Production

Logging is everything. When something fails at 3 AM, logs are all you have.
Cron intervals matter. Checking email every minute wastes resources. Every hour is fine for most things.
AI doesn't replace thinking — it replaces the mechanical parts of thinking.
Test on your actual workflow before scaling. My setup works for me; yours will look different.
The value isn't speed — it's offloading mental overhead. I stopped worrying about forgetting to check things.

What's Next

I'm building a Telegram-based version of this that others can use — subscribe to signals, set their own alert rules, powered by the same local AI stack. If that sounds interesting, follow along.

Have you automated parts of your workflow? What was the first thing you tackled? Drop it in the comments — I'm genuinely curious what others are automating.

If you want to set something like this up for your own workflow or business, I do this as a service too: Custom AI Automation Workflows

Daily automation tips: t.me/celebibot_en

DEV Community