DEV Community

Sam Hartley
Sam Hartley

Posted on

How I Automated My Entire Dev Workflow with AI Agents (Running 24/7 on a Mac Mini)

I used to spend 3 hours a day on repetitive tasks. Now an AI agent handles them while I sleep.

This isn't a concept post — this setup is running right now on a Mac Mini in my home office. Here's the full picture.

What Gets Automated

Task Before (Manual) After (AI Agent)
Email triage 30 min/day Automatic
Code review 45 min/day 12 seconds per PR
Customer inbox check 20 min/day Hourly cron job
Calendar reminders Forget constantly Proactive alerts
Weather check Open app manually Agent tells me before I leave
Market data Check 5 websites Live on my watch face

Total time saved: ~2.5 hours/day. That's 75 hours a month I got back.

The Architecture

Mac Mini M4 (always on)
├── AI Agent (orchestrator)
│   ├── Email checker (hourly)
│   ├── Calendar scanner (every 4 hours)
│   ├── Inbox monitor (Fiverr/orders, every hour)
│   └── Market data fetcher (every 15 min)
├── Ollama (local LLM — free inference)
│   ├── Qwen 3.5 9B (general tasks)
│   └── Qwen 3 Coder 30B (via network GPU)
├── Notification layer
│   └── Telegram bot → my phone
└── Background services
    ├── Garmin watch face data feed
    └── System health monitoring
Enter fullscreen mode Exit fullscreen mode

Total hardware cost: $0 additional (Mac Mini was already there).
Monthly API cost: ~$25 (only complex reasoning hits the cloud).

Key Design Principles

1. Don't Automate Everything

The 80/20 rule applies hard:

  • ✅ Automate: Checking, monitoring, formatting, reminders
  • ❌ Don't automate: Decisions that need judgment, creative work, human relationships

I tried automating replies to clients once. Deleted it after day one. Some things need a human touch.

2. Alert, Don't Act

My agent tells me when something needs attention — it doesn't reply to customers, send emails, or make purchases.

One wrong automated email can destroy a client relationship. AI is excellent at detection. It's not great at nuance.

3. Local First, Cloud Fallback

if query_is_simple(prompt):
    # Free, instant, private
    response = ollama.generate(model="qwen3.5:9b", prompt=prompt)
else:
    # Rare — complex reasoning only
    response = claude.complete(prompt)
Enter fullscreen mode Exit fullscreen mode

95% of automation tasks are simple: "Is there a new message? What does it say? Is it urgent?"

A 9B model handles that perfectly at zero cost.

4. Fail Silently, Alert Loudly

If the weather API is down → no notification (who cares).
If a paying customer messages → instant alert to my phone.

Not all failures are equal. Treat them that way.

Real Example: The Inbox Monitor

Every hour, the agent:

  1. Opens the browser → inbox
  2. Scans for new messages
  3. Checks against known spam/scam patterns (regex + AI)
  4. If legitimate: sends me a Telegram notification with the preview
  5. If junk: logs it and moves on quietly

Scam detection rate: 100% so far (pattern matching + LLM analysis).
False positives: 0 (I still review every alert manually before responding).
Time saved: 20 min/day × 30 days = 10 hours/month.

The Full Stack (All Free or Open Source)

Component Tool Cost
LLM inference Ollama + Qwen 3.5 $0
Orchestration Cron jobs + Python scripts $0
Notifications Telegram Bot API $0
Email access Apple Mail + CLI bridge $0
Calendar EventKit (macOS) $0
Browser automation Safari + accessibility APIs $0
Cloud reasoning (10%) Anthropic API ~$25/mo

Monthly total: ~$25. Before this setup I was paying $150+ just in API costs.

How I Started (Without Going Crazy)

I didn't build this in a weekend. Here's the honest timeline:

  • Week 1: Just the email checker. One script, one cron job.
  • Week 2: Added Telegram notifications. Suddenly I could see it working.
  • Week 3: Calendar alerts. This was a game changer.
  • Month 2: Inbox monitoring, market data feed.
  • Month 3: Everything orchestrated together.

The lesson: start with one automation, get it reliable, then add the next. Trying to build everything at once is how you end up with a half-working mess you don't trust.

Lessons From Running This in Production

  1. Logging is everything. When something fails at 3 AM, logs are all you have.
  2. Cron intervals matter. Checking email every minute wastes resources. Every hour is fine for most things.
  3. AI doesn't replace thinking — it replaces the mechanical parts of thinking.
  4. Test on your actual workflow before scaling. My setup works for me; yours will look different.
  5. The value isn't speed — it's offloading mental overhead. I stopped worrying about forgetting to check things.

What's Next

I'm building a Telegram-based version of this that others can use — subscribe to signals, set their own alert rules, powered by the same local AI stack. If that sounds interesting, follow along.

Have you automated parts of your workflow? What was the first thing you tackled? Drop it in the comments — I'm genuinely curious what others are automating.


If you want to set something like this up for your own workflow or business, I do this as a service too: Custom AI Automation Workflows

Daily automation tips: t.me/celebibot_en

Top comments (0)