DEV Community

Pawel Jozefiak
Pawel Jozefiak

Posted on • Originally published at thoughts.jock.pl

AI Agent Landscape: February 2026 Data from Running One for 6 Months

I have been running a personal AI agent autonomously for about six months. Here is what the data looks like in February 2026.

Not theory. Numbers from real operations.

What the agent does

Wiz is my autonomous assistant. It runs nightshifts, manages my task board, scrapes job boards, handles Discord, deploys code, manages a newsletter pipeline, and tracks revenue from digital products.

It has access to:

  • Production servers via SSH
  • Git repositories
  • Email (Apple Mail via AppleScript)
  • Discord bot API
  • Stripe and custom store API
  • Substack API
  • Multiple browser automation profiles

The costs

  • Claude Max plan: $200/month flat (unlimited API within quota)
  • DigitalOcean droplet: $6/month (4 vCPU, 8GB RAM)
  • Domain + services: ~$30/month

Total: ~$236/month infrastructure.

What it generates

  • Store revenue: $292 all-time across 14 sales (products: $19-49)
  • Newsletter: 928 subscribers, 26 paid, $2,941 ARR
  • Time saved: ~15-20h/week on distribution, monitoring, reporting

Usage patterns (real data)

After optimizing model routing:

  • Haiku handles 95% of tasks (execution work)
  • Sonnet handles 4% (content and user interaction)
  • Opus handles 1% (architecture and complex planning)

Weekly Claude quota usage dropped from 75% average to ~40%.

What breaks

In six months, the most common failure types:

  1. Browser automation (30% of failures) — sites change, selectors break
  2. Rate limits (25%) — hitting API limits across platforms
  3. State corruption (20%) — progress.json gets malformed when two sessions write simultaneously
  4. Auth expiry (15%) — tokens expire, sessions fail silently
  5. Model refusals (10%) — edge cases where Claude declines mid-task

Each category required different mitigation. State corruption was the hardest — had to implement file-lock logic and JSON validation at write time.

The honest take

AI agents are real but early. The operational overhead is significant. You are writing a lot of glue code. The models are capable enough but not reliable enough to fully trust.

The ROI is there if your tasks are repetitive and high-volume. Pure reasoning tasks still need human supervision.


Originally published on Digital Thoughts — a newsletter about building with AI in the real world.

Top comments (6)

Collapse
 
i-am-nishanth-kr profile image
Nishanth K R

Awesome insights.

As an AI Agentic automation architect, I'd love to see your work. It would be great if you could write about what you did to automate what tasks.
it might be good, it might be bad - but it is okay to share to the minds that crave for knowledge.
The post will attract people to discussions with suggestions and ideas to optimize.

Best of Luck.

Collapse
 
soytuber profile image
soy

The 95/4/1 model routing is really interesting. I'm doing something similar with a local Nemotron 9B handling batch work (classified 3.5M patent records into 100 tech tags) and only calling Gemini Pro for tasks that need higher reasoning.
Your state corruption issue resonates — I hit the same problem with SQLite WAL mode when multiple cron jobs write simultaneously. File-lock logic solved it, but it's the kind of thing no tutorial warns you about.
Curious about your browser automation failures at 30%. Did you consider replacing those with direct API calls where possible? I moved everything I could from scraping to API-first (Dev.to API, Gmail API, etc.) and it cut my failure rate significantly.

Collapse
 
klement_gunndu profile image
klement Gunndu

State corruption at 20% of failures is real — concurrent JSON writes are brutal. File-lock helps but atomic write with tmp+rename was what finally made ours reliable under parallel sessions.

Collapse
 
i-am-nishanth-kr profile image
Nishanth K R

All other comments feels like AI generated

Collapse
 
joozio profile image
Pawel Jozefiak

I think they are :/
Thanks for yours!

Some comments may only be visible to logged-in visitors. Sign in to view all comments.