DEV Community

Build Your Own AI Butler - A Scheduled Agent That Runs Itself!

Erik Hanchett on May 06, 2026

I want an AI agent that works for me. I want it to search up the latest news, and I want it to deliver it to me in a daily message. I want to chat ...

Read full post

Pururva Agarwal • May 7

Your pursuit of a 'controlled' AI agent is key. While Bedrock AgentCore shines for general tasks, true agent mastery in specialized domains demands granular control beyond typical managed services.\n\nConsider a drug interaction agent for India.

Nikolaos Christoforakos • May 7

The Reddit RSS detail is the actual lesson. Half of building agents is figuring out which sites will let you in and which will 403 you forever. Nobody writes the tutorial about that part.

Erik Hanchett AWS • May 7

Yes that happens a lot! You have to be able to get around it. Websites are actively blocking agents, making this much harder.

Erik Hanchett AWS • May 6

Let me know what you think?

Prakhar Srivastava • May 7

Great post!

Erik Hanchett AWS • May 7

Thanks!

Harjot Singh • May 31

A scheduled agent that runs itself is where agents get genuinely useful and also genuinely risky, because the moment a human isn't in the loop pressing go, every failure mode runs unattended. The butler framing is nice but the engineering question is what happens at 3am when it misfires and nobody's watching. Two things become non-negotiable for self-running agents. First, bounded autonomy with gates on the irreversible: it can do the reversible stuff freely on schedule, but anything that spends money, sends externally, or deletes should hit a gate or a hard policy, because an unattended confident mistake has no human to catch it mid-flight. Second, observability and alerting: a scheduled run that silently fails (or silently does the wrong thing) is worse than one that errors loudly, so it needs to log every decision and surface anomalies, otherwise you find out when the damage is done. The cron-job-with-a-brain pattern is powerful precisely because it compounds, it runs while you sleep, which means good behavior and bad behavior both compound, so the harness around it is what decides which. Automate the routine, gate the dangerous, and make it shout when something's off. That make-unattended-safe instinct is core to how I think about scheduled agents in Moonshift. For your butler, is there a gate on its higher-stakes actions, or does it execute everything autonomously on the schedule?

Mininglamp • May 12

Scheduled agents are great until the cloud bill arrives. Running a lightweight local agent on Apple Silicon for routine tasks (file organization, GUI automation, periodic checks) costs exactly $0 in inference after initial setup. The "butler" metaphor works even better when the butler lives in your house, not in someone else's data center. For tasks that need reasoning power, routing to a cloud model on-demand keeps costs predictable.

Xidao • May 12

Great walkthrough of the AgentCore harness approach. The "define in config + system prompt, then deploy" model is a really clean abstraction for scheduled agents.

One thing I have been thinking about with scheduled agents is the cold-start problem — the agent loses conversational context between runs, so the first message of each run needs to re-establish enough state to be useful. Your use of AgentCore Memory for persisting context across runs is the right approach. I have been using a similar pattern where each run writes a structured "session summary" to a persistent store, and the next run loads it as context. The key is keeping those summaries compact — if they grow unbounded, you eventually hit context window limits.

Curious how you handle failures in the web scraping layer. HN and Reddit pages can change structure unexpectedly — do you have retry logic or fallback data sources, or does the agent just report the failure in its digest?

Mininglamp • May 21

Persistent autonomy is where agents get actually useful beyond the chat window. The scheduling pattern is solid, but the harder engineering problem is state management between runs — how does the agent remember what it already processed yesterday vs. what's new? A lightweight checkpoint/journal approach (append-only log + periodic compaction) scales better than dumping full state into the LLM context every run.

Harpinder • May 22

The EventBridge + memory pattern feels like the right baseline for this.

One thing I'd separate early is "scheduled digest" vs "watch for changes". A daily news summary makes sense on a timer, but a lot of butler-style tasks are better as registered watches: define the source/event/filter once, keep the cursor/checkpoint outside the agent, and only wake the agent when something matches.

That also makes the annoying production questions cleaner: what did it already process, did this run retry, did the scraper fail, and why did the agent wake up at all?

Mixture of Experts • May 8

This is a great breakdown. I particularly appreciated the sections on scraping because I find it's the messiest part of building autonomous agents sometimes. Your approach to handling dynamic content was very helpful. I also found the insight around cost management and leveraging EventBridge a good tip. Is there anything else since that you've learned to enable scaling this more? Thanks!

Theo Valmis • May 11

Scheduled agents are one of those patterns that look clean in the demo and bite you in production. The two things that always surface: idempotency when a run fires twice from a transient retry, and visibility into what the agent actually did between runs. Worth wiring CloudWatch alarms on action counts and tool-call diffs before you trust it unattended.

Vic Chen • May 7

Nice walkthrough. The part I liked most was the practical explanation of why a harness beats a plain Lambda for this kind of job: real browser control, persistent state in /mnt/data, and enough runtime to do multi-step summarization without awkward glue code. That tradeoff gets missed a lot in agent demos. Also appreciated that you showed the scheduler + memory pattern instead of stopping at a toy chat loop — it feels much closer to how useful AI assistants will actually run in production.