DEV Community

How I Automated CS, Bug Fixes, and Competitor Monitoring with Claude Code Schedule

kanta13jp1 on April 12, 2026

How I Automated CS, Bug Fixes, and Competitor Monitoring with Claude Code Schedule — Zero Servers, Zero API Cost Title Options ...
Collapse
 
pavelbuild profile image
Pavel Gajvoronski

This is a great setup. Using CLAUDE.md as a living operations manual that actually executes — not just documentation — is exactly the right mental model.
Your 9-task table is really clean. I'm curious about the cs-check running hourly — how accurate is it at deciding when to fix a bug vs escalate? That boundary between "agent handles it" and "human needs to look" is the hardest part to get right.
I'm building something similar but with 28 specialized agents instead of one Claude instance handling everything. Each agent has its own CLAUDE.md — a dedicated security auditor, a researcher, a content writer, a bug fixer — and they delegate to each other through chains. Your architecture with Edge Functions as thin APIs is smart. I'm using a FastAPI gateway with Redis pub/sub for the same purpose — routing tasks to the right agent and logging everything.
The $0 additional cost point is powerful. I'm seeing similar results with model routing — using free/budget models for routine tasks and only hitting Opus for architecture decisions. Keeps monthly costs under $300 even with heavy usage.
One suggestion: consider logging task outcomes to a knowledge base (I use an Obsidian vault with git). After a few months, your competitor monitoring data becomes a goldmine for strategic decisions — but only if it's searchable and connected, not just sitting in a database table.
Solid build. Following for updates.

Collapse
 
kanta13jp1 profile image
kanta13jp1

Thanks Pavel — really appreciate the detailed breakdown of your setup.

On the cs-check fix/escalate boundary: we use a two-layer decision. First, keyword heuristics ("error", "broken", "can't", "bug") flag potential fixes. Then Claude reads the actual source file and tries to identify the root cause — if it's a null check, a typo, or a simple logic error, it fixes and commits. If it touches auth, billing, or requires understanding cross-file state beyond ~3 files, it escalates. In practice, maybe 20–30% of tickets get auto-fixed; the rest go to the cs-notes/ escalation doc. The false-positive rate (escalating something fixable) is fine — the dangerous direction is auto-fixing something that shouldn't be. We bias toward escalation hard.

Your 28-agent chain architecture is fascinating. We're doing something in between — 3 Claude Code instances with distinct CLAUDE.md scopes (VSCode for UI, PowerShell for CI/CD, Windows for migrations/docs) plus GitHub Actions workflows as "lightweight agents." Each instance can read cross-instance coordination files but won't touch the other's directories. Your dedicated security auditor agent is something I want to steal — we currently just fold that into PR review.

The FastAPI + Redis pub/sub vs Edge Functions tradeoff is interesting. Edge Functions give us zero infra overhead and the Supabase auth layer for free, but you lose persistent connections and local state. Redis pub/sub solves exactly the problems we hit with chained task coordination. Curious how you handle agent failures mid-chain — do you checkpoint state to Redis or retry from scratch?

On the knowledge base: we're actually using NotebookLM as a "Master Brain" — every session's learnings get added as sources, and notebooklm ask is the first call when making architecture decisions. The competitor monitoring data is exactly the use case you're describing. The gap is that Supabase tables aren't naturally "searchable and connected" the way an Obsidian graph is. Your point about the vault becoming a goldmine after months resonates — we're only 2 months in so the compounding hasn't kicked in yet.

Following your work too. Would love to see a post on how you handle agent chain failures.

Collapse
 
pavelbuild profile image
Pavel Gajvoronski

For chain failures we log every step with full context — that's actually one of the core reasons I built TraceHawk. When an agent fails mid-chain, you need to know exactly which tool call caused it and what the state was at that point. Without that visibility you're just retrying blind.
Currently we retry from last successful checkpoint. But the more interesting problem is knowing WHEN to retry vs escalate — that's where the tracing data becomes valuable.
Your NotebookLM as Master Brain idea is fascinating — would love to see a post on that.

Thread Thread
 
kanta13jp1 profile image
kanta13jp1

That makes a lot of sense. “Retrying blind” is exactly the f(DEV Community)avoid too.

Right now my setup is still simpler than yours: I log the full task context, keep the last known-good state where I can, and bias toward escalation if the same chain fails repeatedly or touches anything high-risk. The weak spot is still observability across handoffs between Claude Code instances — especially around tool-call boundaries.

And yes, the NotebookLM “Master Brain” has been more useful than I expected. It works well as an architecture memory layer, but the syncing process is still too manual. I’ll probably write a dedicated post once I have a cleaner pattern for turning task outputs, failures, and decisions into reusable knowledge.

Really appreciate your thoughts here — this exchange gave me a few ideas.

Thread Thread
 
pavelbuild profile image
Pavel Gajvoronski

The observability gap across handoffs between Claude Code instances is exactly where we focused too. In Kepion we solved this with JEP (Judgment Event Protocol) — every agent decision is a cryptographically signed event with hash-linked chain: Judge → Delegate → Verify → Terminate. When a chain fails, we trace backwards through the hash chain and find the exact agent and exact decision that caused the failure. Takes 3 seconds instead of grepping logs for hours.
Your NotebookLM "Master Brain" approach is interesting — we're doing something similar with our vault (Obsidian-compatible markdown with YAML frontmatter), but we just added typed wikilinks: @based_on, @contradicts, @supersedes. This means the vault can answer "what breaks if this research is wrong?" by tracing dependency relationships. Still early but the pattern of turning task outputs into queryable knowledge graph is clearly the right direction.
Would love to read that dedicated post about turning failures into reusable knowledge. That's essentially what our Agent Learning Loop does — after each task, the agent scores itself and saves successful approaches as reusable patterns. The key insight: failures are more valuable than successes for learning, because they reveal edge cases the prompt didn't cover.

Thread Thread
 
kanta13jp1 profile image
kanta13jp1

That’s a really interesting architecture. The cryptographically signed, hash-linked event chain is a clever way to turn “observability” into something closer to provenance, not just logging. Once decisions become first-class events, debugging stops being archaeology.

The typed wikilinks idea resonates too. @based_on / @contradicts / @supersedes feels like the missing layer between “saved notes” and an actual decision graph. We’re still earlier than that — our memory layer is useful for recall, but not yet strong enough to answer dependency questions like “what assumption did this design depend on?”

And I completely agree on failures being more valuable than successes. Successful runs tell you what worked once; failures tell you where your current abstractions break. That’s probably the next thing I want to formalize: not just storing outcomes, but extracting reusable anti-patterns and escalation triggers from failed chains.

Really appreciate you sharing this — there’s a lot here worth stealing.