If you use Claude Code on a real project for more than one-off coding tasks, you eventually hit the same wall:
the model is good at solving the ta...
For further actions, you may consider blocking this person and/or reporting abuse
It works for me too, with one addition. I get Claude to write its own CLAUDE.md and any SKILL.md files it might need. I don't dictate the text; I just make sure it includes all the right elements. Work with Claude as an equal; you just have different abilities and different insights, that's all.
This is the first practical writeup I’ve seen that treats coding agents like an operating system problem instead of a prompt problem. The part that resonates most is the separation between durable context and rolling memory. Once a project has real surface area, session reset becomes a tax on velocity. From a founder angle, the teams that win with AI tooling will probably be the ones that design process primitives around the model, not just better prompts.
The "shadow Jira" problem is real I've caught myself building mini project trackers inside chat sessions more times than I'd like to admit.
The key insight here for me is CLAUDE.md being short and boring. I used to dump everything in there and it just became noise.
Separating guardrails (CLAUDE.md), durable context (maintainers docs), and working memory (JSON) is a much cleaner mental model.
The /standup and /reflect commands are a nice touch too formalizing the stuff you're copy-pasting every morning anyway.
Might fork the starter repo and adapt it to our Linear + GitHub setup.
👍 👍 👍
Am glad you found this helpful. If you end up making improvements please post a link to your repo. Would love to check it out.
This shift is maturity in action. I fell into the mega‑prompt trap myself, and it never delivered. Treating the model as a project operator is a mental leap many won’t make—you’ve articulated it clearly. Real trouble with this pattern: partial failure. If step 3 fails, you’re left with a half‑refactored mess. In CI/CD, we enforce idempotency and explicit artifact passing. Do you have something similar here, or is it still a linear chain of hope? One idea: let each operation drop a tiny “manifest” of file changes. After everything runs, compare intended state to actual git diff and auto‑rollback mismatches. A mini-Terraform for code ops.
Exactly. I’m not claiming this is fully transactional code ops yet. Right now it’s more guardrails, stable context, and repeatable workflows than true idempotent orchestration.
The protections I rely on today are smaller scoped operations, explicit systems of record, checkpoints/handoffs, maintainers docs, and reviewing actual git state instead of trusting a long agent chain. So yes, partial failure is still a real gap.
I really like your manifest idea. A mini-Terraform for code ops is a good way to describe the next layer: each workflow declares intended changes, emits a small artifact or manifest, and gets reconciled against the actual diff before the run is considered complete.
That’s a different maturity level than the starter I shared, but it’s probably the right direction for higher-risk or multi-step flows.
Glad the manifest idea landed. The guardrails you listed—smaller scoped ops, explicit systems of record, checkpointing—already set the foundation for it. You're closer than you think.
One lightweight starting point: a declarative YAML file per workflow that lists expected file paths and their intended state (created, modified, deleted). After the run, diff it against actual git state. No discrepancy? The manifest auto‑commits as an audit trail. Mismatch? Rollback and flag for review. No heavy orchestration engine needed.
That "starter" version could slip right into your current checkpointing flow without a rewrite. Curious if you've already prototyped something like that, or if it's still on the whiteboard.
Honestly have not invested much time into a prototype, I tried to keep it simple because I did not want to over engineer since my use-case was pretty basic. I can see room for implementing some of your suggestions for sure
Really useful framing — the "stop hand-feeding context" point landed for me. Quick question: how big does your local JSON memory file get over time, and does /reflect ever produce noisy entries you have to prune manually?
To be honest, /reflect has been pretty neat. I haven’t had to edit my context JSON at all. Because the schema is
strict, it only stores exactly what I need, so I haven’t really run into noisy entries.
I also haven’t hit a size problem yet, but I’ve only been using this approach for about 3 months. I can imagine
the file getting bigger over time. If that becomes an issue, one idea would be to move to a lightweight database if
I wanted to preserve full context. That said, I’m not convinced I even need full history. Sharding by month or
something similar could be enough, and if I ever needed deeper review, I could create a separate skill to analyze
all the context files.
honestly the moment that context DB starts driving plan decisions instead of just informing them, you’ve got state that claude wrote being read back as ground truth. who owns that chain when something ships wrong?
This is the maturity jump that separates people who get value from coding agents from people who churn. "Giant prompt" mode treats the model as a vending machine - dump everything, hope. "Project ops" mode treats it as a system with state, conventions, defined tasks, and feedback loops. Same model, completely different output quality.
The reframe that clicks for most people: stop optimizing the single prompt and start optimizing the environment the agent operates in (clear tasks, persistent context, guardrails, checks). It's the difference between shouting instructions and building a workflow. Bonus: project-ops mode is also cheaper, because scoped tasks burn far fewer tokens than one ever-growing mega-conversation. Really like this framing - it's the operational mindset more devs need to hear. What did your task/context structure end up looking like?
i totally get the frustration of reconstructing projects with tools like Claude. it’s cool that you built a project-ops layer to streamline things. if you're ever looking for a way to deploy your app effortlessly, check out Moonshift. you can get a next.js + postgres + auth build up in about 7 minutes, and you own the code on your github. happy to offer you a free run if you're interested.