Mary Olowu

Posted on May 13

I Stopped Using Claude Code as a Giant Prompt and Started Using It as Project Ops

#ai #productivity #monorepo #devtools

Jira MCP and a public starter repo

If you use Claude Code on a real project for more than one-off coding tasks, you eventually hit the same wall:

the model is good at solving the task in front of it, but every new session still has to reconstruct the project.

For me, that got especially annoying in a solo-dev monorepo. I was not just asking Claude to write code. I was also using it for:

backlog triage
bug capture
planning the next task
weekly status summaries
preserving decisions across sessions

At some point I realized I was trying to solve a workflow problem with a better prompt.

That was the wrong move.

What helped was building a thin project-ops layer around Claude Code instead.

My current version uses Jira MCP for backlog work, Confluence for published reports, a local JSON context DB for working memory, maintainer docs for durable context, and a few commands like /standup, /bug, and /rfe.

Then I pulled the reusable parts into a public starter repo without shipping the private project details around them.

The repo is here: restofstack/claude-project-ops-starter.

TL;DR

The useful part of my setup is not one giant prompt.

It is:

a short CLAUDE.md for guardrails
a docs/maintainers/ folder for durable project context
a tiny local JSON file for rolling memory
real systems of record for backlog, PRs, and releases
reusable commands for common project-ops tasks

That is the pattern I extracted into a public starter repo.

The Problem With Default AI Usage on Ongoing Projects

The default interaction pattern looks like this:

open Claude Code
paste context
explain the task
repeat tomorrow

That is fine for isolated implementation work.

It breaks down when each session has to renegotiate:

what matters in the repo
where architecture context lives
what work is already in progress
which tools are authoritative
how status should be reported

Once a project is large enough, "just paste more context" stops being a serious strategy.

The Structure I Ended Up With

This is the rough shape:

CLAUDE.md
docs/
  maintainers/
    README.md
    overview/
    development/
.workspace-temp/
  context-db.json
.claude/
  commands/
    standup.md
    bug.md
    rfe.md
    reflect.md
    weekly-report.md

Each part has a different job.

1. `CLAUDE.md` Is for Rules, Not Everything

I keep CLAUDE.md short and boring on purpose.

It only contains the repo-level rules that should apply in every session, things like:

prefer the existing system of record over invented state
finish work in progress before proposing new work
never fabricate backlog items or counts
keep outputs concise and actionable

That file is not where I put architecture notes, runbooks, or a giant project brain dump.

If you overload it, both you and the model stop trusting it.

2. `docs/maintainers/` Holds the Durable Context

Anything that should survive beyond a session goes into maintainers docs:

system overviews
service boundaries
local development notes
runbooks
release notes

This gives Claude a clean place to start, and it has a side benefit: the docs also become useful to humans.

That matters more than it sounds. If a doc is good enough for future-you, it is usually better context for AI too.

3. Local JSON Memory Is Good Enough

I use a small local JSON file for rolling working memory.

Not a service. Not a database product. Just a file.

It stores a few useful things:

what shipped recently
branch or PR context
decisions worth remembering
estimate patterns

This has been the right level of complexity for solo work because it is:

cheap
easy to inspect
easy to edit
easy to replace later

I also use a /reflect command to append small memory items instead of trying to manually maintain that file all the time.

4. Real Systems of Record Stay Real

Claude should not become your shadow Jira, shadow GitHub, or shadow release tracker.

The actual source of truth should stay in the actual system:

Jira, GitHub Issues, Linear, or whatever you use for backlog
PR system
release history
docs or wiki

The AI should read from those systems and synthesize useful outputs. It should not replace them.

That boundary is what keeps the workflow practical instead of magical-and-fragile.

The Commands Were the Biggest UX Upgrade

The structure matters, but the commands are what made the whole thing usable day to day.

I extracted the workflows I kept repeating:

/standup
/bug
/rfe
/reflect
/weekly-report

And when I cleaned the setup for a public starter, a few surrounding workflows became part of the picture too: /checkpoint, /sanitize, /docs-sync, /release-notes, and /root-cause.

Each one has a defined job.

For example:

/standup checks memory, git state, PR state, backlog state, and maintainers docs, then recommends the next actions
/bug captures a clean bug report without turning it into a full debugging session
/weekly-report turns project signals into a durable report instead of a one-off chat response

That consistency removed a lot of prompt thrash.

Without commands, every request is basically a blank page. With commands, the common project tasks have defaults.

Where This Fits Relative to Spec Kit

I still use Spec Kit, and I do not think this starter replaces it.

Spec Kit is useful when I want to take one feature or product change and push it toward a clearer spec and implementation path.

This starter handles a different layer: working memory, maintainer docs, standups, bug and RFE capture, reports, handoff, and the repeatable repo workflows that help Claude pick up the thread again tomorrow.

So for me this fills a different gap than Spec Kit or other Claude "superpowers" style workflows.

Why This Was Especially Useful in a Monorepo

Monorepos create a context problem fast.

Even as a solo developer, I still need a reliable way to answer:

what changed recently?
what is in progress?
what got forgotten?
what should be picked up next?
what decisions should persist beyond this session?

I did not want to build a custom agent platform to solve that.

I also did not want to keep improvising.

This setup gave me a middle path.

The Portable Part

The original version of this workflow was tied pretty closely to my own stack.

The part worth sharing was the pattern, not the exact tools.

It also needed a cleanup pass before it was publishable. A real working setup usually contains things you should not ship as-is:

local Claude settings
project-specific IDs and URLs
live working memory files
internal naming conventions
backlog details that only make sense inside the project

That is why I split the starter into adapters like:

Jira + Confluence
GitHub Issues + repo docs
local JSON + markdown only

So if your stack is different, you can still keep the same model:

stable guardrails
durable docs
lightweight memory
real systems of record
repeatable workflows

That was the real extraction goal: publish the useful workflows, not the private residue of my specific project.

What I Would Recommend If You Try This

If you want to copy the idea, I would start with this:

Keep CLAUDE.md short.
Move durable project context into maintainers docs.
Use a tiny local memory file before building anything fancier.
Pick 3-5 workflows you repeat all the time and formalize them.
Keep the real backlog and release data in the systems you already trust.

That is enough to get most of the value.

Closing

The useful change was not "add more prompt."

It was "design better interfaces for the model."

That is what made Claude Code feel less like a clever autocomplete session and more like a practical project-ops layer for an ongoing codebase.

If you are already using AI on a real repo, that is where I think the leverage is.

Top comments (13)

Graham Trott • May 20

It works for me too, with one addition. I get Claude to write its own CLAUDE.md and any SKILL.md files it might need. I don't dictate the text; I just make sure it includes all the right elements. Work with Claude as an equal; you just have different abilities and different insights, that's all.

Vic Chen • May 20

This is the first practical writeup I’ve seen that treats coding agents like an operating system problem instead of a prompt problem. The part that resonates most is the separation between durable context and rolling memory. Once a project has real surface area, session reset becomes a tax on velocity. From a founder angle, the teams that win with AI tooling will probably be the ones that design process primitives around the model, not just better prompts.

Mamoor Ahmad • May 13

The "shadow Jira" problem is real I've caught myself building mini project trackers inside chat sessions more times than I'd like to admit.
The key insight here for me is CLAUDE.md being short and boring. I used to dump everything in there and it just became noise.
Separating guardrails (CLAUDE.md), durable context (maintainers docs), and working memory (JSON) is a much cleaner mental model.
The /standup and /reflect commands are a nice touch too formalizing the stuff you're copy-pasting every morning anyway.
Might fork the starter repo and adapt it to our Linear + GitHub setup.
👍 👍 👍

Mary Olowu • May 14

Am glad you found this helpful. If you end up making improvements please post a link to your repo. Would love to check it out.

HARD IN SOFT OUT • May 13

This shift is maturity in action. I fell into the mega‑prompt trap myself, and it never delivered. Treating the model as a project operator is a mental leap many won’t make—you’ve articulated it clearly. Real trouble with this pattern: partial failure. If step 3 fails, you’re left with a half‑refactored mess. In CI/CD, we enforce idempotency and explicit artifact passing. Do you have something similar here, or is it still a linear chain of hope? One idea: let each operation drop a tiny “manifest” of file changes. After everything runs, compare intended state to actual git diff and auto‑rollback mismatches. A mini-Terraform for code ops.

Mary Olowu • May 13

Exactly. I’m not claiming this is fully transactional code ops yet. Right now it’s more guardrails, stable context, and repeatable workflows than true idempotent orchestration.

The protections I rely on today are smaller scoped operations, explicit systems of record, checkpoints/handoffs, maintainers docs, and reviewing actual git state instead of trusting a long agent chain. So yes, partial failure is still a real gap.

I really like your manifest idea. A mini-Terraform for code ops is a good way to describe the next layer: each workflow declares intended changes, emits a small artifact or manifest, and gets reconciled against the actual diff before the run is considered complete.
That’s a different maturity level than the starter I shared, but it’s probably the right direction for higher-risk or multi-step flows.

HARD IN SOFT OUT • May 13

Glad the manifest idea landed. The guardrails you listed—smaller scoped ops, explicit systems of record, checkpointing—already set the foundation for it. You're closer than you think.

One lightweight starting point: a declarative YAML file per workflow that lists expected file paths and their intended state (created, modified, deleted). After the run, diff it against actual git state. No discrepancy? The manifest auto‑commits as an audit trail. Mismatch? Rollback and flag for review. No heavy orchestration engine needed.

That "starter" version could slip right into your current checkpointing flow without a rewrite. Curious if you've already prototyped something like that, or if it's still on the whiteboard.

Mary Olowu • May 14

Honestly have not invested much time into a prototype, I tried to keep it simple because I did not want to over engineer since my use-case was pretty basic. I can see room for implementing some of your suggestions for sure

Hussein Mahdi • May 13

Really useful framing — the "stop hand-feeding context" point landed for me. Quick question: how big does your local JSON memory file get over time, and does /reflect ever produce noisy entries you have to prune manually?

Mary Olowu • May 14

To be honest, /reflect has been pretty neat. I haven’t had to edit my context JSON at all. Because the schema is
strict, it only stores exactly what I need, so I haven’t really run into noisy entries.

I also haven’t hit a size problem yet, but I’ve only been using this approach for about 3 months. I can imagine
the file getting bigger over time. If that becomes an issue, one idea would be to move to a lightweight database if
I wanted to preserve full context. That said, I’m not convinced I even need full history. Sharding by month or
something similar could be enough, and if I ever needed deeper review, I could create a separate skill to analyze
all the context files.

Mykola Kondratiuk • May 22

honestly the moment that context DB starts driving plan decisions instead of just informing them, you’ve got state that claude wrote being read back as ground truth. who owns that chain when something ships wrong?

Harjot Singh • May 30

This is the maturity jump that separates people who get value from coding agents from people who churn. "Giant prompt" mode treats the model as a vending machine - dump everything, hope. "Project ops" mode treats it as a system with state, conventions, defined tasks, and feedback loops. Same model, completely different output quality.

The reframe that clicks for most people: stop optimizing the single prompt and start optimizing the environment the agent operates in (clear tasks, persistent context, guardrails, checks). It's the difference between shouting instructions and building a workflow. Bonus: project-ops mode is also cheaper, because scoped tasks burn far fewer tokens than one ever-growing mega-conversation. Really like this framing - it's the operational mindset more devs need to hear. What did your task/context structure end up looking like?

Harjot Singh • Jun 1

i totally get the frustration of reconstructing projects with tools like Claude. it’s cool that you built a project-ops layer to streamline things. if you're ever looking for a way to deploy your app effortlessly, check out Moonshift. you can get a next.js + postgres + auth build up in about 7 minutes, and you own the code on your github. happy to offer you a free run if you're interested.