DEV Community

Cover image for The claude -p playbook for June 15 — rebuilding your AI workflows inside interactive sessions
fujibee
fujibee

Posted on

The claude -p playbook for June 15 — rebuilding your AI workflows inside interactive sessions

On June 15, Claude's claude -p (headless mode) and the Agent SDK stop drawing from your subscription and move to a separate metered credit. If you've built a pile of claude -p scripts, the news probably landed with a small jolt — I've written plenty of them myself, and "wait, all of that is metered now?" was my first reaction too.

But step back and it's not really one company's pricing decision. The whole industry is converging on the same shape:

  • GitHub Copilot moved to AI Credits on June 1 (completions stay free; Chat, CLI, and agents consume credits)
  • OpenAI Codex pairs seat pricing with credits + API usage
  • And Claude, on June 15, splits headless and the SDK onto a separate credit

The shape they all landed on is the same: interaction stays flat-rate, automation gets metered.

This post is about how to read that, and what to actually do with your claude -p scripts. It comes from a few months of running a "multi-agent inside interactive sessions" setup day to day.

One thing up front: this is not a billing-evasion hack. The conclusion isn't "go back to doing everything by hand," and it isn't "keep everything headless" — it's a redesign that sits between the two.

— that said, if you opened this because you want to know "so can I keep doing my claude -p stuff inside the flat-rate plan?", some of the recipes below do read that way. Terms and pricing lines can shift, so check the current conditions and use your own judgment. My argument is "because it's a better setup," not "because it's cheaper" — but either door is fine.

Why everyone converged on the same shape

If you want the broader picture, "The flat-fee era is over" walks through the cost mechanics across providers, and "The Tokenpocalypse" frames what survives the meter. The gist:

Old-style completion was short output, and flat-rate worked. Agentic tools are different. Behind one user request, a flood of tokens: read the repo, search files, run tests, patch. The cost gap between a light user and a heavy user became extreme — flat-fee misses the real token cost of a heavy user by up to 10x — and "unlimited flat-rate for everyone" stopped being mathematically sustainable. The same shift that happened when cloud went from "server rent" to "metered usage" is now happening to LLM tokens.

Overlay how each vendor drew its line and something interesting shows up: usage where a human is at the screen (interaction) stays flat-rate; usage where the human steps away (headless, SDK, CI) goes metered.

The reason is simple — interaction is throughput-capped. A human reads, thinks, types. One session's consumption tops out at human speed. Headless can be called in a loop, without bound. The pricing is the answer to "which kind of usage can a flat rate actually support," and at the same time it's a statement about which usage the platform will structurally favor.

So the thing June 15 quietly tells you: the economically durable surface is inside the interactive session.

Was your claude -p really an "unattended" job?

Looking back, a fair amount of what I ran through claude -p didn't strictly need to be unattended. I wanted a second agent's opinion while I was working. I wanted a review from a different model. I wanted a refactor running in parallel on another model — and in every case I was right there. But there was no channel between sessions, so I had two options: be the copy-paste courier myself, or write claude -p into a script to bridge them.

This is a common thing in how a stack matures: when a part is missing, the neighboring part carries its role. While agent-to-agent messaging didn't exist, headless calls and glue scripts carried that weight — headless wasn't the wrong tool, it was the only channel.

Now that the missing part is filling in, you can redraw the division of labor. The response to June 15 isn't only "budget for headless credits" (some jobs genuinely need it — more below); it's also moving the carried-over work back to where it belongs, and leaving claude -p only the jobs that truly have to be headless. If agents can talk to each other directly inside interactive sessions, you need neither the courier nor the bridge script.

The channel I built for that is agmsg — a messaging layer that runs on nothing but bash + SQLite, letting Claude Code / Codex / Gemini CLI / Copilot CLI sessions form a team and message each other. No daemon, no network, not MCP. And the key part: send and receive both run inside your normal interactive sessions, through a hook. No claude -p, no SDK.

https://github.com/fujibee/agmsg

Here's how to rebuild the common claude -p patterns, one by one.

Recipe 1: a script that "asks an AI" → a resident buddy session

Before: code review or a quick consult, fired off to headless from inside a script. The script is non-interactive, but the AI work inside it is fundamentally a conversation.

#!/usr/bin/env bash
# review-diff.sh — runs in the dev loop
set -euo pipefail
diff=$(git diff --staged)
# headless call — metered after June 15
review=$(claude -p \
  --print "Review this diff for race conditions and SQL injection. Return JSON \
  {\"verdict\":\"ok|block\",\"notes\":\"...\"}." \
  <<<"$diff")
echo "$review" | jq -e '.verdict == "ok"' >/dev/null || {
  echo "$review" | jq -r '.notes' >&2
  exit 1
}
Enter fullscreen mode Exit fullscreen mode

After: open a second Claude Code in another terminal and keep it in the team (real-time monitor mode). The reviewer is a regular interactive session, covered by your subscription.

One-time setup (terminal 2, the reviewer) — just this, inside the session:

/agmsg
# → it asks for team + agent name: answer team: dev / agent: alice
# → pick monitor (real-time) when asked how to receive

# from now on, anything addressed to alice streams into this window live
Enter fullscreen mode Exit fullscreen mode

The script side (terminal 1, where claude -p used to be):

#!/usr/bin/env bash
# review-diff.sh — agmsg version
set -euo pipefail
TEAM=dev
FROM=worker     # this script's identity
TO=alice        # the resident reviewer session
# hand the review to the live session; body is plain text —
# pass a reference and a short ask, not raw context
~/.agents/skills/agmsg/scripts/send.sh "$TEAM" "$FROM" "$TO" \
  "Please review the staged diff at $(pwd). Reply with verdict (ok|block) + notes."
echo "Waiting for $TO's verdict..."
while true; do
  reply=$(~/.agents/skills/agmsg/scripts/inbox.sh "$TEAM" "$FROM" 2>/dev/null)
  case "$reply" in
    *"verdict: ok"*)    echo "ok"; exit 0 ;;
    *"verdict: block"*) echo "$reply" >&2; exit 1 ;;
  esac
  sleep 5
done
Enter fullscreen mode Exit fullscreen mode

The reviewer looks at the diff, replies with /agmsg send worker "verdict: ok ...", and the script resumes. No claude -p invoked.

The point: the buddy is a resident interactive session, not a headless process that vanishes after each call. Context persists, so the second ask onward, "same approach as before" just works. It's actually higher quality than a headless call you had to brief from zero every time.

One more contrast — versus subagents. A subagent inherits the parent session's context, so even when you say "review this independently," the answer tends to come back inside the frame of the parent's hypothesis. A separate-session buddy starts with independent context, so the review is genuinely independent. One user actually moved to agmsg for exactly this reason (data-analysis use; I'm putting together a use-case collection, more there).

Recipe 2: an orchestration script → a director agent

Before: a self-built orchestration script that fans out to several models in parallel and merges the results. Credits drain by the number of workers.

#!/usr/bin/env bash
# judge-panel.sh — three independent reviews → synthesis
set -euo pipefail
problem="$1"
# three parallel headless calls — each one metered
analysis_a=$(claude -p "$problem" --print) &
analysis_b=$(claude -p "$problem (focus on security)" --print) &
analysis_c=$(codex exec --headless --prompt "$problem (focus on perf)" --json) &
wait
# synthesis — yet another call
claude -p --print "Synthesize these three reviews into one verdict: ..."
Enter fullscreen mode Exit fullscreen mode

After: stand up one director session and put the workers (models/tools can be mixed) in the same team. The orchestration logic becomes the director agent's instructions; fan-out, collection, and synthesis run over agmsg messages.

Setup: the three workers (each its own terminal / window / IDE pane) and the director (the session you actually work in), one /agmsg each:

# inside each session
/agmsg
# → answer team: panel / agent: reviewer-a (b, c, director), pick monitor
Enter fullscreen mode Exit fullscreen mode

Then you just talk to the director session:

You (in the director session):
  "Send reviewer-a a correctness review, reviewer-b a security review,
   reviewer-c a performance review, of the staged diff.
   Wait for all three, then synthesize."

director: [sends three agmsg messages, polls inbox, and once all three
           land, writes the synthesis in this same session]
Enter fullscreen mode Exit fullscreen mode

The control flow you used to script becomes plain instructions. Zero claude -p.

A real example: on Fable 5 release day, a user reported running a refactor with Fable 5 as commander, Opus 4.8 as deputy, and Sonnet 4.6 as scouts. The day a new model ships, you can drop it straight into your existing lineup — proof the transport doesn't depend on the model.

Recipe 3: "run it in the background while I work" → a dedicated worker session

Before: a task you want running alongside your own work (test scaffolding, doc updates, research), fired via cron or a script + claude -p.

After: in the morning, open one more terminal, spin up a worker session, keep it in the team (inside the session: /agmsg → team: dev / agent: worker / pick monitor). Whenever something comes up, send it a message from your main session. It acts on each inbound and reports back when done.

Kickoff message (from your main session):

You: send worker: "Analyze /tmp/deploy.log for slow queries. Write findings
to /tmp/deploy-findings.md. Ping me when done."
Enter fullscreen mode Exit fullscreen mode

It starts on its next monitor event, does the work, writes the file, then:

worker → me: "deploy-findings.md ready. 3 queries > 500ms, all on the
users table. Top one is the email_lower lookup — index missing."
Enter fullscreen mode Exit fullscreen mode

Unlike a backgrounded headless job, no orphaned processes. Close the worker terminal and the work stops — that's what "running where you can see it" means.

This isn't "give up on automation and go manual." Dispatch and progress stay automated; execution just comes back into the same room as you. When something breaks, it's right there in the next terminal — healthier than a cron job that fails quietly at 3am.

Recipe 4: a tool-to-tool pipe → a message

Before: a pipe or intermediate file plus glue script to pass Claude's output to Codex (or the reverse).

After: put both sessions in the same team and let them talk directly.

This one runs here every day, so I can give a real case. Last week agmsg itself got a Linux-only bug report (issue #95). The Claude Code that implemented the fix and a separate session that verified it in a Debian container handled the whole thing — request → environment details → verification results → merge report — over agmsg messages, and it went from report to merge in six hours (PR #97). Both were ordinary interactive sessions. Zero glue script.

The minimum setup:

# Terminal A — Claude Code, joined as "analyst"
#   /agmsg → team: dev / agent: analyst
# Terminal B — Codex, joined as "implementer"
#   $agmsg → team: dev / agent: implementer
Enter fullscreen mode Exit fullscreen mode

Then the flow is just speech:

# Terminal A (analyst):
You: "send implementer: the settings writer embeds settings.local.json into
     the sqlite3 argv 6x (issue #95). readfile() in the SQL should keep it
     off argv entirely. Patch + tests please."

# Terminal B (implementer): receives the message, writes patch + tests, runs bats:
You: "send analyst: branch fix/delivery-e2big pushed. 3 bats cases at a 25KB
     fixture, all green locally. Ready for review."

# Terminal A: reads the PR, approves inline, then:
You: "send implementer: approving. one ask — verify on Linux too; the pre-fix
     code should fail there with E2BIG."
Enter fullscreen mode Exit fullscreen mode

The hand-off carries a sentence and a reference. The receiver does the heavy work in its own session, against the live filesystem. Nothing is piped.

One honest note — running the same prompt across several models in parallel for independent takes is still better as a headless pipe. That's a "budget the credits and use it" decision (next section).

When to keep it headless

This isn't "replace everything." The following are right for headless, and worth budgeting the new credits for:

  • CI pipelines — running where no human is, by definition. GitHub Actions integration is on the new credit too.
  • Nightly batches / scheduled jobs — genuinely meant to run unattended.
  • SDK usage embedded in a product — should be costed as metered in the first place.
  • One-shot massive parallelism (the same input across 50 model calls) — a credit-budget decision, not a workflow one.

The test is simple: when it's running, are you there? If yes, it can move into an interactive session. If no, budget the headless credits.

Not because it's cheaper, because it's better

I read June 15 as a message from the platform: development with a human in the loop gets structurally favored.

But the reason I recommend "keep the interaction, automate only the coordination between agents" isn't just that it fits the flat-rate bucket. A few months in, what I've found is that development where judgment stays with the human is faster in the end. The coordination between agents proceeds automatically; the human is pulled in only at the branch points and the final call. Not the copy-paste courier, not the 3am cron failure log.

The pricing just happened to point at that shape.

agmsg is open source. Setup is 30 seconds; if you have bash and sqlite3, it runs.
https://github.com/fujibee/agmsg

Top comments (0)