Mirza Iqbal

Posted on May 25

€40 n8n vs 28% weekly Anthropic quota. Which /goal layer should you actually run?

#n8n #claudecode #agents #llm

Vasu Yadav published a sharp piece on Medium this week.

He called /goal "the most important agent primitive of 2026."

He defined it as a "thread-scoped completion contract" with six components.

Outcome. Verification surface. Constraints. Boundaries. Iteration policy. Stop conditions.

He is half right.

The primitive is real.

The implementation is two years late.

The cost gap between his Anthropic Max setup and the n8n version DACH enterprises already run is large enough that picking the wrong layer for the wrong job becomes the single biggest line on your monthly AI bill.

Read his piece first if you have not. The /goal command is the most important agent primitive of 2026 (Medium, Vasu Yadav, 25 May 2026).

What Vasu's piece actually claims

Quoting him directly so I am not building a straw man.

"You define an outcome. The agent works toward that outcome across multiple turns, evaluating its own progress against evidence, until either the outcome is met, the budget is exhausted, or you pause it."

He frames the broader shift as "prompt-driven work to outcome-driven work."

He gives an honest cost number. One 15-hour /goal run consumed 28 percent of his weekly Claude Code Max quota.

He admits the risks. Scope creep. Rubber-stamping in multi-agent chains. Empirical data thin.

He references three implementations. Claude Code, Codex, and Cursor Background Agents.

The six components, mapped to n8n nodes

This is where it gets uncomfortable.

/goal component	What n8n calls it
Outcome	Workflow end node plus Set node output shape
Verification surface	IF node plus Function node plus custom code
Constraints	Typed expressions plus JSON Schema validator node
Boundaries	Sub-workflow scope plus credential isolation
Iteration policy	Loop Over Items plus Wait node plus Batch Items
Stop conditions	Error Trigger workflow plus max executions per workflow

All six are core n8n primitives. n8n itself shipped its first public release in October 2019 and the workflow engine has carried these node types since the early versions.

Source for each node lives in the n8n integration docs at https://docs.n8n.io/integrations/builtin/core-nodes/.

The match is one-to-one against the definition Vasu wrote himself.

Where /goal does something n8n cannot

If n8n already had the contract, what changed in 2026?

One thing.

The contract became LLM-readable.

n8n's outcome is a JSON shape. /goal's outcome is a sentence like "p95 latency below 120ms."

n8n's verification is {{$json.latency_ms < 120}} inside an IF node. /goal's verification is "run pnpm bench, parse the p95 column, compare it to target, decide."

Forget "goal as primitive."

What is new is that fuzzy intent now compiles into deterministic verification.

That is real and worth respect. It also costs differently.

The cost math

Anthropic Max pricing is at https://www.anthropic.com/pricing.

Whatever tier you run, 28 percent of weekly quota for one 15-hour run means roughly 3 to 4 of those runs per week before the cap.

Now the n8n side.

Hetzner Cloud CX22 in Frankfurt costs €4.59 per month for 2 vCPU and 4GB RAM (https://www.hetzner.com/cloud).

n8n self-hosted has €0 license cost (https://docs.n8n.io/hosting/installation/).

If you need n8n Cloud with team features instead of self-hosted, the Pro tier is €40 per month (https://n8n.io/pricing).

That is the €40 number from the title.

Same outcome class as the /goal stack for deterministic work. Different reliability profile. Same SaaS catalogue of 400-plus integrations.

Where each layer actually wins

n8n wins on:

Reliability per dollar at high run volume
Deterministic execution that survives compliance review
400-plus pre-built SaaS integrations (https://n8n.io/integrations)
Multi-tenant isolation auditors actually understand

/goal wins on:

Subtasks where the steps are unknowable at design time
Refactor work where the spec is "make tests pass"
Investigation flows where each iteration depends on the previous output
Anything where an LLM-readable spec is more compact than a deterministic flow graph

These do not compete. They layer.

The hybrid pattern, copy-pasteable

This is the architectural piece I have not seen written down yet.

The outer goal sits in n8n. The fuzzy interior subtasks sit in /goal.

Shape of a compliance intake flow.

n8n Webhook receives an inbound document
n8n Function node extracts metadata, deterministic, milliseconds
n8n HTTP Request node POSTs the document to a Claude Code /goal endpoint running on the same host
/goal works through a single fuzzy task, for example "classify this document against our compliance taxonomy and return the matching category plus citations"
/goal returns a structured response or a typed rejection
n8n IF node routes the document to the matching downstream workflow, deterministic
n8n logs the full chain to PostgreSQL for the audit trail

Deterministic outer loop. Fuzzy interior. Cost stays bounded because /goal only fires on the one ambiguous step, not the whole flow.

Here is the n8n side, trimmed.

{
  "nodes": [
    {
      "name": "Compliance Webhook",
      "type": "n8n-nodes-base.webhook",
      "parameters": { "path": "compliance-intake", "httpMethod": "POST" }
    },
    {
      "name": "Extract Metadata",
      "type": "n8n-nodes-base.function",
      "parameters": {
        "functionCode": "return [{json: {doc_id: $input.first().json.id, received_at: new Date().toISOString()}}];"
      }
    },
    {
      "name": "Classify via /goal",
      "type": "n8n-nodes-base.httpRequest",
      "parameters": {
        "url": "http://localhost:8787/goal/classify",
        "method": "POST",
        "bodyParametersJson": "={{ JSON.stringify({document_id: $json.doc_id, taxonomy: 'compliance-v3'}) }}",
        "options": { "timeout": 300000 }
      }
    },
    {
      "name": "Route by Category",
      "type": "n8n-nodes-base.if",
      "parameters": {
        "conditions": {
          "string": [{ "value1": "={{$json.category}}", "operation": "regex", "value2": "^(legal|finance|hr)$" }]
        }
      }
    }
  ]
}

And the goal.md the /goal endpoint serves to Claude Code.

# Goal

classify document against compliance taxonomy v3

## Outcome
Return JSON with three fields. category, confidence, citations.
Category MUST be one of these values, legal, finance, hr, operations, none.

## Verification
- Confidence above 0.85
- At least 2 citations from the taxonomy document
- Category present in the allowed list

## Constraints
- Taxonomy file at /workspace/taxonomy/compliance-v3.md
- Output is JSON only, no prose
- Max 3 retries with feedback on failed verification

## Stop conditions
- Verification passes
- 3 failed attempts return rejection with reason
- 60 second wall clock

How to estimate cost for your own flow

Run volume times per-step cost is the math.

The n8n outer loop scales with VPS resources, not per-execution charges. The CX22 above handles tens of thousands of executions per day without changing the €4.59 line.

The /goal interior is metered. Multiply your ambiguous-doc count by your average completion-token cost from https://www.anthropic.com/pricing.

The point of the hybrid is to push as much volume as possible onto the deterministic outer loop and reserve /goal for the steps that actually need fuzzy reasoning.

If every step in your flow is fuzzy, /goal end to end is fine. Most enterprise flows are not.

Decision table

Situation	Run it where
Same shape every time, high volume	n8n
Hard reliability requirement (compliance, billing)	n8n outer loop
Steps unknowable at design time	/goal interior
Need a deterministic audit trail	n8n logs the chain
Spec is "make tests pass" or "find the bug"	/goal
Cost matters and the work is repeatable	n8n
Cost matters and the work is novel	/goal

The lazy answer is "use the agent for everything."

The expensive answer is "use the agent for everything."

Same answer, paid in cash.

What I would change about Vasu's framing

Two things.

First, /goal is not the most important primitive of 2026. It is the most important primitive for novel work. Most enterprise work is not novel. The hybrid pattern is the actually-important primitive.

Second, the empirical data gap he flags is real. I run this pattern in DACH production every week. Happy to share what fails and what holds when other people start measuring.

Question for you.

What is the most expensive /goal run you shipped this month?

Was the same outcome reachable with a deterministic outer loop calling /goal on the one fuzzy step?