DEV Community

Fryderyk
Fryderyk

Posted on • Originally published at github.com

I built Arness: a Claude Code plugin marketplace you drive with four slash commands

Arness - The H? Handled!

Claude Code is powerful. Without structure around it, every session starts cold, plans live in chat history, and the spec you cared about is buried in a thread you will never re-read.

I built Arness because I got tired of two things at once: the ad-hoc-prompting ceiling, and the ceremony every framework adds when it tries to fix it. It is an open-source Claude Code plugin marketplace, and you drive it with four slash commands.

The four commands

These are the user-facing surface. You do not pick between dozens of skills. You pick a verb that matches what you are doing right now, and the entry skill routes the rest.

/arn-brainstorming     → start a new product idea from scratch
/arn-planning          → turn a feature idea into a phased plan
/arn-implementing      → build the plan task-by-task
/arn-infra-wizard      → set up, deploy, or change infrastructure
Enter fullscreen mode Exit fullscreen mode

That is the whole vocabulary. If you can describe what you are doing in one verb, you know which command to run.

What happens underneath

Each entry skill dispatches to dozens of specialist skills and agents. /arn-planning calls feature-spec generation, codebase-pattern discovery, plan-writer, and plan-reviewer. /arn-implementing runs a task-executor and a task-reviewer agent per task, with self-healing test loops between them. /arn-infra-wizard walks discovery, define, containerize, deploy, verify, and change management.

You do not learn the names. The entry skill reads your ## Arness config and the current state of your repo, then picks the right next move. If your project has no spec yet, it writes one. If a spec exists but no plan, it produces a plan. If a plan exists but tasks are pending, it executes them. The progressive disclosure is by design: the surface is small, the depth is real, and you only meet the depth when something needs your attention.

What this looks like in practice

You start a new product idea. You run /arn-brainstorming. It walks discovery, drafts personas, proposes an architecture vision, and scaffolds a working skeleton you can run. You move into the build phase: /arn-planning for the next feature, then /arn-implementing to walk the plan task-by-task. When the feature needs a deploy, /arn-infra-wizard handles the IaC, the deploy, and the post-deploy verification.

Four slash commands. Full lifecycle. The depth is there when you need it (each entry skill exposes its sub-skills if you want to drive at a finer grain), but most of the time you do not.

Install one, install all three

Arness ships as three independently-installable plugins. Each plugin stands alone.

Plugin Stage Entry skill
arn-spark Greenfield exploration /arn-brainstorming
arn-code Development pipeline /arn-planning, /arn-implementing
arn-infra Infrastructure /arn-infra-wizard

Install Spark for a brand-new product idea and stop there. Install Code on an existing codebase you want to add structure to. Install Infra to manage deployment without touching the dev pipeline. Or install all three and ride the full chain from idea to deployed feature.

When you install a second plugin alongside an existing one, it reuses the ## Arness config block in your CLAUDE.md. The new plugin inherits what the first one already learned about your project. No re-init, no re-discovery, no contradictory state.

Why it actually works: the artifact contract

The single design rule across the marketplace is the human is the only writer of intent. Every skill writes structured output to disk. Every skill reads structured input from disk. The conversation is scaffolding, not the source of truth.

That is what makes the four-verb surface possible. You can stop a session, switch projects, switch plugins, and the chain still composes because every step left a file behind. A feature spec written by /arn-planning is a plain Markdown file your colleague can read, your code reviewer can grep, and /arn-implementing can pick up tomorrow.

Three concrete things this changes:

  1. Every decision is inspectable. Spec, plan, task list, review verdict, deploy report all live as plain Markdown or JSON in your repo. You diff them, PR-review them, grep them.
  2. Stages are interruptible and resumable. Lose the session, restart Claude tomorrow, point the next entry skill at the artifact. The pipeline picks up where it left off.
  3. The output of one stage gates the next. A plan with no acceptance criteria does not produce executable tasks. An execution with no green test run does not produce a change record. The structure is checked, not assumed.

Who this is for

  • Solo builders who lose context between sessions and want a chain of artifacts instead of a thread of prompts.
  • Skeptical staff engineers who refuse to trust AI output without an inspectable audit trail. Plain-text artifacts mean code review still works.
  • Stretched operators who re-paste 2,400 words of infra context every session. arn-infra owns that context as artifacts so the operator does not.
  • Engineering managers who need uneven AI productivity to converge. The structure of the pipeline is the convergence mechanism.

Install

# Add the marketplace
/plugin marketplace add AppsVortex/arness

# Install the plugins you need (or all three)
/plugin install arn-spark@arn-marketplace
/plugin install arn-code@arn-marketplace
/plugin install arn-infra@arn-marketplace
Enter fullscreen mode Exit fullscreen mode

After install, run /arn-spark-init, /arn-code-init, or /arn-infra-init once per project. Each init writes the ## Arness block to your CLAUDE.md and asks four short setup questions. After that, the four entry skills are usable.

Status

Arness opened publicly a few weeks ago at v1.0.0. Current versions: arn-code 3.3.0 (35 skills), arn-spark 2.2.0 (28 skills), arn-infra 2.2.0 (25 skills). MIT licensed. No telemetry, no server component, runs entirely inside your Claude Code session.

What I am still working out: how much of each entry skill should pause for confirmation versus just proceed. Right now /arn-implementing halts before each phase boundary; some users want that, others find it ceremony. The current answer is a ## Arness config flag, but the right default is not settled.

What is your current Claude Code setup, and where does the chain of intent break down for you the most?


GitHub logo AppsVortex / arness

Structured AI workflows for Claude Code — from first idea to production deploy. Three plugins: Spark (discovery & prototyping), Code (development pipeline), Infra (infrastructure & deployment).

Arness

Arness

Docs

Arness — H not required.

Structured AI workflows for Claude Code. From first idea to production deploy.

Seven entry commands. That's all you need to remember. Behind them, 134 specialist skills and agents handle the details across three independent plugins — ideation, development, and infrastructure.

Most AI coding tools help you write code faster. Arness helps you build software better. It gives your Claude Code session a structured pipeline: specs before code, plans before execution, reviews before shipping. Every stage produces a human-readable artifact that feeds the next. Nothing is hidden, nothing is locked in.

Three Plugins, One Lifecycle

Arness Spark — Where ideas come alive

Arness Spark

Most projects fail before the first commit — wrong problem, wrong audience, wrong architecture. Spark takes a raw idea and puts it through product discovery, stress testing, brand naming, use case writing, architecture evaluation, and interactive prototyping. By the time you write real…




Drafted with Claude Code, edited by me. Which is, recursively, the workflow Arness is for.

Top comments (0)