DEV Community

Abhishek Pandit
Abhishek Pandit

Posted on

One Command to Rule Them All: The /ship Chatmode That Reviews, Audits, and Cleans Before Every Merge

Here's a problem I had.

I'd built three specialist agents in Copilot Chat: @code-reviewer, @security-auditor, and @simplifier. Each one was genuinely useful. Each one caught things the others missed.

But using all three before a merge meant:

  1. Invoking @code-reviewer, reading the output, addressing findings
  2. Separately invoking @security-auditor, reading that output, addressing findings
  3. Separately invoking @simplifier, reading that output, addressing findings
  4. Mentally combining three reports into one decision

That's not automation. That's just delegation with extra steps.

What I actually wanted: one command that does all three and tells me to ship or not.

That's what the /ship chatmode is.

This is Part 5 of the copilot-workflow series — and it's the part where the individual pieces become a real automated workflow.


What's a Chatmode?

Before we get into /ship, a quick explanation of chatmodes — because it's a feature most Copilot users don't know about.

Think of Copilot Chat as a radio. By default, it's tuned to a general-purpose frequency. Chatmodes are preset stations — each one configures Copilot with a specific role, tools, and approach before you say anything.

You switch chatmodes the same way you'd switch radio stations: one click in the chatmode selector in VS Code Copilot Chat, then start talking.

The difference: when you open the ship chatmode, Copilot already knows what you're trying to do. It runs the full three-pass review without you having to orchestrate it manually.


The Flight Checklist Analogy

Pilots don't improvise pre-flight checks.

Before every flight, they run through a standardized checklist — the same one, in the same order, every time. Not because pilots are forgetful. Because when something is important enough, you systematize it. You make the right thing the default thing.

Pilots discovered this the hard way. Before checklists became standard, perfectly competent pilots died in perfectly functional aircraft because they forgot one step under pressure.

The /ship chatmode is your pre-merge checklist. Every merge. Same order. No improvising. No forgetting the security pass when you're rushing to meet a deadline.


How /ship Works

When you activate the ship chatmode and describe your changes, three passes run in sequence:

Pass 1: Code Review

@code-reviewer evaluates the changes across five dimensions: correctness, readability, architecture, security, performance.

Every finding is labeled:

  • Critical — blocks the merge
  • Important — should fix before merging
  • Suggestion — optional improvement

Pass 2: Security Audit

@security-auditor starts from trust boundaries, runs STRIDE analysis, and maps exploitable vulnerabilities to OWASP Top 10.

Security-only findings: Critical, High, Medium, Low severity. Every Critical or High includes proof of concept and a specific fix — not vague "consider validating input" advice.

Pass 3: Simplification

@simplifier scans for complexity that can be removed without changing behavior: deeply nested logic, generic names, duplicated code, unnecessary abstractions.

This is the last pass — not because it matters least, but because it only makes sense after you've verified the code is correct and secure.

The Verdict

After all three passes, one consolidated report:

## Pre-Merge Review

### Verdict: DO NOT SHIP ❌

**Summary:** The payment flow changes contain one critical SQL injection
vulnerability and a missing token expiry check. The logic is otherwise
clean and well-structured.

### Must Fix Before Merge
- [CRITICAL] SQL injection in payment intent creation — payments.ts:47
  Fix: parameterize the amount field in the INSERT statement
- [CRITICAL] Reset token not invalidated after use — auth.ts:123
  Fix: SET reset_token = NULL in the UPDATE statement

### Should Fix Before Merge
- [IMPORTANT] Missing test for payment failure path — payments.test.ts
  The happy path is covered but no test for insufficient funds

### Optional Improvements
- [SUGGESTION] extractPaymentIntent() could be extracted to a helper — payments.ts:40-65

### Clean ✅
- Correct use of parameterized queries in all existing endpoints
- Rate limiting already applied to auth endpoints
Enter fullscreen mode Exit fullscreen mode

SHIP = no Critical issues. You make the call on Important and Suggestion items — the chatmode doesn't make that judgment for you.


Real Scenario: What /ship Catches That You'd Miss

Let me show you three things the ship chatmode caught in a single review on a real feature — a task sharing system.

What I thought I was shipping: A feature that lets users share tasks with teammates. Looked clean. Tests passed. I'd been working on it for two days and was confident in it.

What /ship found:

Code review — correctness: The permission check was on the wrong side of the async boundary. If two users tried to accept the same share invitation simultaneously, both could succeed — creating a race condition that allowed a task to have two owners.

Security audit — broken access control: The GET /api/tasks/:id/share-link endpoint didn't verify the requester owned the task. Any authenticated user could generate a share link for any task by guessing the ID.

Simplification: The permission checking logic was duplicated in three different places. One helper function would have made it easier to maintain — and would have meant fixing the race condition in one place instead of three.

Three different problem types. One pass caught each one. Single command.


Setting Up /ship: Zero Configuration

The chatmode is already in the template. There's nothing to configure.

  1. Go to github.com/panditAbhis/copilot-workflow
  2. Click Use this template → create your repo
  3. Open VS Code with the Copilot Chat panel
  4. Click the chatmode selector (the dropdown at the top of Copilot Chat)
  5. Select ship
  6. Describe your changes

That's it. The chatmode is defined in .github/chatmodes/ship.chatmode.md — Copilot picks it up automatically.


The Two Other Chatmodes

The /ship chatmode isn't the only one in the template. There are two others for different phases of development.

/spec — Idea to Implementation Plan

Before writing a single line of code, /spec walks you through:

  1. Idea refinement — sharpens vague concepts, generates variations, forces you to name your "Not Doing" list
  2. Spec writing — produces a structured specification with success criteria, boundaries, and open questions
  3. Task breakdown — converts the spec into ordered, verifiable tasks with acceptance criteria

Nothing gets implemented until the spec is approved. This sounds slow. It's actually the fastest path — because you're not rewriting code that was built on wrong assumptions.

[In /spec chatmode]
I want to add email notifications when tasks are assigned to team members.
Enter fullscreen mode Exit fullscreen mode

The chatmode asks clarifying questions, surfaces assumptions ("should unassigned tasks also trigger notifications?"), proposes 3 approaches with trade-offs, writes the spec, and produces a numbered task list. Then it stops. You implement, guided by the tasks.

/debug — Systematic Root Cause Analysis

When something breaks, /debug runs the Prove-It Pattern:

  1. Reproduce first — make the failure happen reliably before touching code
  2. Localize — which layer? Which commit introduced it? (git bisect)
  3. Root cause — fix the cause, not the symptom
  4. Regression test — write a failing test that proves the bug existed
  5. Verify — full suite passes with the fix applied

The chatmode won't let you skip straight to the fix. The reproduction step is mandatory. The regression test is mandatory. This discipline is tedious the first few times. After your first "this fixed it" that actually fixed it — permanently, with a test to prove it — you stop finding it tedious.


The Automation Gap

Here's what separates a "copilot user" from a "copilot workflow":

A copilot user asks Copilot questions. They get answers. Sometimes the answers are good. The quality depends on how well they prompt, how much context they provide, how consistently they remember to check security, how often they run reviews.

A copilot workflow is systematic. It defines the process once, then enforces it automatically. /ship runs the same three-pass review every time — not when you remember to, not when you feel like the code needs it, not just on "important" PRs.

The difference isn't the AI. It's the discipline baked into the tooling.


What's Coming Next

This template started with three agents and grew to 17. The next phase is agentic workflows — where Copilot doesn't just respond to your prompts but initiates workflows autonomously:

  • Copilot in GitHub Actions — automatic review triggered on every PR open, no human needed to remember
  • MCP server integration — agents that read your actual database schema, live error logs, and monitoring metrics when reviewing code
  • The /spec to production pipeline — from idea to deployed feature, with AI assistance at every step and human approval gates between them

The foundations are already in the template. Watch the repo for updates as each piece ships.


Get the Template

Everything in this series — all 17 agents, all 3 chatmodes, the CI pipeline, MCP config — is in the template.

👉 github.com/panditAbhis/copilot-workflow

One click. Use this template. Every repo you create from it has the full workflow.


Series navigation

Top comments (0)