Abhishek Pandit

Posted on Jun 12

Think Like an Attacker: How I Use @security-auditor Before Every Production Deploy

#github #githubcopilot #security #webdev

A lock on your front door doesn't make your house secure.

Not if the back door is open. Not if the window latch is broken. Not if you're handing spare keys to strangers without realizing it.

Security works the same way. Adding password hashing to your login endpoint doesn't make your app secure — not if the password reset endpoint doesn't expire tokens, or the file upload handler accepts any file type, or the database query for your admin panel is built by concatenating user input.

Most developers secure the obvious thing and leave a dozen attack surfaces unchecked. Not because they're careless — because they're only looking at one door at a time.

@security-auditor changes that. Instead of checking individual controls, it starts from a map of your entire attack surface.

This is Part 4 of the copilot-workflow series.

The Bank Vault Analogy

Imagine you're designing a bank vault.

A naive approach: put a very thick door on the vault. Done.

A security engineer's approach: draw a map of everything that needs protecting, then ask "how would a thief get to it?" They'd check the vault door — but also the air vents, the maintenance tunnels, the manager's office, the cleaning crew's access, and the bank's Wi-Fi network.

This is threat modeling. You don't start with controls. You start with a map of what you're protecting and all the ways someone could get to it.

@security-auditor runs this process on your code automatically. It's not a checklist bot. It's an attacker's mindset, systematically applied.

How @security-auditor Thinks

Before flagging a single vulnerability, the agent does two things:

1. Maps your trust boundaries

Trust boundaries are places where data crosses from one trust level to another. HTTP requests come in — untrusted. Database data goes out — trusted. A user uploads a file — untrusted. Your auth middleware processes it — trusted (if the middleware is correct).

Every trust boundary is potential attack surface.

2. Runs STRIDE

STRIDE is a structured way to think about threats at each boundary:

Letter	Threat	Question
S	Spoofing	Can someone pretend to be a legitimate user or service?
T	Tampering	Can someone modify data in transit or at rest?
R	Repudiation	Can someone deny an action they took?
I	Information disclosure	Can data leak to unauthorized parties?
D	Denial of service	Can the system be overwhelmed?
E	Elevation of privilege	Can someone gain more access than they should have?

Then it maps findings to OWASP Top 10 — the industry standard list of the most critical web security risks.

A Real Example: The Password Reset Trap

Here's an endpoint I've seen in real codebases — a password reset handler:

app.post('/api/auth/reset-password', async (req, res) => {
  const { token, newPassword } = req.body;
  const user = await db.query(
    `SELECT * FROM users WHERE reset_token = '${token}'`
  );
  if (!user) return res.status(404).json({ error: 'Invalid token' });
  await db.query(
    `UPDATE users SET password = '${newPassword}' WHERE id = ${user.id}`
  );
  res.json({ success: true });
});

It "works." Users can reset their passwords. Tests pass. Looks fine.

Let me show you what @security-auditor finds:

@security-auditor Audit this password reset endpoint for security vulnerabilities.

Security Audit Report

Critical: 4 | High: 2 | Medium: 1

[CRITICAL] SQL Injection — token lookup

Location: Line 3
Description: User-supplied token is concatenated directly into SQL
Impact: Attacker can bypass authentication entirely. Input ' OR '1'='1 retrieves any user.
Proof of concept: token = "' OR '1'='1"
Fix: db.query('SELECT * FROM users WHERE reset_token = $1', [token])

[CRITICAL] Plaintext Password Storage

Location: Line 7
Description: newPassword stored as raw string — no hashing
Impact: Database breach exposes every user's actual password
Fix: const hashed = await bcrypt.hash(newPassword, 12); db.query('UPDATE users SET password = $1 WHERE id = $2', [hashed, user.id])

[CRITICAL] SQL Injection — UPDATE statement

Location: Line 7
Description: Both newPassword and user.id concatenated into SQL
Fix: Use parameterized query as shown above

[CRITICAL] Reset token not invalidated after use

Location: Missing
Description: Token remains valid after password reset — replayable
Impact: Anyone who intercepts the token can reset the password again
Fix: UPDATE users SET password = $1, reset_token = NULL WHERE id = $2

[HIGH] No token expiry check

Location: Missing
Description: Tokens never expire — valid indefinitely
Fix: Add reset_token_expires_at column, check > NOW() in query, expire after 15 minutes

[HIGH] No rate limiting

Location: Missing
Description: Unlimited attempts — brute-force token guessing is possible
Fix: app.use('/api/auth/', rateLimit({ windowMs: 900000, max: 10 }))

[MEDIUM] Token enumeration via error message

Location: Line 4
Description: 404 reveals whether a token exists — helps attackers
Fix: Return 200 with generic message regardless of outcome

That endpoint looked functional. The actual security posture: four critical vulnerabilities, any one of which could result in account takeover or full database exposure.

The OWASP Top 10: Your Security Baseline

@security-auditor maps every finding to the OWASP Top 10 — the most critical web application security risks, updated every few years based on real breach data.

You don't need to memorize it. But understanding the categories helps you recognize when to invoke the agent:

#	Risk	"Invoke when you're..."
A01	Broken Access Control	Building any endpoint that checks ownership
A02	Cryptographic Failures	Storing passwords, tokens, or PII
A03	Injection	Writing any database query with user input
A04	Insecure Design	Designing auth flows, payment logic
A05	Security Misconfiguration	Setting up CORS, headers, error messages
A06	Vulnerable Components	Adding a new npm dependency
A07	Authentication Failures	Building login, registration, password reset
A08	Software Integrity	Setting up CI/CD or deployment pipelines
A09	Logging Failures	Building audit trails or error handlers
A10	SSRF	Building webhooks, URL imports, link previews

The Three-Tier Rule

The most useful mental model from @security-auditor is the three-tier boundary system. Before writing any security-sensitive code, you know exactly what category it falls into:

Always do (no human approval needed):

Validate all external input at the boundary
Parameterize all database queries
Hash passwords with bcrypt/argon2, ≥12 rounds
Set security headers (CSP, HSTS, X-Frame-Options)
Use httpOnly + secure + sameSite cookies for sessions
Run npm audit before every release

Ask first (requires explicit approval):

Adding new authentication flows
Storing new categories of sensitive data (PII, payment info)
Adding new external service integrations
Changing CORS configuration
Adding file upload handlers
Modifying rate limits

Never do (hard stops):

Commit secrets to version control
Log passwords, tokens, or full credit card numbers
Trust client-side validation as a security boundary
Use eval() or innerHTML with user-provided data
Store auth tokens in localStorage
Expose stack traces or internal error details to users

When to Invoke @security-auditor

Before shipping anything that:

Accepts user input (forms, query params, file uploads)
Handles authentication or sessions
Fetches URLs provided by users (webhooks, import-from-URL features)
Calls external APIs with stored credentials
Processes payment or PII data

After:

Adding a new npm dependency (npm audit + ask the auditor to check supply chain risk)
Changing CORS configuration
Adding any new public endpoint

Alongside @code-reviewer for:

AI-generated code (especially auth logic — Copilot generates plausible but often vulnerable patterns)
Code you inherited that nobody has security-reviewed

The LLM Security Trap

Here's a threat category most security guides don't cover: your own AI assistant.

If you're building features that use LLMs — chatbots, summarizers, AI agents — the model's output is untrusted data. Full stop. It must be treated exactly like user input.

// DANGEROUS: passing LLM output directly to the database
const sqlQuery = await llm.generate(`Write SQL to find tasks for: ${userQuery}`);
await db.query(sqlQuery);  // arbitrary SQL execution

// SAFE: parse defensively, validate, then act
let intent;
try {
  intent = TaskQuerySchema.parse(JSON.parse(await llm.replyJson(userQuery)));
} catch {
  throw new ValidationError('Could not parse request');
}
const tasks = await db.tasks.findMany({ where: buildSafeWhere(intent) });

@security-auditor checks for this specifically. Prompt injection (an attacker embedding instructions in text your LLM processes), excessive agency (your agent doing things it shouldn't have permission to do), and unbounded consumption (a crafted input that runs up your API costs) are all on its checklist.

The Security Mindset Shift

Before using @security-auditor, I thought about security as a feature — something you add to a working system.

After: security is a constraint. You don't add it after the code works. You build it into every decision.

The difference in practice: when I write a database query, I parameterize it as I'm writing it — not after. When I build an endpoint, I add rate limiting and auth checks in the same commit — not later. When I store a token, I immediately ask what happens if that token is compromised.

The agent didn't teach me to be more careful. It taught me to look in more places.

Get the Template

@security-auditor is included in the copilot-workflow template — one setup, automatic on every session.

👉 github.com/panditAbhis/copilot-workflow

Next in the series: Part 5 — The /ship chatmode. One command that fans out code review, security audit, and simplification in sequence and gives you a single SHIP / DO NOT SHIP verdict.

Series navigation

Part	Title
1	Your Copilot Has No Memory. Here's How I Fixed That in 5 Minutes.
2	Stop Merging Blind: How I Use @code-reviewer Before Every PR
3	Never Fix a Bug Without Proof: The @test-engineer Prove-It Pattern
4	Think Like an Attacker: How I Use @security-auditor Before Every Production Deploy
5	One Command to Rule Them All: The /ship Chatmode
6	Stop Building the Wrong Thing: @spec-writer and @planner
7	A Day in the Life: Complete Session Walkthrough

DEV Community