DEV Community

Abhishek Pandit
Abhishek Pandit

Posted on

Think Like an Attacker: How I Use @security-auditor Before Every Production Deploy

A lock on your front door doesn't make your house secure.

Not if the back door is open. Not if the window latch is broken. Not if you're handing spare keys to strangers without realizing it.

Security works the same way. Adding password hashing to your login endpoint doesn't make your app secure — not if the password reset endpoint doesn't expire tokens, or the file upload handler accepts any file type, or the database query for your admin panel is built by concatenating user input.

Most developers secure the obvious thing and leave a dozen attack surfaces unchecked. Not because they're careless — because they're only looking at one door at a time.

@security-auditor changes that. Instead of checking individual controls, it starts from a map of your entire attack surface.

This is Part 4 of the copilot-workflow series.


The Bank Vault Analogy

Imagine you're designing a bank vault.

A naive approach: put a very thick door on the vault. Done.

A security engineer's approach: draw a map of everything that needs protecting, then ask "how would a thief get to it?" They'd check the vault door — but also the air vents, the maintenance tunnels, the manager's office, the cleaning crew's access, and the bank's Wi-Fi network.

This is threat modeling. You don't start with controls. You start with a map of what you're protecting and all the ways someone could get to it.

@security-auditor runs this process on your code automatically. It's not a checklist bot. It's an attacker's mindset, systematically applied.


How @security-auditor Thinks

Before flagging a single vulnerability, the agent does two things:

1. Maps your trust boundaries

Trust boundaries are places where data crosses from one trust level to another. HTTP requests come in — untrusted. Database data goes out — trusted. A user uploads a file — untrusted. Your auth middleware processes it — trusted (if the middleware is correct).

Every trust boundary is potential attack surface.

2. Runs STRIDE

STRIDE is a structured way to think about threats at each boundary:

Letter Threat Question
S Spoofing Can someone pretend to be a legitimate user or service?
T Tampering Can someone modify data in transit or at rest?
R Repudiation Can someone deny an action they took?
I Information disclosure Can data leak to unauthorized parties?
D Denial of service Can the system be overwhelmed?
E Elevation of privilege Can someone gain more access than they should have?

Then it maps findings to OWASP Top 10 — the industry standard list of the most critical web security risks.


A Real Example: The Password Reset Trap

Here's an endpoint I've seen in real codebases — a password reset handler:

app.post('/api/auth/reset-password', async (req, res) => {
  const { token, newPassword } = req.body;
  const user = await db.query(
    `SELECT * FROM users WHERE reset_token = '${token}'`
  );
  if (!user) return res.status(404).json({ error: 'Invalid token' });
  await db.query(
    `UPDATE users SET password = '${newPassword}' WHERE id = ${user.id}`
  );
  res.json({ success: true });
});
Enter fullscreen mode Exit fullscreen mode

It "works." Users can reset their passwords. Tests pass. Looks fine.

Let me show you what @security-auditor finds:

@security-auditor Audit this password reset endpoint for security vulnerabilities.
Enter fullscreen mode Exit fullscreen mode

Security Audit Report

Critical: 4 | High: 2 | Medium: 1

[CRITICAL] SQL Injection — token lookup

  • Location: Line 3
  • Description: User-supplied token is concatenated directly into SQL
  • Impact: Attacker can bypass authentication entirely. Input ' OR '1'='1 retrieves any user.
  • Proof of concept: token = "' OR '1'='1"
  • Fix: db.query('SELECT * FROM users WHERE reset_token = $1', [token])

[CRITICAL] Plaintext Password Storage

  • Location: Line 7
  • Description: newPassword stored as raw string — no hashing
  • Impact: Database breach exposes every user's actual password
  • Fix: const hashed = await bcrypt.hash(newPassword, 12); db.query('UPDATE users SET password = $1 WHERE id = $2', [hashed, user.id])

[CRITICAL] SQL Injection — UPDATE statement

  • Location: Line 7
  • Description: Both newPassword and user.id concatenated into SQL
  • Fix: Use parameterized query as shown above

[CRITICAL] Reset token not invalidated after use

  • Location: Missing
  • Description: Token remains valid after password reset — replayable
  • Impact: Anyone who intercepts the token can reset the password again
  • Fix: UPDATE users SET password = $1, reset_token = NULL WHERE id = $2

[HIGH] No token expiry check

  • Location: Missing
  • Description: Tokens never expire — valid indefinitely
  • Fix: Add reset_token_expires_at column, check > NOW() in query, expire after 15 minutes

[HIGH] No rate limiting

  • Location: Missing
  • Description: Unlimited attempts — brute-force token guessing is possible
  • Fix: app.use('/api/auth/', rateLimit({ windowMs: 900000, max: 10 }))

[MEDIUM] Token enumeration via error message

  • Location: Line 4
  • Description: 404 reveals whether a token exists — helps attackers
  • Fix: Return 200 with generic message regardless of outcome

That endpoint looked functional. The actual security posture: four critical vulnerabilities, any one of which could result in account takeover or full database exposure.


The OWASP Top 10: Your Security Baseline

@security-auditor maps every finding to the OWASP Top 10 — the most critical web application security risks, updated every few years based on real breach data.

You don't need to memorize it. But understanding the categories helps you recognize when to invoke the agent:

# Risk "Invoke when you're..."
A01 Broken Access Control Building any endpoint that checks ownership
A02 Cryptographic Failures Storing passwords, tokens, or PII
A03 Injection Writing any database query with user input
A04 Insecure Design Designing auth flows, payment logic
A05 Security Misconfiguration Setting up CORS, headers, error messages
A06 Vulnerable Components Adding a new npm dependency
A07 Authentication Failures Building login, registration, password reset
A08 Software Integrity Setting up CI/CD or deployment pipelines
A09 Logging Failures Building audit trails or error handlers
A10 SSRF Building webhooks, URL imports, link previews

The Three-Tier Rule

The most useful mental model from @security-auditor is the three-tier boundary system. Before writing any security-sensitive code, you know exactly what category it falls into:

Always do (no human approval needed):

  • Validate all external input at the boundary
  • Parameterize all database queries
  • Hash passwords with bcrypt/argon2, ≥12 rounds
  • Set security headers (CSP, HSTS, X-Frame-Options)
  • Use httpOnly + secure + sameSite cookies for sessions
  • Run npm audit before every release

Ask first (requires explicit approval):

  • Adding new authentication flows
  • Storing new categories of sensitive data (PII, payment info)
  • Adding new external service integrations
  • Changing CORS configuration
  • Adding file upload handlers
  • Modifying rate limits

Never do (hard stops):

  • Commit secrets to version control
  • Log passwords, tokens, or full credit card numbers
  • Trust client-side validation as a security boundary
  • Use eval() or innerHTML with user-provided data
  • Store auth tokens in localStorage
  • Expose stack traces or internal error details to users

When to Invoke @security-auditor

Before shipping anything that:

  • Accepts user input (forms, query params, file uploads)
  • Handles authentication or sessions
  • Fetches URLs provided by users (webhooks, import-from-URL features)
  • Calls external APIs with stored credentials
  • Processes payment or PII data

After:

  • Adding a new npm dependency (npm audit + ask the auditor to check supply chain risk)
  • Changing CORS configuration
  • Adding any new public endpoint

Alongside @code-reviewer for:

  • AI-generated code (especially auth logic — Copilot generates plausible but often vulnerable patterns)
  • Code you inherited that nobody has security-reviewed

The LLM Security Trap

Here's a threat category most security guides don't cover: your own AI assistant.

If you're building features that use LLMs — chatbots, summarizers, AI agents — the model's output is untrusted data. Full stop. It must be treated exactly like user input.

// DANGEROUS: passing LLM output directly to the database
const sqlQuery = await llm.generate(`Write SQL to find tasks for: ${userQuery}`);
await db.query(sqlQuery);  // arbitrary SQL execution

// SAFE: parse defensively, validate, then act
let intent;
try {
  intent = TaskQuerySchema.parse(JSON.parse(await llm.replyJson(userQuery)));
} catch {
  throw new ValidationError('Could not parse request');
}
const tasks = await db.tasks.findMany({ where: buildSafeWhere(intent) });
Enter fullscreen mode Exit fullscreen mode

@security-auditor checks for this specifically. Prompt injection (an attacker embedding instructions in text your LLM processes), excessive agency (your agent doing things it shouldn't have permission to do), and unbounded consumption (a crafted input that runs up your API costs) are all on its checklist.


The Security Mindset Shift

Before using @security-auditor, I thought about security as a feature — something you add to a working system.

After: security is a constraint. You don't add it after the code works. You build it into every decision.

The difference in practice: when I write a database query, I parameterize it as I'm writing it — not after. When I build an endpoint, I add rate limiting and auth checks in the same commit — not later. When I store a token, I immediately ask what happens if that token is compromised.

The agent didn't teach me to be more careful. It taught me to look in more places.


Get the Template

@security-auditor is included in the copilot-workflow template — one setup, automatic on every session.

👉 github.com/panditAbhis/copilot-workflow

Next in the series: Part 5 — The /ship chatmode. One command that fans out code review, security audit, and simplification in sequence and gives you a single SHIP / DO NOT SHIP verdict.


Series navigation

Top comments (0)