Patience Mpofu

Posted on May 7

Writing Custom SAST Rules for Vulnerabilities Your Scanner Doesn't Cover

#python #tutorial #security #appsec

Every SAST tool ships with a default ruleset. And every default ruleset has gaps.

Sometimes the gap is a framework-specific vulnerability that the tool's authors didn't anticipate. Sometimes it's an internal pattern unique to your organisation — a custom authentication library, a legacy data access layer, a home-grown serialisation format that every engineer knows is sensitive but no off-the-shelf rule covers.

This is the article where I show you how to close those gaps using the YAML rule engine I built. No Python required. No rebuilding the scanner. Just a YAML file and an understanding of what you're trying to detect.

By the end, you'll have written three custom rules from scratch — a Java-specific one, a Node.js-specific one, and an organisation-level one that catches usage of a fictional internal library pattern. The process is the same for any vulnerability you want to target.

Before You Write a Rule: The Four Questions

Every good detection rule starts with the same four questions. Skip them and you end up with either a rule that fires on everything or a rule that fires on nothing.

1. What does the vulnerable code actually look like in text?
Not the conceptual vulnerability — the literal characters that appear on screen when a developer writes the bad pattern. Be specific. "SQL injection" is not an answer. "SELECT * FROM users WHERE id = " + userId is an answer.

2. What does safe code look like?
You need the counterexample. If your pattern would also match safe code, you have a false positive problem. If you can't articulate what safe code looks like, you don't understand the vulnerability well enough to write a rule yet.

3. Which languages does this apply to?
Some patterns are universal — hardcoded secrets look similar everywhere. Others are language or framework-specific. Writing a broad rule when a narrow one is appropriate generates noise and erodes trust in the scanner.

4. What's the right confidence level?
HIGH means "this is almost certainly a real vulnerability." MEDIUM means "this warrants human review." LOW means "this is suspicious but probably benign." If you're unsure, start at MEDIUM and tighten it after you see the results on real code.

Now let's write some rules.

The Rule Format (Quick Reference)

rules:
  - id: CUSTOM-001
    title: Short descriptive title
    description: >
      What the vulnerability is and why it matters.
    severity: CRITICAL | HIGH | MEDIUM | LOW
    category: Injection | Secrets | Cryptography | Authentication | Misconfiguration | Path Traversal
    cwe: CWE-XXX
    owasp: AXX:2021 - Category Name
    languages: ["python", "java", "javascript", "typescript", "csharp", "kotlin", "go", "ruby", "php"]
    remediation: >
      What the developer should do instead.
    patterns:
      - regex: 'your-pattern-here'
        confidence: HIGH | MEDIUM | LOW

Save it anywhere — the scanner discovers all YAML files in the rules/ directory automatically. If you want to keep your custom rules separate from the core ruleset, create a rules/custom/ subdirectory and point the scanner at it:

python main.py ./src --rules ./rules/custom/

Rule 1: Java — Spring `@Transactional` on Public Methods Exposing Sensitive Data

This one is Java-specific and framework-specific. It's the kind of vulnerability that no generic SAST tool covers because it requires understanding Spring's transaction management model.

The vulnerability: In Spring, @Transactional annotations on public methods in @Service or @Repository classes work as expected because Spring creates a proxy. But when @Transactional is placed on a private method, Spring's proxy-based AOP cannot intercept it — the transaction is silently ignored. This is especially dangerous when the private method performs database writes that need to be atomic.

This isn't a traditional security vulnerability in the CVE sense — it's a correctness issue that can become a security issue when the failed transaction silently corrupts data, leaves partial writes in the database, or bypasses audit logging that was supposed to be transactional.

What safe code looks like: @Transactional on public methods, or using TransactionTemplate for programmatic transaction management on private methods.

What vulnerable code looks like:

@Service
public class PaymentService {

    @Transactional  // silent no-op — Spring proxy can't intercept private methods
    private void processRefund(String accountId, BigDecimal amount) {
        ledgerRepo.debit(accountId, amount);
        auditRepo.log("REFUND", accountId, amount);  // may not be in same transaction
    }
}

The rule:

rules:
  - id: JAVA-001
    title: "@Transactional on Private Method — Transaction Silently Ignored"
    description: >
      Spring's proxy-based AOP cannot intercept @Transactional annotations on
      private methods. The annotation is silently ignored, meaning the method
      executes without transaction management. This can cause partial writes,
      data corruption, and bypassed audit logging in database operations.
    severity: HIGH
    category: Misconfiguration
    cwe: CWE-362
    owasp: A05:2021 - Security Misconfiguration
    languages: ["java"]
    remediation: >
      Move @Transactional to public methods only. For private methods that
      require transaction management, either make them public, use
      TransactionTemplate for programmatic transactions, or restructure
      the code so the public caller method is annotated instead.
    patterns:
      - regex: '@Transactional[\s\S]{0,100}private\s+\w+\s+\w+\s*\('
        confidence: HIGH
      - regex: 'private\s+\w+\s+\w+\s*\([\s\S]{0,100}@Transactional'
        confidence: MEDIUM

Testing your rule — create a test file test_java_transactional.java and verify it fires:

// Should fire — JAVA-001
@Transactional
private void updateBalance(String id, BigDecimal amount) { }

// Should NOT fire — public method is fine
@Transactional
public void processPayment(String id, BigDecimal amount) { }

Run:

python main.py ./test_java_transactional.java --rules ./rules/custom/java-rules.yaml

Rule 2: Node.js — `child_process.exec` with Template Literals

This one targets a Node.js-specific pattern that's extremely common in backend services written by developers who came from a systems programming background.

The vulnerability: child_process.exec() passes its argument to the shell for execution. If that argument contains user-controlled input — even through a template literal that looks clean — it enables OS command injection. The shell will happily interpret special characters like ;, &&, |, and backticks as command separators or subshell operators.

What safe code looks like: child_process.execFile() or child_process.spawn() with arguments as an array — these bypass the shell entirely and treat the command and arguments as separate values.

What vulnerable code looks like:

// Dangerous — shell injection possible
const filename = req.body.filename;
exec(`convert ${filename} -resize 800x600 output.jpg`, callback);

// Also dangerous — looks safer but isn't
exec("ffmpeg -i " + userInput + " output.mp4", callback);

What safe code looks like:

// Safe — no shell involved
execFile('convert', [filename, '-resize', '800x600', 'output.jpg'], callback);

// Safe — spawn with args array
spawn('ffmpeg', ['-i', userInput, 'output.mp4']);

The rule:

rules:
  - id: NODE-001
    title: "child_process.exec with Dynamic Input — OS Command Injection"
    description: >
      child_process.exec() passes its argument to the system shell, enabling
      OS command injection when the argument includes user-controlled input,
      template literals, or string concatenation. Attackers can inject shell
      metacharacters to execute arbitrary commands on the host system.
    severity: CRITICAL
    category: Injection
    cwe: CWE-78
    owasp: A03:2021 - Injection
    languages: ["javascript", "typescript"]
    remediation: >
      Replace exec() with execFile() or spawn() and pass command arguments
      as an array. These functions bypass the shell entirely and treat each
      argument as a literal string, preventing shell metacharacter injection.
      Never concatenate user input into exec() arguments.
    patterns:
      - regex: 'exec\s*\(\s*`[^`]*\$\{'
        confidence: HIGH
      - regex: 'exec\s*\(\s*["\'][^"\']*["\'\s]\+\s*\w'
        confidence: HIGH
      - regex: 'exec\s*\(\s*\w+\s*\+'
        confidence: MEDIUM

The three patterns cover the three common forms: template literals with interpolation, concatenation with a string prefix, and concatenation with a variable. The last one is MEDIUM because exec("mycommand" + options) where options is a static config value is less dangerous — but still warrants review.

Rule 3: Organisation-Level — Internal Audit Logger Bypass

This is the most interesting type of custom rule: one that only makes sense for your specific codebase.

Imagine your organisation has an internal library called AuditLogger that must be called for any database mutation. The security policy is clear: every write operation must produce an audit event. But the library has a skipAudit() method that was added for performance testing and was never supposed to reach production code.

This isn't in any public CVE database. No off-the-shelf SAST tool would ever flag it. But it's a real security control bypass in your organisation's context.

The rule:

rules:
  - id: ORG-001
    title: "AuditLogger.skipAudit() — Security Control Bypass"
    description: >
      The skipAudit() method on AuditLogger disables audit event generation
      for database mutations. This method was introduced for load testing
      only and must never appear in production code. Its presence bypasses
      the organisation's regulatory audit trail requirement and may
      constitute a compliance violation.
    severity: CRITICAL
    category: Misconfiguration
    cwe: CWE-778
    owasp: A09:2021 - Security Logging and Monitoring Failures
    languages: ["java", "kotlin", "csharp"]
    remediation: >
      Remove skipAudit() immediately. All database mutations must generate
      audit events via AuditLogger. If performance is a concern, use
      AuditLogger.asyncLog() instead, which queues events without blocking
      the main thread. Contact the security team if an exemption is required.
    patterns:
      - regex: '\.skipAudit\s*\('
        confidence: HIGH
      - regex: 'AuditLogger\s*\.\s*skip'
        confidence: HIGH

Notice what this rule does that a generic tool can't: it encodes your organisation's security policy directly into the scanner. The remediation text names the correct alternative (asyncLog()). The description mentions the regulatory context. The severity is CRITICAL because in this fictional organisation, bypassing audit logging is a compliance issue, not just a best practice.

This is the highest-value type of custom rule because it's completely unavailable from any third-party source.

Multi-Pattern Rules: Increasing Coverage Without Losing Precision

One pattern rarely catches all instances of a vulnerability. The best rules use multiple patterns with appropriate confidence levels to maximise coverage while communicating certainty to the reviewer.

Here's a well-structured multi-pattern rule for detecting hardcoded database credentials in connection strings — a pattern that appears differently across languages and frameworks:

rules:
  - id: CUSTOM-DB-001
    title: "Hardcoded Database Credentials in Connection String"
    description: >
      Database connection strings with embedded credentials expose sensitive
      authentication material in source code, version control history, and
      build artifacts.
    severity: HIGH
    category: Secrets
    cwe: CWE-798
    owasp: A07:2021 - Identification and Authentication Failures
    languages: ["java", "csharp", "python", "javascript", "typescript", "kotlin"]
    remediation: >
      Move credentials to environment variables or a secrets manager such as
      AWS Secrets Manager, HashiCorp Vault, or Azure Key Vault. Never commit
      credentials to version control.
    patterns:
      # JDBC connection strings
      - regex: 'jdbc:[a-z]+://[^/]+/[^?]+\?.*password=[^&\s"'']{3,}'
        confidence: HIGH
      # .NET connection strings
      - regex: 'Password\s*=\s*[^;"\s]{4,}\s*;'
        confidence: HIGH
      # Generic password assignment near connection context
      - regex: '(conn|connection|db).*password\s*=\s*["\'][^"'']{4,}["\']'
        confidence: MEDIUM
      # SQLAlchemy / Django database URLs
      - regex: '(postgresql|mysql|sqlite|mongodb)://\w+:[^@\s"'']{4,}@'
        confidence: HIGH

Each pattern has a different confidence because each has a different false positive profile. JDBC connection strings with password parameters are nearly always real findings. The generic connection.password = pattern might match configuration loading code where the value comes from an environment variable on the right-hand side.

Testing Your Custom Rules

Before you add a rule to your pipeline, test it against both positive and negative cases.

Create a dedicated test file with clearly labelled sections:

# test_custom_rules.py

# --- SHOULD FIRE ---
# NODE-001: exec with template literal
exec(`convert ${userInput} output.jpg`)

# CUSTOM-DB-001: hardcoded JDBC credentials
conn = "jdbc:postgresql://localhost/mydb?user=admin&password=supersecret123"

# --- SHOULD NOT FIRE ---
# Safe: spawn with args array
spawn('convert', [userInput, 'output.jpg'])

# Safe: password from environment
conn = f"jdbc:postgresql://localhost/mydb?user=admin&password={os.getenv('DB_PASS')}"

Then run the scanner and verify the output matches your expectations:

python main.py ./test_custom_rules.py --rules ./rules/custom/ --format json

Check that:

Every SHOULD FIRE comment corresponds to a finding in the output
Every SHOULD NOT FIRE comment has no corresponding finding
The confidence and severity levels match what you intended If a false positive appears, either tighten the regex or downgrade the confidence level. If a true positive is missed, your pattern isn't covering that form of the vulnerability.

The Broader Point: Rules as Institutional Knowledge

The most valuable thing about a YAML-driven rule engine isn't the rules it ships with. It's the rules your team writes over time.

Every time a security engineer finds a vulnerability in a code review, there's a question worth asking: could this have been caught by a scanner rule? If the answer is yes, write the rule. Now the scanner catches that pattern forever, across every future PR, without anyone needing to remember it.

Rules become institutional knowledge. They encode the hard-won understanding of what goes wrong in your specific codebase, your specific frameworks, your specific compliance requirements. That's something no off-the-shelf tool can give you — and it compounds over time.

The full scanner and core ruleset are at github.com/pgmpofu/sast-tool. Drop your custom rules in rules/ and they're picked up automatically on the next scan.

Next up: embedding the scanner in a CI/CD pipeline with configurable severity thresholds — how to go from zero security gates to blocking builds on critical findings without breaking your team's deployment workflow.

DEV Community

Writing Custom SAST Rules for Vulnerabilities Your Scanner Doesn't Cover

Before You Write a Rule: The Four Questions

The Rule Format (Quick Reference)

Rule 1: Java — Spring `@Transactional` on Public Methods Exposing Sensitive Data

Rule 2: Node.js — `child_process.exec` with Template Literals

Rule 3: Organisation-Level — Internal Audit Logger Bypass

Multi-Pattern Rules: Increasing Coverage Without Losing Precision

Testing Your Custom Rules

The Broader Point: Rules as Institutional Knowledge

Top comments (0)

Before You Write a Rule: The Four Questions

The Rule Format (Quick Reference)

Rule 1: Java — Spring @Transactional on Public Methods Exposing Sensitive Data

Rule 2: Node.js — child_process.exec with Template Literals

Rule 3: Organisation-Level — Internal Audit Logger Bypass

Multi-Pattern Rules: Increasing Coverage Without Losing Precision

Testing Your Custom Rules

The Broader Point: Rules as Institutional Knowledge

Rule 1: Java — Spring `@Transactional` on Public Methods Exposing Sensitive Data

Rule 2: Node.js — `child_process.exec` with Template Literals