DEV Community

DIEGO FABRIZIO ANDIA NAVARRO
DIEGO FABRIZIO ANDIA NAVARRO

Posted on

Applying a SAST Tool to a Real Application: Finding and Fixing 10 Vulnerabilities with Bandit

Abstract

Static Application Security Testing (SAST) tools read source code without running it, looking for patterns that are known to be dangerous: string-built SQL queries, hardcoded secrets, unsafe deserialization, and similar mistakes that are easy to write and easy to miss in review. This article applies Bandit — a SAST tool for Python listed in OWASP's Source Code Analysis Tools directory — to a small order-lookup service for an online store. The vulnerable version was scanned first, producing 10 real findings across three severity levels. Each finding was then fixed individually, and the fixed file was scanned again, surfacing one additional issue Bandit caught that had been missed during manual remediation, before reaching a fully clean scan. The goal isn't to demonstrate Bandit's feature list — it's to show what a SAST tool actually catches in a piece of code that looks unremarkable at a glance.

Why static analysis catches what code review often doesn't

A SAST tool doesn't understand what your application does — it doesn't know what a "discount" or an "order" means. What it knows is a large catalog of dangerous patterns: eval() on user input, subprocess calls with shell=True, MD5 used where a password hash is expected, pickle.loads() on data that could come from outside the process. A human reviewer skimming a pull request can miss these because they're often one line buried in otherwise-correct logic. A SAST tool reads every line, every time, without getting tired.

The real-world example: an order service with 7 intentional mistakes

The starting point is a small Flask service with the kind of issues that show up in real codebases — not contrived examples, but the exact shape of mistakes that get written under deadline pressure:

# order_service.py
DB_PASSWORD = "Sup3rSecret!2024"

def find_order_by_customer(customer_name):
    conn = get_connection()
    cursor = conn.cursor()
    query = "SELECT * FROM orders WHERE customer_name = '%s'" % customer_name
    cursor.execute(query)
    return cursor.fetchall()

def calculate_discount_expression(expression):
    return eval(expression)

def export_orders_to_csv(filename):
    subprocess.call("cp orders.db /tmp/" + filename, shell=True)

def hash_password(password):
    return hashlib.md5(password.encode()).hexdigest()

def load_cached_cart(serialized_cart):
    return pickle.loads(serialized_cart)
Enter fullscreen mode Exit fullscreen mode

Running Bandit against it

pip install bandit
bandit order_service.py
Enter fullscreen mode Exit fullscreen mode

Real, unedited output:

Run metrics:
    Total issues (by severity):
        Low: 3
        Medium: 4
        High: 3

>> Issue: [B608:hardcoded_sql_expressions] Possible SQL injection vector through string-based query construction.
   Severity: Medium   Confidence: Low   CWE: CWE-89
   Location: ./order_service.py:31

>> Issue: [B307:blacklist] Use of possibly insecure function - consider using safer ast.literal_eval.
   Severity: Medium   Confidence: High   CWE: CWE-78
   Location: ./order_service.py:39

>> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with shell=True identified, security issue.
   Severity: High   Confidence: High   CWE: CWE-78
   Location: ./order_service.py:44

>> Issue: [B324:hashlib] Use of weak MD5 hash for security. Consider usedforsecurity=False
   Severity: High   Confidence: High   CWE: CWE-327
   Location: ./order_service.py:49

>> Issue: [B301:blacklist] Pickle and modules that wrap it can be unsafe when used to deserialize untrusted data.
   Severity: Medium   Confidence: High   CWE: CWE-502
   Location: ./order_service.py:54

>> Issue: [B201:flask_debug_true] A Flask app appears to be run with debug=True, which exposes the Werkzeug
   debugger and allows the execution of arbitrary code.
   Severity: High   Confidence: Medium   CWE: CWE-94
   Location: ./order_service.py:71
Enter fullscreen mode Exit fullscreen mode

Ten findings total — three Low, four Medium, three High. Note that Bandit doesn't just flag the call site; it attaches a CWE (Common Weakness Enumeration) ID to each one, which is what lets a finding map directly to a recognized vulnerability category instead of being just an opinion.

Fixing each finding

Finding Fix
Hardcoded password & API key Read from environment variables (os.environ.get(...))
SQL injection via % formatting Parameterized query: cursor.execute(query, (customer_name,))
eval() on user input ast.literal_eval(), which only evaluates literals, never arbitrary code
subprocess with shell=True shutil.copy() with no shell, plus os.path.basename() to strip path traversal
MD5 for password hashing bcrypt.hashpw()
pickle.loads() on untrusted data json.loads() instead — JSON can't execute code during deserialization
Flask debug=True + bind to 0.0.0.0 debug driven by an explicit environment flag, bind to 127.0.0.1 by default
# order_service_fixed.py
def find_order_by_customer(customer_name):
    conn = get_connection()
    cursor = conn.cursor()
    query = "SELECT * FROM orders WHERE customer_name = ?"
    cursor.execute(query, (customer_name,))
    return cursor.fetchall()

def calculate_discount_expression(expression):
    try:
        return ast.literal_eval(expression)
    except (ValueError, SyntaxError):
        raise ValueError("Invalid expression")

def hash_password(password):
    return bcrypt.hashpw(password.encode(), bcrypt.gensalt())
Enter fullscreen mode Exit fullscreen mode

The re-scan caught something the manual fix missed

Running Bandit again after the fixes:

$ bandit order_service_fixed.py

>> Issue: [B108:hardcoded_tmp_directory] Probable insecure usage of temp file/directory.
   Severity: Medium   Confidence: Medium   CWE: CWE-377
   Location: ./order_service_fixed.py:47
Enter fullscreen mode Exit fullscreen mode

The shell-injection fix had replaced subprocess with shutil.copy(), but it still wrote to a hardcoded /tmp path — a pattern that's vulnerable to race conditions on shared systems. This is the actual value of a SAST tool in a pipeline: it doesn't just catch the obvious first pass, it catches what a human fixing five things in a row reasonably overlooks on the sixth. Swapping in tempfile.gettempdir() resolved it:

$ bandit order_service_fixed.py

Test results:
    No issues identified.

Run metrics:
    Total issues (by severity):
        Low: 0
        Medium: 0
        High: 0
Enter fullscreen mode Exit fullscreen mode

Wiring it into CI

The same command that ran locally becomes a merge gate:

# .github/workflows/sast.yml
name: SAST scan (Bandit)

on:
  push:
    branches: ["main"]
  pull_request:

jobs:
  bandit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - run: pip install bandit
      - run: bandit -r . -x ./tests -ll
Enter fullscreen mode Exit fullscreen mode

The -ll flag tells Bandit to fail the build only on Medium severity and above, which is a reasonable starting threshold — failing on every Low finding tends to train a team to ignore the tool's output entirely.

What a SAST tool is and isn't good for

Bandit found real, valid issues here — but it's worth being precise about its limits. It works on syntax and known-dangerous-call patterns, so it won't catch a logic flaw like an authorization check that's present but wrong, and it can produce false positives (notice the SQL-injection finding above was flagged at "Confidence: Low" — the tool is also telling you how sure it is). SAST is one layer: it belongs alongside code review, dependency scanning, and — for anything handling real user data — a second pair of human eyes on anything it flags as High severity.

Conclusion

Every one of the 10 findings here came from code that compiles, runs, and would pass a casual review — that's exactly the kind of mistake static analysis exists to catch. The more interesting result wasn't the first scan, it was the second one: fixing six vulnerabilities by hand still left a seventh, smaller one in the remediation itself. That's the actual argument for running a SAST tool in CI rather than as a one-time audit — it checks the fix the same way it checked the original bug.

Top comments (0)