DAYAN ELVIS JAHUIRA PILCO

Posted on Jul 4

Finding Security Bugs Before They Ship: Applying Bandit (SAST) to a Python Web App

#security #python #devsecops #sast

TL;DR: I took a small Flask application with 7 classic security flaws, ran Bandit (a SAST tool from the OWASP Source Code Analysis Tools list) against it, fixed everything it found, and automated the scan with GitHub Actions so no insecure code reaches main again. Full code: GitHub repo →

What is SAST and why should you care?

Static Application Security Testing (SAST) tools analyze your source code without executing it, looking for patterns that are known to cause vulnerabilities: SQL injection, command injection, weak cryptography, hardcoded secrets, and more.

The main advantage over finding bugs in production? Cost and timing. A SAST scan runs in seconds, inside your IDE or CI pipeline, before the code ever ships. According to OWASP, SAST scales well precisely because it can run repeatedly on every build.

For this experiment I chose Bandit: an open-source SAST tool designed specifically for Python, originally developed within the OpenStack project and now maintained by PyCQA. It builds an AST (Abstract Syntax Tree) from each file and runs security plugins against it.

The target: a deliberately vulnerable Flask app

I wrote a small Flask application that concentrates 7 mistakes you can still find in real-world codebases:

import hashlib
import pickle
import random
import sqlite3
import subprocess

from flask import Flask, request

app = Flask(__name__)

# Issue 1: hardcoded credentials
DB_PASSWORD = "SuperSecret123!"


@app.route("/user")
def get_user():
    # Issue 2: SQL injection (string concatenation)
    username = request.args.get("username")
    conn = sqlite3.connect("users.db")
    query = "SELECT * FROM users WHERE name = '%s'" % username
    return str(conn.execute(query).fetchall())


@app.route("/ping")
def ping():
    # Issue 3: command injection (shell=True with user input)
    host = request.args.get("host")
    result = subprocess.check_output("ping -c 1 " + host, shell=True)
    return result


@app.route("/login", methods=["POST"])
def login():
    # Issue 4: weak hashing algorithm (MD5)
    password = request.form.get("password", "")
    return hashlib.md5(password.encode()).hexdigest()


@app.route("/token")
def token():
    # Issue 5: insecure randomness for security tokens
    return str(random.randint(100000, 999999))


@app.route("/load", methods=["POST"])
def load():
    # Issue 6: insecure deserialization
    return str(pickle.loads(request.data))


if __name__ == "__main__":
    # Issue 7: debug mode enabled in production
    app.run(host="0.0.0.0", debug=True)

Running Bandit

Installation and scan are two commands:

pip install bandit
bandit app.py

In under a second, Bandit reported 10 issues in 35 lines of code:

Run metrics:
    Total issues (by severity):
        Low: 4
        Medium: 3
        High: 3

Here are the most interesting findings:

🔴 High: command injection (B602, CWE-78)

>> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with
   shell=True identified, security issue.
   Severity: High   Confidence: High
   Location: ./app.py:32

With shell=True and unsanitized user input, requesting /ping?host=;cat /etc/passwd executes arbitrary commands on the server.

🔴 High: weak hashing (B324, CWE-327)

>> Issue: [B324:hashlib] Use of weak MD5 hash for security.
   Severity: High   Confidence: High
   Location: ./app.py:40

MD5 has been broken for decades — modern GPUs compute billions of MD5 hashes per second, so leaked password hashes fall quickly.

🔴 High: Flask debug mode (B201, CWE-94)

>> Issue: [B201:flask_debug_true] A Flask app appears to be run with
   debug=True, which exposes the Werkzeug debugger and allows the
   execution of arbitrary code.
   Severity: High   Confidence: Medium

The Werkzeug debugger includes an interactive Python console. Exposing it publicly is handing out remote code execution.

🟡 Medium: SQL injection (B608, CWE-89)

Bandit flags the string-formatted query. Note the Low confidence here — pattern-based detection can't always know if the interpolated value is user-controlled. That's a good reminder that SAST findings need human triage.

Other findings included hardcoded credentials (B105), insecure pickle deserialization (B301, Medium), non-cryptographic random for tokens (B311), and binding to 0.0.0.0 (B104).

Fixing the findings

Every fix is small and idiomatic:

Vulnerability	Fix
Hardcoded password	`os.environ.get("DB_PASSWORD")`
SQL injection	Parameterized query: `execute(query, (username,))`
Command injection	Argument list, no `shell=True`, absolute binary path
MD5 password hash	`hashlib.pbkdf2_hmac("sha256", ..., 600_000)`
Insecure random	`secrets.token_urlsafe(16)`
Pickle deserialization	`json.loads()` for untrusted data
Debug mode + 0.0.0.0	`debug=False`, bind to `127.0.0.1`

For example, the ping endpoint becomes:

@app.route("/ping")
def ping():
    host = shlex.quote(request.args.get("host", ""))
    result = subprocess.check_output(["/usr/bin/ping", "-c", "1", host])  # nosec B603
    return result

The # nosec B603 comment tells Bandit this specific call was reviewed and accepted — an important feature for managing false positives without silencing the whole scanner.

Re-running the scan on the fixed version:

Run metrics:
    Total issues (by severity):
        Low: 0
        Medium: 0
        High: 0

From 10 findings to 0. ✅

Automating with GitHub Actions

A scan you run once is a snapshot; a scan that runs on every push is a security control. This workflow blocks any PR that introduces new findings:

name: Bandit SAST Scan

on:
  push:
    branches: [main]
  pull_request:
    branches: [main]

jobs:
  bandit:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-python@v5
        with:
          python-version: "3.12"
      - name: Install Bandit
        run: pip install bandit
      - name: Scan vulnerable app (findings expected, does not block)
        run: bandit app.py || true
      - name: Scan fixed app (security gate)
        run: bandit app_fixed.py
      - name: Upload JSON report
        if: always()
        run: bandit -r . -f json -o bandit-report.json --exclude ./.github || true
      - uses: actions/upload-artifact@v4
        if: always()
        with:
          name: bandit-report
          path: bandit-report.json

Because Bandit exits with a non-zero code when it finds issues, the job fails and the merge is blocked. The JSON report is archived as an artifact for later review.

Strengths and limitations

What I liked about Bandit: zero-config startup, fast scans that fit any CI budget, findings mapped to CWE with links to documentation, and granular suppression with # nosec.

What it won't do for you — and OWASP is explicit about this for SAST in general: it can't detect authentication or access-control flaws, it doesn't track data flow across services, and some findings (like B608) are educated guesses that require human review. SAST is one layer; combine it with dependency scanning (SCA) and dynamic testing (DAST) for real coverage.

Conclusion

In less than an hour, a free tool found 10 real security issues, taught me the CWE behind each one, and became a permanent quality gate in my pipeline. If you write Python and your CI doesn't run a SAST tool yet, pip install bandit is one of the highest-return security investments you can make today.

Full demo code + workflow: github.com/Dayan-18/bandit-demo

What SAST tool does your team use? Let me know in the comments! 👇

References: OWASP Source Code Analysis Tools · Bandit documentation · CWE Top 25

Top comments (2)

Cristhian Carlos MAMANI CORI • Jul 5

Great article, Dayan! Here is my abstract and technical observation on this work: This piece shows the full SAST lifecycle applied to a Flask application — seven intentional vulnerabilities, a Bandit scan that surfaces 10 findings (three of them high severity, like command injection via shell=True and weak MD5 hashing), a complete remediation that brings the count to zero, and a GitHub Actions gate so insecure code can never reach main again. Important Observation: what I appreciate most is your point about the B608 SQL injection finding having low confidence — this is the reality of pattern-based analysis. In a production team I would complement Bandit with taint-tracking tools and mandatory code review, because SAST can tell you where to look but not always whether the data flow is actually exploitable. The nosec-with-justification approach you used is exactly how false positives should be managed. Excellent starting point for anyone adopting DevSecOps.

Alex Shev • Jul 9

Bandit is useful when teams treat it as a habit, not a one-time report. The important part is tuning the findings into a small set of checks that developers trust, then wiring those into CI before insecure patterns become copy-pasted project style.