TL;DR: I took a small Flask application with 7 classic security flaws, ran Bandit (a SAST tool from the OWASP Source Code Analysis Tools list) against it, fixed everything it found, and automated the scan with GitHub Actions so no insecure code reaches
mainagain. Full code: GitHub repo โ
What is SAST and why should you care?
Static Application Security Testing (SAST) tools analyze your source code without executing it, looking for patterns that are known to cause vulnerabilities: SQL injection, command injection, weak cryptography, hardcoded secrets, and more.
The main advantage over finding bugs in production? Cost and timing. A SAST scan runs in seconds, inside your IDE or CI pipeline, before the code ever ships. According to OWASP, SAST scales well precisely because it can run repeatedly on every build.
For this experiment I chose Bandit: an open-source SAST tool designed specifically for Python, originally developed within the OpenStack project and now maintained by PyCQA. It builds an AST (Abstract Syntax Tree) from each file and runs security plugins against it.
The target: a deliberately vulnerable Flask app
I wrote a small Flask application that concentrates 7 mistakes you can still find in real-world codebases:
import hashlib
import pickle
import random
import sqlite3
import subprocess
from flask import Flask, request
app = Flask(__name__)
# Issue 1: hardcoded credentials
DB_PASSWORD = "SuperSecret123!"
@app.route("/user")
def get_user():
# Issue 2: SQL injection (string concatenation)
username = request.args.get("username")
conn = sqlite3.connect("users.db")
query = "SELECT * FROM users WHERE name = '%s'" % username
return str(conn.execute(query).fetchall())
@app.route("/ping")
def ping():
# Issue 3: command injection (shell=True with user input)
host = request.args.get("host")
result = subprocess.check_output("ping -c 1 " + host, shell=True)
return result
@app.route("/login", methods=["POST"])
def login():
# Issue 4: weak hashing algorithm (MD5)
password = request.form.get("password", "")
return hashlib.md5(password.encode()).hexdigest()
@app.route("/token")
def token():
# Issue 5: insecure randomness for security tokens
return str(random.randint(100000, 999999))
@app.route("/load", methods=["POST"])
def load():
# Issue 6: insecure deserialization
return str(pickle.loads(request.data))
if __name__ == "__main__":
# Issue 7: debug mode enabled in production
app.run(host="0.0.0.0", debug=True)
Running Bandit
Installation and scan are two commands:
pip install bandit
bandit app.py
In under a second, Bandit reported 10 issues in 35 lines of code:
Run metrics:
Total issues (by severity):
Low: 4
Medium: 3
High: 3
Here are the most interesting findings:
๐ด High: command injection (B602, CWE-78)
>> Issue: [B602:subprocess_popen_with_shell_equals_true] subprocess call with
shell=True identified, security issue.
Severity: High Confidence: High
Location: ./app.py:32
With shell=True and unsanitized user input, requesting /ping?host=;cat /etc/passwd executes arbitrary commands on the server.
๐ด High: weak hashing (B324, CWE-327)
>> Issue: [B324:hashlib] Use of weak MD5 hash for security.
Severity: High Confidence: High
Location: ./app.py:40
MD5 has been broken for decades โ modern GPUs compute billions of MD5 hashes per second, so leaked password hashes fall quickly.
๐ด High: Flask debug mode (B201, CWE-94)
>> Issue: [B201:flask_debug_true] A Flask app appears to be run with
debug=True, which exposes the Werkzeug debugger and allows the
execution of arbitrary code.
Severity: High Confidence: Medium
The Werkzeug debugger includes an interactive Python console. Exposing it publicly is handing out remote code execution.
๐ก Medium: SQL injection (B608, CWE-89)
Bandit flags the string-formatted query. Note the Low confidence here โ pattern-based detection can't always know if the interpolated value is user-controlled. That's a good reminder that SAST findings need human triage.
Other findings included hardcoded credentials (B105), insecure pickle deserialization (B301, Medium), non-cryptographic random for tokens (B311), and binding to 0.0.0.0 (B104).
Fixing the findings
Every fix is small and idiomatic:
| Vulnerability | Fix |
|---|---|
| Hardcoded password | os.environ.get("DB_PASSWORD") |
| SQL injection | Parameterized query: execute(query, (username,))
|
| Command injection | Argument list, no shell=True, absolute binary path |
| MD5 password hash | hashlib.pbkdf2_hmac("sha256", ..., 600_000) |
| Insecure random | secrets.token_urlsafe(16) |
| Pickle deserialization |
json.loads() for untrusted data |
| Debug mode + 0.0.0.0 |
debug=False, bind to 127.0.0.1
|
For example, the ping endpoint becomes:
@app.route("/ping")
def ping():
host = shlex.quote(request.args.get("host", ""))
result = subprocess.check_output(["/usr/bin/ping", "-c", "1", host]) # nosec B603
return result
The # nosec B603 comment tells Bandit this specific call was reviewed and accepted โ an important feature for managing false positives without silencing the whole scanner.
Re-running the scan on the fixed version:
Run metrics:
Total issues (by severity):
Low: 0
Medium: 0
High: 0
From 10 findings to 0. โ
Automating with GitHub Actions
A scan you run once is a snapshot; a scan that runs on every push is a security control. This workflow blocks any PR that introduces new findings:
name: Bandit SAST Scan
on:
push:
branches: [main]
pull_request:
branches: [main]
jobs:
bandit:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: "3.12"
- name: Install Bandit
run: pip install bandit
- name: Scan vulnerable app (findings expected, does not block)
run: bandit app.py || true
- name: Scan fixed app (security gate)
run: bandit app_fixed.py
- name: Upload JSON report
if: always()
run: bandit -r . -f json -o bandit-report.json --exclude ./.github || true
- uses: actions/upload-artifact@v4
if: always()
with:
name: bandit-report
path: bandit-report.json
Because Bandit exits with a non-zero code when it finds issues, the job fails and the merge is blocked. The JSON report is archived as an artifact for later review.
Strengths and limitations
What I liked about Bandit: zero-config startup, fast scans that fit any CI budget, findings mapped to CWE with links to documentation, and granular suppression with # nosec.
What it won't do for you โ and OWASP is explicit about this for SAST in general: it can't detect authentication or access-control flaws, it doesn't track data flow across services, and some findings (like B608) are educated guesses that require human review. SAST is one layer; combine it with dependency scanning (SCA) and dynamic testing (DAST) for real coverage.
Conclusion
In less than an hour, a free tool found 10 real security issues, taught me the CWE behind each one, and became a permanent quality gate in my pipeline. If you write Python and your CI doesn't run a SAST tool yet, pip install bandit is one of the highest-return security investments you can make today.
Full demo code + workflow: github.com/Dayan-18/bandit-demo
What SAST tool does your team use? Let me know in the comments! ๐
References: OWASP Source Code Analysis Tools ยท Bandit documentation ยท CWE Top 25
Top comments (0)