Manual security checks don't scale. Here's what we built instead.

#cli #python #security #showdev

Hello all

Being tired of scanners producing plain unordered lists of findings, I decided to make a Python CLI that does a security audit of Flask/Nginx/Docker/Linux stacks and includes 2 additional things I couldn't locate in other tools:

Cost-aware prioritisation

A failing check that is higher on the output needs to have higher priority. So that's why I use the following equation to assign a priority to each failing check:

priority = (severity_score × impact_weight) ÷ effort_score

If we consider a HIGH finding that is only one config flag, then it is scored higher than a HIGH finding that requires an architectural change. The result is a Task Day 1 / Day 7 / Day 30 ordered remediation plan.

What-if simulation

You (even) don't have to make changes if you only want to see the result of fixing a random subset of checks (you can simulate it):

python audit.py --simulate HOST-FW-001, APP-COOKIE-001

It will update your grade and attack-path count as if those checks passed. Automating roadmap simulation might look like:

|---|---|---|---|

| Current | F | 40.9% | 1 |

| Day 1 | D | 63.6% | 1 |

| Day 7 | C | 78.2% | 0 |

| Day 30 | A | 95.0% | 0 |

There is also a --profile option (student / devops / cto) that changes the language to OWASP narrative that suits different levels of understanding without altering the list of findings.

24+ checks, OWASP Top 10:2025 mapped, ReportLab PDF generation.

No paywall articles: https://vickkykruzprogramming.dev/blog/manual-web-app-security-checks-don-t-scale-inside-our-automated-assessment-remediation-framework

GitHub: https://github.com/vickkykruz/sec_audit_framework

I will be glad to explain the scoring model or attack-path detection logic.