Alex Spinov

Posted on Mar 25 • Edited on Mar 26

I Built a Free API Vulnerability Scanner — It Found 23 Issues in My Own Code

#python #security #api #webdev

Last month, I ran a security audit on a side project I'd been working on for 6 months. I thought the code was clean — proper auth headers, HTTPS everywhere, input validation on forms.

Then I pointed my scanner at it and found 23 vulnerabilities. Not theoretical ones. Real, exploitable issues.

Here's what happened, what the scanner found, and how you can build one yourself.

The Wake-Up Call

I was building an API aggregation service — something that pulls data from multiple free APIs and normalizes it. I had:

API key rotation ✅
Rate limiting ✅
HTTPS only ✅
Input sanitization ✅

So what could go wrong? A lot.

The 5 Categories of API Vulnerabilities

After researching OWASP API Security Top 10 (2023 edition), I built a scanner that checks for these categories:

1. Broken Object Level Authorization (BOLA)

The #1 API vulnerability. Your endpoint /api/users/123/data\ works for user 123 — but does it also work when user 456 requests it?

import requests

def check_bola(base_url, endpoints, auth_token):
    """Test if changing object IDs exposes other users' data"""
    vulnerabilities = []

    for endpoint in endpoints:
        # Try accessing with incremented/decremented IDs
        for id_offset in [-1, 1, 100, -100]:
            modified_url = modify_id_in_url(endpoint, id_offset)

            resp = requests.get(
                modified_url,
                headers={'Authorization': f'Bearer {auth_token}'}
            )

            if resp.status_code == 200:
                vulnerabilities.append({
                    'type': 'BOLA',
                    'severity': 'HIGH',
                    'endpoint': modified_url,
                    'detail': f'Accessible with modified ID (offset {id_offset})'
                })

    return vulnerabilities

def modify_id_in_url(url, offset):
    """Find numeric IDs in URL and modify them"""
    import re
    def replace_id(match):
        return str(int(match.group()) + offset)
    return re.sub(r'\b\d{1,10}\b', replace_id, url, count=1)

My scanner found: 4 endpoints where changing the user ID returned other users' data. I had authentication but no authorization check.

2. Broken Authentication

Not just "is there a token?" — but "is the token properly validated?"

def check_auth_issues(base_url, endpoints):
    """Test various authentication bypass techniques"""
    issues = []

    bypass_tokens = [
        '',                          # Empty token
        'null',                      # Literal null
        'undefined',                 # Literal undefined
        'Bearer ',                   # Empty bearer
        'Bearer null',               # Null bearer
        'Basic ' + b64encode(b':'), # Empty basic auth
    ]

    for endpoint in endpoints:
        for token in bypass_tokens:
            resp = requests.get(
                f'{base_url}{endpoint}',
                headers={'Authorization': token}
            )

            if resp.status_code != 401:
                issues.append({
                    'type': 'AUTH_BYPASS',
                    'severity': 'CRITICAL',
                    'endpoint': endpoint,
                    'detail': f'Accessible with token: {repr(token)}'
                })

    return issues

My scanner found: 2 endpoints that returned data with an empty Bearer token. My middleware was checking if the header existed, not if it had a valid value.

3. Excessive Data Exposure

APIs that return entire database objects when the client only needs 2 fields.

def check_data_exposure(base_url, endpoint, auth_headers):
    """Check if API returns more data than needed"""
    resp = requests.get(f'{base_url}{endpoint}', headers=auth_headers)

    if resp.status_code != 200:
        return None

    data = resp.json()

    sensitive_fields = [
        'password', 'hash', 'salt', 'secret', 'token',
        'ssn', 'social_security', 'credit_card', 'card_number',
        'private_key', 'api_key', 'internal_id', 'email',
        'phone', 'address', 'ip_address', 'session'
    ]

    found = find_sensitive_fields(data, sensitive_fields)

    if found:
        return {
            'type': 'DATA_EXPOSURE',
            'severity': 'HIGH',
            'endpoint': endpoint,
            'detail': f'Exposes sensitive fields: {found}'
        }

def find_sensitive_fields(obj, sensitive, path=''):
    """Recursively search JSON for sensitive field names"""
    found = []
    if isinstance(obj, dict):
        for key, val in obj.items():
            current = f'{path}.{key}' if path else key
            if any(s in key.lower() for s in sensitive):
                found.append(current)
            found.extend(find_sensitive_fields(val, sensitive, current))
    elif isinstance(obj, list) and obj:
        found.extend(find_sensitive_fields(obj[0], sensitive, f'{path}[0]'))
    return found

My scanner found: 8 fields being returned unnecessarily, including email addresses and internal IDs in a public endpoint.

4. Rate Limiting Gaps

import time
from concurrent.futures import ThreadPoolExecutor

def check_rate_limiting(url, auth_headers, num_requests=100):
    """Test if rate limiting is properly enforced"""

    def make_request(_):
        return requests.get(url, headers=auth_headers).status_code

    start = time.time()
    with ThreadPoolExecutor(max_workers=20) as executor:
        results = list(executor.map(make_request, range(num_requests)))
    elapsed = time.time() - start

    success_count = results.count(200)
    rate_limited = results.count(429)

    if rate_limited == 0 and success_count > 50:
        return {
            'type': 'NO_RATE_LIMIT',
            'severity': 'MEDIUM',
            'endpoint': url,
            'detail': f'{success_count}/{num_requests} succeeded in {elapsed:.1f}s, 0 rate-limited'
        }

My scanner found: 5 endpoints with no rate limiting at all. An attacker could have hammered them at thousands of requests per second.

5. Security Header Analysis

def check_security_headers(url):
    """Verify security headers are present"""
    resp = requests.get(url)
    headers = resp.headers

    required_headers = {
        'X-Content-Type-Options': 'nosniff',
        'X-Frame-Options': ['DENY', 'SAMEORIGIN'],
        'Strict-Transport-Security': None,  # Just needs to exist
        'Content-Security-Policy': None,
        'X-XSS-Protection': '1; mode=block',
    }

    missing = []
    for header, expected in required_headers.items():
        if header not in headers:
            missing.append({'header': header, 'status': 'MISSING'})
        elif expected and headers[header] not in (expected if isinstance(expected, list) else [expected]):
            missing.append({'header': header, 'status': 'MISCONFIGURED', 'value': headers[header]})

    return missing

My scanner found: 4 missing security headers across different endpoints.

The Full Results

Category	Issues Found	Severity
BOLA	4	🔴 High
Auth Bypass	2	🔴 Critical
Data Exposure	8	🟡 Medium-High
No Rate Limit	5	🟡 Medium
Missing Headers	4	🟢 Low
Total	23

What I Did About It

BOLA fixes — Added middleware that checks req.user.id === resource.owner_id\ on every data endpoint
Auth fixes — Rewrote auth middleware to validate token content, not just presence
Data exposure — Created response serializers that whitelist fields per endpoint
Rate limiting — Added express-rate-limit with tiered limits (100/min for reads, 10/min for writes)
Headers — Added helmet.js with strict CSP

Run It On Your Own APIs

I've open-sourced the scanner. To use it:

git clone https://github.com/spinov001-art/api-vulnerability-scanner
cd api-vulnerability-scanner
pip install -r requirements.txt

# Quick scan
python scanner.py --url https://your-api.com --endpoints endpoints.json

# Full scan with auth
python scanner.py --url https://your-api.com --token YOUR_TOKEN --full

The endpoints.json\ format:

{
  "endpoints": [
    {"method": "GET", "path": "/api/users/{id}"},
    {"method": "GET", "path": "/api/orders/{id}"},
    {"method": "POST", "path": "/api/data"}
  ]
}

Key Lessons

Auth ≠ Authorization — Having a login doesn't mean your data is protected
Never trust the client — Server must validate every request independently
Return minimum data — If the UI needs 3 fields, return 3 fields, not 30
Automate security checks — Run this in CI/CD, not once a quarter
OWASP API Top 10 is your checklist — Read it. Seriously.

Have you run a security scanner on your APIs recently? I'm curious what people find — drop your experience in the comments.

If you want to explore free APIs for your projects, check out my curated list of 300+ free APIs.

Found this useful? Follow me for more security + API content.

Need web scraping or data extraction? I've built 77+ production scrapers. Email spinov001@gmail.com — quote in 2 hours. Or try my ready-made Apify actors — no code needed.

DEV Community