DEV Community: Arnold M S

Open Source OWASP API Security Scanner with AI-Assisted Testing

Arnold M S — Thu, 30 Apr 2026 10:06:49 +0000

Most security scanners produce a list of vulnerabilities ranked by severity and leave the remediation work to you. After working on projects where that list grew long and the question "can someone actually exploit this right now?" remained unanswered, I built something different.

The result is Breach Gate, an open source CLI tool that combines static analysis, container scanning, dynamic API testing, and AI-assisted behavioral testing into a single pipeline. It outputs one clear answer: SAFE, UNSAFE, or REVIEW REQUIRED.

The Core Problem

Traditional scanners answer: "What vulnerabilities exist?"

Breach Gate answers: "Can an attacker actually compromise the system right now?"

The distinction matters in CI pipelines. A list of medium-severity findings does not tell you whether to block a deployment. A confirmed exploit does.

Breach Gate scores every finding using a multiplicative formula:
Risk = Reachability x Exploitability x Impact x Confidence

A vulnerability that is hard to reach, has no working proof-of-concept, and low confidence stays at a low risk score. A confirmed exploit with a working payload gets boosted to critical regardless of how the individual factors score.

What It Tests

AI-Assisted Behavioral Testing

The scanner generates OWASP-based test cases per endpoint and executes them against your live API. Two mechanisms keep false positives low:

Baseline diffing -- a benign request is sent to each endpoint before any attack probes. Response tokens that appear in the baseline are filtered from vulnerability indicators, eliminating a large class of false positives where generic words like "error" or "id" triggered matches.
Time-based blind injection -- responses delayed more than 3 seconds AND more than 3 times the baseline timing are flagged as potential blind SQL or command injection, which cannot be detected from response bodies alone.

Attack categories covered out of the box:

Category	Detection Method
SQL Injection	Response body, error text, blind timing
Command Injection	Response body, blind timing
XSS	Reflected probe in response
Broken Access Control	Status code shift vs baseline
SSRF	Cloud metadata endpoint probing
Mass Assignment	Privilege field echo in response
JWT Attacks	Algorithm confusion, claim tampering, expired token
Path Traversal	File content indicators in response

Static Analysis via Trivy

Scans your source code and dependencies for known CVEs, exposed secrets, and misconfigurations. Results feed into the same scoring pipeline as dynamic findings.

Container Scanning

Pulls your Docker image and runs Trivy against the filesystem and OS packages. Findings are correlated with the API endpoint they affect where possible.

GraphQL Security Probing

For GraphQL APIs, Breach Gate runs five dedicated probes: introspection exposure, depth-limit denial of service, field suggestion enumeration, variable injection, and IDOR by ID enumeration.

Dynamic Testing via OWASP ZAP

When ZAP is available (local or Docker), the scanner runs an active API scan and merges the results with findings from other scanners.

The Output

SECURITY VERDICT:
╔════════════════════════════════════════════════════════╗
║ UNSAFE TO DEPLOY ║
╚════════════════════════════════════════════════════════╝
Reason: Confirmed exploitation: SQL Injection, Command Injection.
Active attacks succeeded during testing.
2 CONFIRMED EXPLOITS:
SQL Injection on POST /api/data
Command Injection on POST /api/execute
Attack Surface (by endpoint):
POST /api/execute
Risk: 95%
Command Injection
Attack chain: Command Injection -> Full System Compromise
POST /api/data
Risk: 90%
SQL Injection
Attack chain: Injection -> System Compromise

Reports are generated in JSON, Markdown, SARIF, and HTML. The HTML report includes a category filter bar and one-click evidence copy.

CI Integration

Breach Gate is published to the GitHub Marketplace as a composite action:

- name: Run Breach Gate
  uses: epten08/breach-gate@v1
  with:
    target: ${{ vars.STAGING_API_URL }}
    anthropic-api-key: ${{ secrets.ANTHROPIC_API_KEY }}
    format: json,markdown,sarif
    output: security-reports

The action outputs a verdict value (PASS or FAIL) that downstream steps can consume, and the SARIF report integrates directly with GitHub Code Scanning.

For teams not on GitHub, the same scan runs via npm:

npx breach-gate scan --target https://staging.api.example.com --ci

The --ci flag sets a non-zero exit code on UNSAFE verdicts, which blocks the deployment step in any CI system.

Watch Mode

For continuous environments, a watch command runs scans on a configurable interval and diffs findings between runs:

breach-gate watch --target http://localhost:3000 --interval 300

New findings are logged as warnings. Resolved findings are logged as informational. This is useful for staging environments that receive frequent deployments.

Suppressing Known Findings

Teams working on legacy APIs often have accepted known issues that are tracked. A .breachgateignore file prevents those from blocking pipelines:

suppress:
  - id: "finding-abc123"
    reason: "Tracked in JIRA-456, fix scheduled for next sprint"
    expires: "2026-06-01"

  - pattern: "Missing security header"
    endpoint: "/api/health"
    reason: "Health check endpoint, intentionally minimal headers"

Rules with an expires date automatically stop suppressing after that date, which prevents forgotten suppressions from masking real regressions.

Getting Started

# Install globally
npm install -g breach-gate

# Run against your API
breach-gate scan --target http://localhost:3000

# Run the built-in demo to see a full vulnerable API scan
git clone https://github.com/epten08/breach-gate
cd breach-gate
npm install
npm run demo        # starts a deliberately vulnerable API
npm run scan        # scans it

An OpenAPI spec can be passed to give the scanner full endpoint coverage:

breach-gate scan --target http://localhost:3000 --openapi ./openapi.yml

Without a spec, the scanner infers common endpoint patterns and uses them as a starting point.

Lessons Learned

Reducing the false positive rate was more challenging than building the detection logic. Early versions flagged nearly everything because words like "error", "id", and "success" appeared in every API response. Combining baseline diffing with restricting body matches to 2xx responses brought the false positive rate to a manageable level.

Prompt design for the Anthropic API also required careful iteration. Prompts using direct offensive language were blocked by content filtering. Reframing the same tests as "authorized penetration testing" and "OWASP-based assessment probes" passed the filter while generating identical test cases.

What K6 Load Testing Found in My POS SaaS API (Before It Hit Production)

Arnold M S — Thu, 30 Apr 2026 08:33:32 +0000

Background

I'm a solo developer building a multi-tenant Point of Sale SaaS system,the kind where dozens of shops might be processing sales at the same time through a shared Laravel API. My unit tests passed. My Postman collection passed. Everything worked perfectly when I tested it by hand.

So naturally, I decided to see what happened when I threw 50 simultaneous virtual users at it.

The answer: chaos.

This post walks through exactly what broke, the root causes, and the fixes. The patterns here deadlocks, N+1 queries, race conditions on unique values, synchronous work on hot paths — show up in almost every database-backed API. I just didn't know they were hiding in mine until I looked.

The Setup

Stack: Laravel 11 + MySQL + Redis queues, running in Docker Compose
Test tool: K6 v0.53.0
Scenarios:

multi-tenant.js — multiple tenants each with concurrent cashiers
busy-shop.js — one shop, 50 virtual users (VUs) ramping up over 10 minutes
end-of-day.js - simultaneous cash-ups across all tenants (not yet run) The critical path I was testing: POST /api/sales — the sale transaction that validates items, reduces inventory, records payment, creates journal entries, and updates the till balance.

Before I could even run a test, I had to fix a long list of test harness issues (wrong Docker volume paths, K6 SharedArray type mismatches, Windows Git Bash path mangling, tenant soft-delete not cleaning up properly). If you're setting up K6 with Docker on Windows, budget extra time for that part.I'm considering publishing my Docker Compose + K6 configuration separately if there's interest.

Bug 1: MySQL Deadlock on `tills.current_balance` — Critical

Error rate at 50 VUs: 2.76% of all sales

SQLSTATE[40001]: Serialization failure: 1213 Deadlock found when trying to
get lock; try restarting transaction

Every cashier hitting "Complete Sale" had about a 1-in-36 chance of getting an error. The sale would silently roll back and the cashier would have to retry.

Why it happened

The sale transaction went roughly:

Sale::create() → triggers SaleObserver::created()
Observer immediately increments tills.current_balance (acquires exclusive lock on tills row)
Loop through items → InventoryRepository::reduceStock() → SELECT ... FOR UPDATE on inventories rows Under concurrent load with 50 VUs sharing 3 tills, this created a classic circular dependency:

Transaction A (cash, products [1, 3]): holds tills[1] lock → waiting for inventories[product_3]
Transaction B (cash, products [3, 1]): holds inventories[product_3] lock → waiting for tills[1] MySQL breaks the cycle by rolling back one transaction. The "victim" returns HTTP 400 to the cashier.

The lock on tills was being held for ~500ms the entire duration of inventory processing because it was acquired inside the open transaction, before the inventory loop.

The fix

Defer the till balance update to after the transaction commits:

// app/Observers/SaleObserver.php
public function created(Sale $sale): void
{
    if ($sale->isCompleted() && $sale->payment_method === 'cash') {
        DB::afterCommit(function () use ($sale) {
            $sale->till->increment('current_balance', $sale->total);
        });
    }
}

DB::afterCommit() fires after DB::commit() completes. The tills lock now lasts ~1ms (a single UPDATE outside any transaction) instead of ~500ms.

Result: 0 deadlocks at 50 VUs.

Bug 2: N+1 Queries in the Sale Service — High

Before fix: sale latency minimum 377ms, p(95) = 1.74s at 6 VUs

This one wasn't crashing anything it was just quietly making every sale slower and slower as load increased.

Why it happened

createSale() had two separate loops over the sale items. Each loop called productRepository->find($item['product_id']) independently — so a 4-item sale triggered 8 product queries. Additionally, inventoryRepository->getQuantity() was called once in the validation loop and again in the deduction loop, doubling the inventory queries.

With 50 concurrent users each processing 3–5 item sales, the database was doing 8–10 queries per sale where 1 was needed.

The fix

Batch-load all products before the loops, and cache inventory quantities from the validation pass:

// One query for all products in this sale
$productMap = Product::whereIn('id', array_column($data['items'], 'product_id'))
    ->get()->keyBy('id');
$stockMap = [];

// Validation loop: cache stock alongside the check
if ($product->track_inventory) {
    $availableStock = $this->inventoryRepository->getQuantity(...);
    $stockMap[$item['product_id']] = $availableStock;
}

// Deduction loop: no DB hits
$product        = $productMap->get($item['product_id']);
$quantityBefore = $stockMap[$item['product_id']] ?? ...;

Result: Sale minimum latency: 377ms → 253ms (−33%). p(95) at 6 VUs: 1.74s → 814ms (−53%).

Bug 3: Synchronous Accounting on the HTTP Hot Path — High

120–140ms added to every single sale response

Why it happened

After the sale committed, SaleService called saleAccountingService->recordSaleEntry($sale) synchronously on the same HTTP worker thread that the cashier's request was waiting on. That method made 4–6 separate ChartOfAccount lookups, inserted JournalEntry and JournalEntryLine rows, and did all of this while blocking the response.

The cashier doesn't need to wait for the accounting ledger to update before the "Sale complete" screen appears.

The fix

Dispatch a queued job instead:

// Before (blocking):
$this->saleAccountingService->recordSaleEntry($sale);

// After (async):
RecordSaleEntryJob::dispatch($sale->id);

The existing Redis queue worker picks it up in ~100–140ms. Journal entries are still written correctly verified by inspecting the database after a 60-sale test run. The cashier just doesn't have to wait for it.

One thing to consider with this approach: if the job fails, the sale exists but the journal entries don't. I handle this with Laravel's built-in job retry mechanism (tries = 3 with exponential backoff) and a failed job alert that notifies me via the existing monitoring. In practice, journal entry creation is simple enough that transient failures (brief DB hiccups) resolve on retry, and permanent failures (code bugs) get caught in development.

I also added an instance-level $accountCache to SaleAccountingService to avoid redundant ChartOfAccount lookups within a single job execution.

Bug 4: Duplicate `sale_number` Under Concurrency — Medium

SQLSTATE[23000]: Integrity constraint violation: 1062 Duplicate entry 'SAL-...'

Why it happened

generateSaleNumber() used PHP's uniqid(), which is time-based. Under 50 concurrent VUs, multiple calls within the same microsecond returned identical values.

The fix

Replace with cryptographically random bytes and rely on the database's UNIQUE constraint as the true guarantee:

protected function generateSaleNumber(): string
{
    $prefix = 'SAL';
    $date   = date('Ymd');
    $maxAttempts = 5;

    for ($i = 0; $i < $maxAttempts; $i++) {
        try {
            $random = strtoupper(substr(bin2hex(random_bytes(4)), 0, 6));
            $number = "{$prefix}-{$date}-{$random}";

            // The UNIQUE constraint on sale_number is the real safety net.
            // This existence check is an optimistic pre-filter to avoid
            // hitting the constraint in the common case.
            if (!Sale::where('sale_number', $number)->exists()) {
                return $number;
            }
        } catch (QueryException $e) {
            if ($i === $maxAttempts - 1) throw $e;
            // Duplicate key → retry with a new random value
        }
    }

    throw new \RuntimeException('Failed to generate unique sale number');
}

6 hex characters = 16.7 million combinations per day. The probability of a collision across 4,000 daily sales is ~0.05%. The exists() check handles the common case, but under true concurrency, two transactions can pass the check simultaneously that's why the UNIQUE constraint and the retry-on-exception are there as the actual guarantee.

Lesson: uniqid() is not unique under concurrency. Any identifier you generate at the application layer and store in a UNIQUE column needs either a random component with enough entropy, or a database-side sequence/auto-increment. And the database constraint, not an application-level check, must be your source of truth.

Bug 5: Duplicate `inventory_adjustments.reference_number` — Medium

This one had two causes that had to be fixed separately.

Cause A: The reference was built as 'SALE-' . $sale->sale_number . '-' . $item['product_id']. If the same product appeared twice in a sale, two items produced an identical reference.

Fix A: Add a 1-based index suffix: 'SALE-' . $sale->sale_number . '-' . $product_id . '-' . ($idx + 1)

Cause B: The K6 test payload builder was picking products randomly with replacement, so it could include the same product twice which exposed the above bug. This was a test harness issue, not an application bug, but it was testing a valid edge case.

Fix B: Deduplicate product selection in K6:

const usedIds = new Set();
do {
    product = randomProduct(products);
} while (usedIds.has(product.id) && attempts < 20);
usedIds.add(product.id);

Bug 6: Inventory Lock Ordering — Low (Latent)

This one didn't cause observable failures during the test because Bug 1 was the active deadlock source. But it was sitting there waiting.

Why it was a risk

SELECT ... FOR UPDATE locks inventory rows in the order items appear in the request payload which is random. Two concurrent transactions with overlapping products in reverse order create a textbook circular-wait deadlock.

The fix

Sort items by product_id before both loops:

usort($data['items'], fn($a, $b) => $a['product_id'] <=> $b['product_id']);

All transactions now acquire inventory locks in ascending product_id order. Circular waits on the inventory dimension are impossible.

This is a standard database technique: if multiple transactions need locks on the same set of rows, they must acquire them in a consistent order.

The Numbers

All comparisons below are at 50 VUs over a 10-minute run to keep the baseline and post-fix numbers directly comparable.

Metric	Before	After
`http_req_failed`	1.44%	0.00%
Sale error rate	2.76%	0.00%
All checks passed	98.55%	100.00%
Sale p(95)	3.32s	2.07s
Sale minimum	377ms	253ms
Deadlock errors	~68 / 2,457 sales	0

What I Learned

Unit tests cannot find concurrency bugs. Every one of these bugs was invisible to my test suite because tests run sequentially. Deadlocks, race conditions on unique values, and lock ordering issues only appear when multiple transactions run simultaneously.
Load testing reveals the real hot path. Profiling under load showed that synchronous accounting, redundant queries, and observer side effects were all adding up on a code path that runs on every single sale. None of those costs were obvious from reading the code.
uniqid() is not unique. At any meaningful concurrency, uniqid() generates collisions. Use random_bytes() for application-layer identifiers stored in unique-constrained columns,and let the database constraint be the final enforcer, not an application-level exists() check.
Observers that do DB writes inside a transaction are a trap. Eloquent observers fire during the wrapping transaction. Any row lock acquired in an observer is held for the full transaction duration,including any slower work that happens after. Use DB::afterCommit() for side effects that don't need to be part of the main transaction.
Consistent lock ordering prevents deadlocks. If multiple transactions acquire locks on the same rows, sorting by primary key before processing guarantees they'll always request locks in the same order, making circular waits impossible.

6. The test harness itself has bugs. About half of my debugging time was spent on the test infrastructure (Docker volumes, Windows path issues, tenant cleanup, K6 API quirks) before I could even get a clean test run. Budget for this.

What's Still Open

The end-of-day.js scenario,simultaneous cash-ups across all tenants hasn't run yet. End-of-day is the highest-risk moment: all shops closing, all GL accounts being touched at once. That's next.

There's also a latent entry_number collision in the accounting service that's going to need the same random_bytes treatment as the sale number fix. Found it in the queue worker logs; haven't fixed it yet.

If you're building any kind of transactional API and haven't run a load test, I'd strongly recommend it. The bugs above weren't edge cases, they were waiting to hit real customers on any moderately busy day.

DEV Community: Arnold M S

Open Source OWASP API Security Scanner with AI-Assisted Testing

The Core Problem

What It Tests

AI-Assisted Behavioral Testing

Static Analysis via Trivy

Container Scanning

GraphQL Security Probing

Dynamic Testing via OWASP ZAP

The Output

CI Integration

Watch Mode

Suppressing Known Findings

Getting Started

Lessons Learned

Links

What K6 Load Testing Found in My POS SaaS API (Before It Hit Production)

Background

The Setup

Bug 1: MySQL Deadlock on tills.current_balance — Critical

Why it happened

The fix

Bug 2: N+1 Queries in the Sale Service — High

Why it happened

The fix

Bug 3: Synchronous Accounting on the HTTP Hot Path — High

Why it happened

The fix

Bug 4: Duplicate sale_number Under Concurrency — Medium

Why it happened

The fix

Bug 5: Duplicate inventory_adjustments.reference_number — Medium

Bug 6: Inventory Lock Ordering — Low (Latent)

Why it was a risk

The fix

The Numbers

What I Learned

6. The test harness itself has bugs. About half of my debugging time was spent on the test infrastructure (Docker volumes, Windows path issues, tenant cleanup, K6 API quirks) before I could even get a clean test run. Budget for this.

What's Still Open

Bug 1: MySQL Deadlock on `tills.current_balance` — Critical

Bug 4: Duplicate `sale_number` Under Concurrency — Medium

Bug 5: Duplicate `inventory_adjustments.reference_number` — Medium