<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Benjamin</title>
    <description>The latest articles on DEV Community by Benjamin (@howwow-2000).</description>
    <link>https://dev.to/howwow-2000</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3838343%2Ff9819a71-22fe-4aa6-8616-200b94d25d0c.png</url>
      <title>DEV Community: Benjamin</title>
      <link>https://dev.to/howwow-2000</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/howwow-2000"/>
    <language>en</language>
    <item>
      <title>I Tested My Security Scanner on 500 Sites and Found It Was Lying About 158 of Them</title>
      <dc:creator>Benjamin</dc:creator>
      <pubDate>Tue, 24 Mar 2026 18:22:47 +0000</pubDate>
      <link>https://dev.to/howwow-2000/i-built-77-tests-for-my-security-scanner-and-found-a-production-bug-in-10-minutes-j80</link>
      <guid>https://dev.to/howwow-2000/i-built-77-tests-for-my-security-scanner-and-found-a-production-bug-in-10-minutes-j80</guid>
      <description>&lt;p&gt;Two days ago I published how I rebuilt my scoring from scratch. I recalibrated 20+ finding severities against CVSS and Bugcrowd, built SPA detection, and aligned with industry standards. Users confirmed the fixes worked.&lt;/p&gt;

&lt;p&gt;Then I decided to actually test whether my scanner tells the truth.&lt;/p&gt;

&lt;p&gt;Not "scan a few sites and eyeball the results." Real testing. A/B simulations on every scan in my database. Ground truth verification with actual HTTP requests. Gaming attacks against my own scoring.&lt;/p&gt;

&lt;p&gt;I tested 500+ sites over one session. Here's what I found.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test 1: I sent real PUT/DELETE/PATCH requests to 158 sites
&lt;/h2&gt;

&lt;p&gt;My scanner flagged 158 sites for "Dangerous HTTP methods enabled: PUT, DELETE, PATCH." That's a real security finding. If your server accepts DELETE requests without authentication, someone can delete your data.&lt;/p&gt;

&lt;p&gt;Except I never verified whether those methods were actually enabled. The scanner sent the request, saw a non-405 response, and concluded "method allowed."&lt;/p&gt;

&lt;p&gt;So I sent real PUT, DELETE, and PATCH requests to all 158 sites and recorded what came back.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;What happened&lt;/th&gt;
&lt;th&gt;Sites&lt;/th&gt;
&lt;th&gt;%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;False positive: server returned HTML (SPA catch-all)&lt;/td&gt;
&lt;td&gt;95&lt;/td&gt;
&lt;td&gt;60%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;False positive: server redirected (301/302/308)&lt;/td&gt;
&lt;td&gt;56&lt;/td&gt;
&lt;td&gt;35%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Already blocked (405/400/404) but scanner flagged anyway&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Actually enabled (real API accepting the method)&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Zero. Not one site had PUT, DELETE, or PATCH actually enabled on its homepage.&lt;/p&gt;

&lt;p&gt;The root cause: Single Page Applications return 200 OK with their HTML shell for any HTTP method, not just GET. A Next.js app, a React SPA on Vercel, a Nuxt site on Netlify — they all respond to PUT / with their homepage. My scanner saw 200 OK and concluded the method was "allowed."&lt;/p&gt;

&lt;p&gt;Same problem with redirects. A server that sends 301 Redirect for every request isn't "accepting PUT." It's redirecting everything. But 301 is less than 400, so my check passed it.&lt;/p&gt;

&lt;p&gt;One user, Antoine, told me on LinkedIn: "Next.js App Router only exposes what you explicitly export. Since your route files only have GET and POST, those other methods automatically return 405. That was the only false positive. Everything else was real and we fixed it."&lt;/p&gt;

&lt;p&gt;That last sentence matters. The other findings were real. The HTTP methods check was the one that lied.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision:&lt;/strong&gt; I removed the HTTP methods check entirely. Not fixed. Removed. A check with 0% true positives has no business being in a security report. I'll reintroduce it when I have a reliable way to test methods on actual API endpoints, not homepages.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test 2: My scanner was blind to 82% of external scripts
&lt;/h2&gt;

&lt;p&gt;Mozilla Observatory checks whether your external scripts have Subresource Integrity (SRI) hashes. If a CDN gets compromised, SRI prevents the tampered script from running.&lt;/p&gt;

&lt;p&gt;Observatory flags SRI issues on about 93% of sites. My scanner: 18%.&lt;/p&gt;

&lt;p&gt;I assumed this was a detection quality gap. I was wrong. It was a plumbing bug.&lt;/p&gt;

&lt;p&gt;My SRI check runs on the initial HTML returned by the server. But modern frameworks (React, Next.js, Vue, Nuxt) don't put script tags in the HTML. They inject them dynamically after the page loads. My scanner was checking a page with zero script tags and concluding "no SRI issues."&lt;/p&gt;

&lt;p&gt;The irony: I already collect the rendered DOM. My headless browser (Cloudflare Puppeteer) renders the page, executes JavaScript, and returns the final HTML. I use it for tech stack detection. I just never connected it to the SRI check.&lt;/p&gt;

&lt;p&gt;The fix: one line. &lt;code&gt;checkSRI(renderedHtml || htmlBody)&lt;/code&gt; instead of &lt;code&gt;checkSRI(htmlBody)&lt;/code&gt;. The rendered HTML sees the scripts that frameworks inject. The initial HTML doesn't.&lt;/p&gt;

&lt;p&gt;I also found a secondary bug: the SRI finding didn't have a findingId. My benchmark tool mapped Observatory's SRI test to &lt;code&gt;sri-missing&lt;/code&gt;, but no finding ever had that ID. The 5% agreement rate I reported in my benchmark wasn't a detection gap. It was a broken mapping.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test 3: I scanned Amazon, Netflix, and Reddit
&lt;/h2&gt;

&lt;p&gt;I had built a prototype for contextual scoring: adjust finding severity based on how much JavaScript a site loads. A static portfolio with zero scripts shouldn't be penalized the same way as a site with 12 third-party trackers.&lt;/p&gt;

&lt;p&gt;The prototype worked on my existing 362 user scans. 57% saw their score improve. Zero degraded. Average improvement: +0.78 points. 100 grade changes, all upward.&lt;/p&gt;

&lt;p&gt;Then my CTO asked: "How many of those 362 sites classify as 'complex'?"&lt;/p&gt;

&lt;p&gt;Zero. Every single user in my database has a simple site. My contextual scoring had only been tested in one direction: making simple sites look better. I had no data on whether it correctly maintained severity for complex sites.&lt;/p&gt;

&lt;p&gt;So I scanned 140 of the biggest sites on the internet: Amazon, Netflix, Reddit, GitHub, Stripe, Shopify, Notion, Figma, Airbnb, Booking.com, and 130 more.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Surface Level&lt;/th&gt;
&lt;th&gt;Sites&lt;/th&gt;
&lt;th&gt;%&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;MINIMAL (0 external scripts detected)&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;24%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LOW (1-5 scripts)&lt;/td&gt;
&lt;td&gt;106&lt;/td&gt;
&lt;td&gt;76%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MEDIUM (6+ scripts)&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;HIGH (eval/dangerous patterns)&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Amazon, Netflix, and Reddit all classified as "simple." Zero external scripts detected.&lt;/p&gt;

&lt;p&gt;This is the same renderedHtml bug. My script count comes from the SRI check, which only sees the initial HTML. Amazon loads 50+ scripts dynamically. My scanner sees one or two.&lt;/p&gt;

&lt;p&gt;What this meant for contextual scoring: If I deployed it as designed, Amazon and a personal blog would get the same "low surface" bonus. A missing CSP on Amazon would be downgraded to informational, same as a missing CSP on a one-page HTML portfolio. That's not contextual scoring. That's a bug that happens to look like a feature.&lt;/p&gt;

&lt;p&gt;My CTO put it clearly: "The dev indie who sees his score go up without doing anything gets a false sense of security. That's worse than a false positive. A false positive wastes time. A false sense of security wastes vigilance."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decision:&lt;/strong&gt; Contextual scoring stays on hold. Not because it's a bad idea, but because the current design rewards inaction on simple sites instead of rewarding action. And the data pipeline can't distinguish simple sites from complex ones yet. Both problems need to be solved before deployment.&lt;/p&gt;




&lt;h2&gt;
  
  
  Test 4: I tried to game my own score
&lt;/h2&gt;

&lt;p&gt;An IEEE paper, Lepochat et al. (2025) "One Does Not Simply Score a Website", found that security scoring algorithms can be trivially gamed. I cited this paper in my previous article as a methodological reference. My CTO pointed out I was using it as decoration, not as a constraint.&lt;/p&gt;

&lt;p&gt;So I tested it. I simulated 12 gaming scenarios: adding security headers with empty, invalid, or actively harmful values. The question: does the score go up without security going up?&lt;/p&gt;

&lt;p&gt;Results: 9 out of 12 gaming attempts succeeded.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Gaming attempt&lt;/th&gt;
&lt;th&gt;Score change&lt;/th&gt;
&lt;th&gt;Actual security&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All headers empty/invalid at once&lt;/td&gt;
&lt;td&gt;+2.3 (7.7 to 10.0)&lt;/td&gt;
&lt;td&gt;Zero. Some actively worse.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Strict-Transport-Security: max-age=0&lt;/td&gt;
&lt;td&gt;+0.8&lt;/td&gt;
&lt;td&gt;Worse. Tells browsers to remove HSTS protection.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Referrer-Policy: unsafe-url&lt;/td&gt;
&lt;td&gt;+0.3&lt;/td&gt;
&lt;td&gt;Worse. Leaks full URLs to third parties.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content-Security-Policy: (empty)&lt;/td&gt;
&lt;td&gt;+0.3&lt;/td&gt;
&lt;td&gt;Zero. Browser ignores empty CSP.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Content-Security-Policy: default-src *&lt;/td&gt;
&lt;td&gt;+0.3&lt;/td&gt;
&lt;td&gt;Zero. Allows everything.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X-Frame-Options: ALLOWALL&lt;/td&gt;
&lt;td&gt;+0.3&lt;/td&gt;
&lt;td&gt;Zero. Not a valid value. Browser ignores it.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;X-Content-Type-Options: yes-please-sniff&lt;/td&gt;
&lt;td&gt;+0.3&lt;/td&gt;
&lt;td&gt;Zero. Only nosniff is valid.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The worst case: add every header with a garbage value and your score goes from 7.7 to a perfect 10. No security improvement. Some headers actively degrade your security.&lt;/p&gt;

&lt;p&gt;HSTS: max-age=0 is the most damaging. It tells browsers to stop enforcing HTTPS. My scanner sees "HSTS header present" and removes the finding. The score goes up. The protection goes down.&lt;/p&gt;

&lt;p&gt;The root cause: My scanner checks whether headers exist. It doesn't validate their values. &lt;code&gt;Content-Security-Policy:&lt;/code&gt; (empty string) passes the "has CSP" check. &lt;code&gt;X-Frame-Options: ALLOWALL&lt;/code&gt; passes the "has X-Frame-Options" check. The check is &lt;code&gt;header in response&lt;/code&gt;, not &lt;code&gt;header is correctly configured&lt;/code&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm changing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Headers are no longer in the score
&lt;/h3&gt;

&lt;p&gt;This was the hardest decision. Headers were 6 of my ~50 findings. They're the easiest thing for a user to fix. They show up on every scan.&lt;/p&gt;

&lt;p&gt;But they're gameable. A check that can be passed by adding an empty header doesn't belong in a security score. And a check that rewards &lt;code&gt;max-age=0&lt;/code&gt; is actively harmful.&lt;/p&gt;

&lt;p&gt;Headers still appear in your report as a checklist. "Your site has CSP: yes/no." "Your HSTS max-age is 31536000." But they don't affect your score.&lt;/p&gt;

&lt;p&gt;The score now reflects only findings with ground truth: exposed files, JavaScript secrets, SSL configuration, cookie security, SRI, CORS misconfigurations, backend permission bypasses. Things I can verify are real.&lt;/p&gt;

&lt;h3&gt;
  
  
  SRI uses the rendered DOM
&lt;/h3&gt;

&lt;p&gt;The SRI check now runs on the fully rendered page from my headless browser instead of the initial HTML. This means it sees the scripts that React, Next.js, Vue, and other frameworks inject dynamically.&lt;/p&gt;

&lt;p&gt;Expected impact: detection goes from 18% to an estimated 60-80% of external scripts.&lt;/p&gt;

&lt;h3&gt;
  
  
  HTTP methods check is gone
&lt;/h3&gt;

&lt;p&gt;Removed entirely. 158 false positives, zero true positives. The check tested the homepage URL with alternative HTTP methods. Every modern web framework returns 200 or redirects for any method on the root path. The check was structurally incapable of producing true positives.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Test your assumptions with real requests
&lt;/h3&gt;

&lt;p&gt;I had "automated tests" that validated the HTTP methods analyzer. The tests passed. The analyzer correctly identified status &amp;lt; 400 as "method allowed." The logic was correct. The assumption was wrong.&lt;/p&gt;

&lt;p&gt;The only way I found this was by sending actual PUT requests to actual sites and looking at what came back. 200 with HTML. 301 redirect. Not a single JSON API response. The tests were green. The check was broken.&lt;/p&gt;

&lt;h3&gt;
  
  
  A score that can be gamed is worse than no score
&lt;/h3&gt;

&lt;p&gt;If someone can improve their score by adding &lt;code&gt;Content-Security-Policy:&lt;/code&gt; (empty), the score doesn't measure security. It measures header count. And if &lt;code&gt;HSTS: max-age=0&lt;/code&gt; improves the score while removing protection, the score is actively misleading.&lt;/p&gt;

&lt;p&gt;Lepochat et al. warned about this. I cited their paper. I didn't test against it until my CTO asked.&lt;/p&gt;

&lt;h3&gt;
  
  
  "0 degradations" is not a result
&lt;/h3&gt;

&lt;p&gt;My contextual scoring prototype showed 57% of sites improved, 0% degraded. I presented this as validation. My CTO pointed out it was a tautology: the algorithm was designed to only lower penalties, and my entire dataset was simple sites. Of course nothing degraded. The test that matters is whether complex sites keep their full severity. I couldn't run that test because my scanner can't tell the difference between a blog and Amazon.&lt;/p&gt;

&lt;h3&gt;
  
  
  Transparency is a tool, not a shield
&lt;/h3&gt;

&lt;p&gt;My first article built trust by being honest about my scanner's limitations. That trust could become a shield for future decisions that haven't been validated. "Benji is transparent, so his contextual scoring must be solid." It wasn't. The design favored the wrong users.&lt;/p&gt;

&lt;p&gt;The fix isn't less transparency. It's testing every claim before publishing it.&lt;/p&gt;




&lt;h2&gt;
  
  
  The method
&lt;/h2&gt;

&lt;p&gt;If you're building a scoring system, here's the testing approach that caught these bugs:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ground truth verification.&lt;/strong&gt; Don't just check that your logic is correct. Verify that your assumptions about the real world are correct. Send real requests. Compare with real data.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A/B simulation on existing data.&lt;/strong&gt; Before changing anything in production, replay every historical scan through the new algorithm. Measure what changes. If 0% degrade, ask whether that's a result or a design property.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Gaming tests.&lt;/strong&gt; Try to improve the score without improving security. If you can, the score doesn't measure what it claims.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Benchmark against a reference, but know what you're benchmarking.&lt;/strong&gt; I used Mozilla Observatory as a reference for 342 sites. That helped find the SRI mapping bug. But I also used Observatory correlation as a quality metric for my scoring, which doesn't make sense if I'm intentionally measuring different things.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.first.org/cvss/specification-document" rel="noopener noreferrer"&gt;CVSS v3.1 Specification&lt;/a&gt; (FIRST.org)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://bugcrowd.com/vulnerability-rating-taxonomy" rel="noopener noreferrer"&gt;Bugcrowd Vulnerability Rating Taxonomy&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ieeexplore.ieee.org/document/11129585/" rel="noopener noreferrer"&gt;Lepochat et al. (2025) "One Does Not Simply Score a Website"&lt;/a&gt; (IEEE WTMC)&lt;/li&gt;
&lt;li&gt;&lt;a href="https://developer.mozilla.org/en-US/observatory/docs/tests_and_scoring" rel="noopener noreferrer"&gt;Mozilla Observatory Scoring&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://owasp.org/www-project-web-security-testing-guide/" rel="noopener noreferrer"&gt;OWASP Web Security Testing Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;AmIHackable scans your website from the outside — the same perspective an attacker has. &lt;a href="https://amihackable.dev" rel="noopener noreferrer"&gt;Scan your site now.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>infosec</category>
      <category>security</category>
      <category>testing</category>
    </item>
    <item>
      <title>How I Score Your Website's Security (And Why I Rebuilt It From Scratch)</title>
      <dc:creator>Benjamin</dc:creator>
      <pubDate>Sun, 22 Mar 2026 12:34:51 +0000</pubDate>
      <link>https://dev.to/howwow-2000/how-i-score-your-websites-security-and-why-i-rebuilt-it-from-scratch-440c</link>
      <guid>https://dev.to/howwow-2000/how-i-score-your-websites-security-and-why-i-rebuilt-it-from-scratch-440c</guid>
      <description>&lt;p&gt;&lt;a href="https://amihackable.dev/learn/how-i-score-your-website-security" rel="noopener noreferrer"&gt;AmIHackable ?&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I tested my scanner against Mozilla Observatory on 229 sites, found I was wrong on 40% of them, and rebuilt my entire approach. Here's everything I learned.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The problem: security scores that lie
&lt;/h2&gt;

&lt;p&gt;I built AmIHackable to give developers a clear picture of their website's security. Paste your URL, get a score, fix what matters. Simple.&lt;/p&gt;

&lt;p&gt;Except it wasn't simple. Users started telling me things like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"My site is a React SPA on Netlify. Your scanner says I have WordPress, PHP, and an exposed &lt;code&gt;.env&lt;/code&gt; file. None of that is true."&lt;/p&gt;

&lt;p&gt;"You gave me 3/10 but Mozilla Observatory gives me B+. Your score is misleading for a site with TLS 1.3, solid auth, and zero XSS surface."&lt;/p&gt;

&lt;p&gt;"The scanner flagged &lt;code&gt;dangerouslySetInnerHTML&lt;/code&gt; as an XSS risk — but that string doesn't exist anywhere in my code. It's in React's own bundle."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;These weren't edge cases. When I dug into the data, I found systematic problems.&lt;/p&gt;

&lt;p&gt;I compared my scores against &lt;a href="https://developer.mozilla.org/en-US/observatory" rel="noopener noreferrer"&gt;Mozilla Observatory&lt;/a&gt; on 229 real sites. The results were uncomfortable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Sites that Observatory rated &lt;strong&gt;A+&lt;/strong&gt; were getting &lt;strong&gt;D&lt;/strong&gt; from me&lt;/li&gt;
&lt;li&gt;Sites that Observatory rated &lt;strong&gt;F&lt;/strong&gt; were getting &lt;strong&gt;A+&lt;/strong&gt; from me&lt;/li&gt;
&lt;li&gt;Overall correlation: &lt;strong&gt;56%&lt;/strong&gt; — barely better than flipping a coin between two adjacent grades&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I had two opposite problems at the same time: too harsh on well-configured sites, too lenient on poorly-configured ones.&lt;/p&gt;




&lt;h2&gt;
  
  
  What went wrong
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Problem 1: SPA false positives
&lt;/h3&gt;

&lt;p&gt;Modern web apps use Single Page Application architecture — one HTML file serves all routes. When you request &lt;code&gt;/actuator/env&lt;/code&gt; or &lt;code&gt;/.env&lt;/code&gt; on a Netlify SPA, you get a &lt;code&gt;200 OK&lt;/code&gt; with the app's homepage. My scanner saw &lt;code&gt;200 OK&lt;/code&gt; and concluded the file was accessible.&lt;/p&gt;

&lt;p&gt;Result: a React site with zero backend vulnerabilities gets flagged for Spring Boot actuator endpoints, PHP config files, and WordPress REST APIs. The score tanks to F.&lt;/p&gt;

&lt;p&gt;One user scanned his SPA on Netlify, got 0/10 with 23 findings — 15 of which were phantom API endpoints that didn't exist. He tried again a minute later. Same result. He probably left thinking the tool was broken.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The fix:&lt;/strong&gt; Before checking sensitive paths, I now probe a random nonsensical URL. If the site returns &lt;code&gt;200 OK&lt;/code&gt; with HTML (the SPA shell), I know it's a catch-all. Every subsequent check compares the response body against this fingerprint — if they match, it's the same SPA shell, not a real file.&lt;/p&gt;

&lt;p&gt;That same Netlify site now scores &lt;strong&gt;8.2/10 B&lt;/strong&gt; with 3 real findings instead of 23 false ones.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem 2: severity inflation
&lt;/h3&gt;

&lt;p&gt;When I first built the scanner, I made a deliberate choice: &lt;strong&gt;rate conservatively, alert too much rather than too little.&lt;/strong&gt; I figured it was better to flag a missing CSP as High and have a user add it, than to call it Low and have them ignore it.&lt;/p&gt;

&lt;p&gt;That logic felt responsible. But it backfired completely.&lt;/p&gt;

&lt;p&gt;Missing CSP? High. Missing X-Frame-Options? Medium. Missing HSTS? High. Session cookie without HttpOnly? High. My scanner was screaming "danger" at sites that were fundamentally fine — just missing some defense-in-depth layers.&lt;/p&gt;

&lt;p&gt;When I researched how the security industry actually rates these findings, I realized how far off I was:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Finding&lt;/th&gt;
&lt;th&gt;My initial rating&lt;/th&gt;
&lt;th&gt;Industry consensus&lt;/th&gt;
&lt;th&gt;Source&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Missing CSP&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Low&lt;/strong&gt; (CVSS 2.1-3.1)&lt;/td&gt;
&lt;td&gt;Tenable, Acunetix, Bugcrowd VRT&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Missing HSTS&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Medium&lt;/strong&gt; (CVSS 4.8-6.5)&lt;/td&gt;
&lt;td&gt;Tenable, Probely, OWASP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Missing X-Frame-Options&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Low&lt;/strong&gt; (CVSS 2.1-4.3)&lt;/td&gt;
&lt;td&gt;Bugcrowd P4-P5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Session cookie no HttpOnly&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Low&lt;/strong&gt; (CVSS 2.0-3.5)&lt;/td&gt;
&lt;td&gt;Requires existing XSS to exploit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Missing Referrer-Policy&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Informational&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Bugcrowd P5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source maps exposed&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Medium&lt;/strong&gt; (CVSS 3.5-5.3)&lt;/td&gt;
&lt;td&gt;Info disclosure, not directly exploitable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The key insight came from a user who put it perfectly: &lt;em&gt;"The headers are nice-to-have, not vulnerabilities. A 3/10 score is misleading for a site with TLS 1.3, solid auth, and tested sanitization."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;He was right. &lt;strong&gt;A missing security header is the absence of a mitigation, not the presence of a vulnerability.&lt;/strong&gt; A missing CSP doesn't create XSS — it removes a layer of defense against XSS if one already exists. Those are two fundamentally different things.&lt;/p&gt;

&lt;p&gt;Professional penetration testers and bug bounty platforms (&lt;a href="https://bugcrowd.com/vulnerability-rating-taxonomy" rel="noopener noreferrer"&gt;Bugcrowd VRT&lt;/a&gt;, &lt;a href="https://hackerone.com" rel="noopener noreferrer"&gt;HackerOne&lt;/a&gt;) consistently rate missing headers as P4-P5 (Low/Informational) across millions of real submissions. I was rating them as active threats.&lt;/p&gt;

&lt;h3&gt;
  
  
  Problem 3: no detection context
&lt;/h3&gt;

&lt;p&gt;My scanner treated every finding identically regardless of what the site actually does. A missing CSP on a static portfolio with zero JavaScript gets the same severity as a missing CSP on an e-commerce site loading 12 third-party scripts. Those aren't the same risk. But my score said they were.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I rebuilt
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Findings first, score second
&lt;/h3&gt;

&lt;p&gt;The most important lesson from user feedback: &lt;strong&gt;nobody complained about the score formula. They complained about individual findings being wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A perfectly calibrated scoring model applied to false findings produces false scores. I invested most of my effort into making every finding defensible before touching the scoring.&lt;/p&gt;

&lt;p&gt;Changes deployed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SPA catch-all detection&lt;/strong&gt; eliminates false positives on Netlify, Vercel, and Cloudflare Pages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bundle-aware XSS detection&lt;/strong&gt; skips framework internals (React, Vite, Next.js bundles use &lt;code&gt;dangerouslySetInnerHTML&lt;/code&gt; internally — flagging it was misleading)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Platform-aware email checks&lt;/strong&gt; skip SPF/DMARC on &lt;code&gt;*.netlify.app&lt;/code&gt;, &lt;code&gt;*.vercel.app&lt;/code&gt; and similar — you don't control that DNS, so the finding isn't actionable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Industry-calibrated severities&lt;/strong&gt; for all 20+ finding types, each sourced on CVSS v3.1, Bugcrowd VRT, and OWASP WSTG&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Technology detection
&lt;/h3&gt;

&lt;p&gt;My tech detection is powered by &lt;a href="https://www.wappalyzer.com/" rel="noopener noreferrer"&gt;Wappalyzer&lt;/a&gt;'s open-source database (7,500+ technologies) combined with headless browser execution via Cloudflare Workers. I now pass JavaScript global variables detected through real browser execution and DNS records (MX, TXT, NS) to the detection engine — revealing hosting providers, email services, and CDN layers without additional requests.&lt;/p&gt;

&lt;p&gt;I benchmarked my detection against the real Wappalyzer browser extension on 445 sites. On a real-world test like Doctolib, I match 5 out of 5 of Wappalyzer's key detections (Rails, Cloudflare, Sentry, Didomi, Bot Management) and catch 2 extras (Ruby runtime, Google Tag Manager). There are gaps in smaller libraries (Preact, PDF.js) that I'm closing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scoring methodology
&lt;/h3&gt;

&lt;p&gt;Each finding's severity is now aligned with industry standards:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Severity&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;th&gt;Based on&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Critical&lt;/td&gt;
&lt;td&gt;Immediate risk, exploitable now&lt;/td&gt;
&lt;td&gt;CVSS 9.0-10.0&lt;/td&gt;
&lt;td&gt;Exposed &lt;code&gt;.env&lt;/code&gt; with credentials, SSL failure, database RLS bypass&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Significant weakness&lt;/td&gt;
&lt;td&gt;CVSS 7.0-8.9&lt;/td&gt;
&lt;td&gt;Secrets in JS, weak TLS protocol&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Real but conditional risk&lt;/td&gt;
&lt;td&gt;CVSS 4.0-6.9&lt;/td&gt;
&lt;td&gt;Missing HSTS, open redirect, session cookie without Secure flag&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;Defense-in-depth gap&lt;/td&gt;
&lt;td&gt;CVSS 0.1-3.9&lt;/td&gt;
&lt;td&gt;Missing CSP, missing X-Frame-Options, XSS code patterns&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Info&lt;/td&gt;
&lt;td&gt;Context, no action needed&lt;/td&gt;
&lt;td&gt;CVSS 0&lt;/td&gt;
&lt;td&gt;SSL configured correctly, Observatory grade, detected tech stack&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every severity decision is traceable to a published source: &lt;a href="https://www.first.org/cvss/specification-document" rel="noopener noreferrer"&gt;CVSS v3.1&lt;/a&gt;, &lt;a href="https://bugcrowd.com/vulnerability-rating-taxonomy" rel="noopener noreferrer"&gt;Bugcrowd VRT&lt;/a&gt;, &lt;a href="https://owasp.org/www-project-web-security-testing-guide/" rel="noopener noreferrer"&gt;OWASP WSTG&lt;/a&gt;, or the &lt;a href="https://cwe.mitre.org/top25/" rel="noopener noreferrer"&gt;CWE Top 25&lt;/a&gt;. I keep the full mapping documented internally — every finding has a CWE ID, a WSTG test reference, and a rationale.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I detect that others can't
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Adaptive Backend Probing.&lt;/strong&gt; When my scanner detects backend credentials in your bundled JavaScript — Supabase URLs and anon keys, Firebase configs — it doesn't just flag "key exposed." It automatically tests whether that key can actually access your data. Are your database tables visible? Is Row Level Security properly configured? Can anonymous users read data they shouldn't?&lt;/p&gt;

&lt;p&gt;This transforms a finding from "your key is visible in the JS" into "your key allows reading the users table without authentication."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SPA-Aware Scanning.&lt;/strong&gt; I detect Single Page Application routing and adapt all checks accordingly. Sites on Netlify, Vercel, and Cloudflare Pages no longer get flagged for sensitive files and API endpoints that are actually just the SPA shell returning &lt;code&gt;200 OK&lt;/code&gt; for everything.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'm honest about
&lt;/h2&gt;

&lt;h3&gt;
  
  
  This score is not a risk prediction
&lt;/h3&gt;

&lt;p&gt;My score measures your &lt;strong&gt;observable security posture from the outside&lt;/strong&gt;. It doesn't predict whether you'll be breached. A site with a perfect score can have SQL injection in its login form — I can't see that without access to your code.&lt;/p&gt;

&lt;p&gt;Think of it as a health checkup, not a diagnosis. It tells you what's visible and what to fix first.&lt;/p&gt;

&lt;h3&gt;
  
  
  I measure more than Observatory — and that creates divergence
&lt;/h3&gt;

&lt;p&gt;Mozilla Observatory tests 10 things (all security headers). I test 50+. My scores won't always match Observatory — and that's intentional. A site with perfect headers but an exposed &lt;code&gt;.env&lt;/code&gt; file gets A+ from Observatory and a much lower score from me. I think that's the right behavior.&lt;/p&gt;

&lt;h3&gt;
  
  
  Missing mitigations ≠ vulnerabilities
&lt;/h3&gt;

&lt;p&gt;This is worth repeating: a missing CSP header does not make your site vulnerable to XSS. It removes a layer of defense. I score it accordingly — as Low, not Critical.&lt;/p&gt;

&lt;p&gt;If you see a Low finding for a missing header, it means "this would strengthen your security posture" — not "you're being hacked right now."&lt;/p&gt;




&lt;h2&gt;
  
  
  What's next: contextual scoring
&lt;/h2&gt;

&lt;p&gt;Right now, every missing CSP gets the same severity. But a missing CSP on a static portfolio with zero JavaScript is effectively Informational — there's nothing to protect against. The same missing CSP on a site loading Google Tag Manager, Stripe.js, and Intercom has real consequences — those third-party scripts are exactly the attack surface that CSP is designed to control.&lt;/p&gt;

&lt;p&gt;I already detect your tech stack, your third-party scripts, your framework. The next step is using that context to adjust severity dynamically:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;0 third-party scripts, no inline JS → CSP missing is &lt;strong&gt;Info&lt;/strong&gt; (nothing to protect)&lt;/li&gt;
&lt;li&gt;1-5 scripts, no inline → CSP missing is &lt;strong&gt;Low&lt;/strong&gt; (limited surface)&lt;/li&gt;
&lt;li&gt;6+ third-party scripts or inline JS → CSP missing is &lt;strong&gt;Medium&lt;/strong&gt; (real attack surface)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;eval()&lt;/code&gt; or &lt;code&gt;dangerouslySetInnerHTML&lt;/code&gt; with user input → CSP missing is &lt;strong&gt;High&lt;/strong&gt; (active risk)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;No scanner does this today. Observatory, SecurityHeaders, Qualys — they all score findings in isolation. I'm building toward a score that understands your actual attack surface.&lt;/p&gt;




&lt;h2&gt;
  
  
  Sources
&lt;/h2&gt;

&lt;p&gt;My methodology is built on published standards, not arbitrary choices:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://www.first.org/cvss/specification-document" rel="noopener noreferrer"&gt;CVSS v3.1 Specification&lt;/a&gt; — Severity ranges for individual findings (FIRST.org)&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://bugcrowd.com/vulnerability-rating-taxonomy" rel="noopener noreferrer"&gt;Bugcrowd Vulnerability Rating Taxonomy&lt;/a&gt; — P1-P5 severity from millions of real bug bounty submissions&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://owasp.org/www-project-web-security-testing-guide/" rel="noopener noreferrer"&gt;OWASP Web Security Testing Guide&lt;/a&gt; — Test categorization and methodology&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://cwe.mitre.org/top25/" rel="noopener noreferrer"&gt;CWE Top 25 2024&lt;/a&gt; — Common weakness scoring&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://developer.mozilla.org/en-US/observatory/docs/tests_and_scoring" rel="noopener noreferrer"&gt;Mozilla Observatory Scoring&lt;/a&gt; — Reference baseline for header grading&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/ssllabs/research/wiki/SSL-Server-Rating-Guide" rel="noopener noreferrer"&gt;Qualys SSL Labs Rating Guide&lt;/a&gt; — Grade cap methodology for SSL&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://www.oecd.org/en/publications/handbook-on-constructing-composite-indicators-methodology-and-user-guide_9789264043466-en.html" rel="noopener noreferrer"&gt;OECD/JRC Handbook on Composite Indicators&lt;/a&gt; — Composite scoring framework&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://ieeexplore.ieee.org/document/11129585/" rel="noopener noreferrer"&gt;Lepochat et al. (2025) "One Does Not Simply Score a Website"&lt;/a&gt; — IEEE critique of website scoring algorithms&lt;/li&gt;
&lt;li&gt;Tenable, Acunetix, Probely — Individual finding severity benchmarks (&lt;a href="https://www.tenable.com/plugins/was/112551" rel="noopener noreferrer"&gt;CSP&lt;/a&gt;, &lt;a href="https://www.tenable.com/plugins/was/98056" rel="noopener noreferrer"&gt;HSTS&lt;/a&gt;, &lt;a href="https://www.acunetix.com/vulnerabilities/web/javascript-source-map-detected/" rel="noopener noreferrer"&gt;Source Maps&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;AmIHackable scans your website from the outside — the same perspective an attacker has. It detects your technology stack, checks your security configuration, and gives you a prioritized list of what to fix. &lt;a href="https://amihackable.dev" rel="noopener noreferrer"&gt;Scan your site now.&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>webdev</category>
      <category>javascript</category>
      <category>beginners</category>
    </item>
  </channel>
</rss>
