<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: onfafanutifafa</title>
    <description>The latest articles on DEV Community by onfafanutifafa (@onfafanutifafa).</description>
    <link>https://dev.to/onfafanutifafa</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3847460%2F3573a7ee-31f0-4f9f-9183-66269f09a332.png</url>
      <title>DEV Community: onfafanutifafa</title>
      <link>https://dev.to/onfafanutifafa</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/onfafanutifafa"/>
    <language>en</language>
    <item>
      <title>I benchmarked Python AI-app security scanners. Here's what each catches.</title>
      <dc:creator>onfafanutifafa</dc:creator>
      <pubDate>Fri, 05 Jun 2026 13:12:00 +0000</pubDate>
      <link>https://dev.to/onfafanutifafa/i-benchmarked-python-ai-app-security-scanners-heres-what-each-catches-49je</link>
      <guid>https://dev.to/onfafanutifafa/i-benchmarked-python-ai-app-security-scanners-heres-what-each-catches-49je</guid>
      <description>&lt;p&gt;This week I shipped Python AI-app regex prefilters in getdebug 0.4.0 and benchmarked them against Bandit and Semgrep on real Python code. Here are the numbers and what each tool actually catches.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four tools
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Bandit&lt;/strong&gt; (PyCQA) — the Python-OSS standard security linter. Hand-written rules, free, fast, Python only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semgrep&lt;/strong&gt; — multi-language SAST with community rule packs. Hand-written rules, free, fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;vulnhuntr&lt;/strong&gt; (Protect AI, open source) — the stated category leader for LLM-driven AI-app static analysis. Python only.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;getdebug&lt;/strong&gt; — pattern-based regex prefilters in JS/TS + Python (new in 0.4.0). Plus optional local-LLM SAST via Ollama (free, on-device) and hosted (paid).&lt;/p&gt;

&lt;h2&gt;
  
  
  Test 1 — paired vulnerable/safe fixtures
&lt;/h2&gt;

&lt;p&gt;10 hand-written Python fixtures, 5 vulnerable + 5 safe, one pair per AI-app category (pii-in-prompt, unsafe-role-merge, prompt-injection, unbounded-stream, unsafe-tool-output).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tool        TP  FP  FN   Precision  Recall
getdebug     5   0   0    100%       100%
bandit       1   1   4    50%        20%
semgrep      1   1   4    50%        20%
vulnhuntr    —   —   —    (unable to complete; see below)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bandit and Semgrep both catch the &lt;code&gt;unsafe-tool-output&lt;/code&gt; fixture via their generic &lt;code&gt;subprocess.run(shell=True)&lt;/code&gt; rules. That's a TP on the vulnerable variant. But they also fire on the &lt;strong&gt;safe&lt;/strong&gt; variant — the allowlist-then-run pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Safe pattern — Bandit + Semgrep both flag this as a FP
&lt;/span&gt;&lt;span class="n"&gt;ALLOWED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hosts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cat /etc/hosts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uptime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;uptime&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cmd&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ALLOWED&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tool_call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rejected&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;subprocess&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cmd&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shell&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;capture_output&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="n"&gt;stdout&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Neither tool knows &lt;code&gt;cmd&lt;/code&gt; came from a static dict, not the model. They see &lt;code&gt;shell=True&lt;/code&gt; and fire. getdebug's regex specifically requires the &lt;code&gt;tool_call.input.X&lt;/code&gt; / &lt;code&gt;block.input.X&lt;/code&gt; reference in the sink arg, so the allowlist-then-run pattern stays clean.&lt;/p&gt;

&lt;p&gt;Both tools miss the other four behavioural categories (pii-in-prompt, unsafe-role-merge, prompt-injection, unbounded-stream) entirely. The rule packs don't contain patterns for &lt;code&gt;{"role": "system", "content": f"...{name}..."}&lt;/code&gt;. That's the gap.&lt;/p&gt;

&lt;h2&gt;
  
  
  Test 2 — real-world signal/noise
&lt;/h2&gt;

&lt;p&gt;We ran all three (working) tools against &lt;a href="https://github.com/simonw/llm" rel="noopener noreferrer"&gt;simonw/llm&lt;/a&gt;, Simon Willison's clean CLI for LLMs, 48 Python files.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Tool        Total findings    Signal
bandit      1,189            1,158 are 'assert_used' (pytest);
                              zero AI-app coverage
semgrep     3                3 generic-SAST hits;
                              zero AI-app coverage
getdebug    6                6 AI-app findings: 1 prompt-injection,
                              5 unbounded-stream
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Bandit's 1,189 findings on 48 files is almost entirely the &lt;code&gt;assert_used&lt;/code&gt; warning on pytest assertions — a well-known default everyone disables in real configs. Semgrep's 3 findings are real but none AI-app specific. getdebug's 6 are all AI-app categorized.&lt;/p&gt;

&lt;h2&gt;
  
  
  About vulnhuntr
&lt;/h2&gt;

&lt;p&gt;vulnhuntr is the stated category leader. We wanted a clean cross-check. We couldn't get one:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;--llm claude-code&lt;/code&gt; mode (no-API-key option) crashes with &lt;code&gt;ModuleNotFoundError&lt;/code&gt; in 1.2.2.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--llm gpt&lt;/code&gt; with &lt;code&gt;gpt-4o-mini&lt;/code&gt; fails pydantic-validation on the response.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;--llm gpt&lt;/code&gt; with &lt;code&gt;gpt-4o&lt;/code&gt; hits OpenAI's default 30K TPM rate limit on small accounts.&lt;/li&gt;
&lt;li&gt;Default file-selection heuristic identifies "network-exposed" entry points — simonw/llm is a CLI, so vulnhuntr selected zero files to analyze.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We'll re-benchmark when its 2026 stack stabilises.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this means for you
&lt;/h2&gt;

&lt;p&gt;If you ship Python code that calls an LLM, run all three. They're complementary:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;bandit &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="nb"&gt;.&lt;/span&gt;                              &lt;span class="c"&gt;# general Python hygiene&lt;/span&gt;
semgrep &lt;span class="nt"&gt;--config&lt;/span&gt; auto &lt;span class="nb"&gt;.&lt;/span&gt;                  &lt;span class="c"&gt;# cross-language SAST coverage&lt;/span&gt;
npx @getdebug/cli@0.4.0 analyze &lt;span class="nb"&gt;.&lt;/span&gt;       &lt;span class="c"&gt;# AI-app behavioural patterns&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;None of them subsume the others. The first two catch general SAST; getdebug catches the "serialised the whole user object into the prompt" class that you can't hand-write a sustainable rule for in generic SAST.&lt;/p&gt;

&lt;p&gt;Reproduce every number at &lt;a href="https://www.getdebug.dev/bench" rel="noopener noreferrer"&gt;getdebug.dev/bench&lt;/a&gt;. Corpus and methodology are open at &lt;a href="https://github.com/getdebug-ai/codesecbench" rel="noopener noreferrer"&gt;getdebug-ai/codesecbench&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;read it here  &lt;a href="https://www.getdebug.dev/blog/python-ai-app-prefilters" rel="noopener noreferrer"&gt;https://www.getdebug.dev/blog/python-ai-app-prefilters&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>security</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>We scanned 20 AI repos for leaked keys. Every scanner alert was a false positive.</title>
      <dc:creator>onfafanutifafa</dc:creator>
      <pubDate>Fri, 05 Jun 2026 01:49:05 +0000</pubDate>
      <link>https://dev.to/onfafanutifafa/we-scanned-20-ai-repos-for-leaked-keys-every-scanner-alert-was-a-false-positive-23i3</link>
      <guid>https://dev.to/onfafanutifafa/we-scanned-20-ai-repos-for-leaked-keys-every-scanner-alert-was-a-false-positive-23i3</guid>
      <description>&lt;p&gt;getdebug ships a secret scanner as part of its free tier — committed credentials are the one finding category we surface without an account, because the cost of a leaked key is high enough that even a 30-second check is worth running. So we did the obvious thing: we ran our scanner against 20 public AI-starter repos on GitHub, expecting to find some real leaks. The premise was that someone in a corpus of mid-popularity AI scaffolds must have committed a real OpenAI key.&lt;/p&gt;

&lt;p&gt;Every single scanner alert was a false positive.&lt;/p&gt;

&lt;p&gt;The numbers&lt;br&gt;
Across the 20-repo sweep, our scanner produced 12 alerts at critical severity. Zero of them were real credentials. Two repos accounted for most of the noise:&lt;/p&gt;

&lt;p&gt;stackitcloud/rag-template — 7 scanner alerts, all false positives. Every hit was a placeholder value in a .env.template file (e.g. STACKIT_VLLM_API_KEY=your-stackit-vllm-api-key) or an import.meta.env.X env-var name read. None of them were real credentials.&lt;br&gt;
A popular Claude Code starter template — 5 scanner alerts, all false positives. Three were "Private key block" matches inside CHANGELOG.md and SNAPSHOT.md showing PEM-formatted example output. The other two were the funniest: PEM markers appearing in comments next to grep patterns and redaction regexes that exist to strip the same shape. Secret detectors tripping on secret-detector code.&lt;br&gt;
What we shipped because of it&lt;br&gt;
A false-positive rate that high on a corpus this small is a real problem. So we read every hit, classified the failure modes, and shipped three detector rules into both the CLI (@getdebug/cli) and the hosted analyze worker:&lt;/p&gt;

&lt;p&gt;Broader env-template matching. Any file whose path or extension matches .env.template, .env.example, .env.sample, or a parent directory named examples/ is treated as template by default. Findings inside still surface, but at info severity, not critical.&lt;br&gt;
Doc-context suppression. Hits inside fenced code blocks in markdown, or under headings like "Example output" / "Sample response", no longer trip critical severity. The detector still records them — they just don't page anyone.&lt;br&gt;
Env-var-read skip in entropy. The entropy-based detector now recognizes process.env.X, import.meta.env.X, and os.environ["X"] as identifier reads, not opaque high-entropy strings. The variable name being long and random-looking doesn't make the access a leak.&lt;br&gt;
Re-running the same 20-repo sweep after these three rules landed: 83% reduction in critical false positives. Two FPs remain — both in the same Claude Code starter template — and they need a fourth rule we haven't shipped yet (PEM-in-comment suppression, which requires cross-language comment parsing). The post is upfront about that: it's an open detector gap, not a closed one.&lt;/p&gt;

&lt;p&gt;Why publish this&lt;br&gt;
Every security vendor's landing page claims a low false positive rate. Almost nobody shows the work. We'd rather be the team that publishes its scanner being wrong, ships fixes in public, and re-runs the numbers — because that's what we'd want from a tool we were thinking of buying.&lt;/p&gt;

&lt;p&gt;The corpus is reproducible. The methodology, the per-tool numbers, and the JSON output schema live under /bench. If you find a case the scanner gets wrong on your own code, the issue tracker on github.com/getdebug-ai/cli is the right place to put it. Detector tuning is still early, and the easiest way to improve it is to point us at the noise.&lt;/p&gt;

&lt;p&gt;Try it&lt;br&gt;
The secret-scanning detectors that came out of this sweep are in the free tier of the CLI. No account needed; nothing leaves your laptop.&lt;/p&gt;

&lt;h1&gt;
  
  
  macOS / Linux
&lt;/h1&gt;

&lt;p&gt;brew install getdebug-ai/tap/getdebug&lt;br&gt;
getdebug analyze .&lt;br&gt;
Full install instructions and the rest of the commands are in the docs.&lt;/p&gt;

&lt;p&gt;youcan read it here &lt;a href="https://www.getdebug.dev/blog/credibility-scan" rel="noopener noreferrer"&gt;https://www.getdebug.dev/blog/credibility-scan&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>security</category>
    </item>
    <item>
      <title>I used to guard buildings. Now I guard codebases.</title>
      <dc:creator>onfafanutifafa</dc:creator>
      <pubDate>Wed, 03 Jun 2026 23:25:41 +0000</pubDate>
      <link>https://dev.to/onfafanutifafa/i-used-to-guard-buildings-now-i-guard-codebases-22p3</link>
      <guid>https://dev.to/onfafanutifafa/i-used-to-guard-buildings-now-i-guard-codebases-22p3</guid>
      <description>&lt;p&gt;I come from a physical security space, mainly man-guarding and asset protection. I recently took the challenge to venture into information and cyber security. So far I can say the mentality for both is the same; they differ in technique but the outcomes are the same, in that both are primarily focused on asset or data privacy and protection. Offensive cybersecurity often happens between nation states, but that does not mean corporate entities or individuals do not indulge. They do so cautiously, because breaking into unauthorised networks and domains is a crime. More often than not, countries get away with it, but corporations and individuals face the sharp end of the sword. Offensive information and cyber security experts act under strict regulations and laws to safeguard the data and sovereignty of corporations and nations. This is to lay emphasis on the sameness of the core principles of both physical and information and cyber security, in that they are focused on protection rather than exploiting. &lt;/p&gt;

&lt;p&gt;The difference: they differ in technique in the sense that the tools they need to successfully manoeuvre a problem are different, but the goals are the same. Private security, like health care, only becomes top of mind when things go wrong. Research shows that businesses and people see security as critical to their business and to their brand, but fewer people actually reach for their wallet.[1, 2] The price we pay for the lack of security outstrips the immediate cost of buying one. This is why I am a security man. When I talk to my clients I always make the same point: security is a mindset shift. You can buy security, but you can never buy safety. Selling you security does not mean I can promise you will never be breached, because security by its very nature is not absolute. The systems you call safe is the same systems someone else walks through with ease. For example, when Anthropic launched Mythos[3, 4, 5], it uncovered tens of thousands of vulnerabilities[6, 7] in systems long assumed to be safe — including a twenty-seven-year-old flaw in OpenBSD, one of the most hardened operating systems in the world.[3, 7] Safe, until it was not. So I do not sell certainty. I am honest about my methods and honest about the&lt;br&gt;
odds, and that honesty is what earns a client’s confidence in how protected they really are. A client who understands the true odds is in a far stronger position than one who has been sold the impossible.&lt;/p&gt;

&lt;p&gt;Now more than ever, everything is moving to the web or a network, and the cost of moving has drastically fallen. Moving is the no-brainer every organisation goes to in order to store information and customer data. The great multiplier and enabler of the wind of change is artificial intelligence. AI is great at what it does — it generates working code faster than any team can review it. For people like me who have come to understand what it is, I use it with caution, and this is why I think protecting networks and security in general is going to see an uptick in growth in the coming years. Senior programmers have admitted they cannot keep up with AI-written codebases: the issues are rarely simple, and the sheer quantum of code they have to comb through to find them is overwhelming.[8, 9, 10] Past a few hundred lines a review stops being a review and becomes a rubber stamp[11]; so when a developer is handed ten thousand lines of clean, confident-looking AI code, the easy choice — the human choice — is to trust it and ship it as is.[8] The bug ships with it. This is why I strongly believe AI can be the key mediator here — to narrow down bugs for devs and dev teams to navigate successfully. &lt;/p&gt;

&lt;p&gt;This is why I created getdebug.dev. getdebug.dev is an AI-powered codebase analyser and auto-fixer. It works simply: you connect a codebase or repository from a version control platform like GitHub or GitLab, and getdebug indexes it. That index is what makes it possible to analyse the code and detect bugs, business-logic gaps, and broken access controls. And that last one is the whole point — broken access controls are a security failure, not just a coding one. This is where my two worlds meet. Whether you are guarding a building or guarding a codebase, the job is the same: find the gap before someone else does. getdebug is how I bring the protection mindset to the place everything is now moving — the code itself.&lt;/p&gt;

&lt;p&gt;Now, I am not the first person to think of this. There are good tools out there already doing code review and bug hunting, and some of them are very good. I know them. I did not build getdebug because the&lt;br&gt;
others are bad. I built it because they think like engineers and I think like a security man. To most of these tools a bug is a bug, one more item on a list to clean up. To me every bug is a door. Some doors lead to nothing, and others lead straight into the house. A broken access control is not a code quality problem, it is an unlocked door waiting for someone to walk through. I cannot unsee it that way. So getdebug does not just ask “is this code clean,” it asks “where can someone get in.” That is the difference, and it is a difference in how I see the work, not just in features. Two things follow from that. The first is that getdebug is built for the new kind of software people are shipping now, the AI apps. The mistakes AI apps make are their own breed: prompt injection, leaking keys to the browser, trusting output they should never trust. Most tools catch these by accident, if at all. getdebug looks for them on purpose, because that is where the doors are being left open today.The second is privacy, and I mean a real choice, not a slogan. You can connect your repo and let getdebug work in the cloud, or you can run it entirely on your own machine where your code never leaves your hands. Some teams cannot let their code travel, and they should still be able to secure it. So I built both. But the part I care about most is that getdebug learns. When you tell it a flagged line is fine because you meant it that way, it remembers, and it stops bothering you about it. Good review tools do this for code style now. getdebug does it for security — it learns which doors you have deliberately left open and which it should keep watching, and it gets sharper at the difference the longer it guards your codebase. That is the part I am building everything else around. You can try it at &lt;a href="https://www.getdebug.dev/" rel="noopener noreferrer"&gt;https://www.getdebug.dev/&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;References&lt;/p&gt;

&lt;p&gt;[1] Cybersecurity Dive. “Are businesses underinvesting in cybersecurity?”&lt;br&gt;
&lt;a href="https://www.cybersecuritydive.com/news/security-budgets-enterprise-CISO/595036/" rel="noopener noreferrer"&gt;https://www.cybersecuritydive.com/news/security-budgets-enterprise-CISO/595036/&lt;/a&gt;&lt;br&gt;
[2] Help Net Security. “Cybersecurity spending keeps rising, so why is business impact still hard to explain?” (Jan 15, 2026).&lt;br&gt;
&lt;a href="https://www.helpnetsecurity.com/2026/01/15/expel-cybersecurity-investment-decisions/" rel="noopener noreferrer"&gt;https://www.helpnetsecurity.com/2026/01/15/expel-cybersecurity-investment-decisions/&lt;/a&gt;&lt;br&gt;
[3] Anthropic. “Claude Mythos Preview” (primary source — Mythos launch and the 27-year-old OpenBSD SACK flaw).&lt;br&gt;
&lt;a href="https://red.anthropic.com/2026/mythos-preview/" rel="noopener noreferrer"&gt;https://red.anthropic.com/2026/mythos-preview/&lt;/a&gt;&lt;br&gt;
[4] Anthropic. “Project Glasswing: Securing critical software for the AI era.” &lt;a href="https://www.anthropic.com/glasswing" rel="noopener noreferrer"&gt;https://www.anthropic.com/glasswing&lt;/a&gt;&lt;br&gt;
[5] TechCrunch. “Anthropic scales Claude Mythos to critical infrastructure in 15+ countries” (Jun 2, 2026).&lt;br&gt;
&lt;a href="https://techcrunch.com/2026/06/02/anthropic-scales-claude-mythos-to-critical-infrastructure-in-15-countries/" rel="noopener noreferrer"&gt;https://techcrunch.com/2026/06/02/anthropic-scales-claude-mythos-to-critical-infrastructure-in-15-countries/&lt;/a&gt;&lt;br&gt;
[6] SecurityWeek. “Anthropic: Mythos Detected 23,000 Potential Vulnerabilities Across 1,000 OSS Projects.”&lt;br&gt;
&lt;a href="https://www.securityweek.com/anthropic-mythos-detected-23000-potential-vulnerabilities-across-1000-oss-projects/" rel="noopener noreferrer"&gt;https://www.securityweek.com/anthropic-mythos-detected-23000-potential-vulnerabilities-across-1000-oss-projects/&lt;/a&gt;&lt;br&gt;
[7] Crypto Briefing. “Anthropic’s Mythos detects 23,000 vulnerabilities in open-source projects, including a 27-year-old OpenBSD&lt;br&gt;
flaw.” &lt;a href="https://cryptobriefing.com/anthropic-mythos-open-source-vulnerabilities/" rel="noopener noreferrer"&gt;https://cryptobriefing.com/anthropic-mythos-open-source-vulnerabilities/&lt;/a&gt;&lt;br&gt;
[8] GitClear. “AI Copilot Code Quality: 2025 Research” (10M+ commits; code churn, copy/paste, the “illusion of correctness”).&lt;br&gt;
&lt;a href="https://www.gitclear.com/ai_assistant_code_quality_2025_research" rel="noopener noreferrer"&gt;https://www.gitclear.com/ai_assistant_code_quality_2025_research&lt;/a&gt;&lt;br&gt;
[9] The Register. “AI-authored code contains worse bugs than software crafted by humans” (Dec 17, 2025).&lt;br&gt;
&lt;a href="https://www.theregister.com/2025/12/17/ai_code_bugs/" rel="noopener noreferrer"&gt;https://www.theregister.com/2025/12/17/ai_code_bugs/&lt;/a&gt;&lt;br&gt;
[10] arXiv. “Human-Written vs. AI-Generated Code: A Large-Scale Study of Defects, Vulnerabilities, and Complexity” (2025).&lt;br&gt;
&lt;a href="https://arxiv.org/abs/2508.21634" rel="noopener noreferrer"&gt;https://arxiv.org/abs/2508.21634&lt;/a&gt;&lt;br&gt;
[11] Salesforce Engineering. “Scaling Code Reviews: Adapting to a Surge in AI-Generated Code” (on review degradation past a few&lt;br&gt;
hundred lines). &lt;a href="https://engineering.salesforce.com/scaling-code-reviews-adapting-to-a-surge-in-ai-generated-code/" rel="noopener noreferrer"&gt;https://engineering.salesforce.com/scaling-code-reviews-adapting-to-a-surge-in-ai-generated-code/&lt;/a&gt;&lt;/p&gt;

</description>
      <category>career</category>
      <category>cybersecurity</category>
      <category>infosec</category>
      <category>security</category>
    </item>
  </channel>
</rss>
