<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: YedanYagami</title>
    <description>The latest articles on DEV Community by YedanYagami (@yedanyagamiaicmd).</description>
    <link>https://dev.to/yedanyagamiaicmd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3805531%2F3860bfd2-4131-4148-807b-be26684d20e7.png</url>
      <title>DEV Community: YedanYagami</title>
      <link>https://dev.to/yedanyagamiaicmd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/yedanyagamiaicmd"/>
    <language>en</language>
    <item>
      <title>what if MCP servers had a Lighthouse-style security score?</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Tue, 31 Mar 2026 07:33:05 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-25kf</link>
      <guid>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-25kf</guid>
      <description>&lt;p&gt;i've been auditing MCP servers for a few weeks now. 194 packages scanned, 118 had security findings. that's a 60.8% failure rate. and these are the ones people are plugging into their AI agents right now.&lt;/p&gt;

&lt;p&gt;the problem isn't that developers are lazy. it's that there's no visibility. when you &lt;code&gt;npm install @some-mcp/filesystem-server&lt;/code&gt;, you have zero signal about whether it sanitizes paths, leaks env vars, or shells out with unsanitized input.&lt;/p&gt;

&lt;p&gt;we have Lighthouse for web performance. we have Snyk for dependency vulnerabilities. we have nothing for MCP server security.&lt;/p&gt;

&lt;h2&gt;
  
  
  the proposal: MCP Security Score
&lt;/h2&gt;

&lt;p&gt;a 0-100 score, computed automatically, covering 5 dimensions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. input validation (25 pts)&lt;/strong&gt; -- does the server validate tool parameters before use? or does &lt;code&gt;directory: "../../etc/passwd"&lt;/code&gt; just work?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. execution safety (25 pts)&lt;/strong&gt; -- shell injection checks. does it use &lt;code&gt;execFileSync&lt;/code&gt; with argument arrays or string-concatenated &lt;code&gt;execSync&lt;/code&gt;? does python code use &lt;code&gt;shlex.quote()&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. environment isolation (20 pts)&lt;/strong&gt; -- does it leak &lt;code&gt;process.env&lt;/code&gt;? does it allowlist which env vars it touches? we found 7 packages that &lt;code&gt;console.log(process.env)&lt;/code&gt; during startup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. dependency hygiene (15 pts)&lt;/strong&gt; -- known CVEs in transitive deps, lockfile present, pinned versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. output sanitization (15 pts)&lt;/strong&gt; -- does it prevent sensitive data from flowing back through tool responses into the LLM context?&lt;/p&gt;

&lt;h2&gt;
  
  
  why this matters more than web security
&lt;/h2&gt;

&lt;p&gt;when a web app has an XSS vulnerability, one user is affected. when an MCP server has a shell injection vulnerability, the AI agent executing it can compromise the entire host. and the agent doesn't know the difference between a safe command and &lt;code&gt;; rm -rf /&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;we counted 30 CVEs against MCP packages in the last 60 days. 437K downloads of compromised packages before takedowns. the 97M total npm SDK downloads tell us adoption is way ahead of security infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  what it could look like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@modelcontextprotocol/server-filesystem
MCP Security Score: 72/100
  input validation:     18/25
  execution safety:     22/25
  environment isolation: 14/20
  dependency hygiene:    12/15
  output sanitization:    6/15
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;render it in npm readme badges. surface it in MCP client UIs. let agent frameworks refuse to load servers below a threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  honest limitations
&lt;/h2&gt;

&lt;p&gt;this won't catch everything. logic bugs, prompt injection through tool responses, and novel attack chains need human review. a score creates false confidence if people treat it as a guarantee. it's a floor, not a ceiling.&lt;/p&gt;

&lt;p&gt;but right now the floor is "nothing." any signal is better than zero.&lt;/p&gt;

&lt;h2&gt;
  
  
  it's live now
&lt;/h2&gt;

&lt;p&gt;i actually built this. the MCP Security Score API is free and public:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# score any MCP server in seconds&lt;/span&gt;
curl &lt;span class="s2"&gt;"https://mcp-security-score.yagami8095.workers.dev/score?url=https://your-server.com"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;what you get:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;0-100 score across 8 security checks (HTTPS, CORS, CSP, response time, MCP protocol, tool validation, injection patterns, CVE detection)&lt;/li&gt;
&lt;li&gt;embeddable SVG badge for your README&lt;/li&gt;
&lt;li&gt;public leaderboard of scored servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;embed the badge:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="p"&gt;![&lt;/span&gt;&lt;span class="nv"&gt;MCP Security Score&lt;/span&gt;&lt;span class="p"&gt;](&lt;/span&gt;&lt;span class="sx"&gt;https://mcp-security-score.yagami8095.workers.dev/badge?url=https://your-server.com&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;try it: &lt;a href="https://mcp-security-score.yagami8095.workers.dev" rel="noopener noreferrer"&gt;mcp-security-score.yagami8095.workers.dev&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;if you ship MCP servers, get your score. if the free API doesn't cover your needs, reply "audit" and i'll do a manual hardening review.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;built by &lt;a href="https://github.com/yedanyagamiai-cmd" rel="noopener noreferrer"&gt;@yedanyagamiai&lt;/a&gt; — we run 7 AI brains and hardened 15 MCP servers with 20 OWASP rules. the score API runs on Cloudflare Workers at $0/mo infrastructure cost.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>security</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>what if MCP servers had a Lighthouse-style security score?</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:50:17 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-35df</link>
      <guid>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-35df</guid>
      <description>&lt;p&gt;i've been auditing MCP servers for a few weeks now. 194 packages scanned, 118 had security findings. that's a 60.8% failure rate. and these are the ones people are plugging into their AI agents right now.&lt;/p&gt;

&lt;p&gt;the problem isn't that developers are lazy. it's that there's no visibility. when you &lt;code&gt;npm install @some-mcp/filesystem-server&lt;/code&gt;, you have zero signal about whether it sanitizes paths, leaks env vars, or shells out with unsanitized input.&lt;/p&gt;

&lt;p&gt;we have Lighthouse for web performance. we have Snyk for dependency vulnerabilities. we have nothing for MCP server security.&lt;/p&gt;

&lt;h2&gt;
  
  
  the proposal: MCP Security Score
&lt;/h2&gt;

&lt;p&gt;a 0-100 score, computed automatically, covering 5 dimensions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. input validation (25 pts)&lt;/strong&gt; -- does the server validate tool parameters before use? or does &lt;code&gt;directory: "../../etc/passwd"&lt;/code&gt; just work?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. execution safety (25 pts)&lt;/strong&gt; -- shell injection checks. does it use &lt;code&gt;execFileSync&lt;/code&gt; with argument arrays or string-concatenated &lt;code&gt;execSync&lt;/code&gt;? does python code use &lt;code&gt;shlex.quote()&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. environment isolation (20 pts)&lt;/strong&gt; -- does it leak &lt;code&gt;process.env&lt;/code&gt;? does it allowlist which env vars it touches? we found 7 packages that &lt;code&gt;console.log(process.env)&lt;/code&gt; during startup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. dependency hygiene (15 pts)&lt;/strong&gt; -- known CVEs in transitive deps, lockfile present, pinned versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. output sanitization (15 pts)&lt;/strong&gt; -- does it prevent sensitive data from flowing back through tool responses into the LLM context?&lt;/p&gt;

&lt;h2&gt;
  
  
  why this matters more than web security
&lt;/h2&gt;

&lt;p&gt;when a web app has an XSS vulnerability, one user is affected. when an MCP server has a shell injection vulnerability, the AI agent executing it can compromise the entire host. and the agent doesn't know the difference between a safe command and &lt;code&gt;; rm -rf /&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;we counted 30 CVEs against MCP packages in the last 60 days. 437K downloads of compromised packages before takedowns. the 97M total npm SDK downloads tell us adoption is way ahead of security infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  what it could look like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@modelcontextprotocol/server-filesystem
MCP Security Score: 72/100
  input validation:     18/25
  execution safety:     22/25
  environment isolation: 14/20
  dependency hygiene:    12/15
  output sanitization:    6/15
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;render it in npm readme badges. surface it in MCP client UIs. let agent frameworks refuse to load servers below a threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  honest limitations
&lt;/h2&gt;

&lt;p&gt;this won't catch everything. logic bugs, prompt injection through tool responses, and novel attack chains need human review. a score creates false confidence if people treat it as a guarantee. it's a floor, not a ceiling.&lt;/p&gt;

&lt;p&gt;but right now the floor is "nothing." any signal is better than zero.&lt;/p&gt;

&lt;p&gt;building a prototype. interested in beta testing? comment below.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>security</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>what if MCP servers had a Lighthouse-style security score?</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 16:17:45 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-1cpg</link>
      <guid>https://dev.to/yedanyagamiaicmd/what-if-mcp-servers-had-a-lighthouse-style-security-score-1cpg</guid>
      <description>&lt;p&gt;i've been auditing MCP servers for a few weeks now. 194 packages scanned, 118 had security findings. that's a 60.8% failure rate. and these are the ones people are plugging into their AI agents right now.&lt;/p&gt;

&lt;p&gt;the problem isn't that developers are lazy. it's that there's no visibility. when you &lt;code&gt;npm install @some-mcp/filesystem-server&lt;/code&gt;, you have zero signal about whether it sanitizes paths, leaks env vars, or shells out with unsanitized input.&lt;/p&gt;

&lt;p&gt;we have Lighthouse for web performance. we have Snyk for dependency vulnerabilities. we have nothing for MCP server security.&lt;/p&gt;

&lt;h2&gt;
  
  
  the proposal: MCP Security Score
&lt;/h2&gt;

&lt;p&gt;a 0-100 score, computed automatically, covering 5 dimensions:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. input validation (25 pts)&lt;/strong&gt; -- does the server validate tool parameters before use? or does &lt;code&gt;directory: "../../etc/passwd"&lt;/code&gt; just work?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. execution safety (25 pts)&lt;/strong&gt; -- shell injection checks. does it use &lt;code&gt;execFileSync&lt;/code&gt; with argument arrays or string-concatenated &lt;code&gt;execSync&lt;/code&gt;? does python code use &lt;code&gt;shlex.quote()&lt;/code&gt;?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. environment isolation (20 pts)&lt;/strong&gt; -- does it leak &lt;code&gt;process.env&lt;/code&gt;? does it allowlist which env vars it touches? we found 7 packages that &lt;code&gt;console.log(process.env)&lt;/code&gt; during startup.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. dependency hygiene (15 pts)&lt;/strong&gt; -- known CVEs in transitive deps, lockfile present, pinned versions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. output sanitization (15 pts)&lt;/strong&gt; -- does it prevent sensitive data from flowing back through tool responses into the LLM context?&lt;/p&gt;

&lt;h2&gt;
  
  
  why this matters more than web security
&lt;/h2&gt;

&lt;p&gt;when a web app has an XSS vulnerability, one user is affected. when an MCP server has a shell injection vulnerability, the AI agent executing it can compromise the entire host. and the agent doesn't know the difference between a safe command and &lt;code&gt;; rm -rf /&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;we counted 30 CVEs against MCP packages in the last 60 days. 437K downloads of compromised packages before takedowns. the 97M total npm SDK downloads tell us adoption is way ahead of security infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  what it could look like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;@modelcontextprotocol/server-filesystem
MCP Security Score: 72/100
  input validation:     18/25
  execution safety:     22/25
  environment isolation: 14/20
  dependency hygiene:    12/15
  output sanitization:    6/15
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;render it in npm readme badges. surface it in MCP client UIs. let agent frameworks refuse to load servers below a threshold.&lt;/p&gt;

&lt;h2&gt;
  
  
  honest limitations
&lt;/h2&gt;

&lt;p&gt;this won't catch everything. logic bugs, prompt injection through tool responses, and novel attack chains need human review. a score creates false confidence if people treat it as a guarantee. it's a floor, not a ceiling.&lt;/p&gt;

&lt;p&gt;but right now the floor is "nothing." any signal is better than zero.&lt;/p&gt;

&lt;p&gt;building a prototype. interested in beta testing? comment below.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>security</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>MCP Security Report — March 2026: 30 CVEs, 437K Compromised Downloads</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 15:08:05 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/mcp-security-report-march-2026-30-cves-437k-compromised-downloads-55e9</link>
      <guid>https://dev.to/yedanyagamiaicmd/mcp-security-report-march-2026-30-cves-437k-compromised-downloads-55e9</guid>
      <description>&lt;p&gt;30 CVEs in MCP packages in 60 days. 437K compromised downloads. a CVSS 9.6 RCE in a package with 500K downloads.&lt;/p&gt;

&lt;p&gt;this is the first monthly MCP security report. all data is real.&lt;/p&gt;

&lt;h2&gt;
  
  
  by the numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;metric&lt;/th&gt;
&lt;th&gt;value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;packages audited&lt;/td&gt;
&lt;td&gt;194&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;packages with findings&lt;/td&gt;
&lt;td&gt;118 (60.8%)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;critical findings&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;high findings&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;medium findings&lt;/td&gt;
&lt;td&gt;63&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;low findings&lt;/td&gt;
&lt;td&gt;41&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CVEs disclosed (60 days)&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;compromised downloads&lt;/td&gt;
&lt;td&gt;437,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  top 5 vulnerability patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. shell injection (critical)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// vulnerable&lt;/span&gt;
&lt;span class="nf"&gt;exec&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`git log --oneline -&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userInput&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// secure&lt;/span&gt;
&lt;span class="nf"&gt;execFileSync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;git&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;log&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;--oneline&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;validated&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MCP servers calling &lt;code&gt;child_process.exec()&lt;/code&gt; with user input. one crafted prompt = &lt;code&gt;rm -rf /&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. environment variable leakage (high)
&lt;/h3&gt;

&lt;p&gt;secrets loaded from env vars accidentally appearing in LLM context windows through error messages. this one is subtle — your API key ends up in a stack trace that gets sent to the model.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. path traversal (critical)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# vulnerable
&lt;/span&gt;&lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

&lt;span class="c1"&gt;# secure
&lt;/span&gt;&lt;span class="n"&gt;real&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;realpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;real&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;realpath&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_dir&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;SecurityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;path traversal blocked&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. dependency chain risks (medium)
&lt;/h3&gt;

&lt;p&gt;packages pulling in dozens of transitive dependencies, some unmaintained. the package itself is fine, but its supply chain introduces risk.&lt;/p&gt;

&lt;h3&gt;
  
  
  5. missing input validation (low)
&lt;/h3&gt;

&lt;p&gt;parameters accepted without type checking, length limits, or format validation.&lt;/p&gt;

&lt;h2&gt;
  
  
  emerging solutions
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;tool&lt;/th&gt;
&lt;th&gt;approach&lt;/th&gt;
&lt;th&gt;status&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Constitution Gate&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;dual-LLM runtime quarantine&lt;/td&gt;
&lt;td&gt;deployed (CF Worker)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Wombat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;unix-style rwxd permissions&lt;/td&gt;
&lt;td&gt;new entrant&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP Gateway&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OAuth 2.1 + RBAC middleware&lt;/td&gt;
&lt;td&gt;emerging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;protect-mcp&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;per-tool signed receipts&lt;/td&gt;
&lt;td&gt;emerging&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AgentAudit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;CVE-like registry for agent packages&lt;/td&gt;
&lt;td&gt;194 audited&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  recommendations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;if you build MCP servers:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;never pass user input to &lt;code&gt;exec()&lt;/code&gt; — use &lt;code&gt;execFileSync&lt;/code&gt; with argument arrays&lt;/li&gt;
&lt;li&gt;validate all inputs with JSON schema before processing&lt;/li&gt;
&lt;li&gt;use &lt;code&gt;os.path.realpath()&lt;/code&gt; + directory allowlists for file operations&lt;/li&gt;
&lt;li&gt;keep dependencies minimal — our servers average 3 direct deps each&lt;/li&gt;
&lt;li&gt;never include env vars in error messages or LLM context&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;if you deploy MCP servers:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;audit before you install — check AgentAudit or run your own scan&lt;/li&gt;
&lt;li&gt;pin dependencies with lockfiles&lt;/li&gt;
&lt;li&gt;run MCP servers with least-privilege permissions&lt;/li&gt;
&lt;li&gt;consider a security proxy (Constitution Gate, MCP Gateway, or Wombat)&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;this report will be published monthly. data sources: AgentAudit (194 packages), HN CVE tracking, and our own experience hardening 15 production MCP servers with 20 OWASP Agentic AI rules.&lt;/p&gt;

&lt;p&gt;want the full 20-rule security checklist? → &lt;a href="https://yesinyagami.gumroad.com/l/mcp-security-checklist" rel="noopener noreferrer"&gt;MCP Security Audit Checklist on Gumroad ($29)&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;runtime protection for your MCP servers? → &lt;a href="https://constitution-gate.pages.dev" rel="noopener noreferrer"&gt;Constitution Gate&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;built by &lt;a href="https://yedanyagami.cc" rel="noopener noreferrer"&gt;yedan yagami&lt;/a&gt; | &lt;a href="https://ko-fi.com/yedanyagamiai" rel="noopener noreferrer"&gt;ko-fi&lt;/a&gt; | &lt;a href="https://github.com/yedanyagamiai-cmd" rel="noopener noreferrer"&gt;github&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>cybersecurity</category>
      <category>mcp</category>
      <category>news</category>
      <category>security</category>
    </item>
    <item>
      <title>real costs of running 9 MCP servers for 30 days: $0.00</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 13:33:16 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/real-costs-of-running-9-mcp-servers-for-30-days-000-45a0</link>
      <guid>https://dev.to/yedanyagamiaicmd/real-costs-of-running-9-mcp-servers-for-30-days-000-45a0</guid>
      <description>&lt;p&gt;everyone asks the same question when i show them the system: "yeah but how much does it cost?"&lt;/p&gt;

&lt;p&gt;here's the honest answer after 30 days of running 9 MCP servers, 60+ cloudflare workers, 2 databases, a knowledge graph, and a local GPU inference stack.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;total monthly cost: $11.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;not $11 for the MCP servers. those are free. the $11 is for the VM that runs ollama. let me break it down.&lt;/p&gt;




&lt;h2&gt;
  
  
  the $0 tier: cloudflare workers
&lt;/h2&gt;

&lt;p&gt;all 9 MCP servers run on cloudflare workers free tier. every single one. no credit card required.&lt;/p&gt;

&lt;p&gt;here's what free tier gives you:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;resource&lt;/th&gt;
&lt;th&gt;free limit&lt;/th&gt;
&lt;th&gt;my actual usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;requests/day&lt;/td&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~2,000-5,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CPU time/invocation&lt;/td&gt;
&lt;td&gt;10ms&lt;/td&gt;
&lt;td&gt;2-8ms avg&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;workers&lt;/td&gt;
&lt;td&gt;unlimited&lt;/td&gt;
&lt;td&gt;60+ deployed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KV reads/day&lt;/td&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KV storage&lt;/td&gt;
&lt;td&gt;1 GB&lt;/td&gt;
&lt;td&gt;~12 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;i'm using roughly 3-5% of the free tier limits on a busy day. the 10ms CPU limit sounds scary until you realize most tool operations finish in 2-3ms. the constraint forces you to write efficient code, which is a feature not a bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  the $0 tier: D1 databases
&lt;/h2&gt;

&lt;p&gt;i run 2 D1 databases on free tier. D1 is sqlite at the edge. i store 4,300+ knowledge graph entities, full audit trails, and A/B experiment results. all on free tier.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;resource&lt;/th&gt;
&lt;th&gt;free limit&lt;/th&gt;
&lt;th&gt;my usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;storage&lt;/td&gt;
&lt;td&gt;5 GB per database&lt;/td&gt;
&lt;td&gt;~400 MB total&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;reads/day&lt;/td&gt;
&lt;td&gt;5,000,000&lt;/td&gt;
&lt;td&gt;~10,000&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;writes/day&lt;/td&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;~1,000&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  the $0 tier: LLM inference
&lt;/h2&gt;

&lt;p&gt;this is the part that makes people do a double-take. three free LLM API providers with multi-provider routing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;provider&lt;/th&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;th&gt;free tier&lt;/th&gt;
&lt;th&gt;rate limit&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;groq&lt;/td&gt;
&lt;td&gt;llama-3.3-70b&lt;/td&gt;
&lt;td&gt;unlimited*&lt;/td&gt;
&lt;td&gt;30 req/min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;cerebras&lt;/td&gt;
&lt;td&gt;llama-3.3-70b&lt;/td&gt;
&lt;td&gt;unlimited*&lt;/td&gt;
&lt;td&gt;30 req/min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sambanova&lt;/td&gt;
&lt;td&gt;llama-3.3-70b&lt;/td&gt;
&lt;td&gt;unlimited*&lt;/td&gt;
&lt;td&gt;varies&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;the trick: when groq rate-limits me, requests cascade to cerebras, then sambanova. circuit breaker pattern (3 failures = 1 min cooldown) means the system self-heals.&lt;/p&gt;

&lt;p&gt;is this sustainable? honestly, probably not forever. but llama-3.3-70b inference is heading toward $0.05-0.10 per million tokens.&lt;/p&gt;




&lt;h2&gt;
  
  
  the $11/month: the VM
&lt;/h2&gt;

&lt;p&gt;oracle cloud VM with RTX 3060. runs ollama (7 local models), 3 AI brains, 48 skills. flash attention, KV cache, 24/7.&lt;/p&gt;

&lt;p&gt;could i skip it? yes. the VM is a luxury, not a necessity.&lt;/p&gt;




&lt;h2&gt;
  
  
  the real cost breakdown (30 days)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;item&lt;/th&gt;
&lt;th&gt;monthly cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;9 MCP servers (cloudflare workers)&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;50+ additional workers&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 D1 databases&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;R2 + KV storage&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;groq + cerebras + sambanova APIs&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;domain + SSL&lt;/td&gt;
&lt;td&gt;$0.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;oracle cloud VM (RTX 3060)&lt;/td&gt;
&lt;td&gt;$11.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$11.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  honest limitations
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;no cron triggers on free tier (workaround: systemd timer on VM)&lt;/li&gt;
&lt;li&gt;10ms CPU tight for heavy computation&lt;/li&gt;
&lt;li&gt;no websocket without durable objects (SSE works fine for MCP)&lt;/li&gt;
&lt;li&gt;D1 sqlite write contention at ~100 writes/sec&lt;/li&gt;
&lt;li&gt;free LLM APIs have no SLA&lt;/li&gt;
&lt;li&gt;workers AI free = ~100 small inference calls/day&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  the punchline
&lt;/h2&gt;

&lt;p&gt;the model is becoming a commodity. infrastructure is becoming a commodity. the real cost is your time.&lt;/p&gt;

&lt;p&gt;$11/month for 9 MCP servers, 60+ workers, 2 databases, a GPU inference box, and edge deployment across 300+ cities.&lt;/p&gt;

&lt;p&gt;the expensive part was never the servers. it was always figuring out what to build.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>serverless</category>
    </item>
    <item>
      <title>what i actually learned coordinating 15 MCP servers (it's not what you'd expect)</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 12:48:53 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/what-i-actually-learned-coordinating-15-mcp-servers-its-not-what-youd-expect-jb5</link>
      <guid>https://dev.to/yedanyagamiaicmd/what-i-actually-learned-coordinating-15-mcp-servers-its-not-what-youd-expect-jb5</guid>
      <description>&lt;p&gt;everyone talks about MCP servers like they're the hard part. they're not. writing a single MCP server is maybe 200 lines of code. the hard part is what happens when you have 15 of them running simultaneously and they all need to cooperate.&lt;/p&gt;

&lt;p&gt;i've been building a multi-agent system for the past few months. 9 services, 15 MCP servers, 60+ Cloudflare Workers. here's what i actually learned — most of it the hard way.&lt;/p&gt;

&lt;h2&gt;
  
  
  lesson 1: the orchestration layer is the real product
&lt;/h2&gt;

&lt;p&gt;anyone can write an MCP server. &lt;code&gt;child_process.exec()&lt;/code&gt;, parse the output, return JSON. done.&lt;/p&gt;

&lt;p&gt;but when server #7 times out and server #3 depends on its output, and server #12 is rate-limited, and the user is waiting... that's where the real engineering lives.&lt;/p&gt;

&lt;p&gt;we built a coordinator daemon that does health checks every 30 seconds across all services. when something goes down, it doesn't just retry — it reroutes through fallback chains. primary fails? try the secondary. secondary fails? degrade gracefully and tell the user what happened.&lt;/p&gt;

&lt;p&gt;this is boring plumbing work. it's also the thing that makes the difference between a demo and a production system.&lt;/p&gt;

&lt;h2&gt;
  
  
  lesson 2: security is not optional (and it's scarier than you think)
&lt;/h2&gt;

&lt;p&gt;we run 15 MCP servers. each one is a potential attack surface. the patterns we've seen (and defended against):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;shell injection&lt;/strong&gt;: if your MCP server calls &lt;code&gt;child_process.exec()&lt;/code&gt; with user input, you're one crafted prompt away from &lt;code&gt;rm -rf /&lt;/code&gt;. we use &lt;code&gt;shlex.quote()&lt;/code&gt; on literally everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;env variable leakage&lt;/strong&gt;: secrets loaded from env vars accidentally appearing in LLM context windows through error messages. this one is subtle and terrifying.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;path traversal&lt;/strong&gt;: &lt;code&gt;../../etc/passwd&lt;/code&gt; in a file-reading MCP server. &lt;code&gt;os.path.realpath()&lt;/code&gt; + directory whitelist, no exceptions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;we eventually built a "constitution gate" — a dual-LLM validation layer that checks every input before it reaches any tool. paranoid? maybe. but we haven't been pwned yet.&lt;/p&gt;

&lt;h2&gt;
  
  
  lesson 3: the model is becoming a commodity
&lt;/h2&gt;

&lt;p&gt;we route between groq, cerebras, ollama (local), and claude depending on the task. same prompt, different providers, based on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;latency requirements (groq for fast, claude for complex)&lt;/li&gt;
&lt;li&gt;cost (local ollama for repetitive tasks)&lt;/li&gt;
&lt;li&gt;availability (if one provider is down, cascade to the next)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the model doesn't matter as much as people think. what matters is the routing logic, the fallback chains, the budget governance that prevents a runaway loop from draining your API credits.&lt;/p&gt;

&lt;h2&gt;
  
  
  lesson 4: your agent's memory is more important than its reasoning
&lt;/h2&gt;

&lt;p&gt;we have three layers of memory:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;session memory&lt;/strong&gt; (what happened in this conversation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;task memory&lt;/strong&gt; (success/failure patterns across all tasks)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;playbook memory&lt;/strong&gt; (reusable templates auto-generated from successful task sequences)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;when a new task comes in, the orchestrator checks memory &lt;em&gt;before&lt;/em&gt; planning. "have we seen something like this before? what worked? what failed?" this alone cut our error rate by ~40%.&lt;/p&gt;

&lt;h2&gt;
  
  
  lesson 5: silence is a feature
&lt;/h2&gt;

&lt;p&gt;this is the one nobody talks about. our system has a dead-man's-switch — if the coordinator hasn't checked in for 60 minutes, something is wrong. but the inverse is also true: the system doesn't need to be &lt;em&gt;doing something&lt;/em&gt; all the time.&lt;/p&gt;

&lt;p&gt;the most reliable systems i've built are the ones that know when to shut up and wait.&lt;/p&gt;




&lt;p&gt;these aren't revolutionary insights. they're the boring, practical things you learn when you actually try to run multiple MCP servers in production instead of just demoing one in a blog post.&lt;/p&gt;

&lt;p&gt;if you're building something similar, i'd genuinely love to hear what patterns you've found. especially around multi-server coordination — i feel like we're all reinventing the same wheels independently.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>agents</category>
      <category>architecture</category>
    </item>
    <item>
      <title>I Built a 42KB Website with Canvas Particles, Live API Status, and an Interactive Terminal</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 08:49:20 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/i-built-a-42kb-website-with-canvas-particles-live-api-status-and-an-interactive-terminal-43c8</link>
      <guid>https://dev.to/yedanyagamiaicmd/i-built-a-42kb-website-with-canvas-particles-live-api-status-and-an-interactive-terminal-43c8</guid>
      <description>&lt;p&gt;Last night I rewrote my entire website from scratch. No React. No Tailwind CDN. No build step. Just one self-contained HTML file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live&lt;/strong&gt;: &lt;a href="https://yedanyagami.cc" rel="noopener noreferrer"&gt;yedanyagami.cc&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1 HTML file&lt;/strong&gt; (42KB)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;0 dependencies&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloudflare Pages&lt;/strong&gt; (free tier)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;System fonts only&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Canvas Particle System
&lt;/h2&gt;

&lt;p&gt;The hero has ~120 particles with mouse repulsion. 80 lines of vanilla JS:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;mx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;my&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dx&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;dx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;dy&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="nx"&gt;dy&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;dx&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;dy&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mf"&gt;1.5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Live Fleet Status
&lt;/h2&gt;

&lt;p&gt;The site fetches real health data from a production API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://yedan-graph-rag.yagami8095.workers.dev/health&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;d&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* update service cards */&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;CORS enabled (&lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt;), so it works from the browser.&lt;/p&gt;

&lt;h2&gt;
  
  
  Interactive Terminal
&lt;/h2&gt;

&lt;p&gt;Users type real commands: &lt;code&gt;help&lt;/code&gt;, &lt;code&gt;status&lt;/code&gt;, &lt;code&gt;services&lt;/code&gt;, &lt;code&gt;benchmark&lt;/code&gt;, &lt;code&gt;kg&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Auto-demo starts after 5s idle, any keypress switches to interactive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scroll Reveals (20 lines)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight css"&gt;&lt;code&gt;&lt;span class="nc"&gt;.rv&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;translateY&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;28px&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="nl"&gt;transition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;opacity&lt;/span&gt; &lt;span class="m"&gt;.65s&lt;/span&gt; &lt;span class="n"&gt;ease&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nc"&gt;.rv.vis&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;opacity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;none&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;ro&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IntersectionObserver&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;es&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;es&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isIntersecting&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;classList&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;vis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;15&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelectorAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.rv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;ro&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;observe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Stats (All Real)
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stat&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Services&lt;/td&gt;
&lt;td&gt;9 active&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Benchmark&lt;/td&gt;
&lt;td&gt;10/10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Providers&lt;/td&gt;
&lt;td&gt;14 cloud&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KG Entities&lt;/td&gt;
&lt;td&gt;5,600+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP Servers&lt;/td&gt;
&lt;td&gt;17&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every number verified from live fleet data.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;42KB total&lt;/li&gt;
&lt;li&gt;~12KB gzipped&lt;/li&gt;
&lt;li&gt;System fonts (zero loading delay)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;prefers-reduced-motion&lt;/code&gt; respected&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;strong&gt;Live site&lt;/strong&gt;: &lt;a href="https://yedanyagami.cc" rel="noopener noreferrer"&gt;yedanyagami.cc&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;View source — it's all in one file. If you're building something similar, the interactive terminal and live API fetch are the most interesting parts.&lt;/p&gt;

&lt;p&gt;Support the project: &lt;a href="https://ko-fi.com/whitebrookpeterpan" rel="noopener noreferrer"&gt;ko-fi.com/whitebrookpeterpan&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>showdev</category>
      <category>css</category>
    </item>
    <item>
      <title>I Automated My Upwork Proposals With AI — Here's the Template (Free)</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Mon, 30 Mar 2026 04:38:55 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/i-automated-my-upwork-proposals-with-ai-heres-the-template-free-2l51</link>
      <guid>https://dev.to/yedanyagamiaicmd/i-automated-my-upwork-proposals-with-ai-heres-the-template-free-2l51</guid>
      <description>&lt;p&gt;I got tired of writing Upwork proposals from scratch every time. So I built a template system.&lt;/p&gt;

&lt;p&gt;5 templates. Each one optimized for a specific type of AI consulting job. Copy the template, fill in the brackets, submit.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Writing proposals takes 15-30 minutes each. Most get ignored. The ones that get responses have a pattern — they're specific, they show relevant experience, and they're short.&lt;/p&gt;

&lt;p&gt;So I reverse-engineered what works and made templates.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 5 Templates
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. MCP / AI Security Audit
&lt;/h3&gt;

&lt;p&gt;For jobs mentioning: security, OWASP, MCP, compliance, vulnerability assessment.&lt;/p&gt;

&lt;p&gt;Key hook: "I've built 17 production MCP servers and published a security checklist based on OWASP Agentic AI Top 10."&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Multi-Agent / AI Architecture
&lt;/h3&gt;

&lt;p&gt;For jobs mentioning: AI agents, orchestration, distributed systems, multi-agent.&lt;/p&gt;

&lt;p&gt;Key hook: "My production platform runs 9 coordinated services across 14 cloud providers."&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Claude / Anthropic Integration
&lt;/h3&gt;

&lt;p&gt;For jobs mentioning: Claude, Anthropic, prompt engineering, Claude Code.&lt;/p&gt;

&lt;p&gt;Key hook: "250+ skills, 35 agents, 17 MCP integrations in production."&lt;/p&gt;

&lt;h3&gt;
  
  
  4. RAG / Knowledge Graph
&lt;/h3&gt;

&lt;p&gt;For jobs mentioning: RAG, retrieval, vector search, embeddings, knowledge graph.&lt;/p&gt;

&lt;p&gt;Key hook: "Production A-RAG system with 5,600+ entities, BM25 + semantic + causal search."&lt;/p&gt;

&lt;h3&gt;
  
  
  5. General AI/LLM
&lt;/h3&gt;

&lt;p&gt;For everything else: chatbots, automation, GPT integration.&lt;/p&gt;

&lt;p&gt;Key hook: "9 coordinated AI services running 24/7 with 14 cloud providers."&lt;/p&gt;

&lt;h2&gt;
  
  
  The Scoring System
&lt;/h2&gt;

&lt;p&gt;Each template includes a scoring guide — match keywords from the job description to pick the right template:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Job keywords&lt;/th&gt;
&lt;th&gt;Template&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;security, audit, OWASP&lt;/td&gt;
&lt;td&gt;#1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;agent, orchestration&lt;/td&gt;
&lt;td&gt;#2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude, Anthropic&lt;/td&gt;
&lt;td&gt;#3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RAG, knowledge graph&lt;/td&gt;
&lt;td&gt;#4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI, LLM, chatbot&lt;/td&gt;
&lt;td&gt;#5&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why This Works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Specificity&lt;/strong&gt; — Each template includes concrete numbers (9 services, 14 providers, 5,600+ entities) instead of generic claims&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brevity&lt;/strong&gt; — All templates are under 150 words&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Social proof&lt;/strong&gt; — Links to portfolio, GitHub, published research&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Call to action&lt;/strong&gt; — Ends with availability and rate, not a question&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Get the Templates
&lt;/h2&gt;

&lt;p&gt;The full template pack (all 5 templates + scoring guide + customization tips) is available for free.&lt;/p&gt;

&lt;p&gt;If it saves you time, consider supporting the project — I'm building an open-source distributed AGI platform with 9 coordinated services.&lt;/p&gt;

&lt;h2&gt;
  
  
  My Setup
&lt;/h2&gt;

&lt;p&gt;I built these templates as part of a larger system — a distributed AGI platform that runs 9 coordinated AI services. The same system that generates these proposals also:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Benchmarks itself every 30 minutes&lt;/li&gt;
&lt;li&gt;Identifies its own weaknesses&lt;/li&gt;
&lt;li&gt;Generates improvement hypotheses&lt;/li&gt;
&lt;li&gt;Runs experiments to test them&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can see it live at &lt;a href="https://yedanyagami.cc" rel="noopener noreferrer"&gt;yedanyagami.cc&lt;/a&gt;, including real-time fleet status.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;If you're a freelance AI developer, these templates will save you hours.&lt;/strong&gt; Copy them, customize them, iterate on what works.&lt;/p&gt;

&lt;p&gt;And if you want to see how the system behind them works — check out the &lt;a href="https://yedanyagami.cc#terminal" rel="noopener noreferrer"&gt;interactive terminal&lt;/a&gt; on my site. Type &lt;code&gt;services&lt;/code&gt; to see what's running.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>freelancing</category>
      <category>career</category>
      <category>productivity</category>
    </item>
    <item>
      <title>GPU-First LLM Inference: How I Cut API Costs to $0 With a Laptop GPU</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Sun, 29 Mar 2026 12:49:07 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/gpu-first-llm-inference-how-i-cut-api-costs-to-0-with-a-laptop-gpu-3057</link>
      <guid>https://dev.to/yedanyagamiaicmd/gpu-first-llm-inference-how-i-cut-api-costs-to-0-with-a-laptop-gpu-3057</guid>
      <description>&lt;p&gt;Cloud LLM APIs are expensive. Groq, OpenAI, Anthropic — they all charge per token. But what if you could run &lt;strong&gt;production-quality inference for free&lt;/strong&gt; on your laptop GPU?&lt;/p&gt;

&lt;p&gt;Here's how I built a GPU-first architecture that routes 90%+ of queries to local models at $0 cost.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Hardware&lt;/strong&gt;: NVIDIA RTX 4050 Laptop (6GB VRAM)&lt;br&gt;
&lt;strong&gt;Software&lt;/strong&gt;: Ollama + Node.js&lt;br&gt;
&lt;strong&gt;Models&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deepseek-r1:8b (5.2GB) — Complex reasoning&lt;/li&gt;
&lt;li&gt;phi4-mini (2.5GB) — General + science&lt;/li&gt;
&lt;li&gt;qwen2.5:3b (1.9GB) — Quick answers&lt;/li&gt;
&lt;li&gt;nomic-embed-text (274MB) — Embeddings&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Total: ~12GB on disk, but only 1 model loads into VRAM at a time.&lt;/p&gt;
&lt;h2&gt;
  
  
  Ollama Optimization (Critical for 6GB)
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_FLASH_ATTENTION&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_KV_CACHE_TYPE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;q8_0
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_NUM_PARALLEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_MAX_LOADED_MODELS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;OLLAMA_GPU_OVERHEAD&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;600
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;These settings are the difference between OOM crashes and smooth operation.&lt;/p&gt;
&lt;h2&gt;
  
  
  Smart Routing
&lt;/h2&gt;

&lt;p&gt;Not every query needs the biggest model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;selectModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;*&lt;/span&gt;&lt;span class="se"&gt;[\*\/\^]\s&lt;/span&gt;&lt;span class="sr"&gt;*&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;+/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deepseek-r1:8b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/atomic|element|chemical/&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;phi4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen2.5:3b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;phi4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Cloud Fallback (14 Providers)
&lt;/h2&gt;

&lt;p&gt;When GPU is busy or the model needs more capability, fall back to cloud:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CLOUD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="c1"&gt;// Groq x4 keys (round-robin)&lt;/span&gt;
  &lt;span class="c1"&gt;// Cerebras x4 keys&lt;/span&gt;
  &lt;span class="c1"&gt;// SambaNova x4 keys&lt;/span&gt;
  &lt;span class="c1"&gt;// DeepInfra, Mistral&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;callCloud&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;CLOUD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;CLOUD&lt;/span&gt;&lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="nx"&gt;CLOUD&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before (Cloud Only)&lt;/th&gt;
&lt;th&gt;After (GPU-First)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cost/month&lt;/td&gt;
&lt;td&gt;$50-200&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Avg latency&lt;/td&gt;
&lt;td&gt;300-800ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;200-500ms&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Availability&lt;/td&gt;
&lt;td&gt;99% (rate limits)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;99.9%&lt;/strong&gt; (14 fallbacks)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Privacy&lt;/td&gt;
&lt;td&gt;Data sent to cloud&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Local processing&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  The Key Insight
&lt;/h2&gt;

&lt;p&gt;Cloud APIs are a &lt;strong&gt;fallback&lt;/strong&gt;, not the default. For 90%+ of queries, a $500 laptop GPU gives you better latency, zero cost, and complete privacy.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Start with &lt;code&gt;ollama pull qwen2.5:3b&lt;/code&gt; and build from there.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>tutorial</category>
      <category>machinelearning</category>
      <category>devops</category>
    </item>
    <item>
      <title>OWASP Agentic AI 2026: The 10 Security Risks Every AI Developer Must Know</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Sun, 29 Mar 2026 12:43:32 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/owasp-agentic-ai-2026-the-10-security-risks-every-ai-developer-must-know-1idi</link>
      <guid>https://dev.to/yedanyagamiaicmd/owasp-agentic-ai-2026-the-10-security-risks-every-ai-developer-must-know-1idi</guid>
      <description>&lt;p&gt;The OWASP Top 10 for Agentic Applications dropped in 2026, and it's a wake-up call. &lt;strong&gt;48% of cybersecurity professionals now rank agentic AI as the #1 attack vector&lt;/strong&gt; — above ransomware.&lt;/p&gt;

&lt;p&gt;Here's what you need to know and how to defend against each risk.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Agentic AI Security Is Different
&lt;/h2&gt;

&lt;p&gt;Traditional LLM security assumes a human in the loop. Agentic AI doesn't work that way — agents plan, call tools, store memory, and execute &lt;strong&gt;without human review at each step&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The attack surface includes every tool call, every memory read/write, every inter-agent handoff.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Top 10 Risks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ASI01: Agent Goal Hijacking (Critical)
&lt;/h3&gt;

&lt;p&gt;An attacker embeds instructions in data the agent processes (emails, documents, web content).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defense:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CONSTITUTION&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="sr"&gt;/ignore&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;previous|above|all&lt;/span&gt;&lt;span class="se"&gt;)\s&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;instructions|prompts&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sr"&gt;/you&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+are&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+now&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;a|DAN|jailbroken&lt;/span&gt;&lt;span class="se"&gt;)&lt;/span&gt;&lt;span class="sr"&gt;/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="sr"&gt;/system&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;*prompt|reveal.*instructions/i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ASI02: Tool Misuse
&lt;/h3&gt;

&lt;p&gt;Agents generating and executing unsafe code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defense:&lt;/strong&gt; Sandbox all code execution. Treat LLM output as hostile.&lt;/p&gt;

&lt;h3&gt;
  
  
  ASI03: Identity &amp;amp; Privilege Abuse
&lt;/h3&gt;

&lt;p&gt;Agent credentials stolen or escalated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defense:&lt;/strong&gt; Short-lived tokens, OAuth 2.0, isolated agent identities.&lt;/p&gt;

&lt;h3&gt;
  
  
  ASI04: Memory Poisoning
&lt;/h3&gt;

&lt;p&gt;Malicious data persists in agent memory, corrupting future decisions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Defense:&lt;/strong&gt; TTL on memory entries, structured fact validation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;FACT_TTL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3600000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// 1 hour&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;cleanExpiredFacts&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;factTimestamps&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;FACT_TTL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;delete&lt;/span&gt; &lt;span class="nx"&gt;facts&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;k&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ASI05-ASI10: Brief Overview
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ASI05&lt;/strong&gt;: Data Exfiltration — scan outputs for secrets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ASI06&lt;/strong&gt;: Supply Chain — verify MCP servers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ASI07&lt;/strong&gt;: Insecure Inter-Agent Comm — use mTLS&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ASI08&lt;/strong&gt;: Cascading Failures — circuit breakers + rate limits&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ASI09&lt;/strong&gt;: Excessive Agency — least-privilege principle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ASI10&lt;/strong&gt;: Rogue Agents — anomaly detection + kill switches&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Implementation Checklist
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[ ] Constitution rules (20+ patterns)&lt;/li&gt;
&lt;li&gt;[ ] Rate limiting (30 req/min recommended)&lt;/li&gt;
&lt;li&gt;[ ] Memory TTL (1 hour for untrusted facts)&lt;/li&gt;
&lt;li&gt;[ ] Input size validation (4KB max)&lt;/li&gt;
&lt;li&gt;[ ] Output scanning (no secrets/PII in responses)&lt;/li&gt;
&lt;li&gt;[ ] Auth on all mutation endpoints&lt;/li&gt;
&lt;li&gt;[ ] Bind services to 127.0.0.1 unless needed externally&lt;/li&gt;
&lt;li&gt;[ ] Audit logging (JSONL, append-only)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Key Takeaway
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;Implementation requires 80% governance, 20% technology.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The attacks are real. The defenses are implementable. Start now.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sources: &lt;a href="https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/" rel="noopener noreferrer"&gt;OWASP GenAI Security Project&lt;/a&gt;, &lt;a href="https://unit42.paloaltonetworks.com" rel="noopener noreferrer"&gt;Palo Alto Unit 42&lt;/a&gt;, &lt;a href="https://crowdstrike.com" rel="noopener noreferrer"&gt;CrowdStrike&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>webdev</category>
      <category>devops</category>
    </item>
    <item>
      <title>How to Build Self-Evolving AI Agents That Improve Without Human Intervention</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Sun, 29 Mar 2026 12:41:34 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/how-to-build-self-evolving-ai-agents-that-improve-without-human-intervention-2317</link>
      <guid>https://dev.to/yedanyagamiaicmd/how-to-build-self-evolving-ai-agents-that-improve-without-human-intervention-2317</guid>
      <description>&lt;p&gt;Most AI agents are static — they do exactly what they're told, nothing more. But what if your agents could &lt;strong&gt;benchmark themselves&lt;/strong&gt;, &lt;strong&gt;learn from failures&lt;/strong&gt;, and &lt;strong&gt;optimize their own performance&lt;/strong&gt; without any human intervention?&lt;/p&gt;

&lt;p&gt;In this guide, I'll show you how to build a self-evolving agent architecture using free tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Loop
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Benchmark → Analyze Failures → Adjust Strategy → Re-benchmark → Repeat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the &lt;strong&gt;Evolution Cycle&lt;/strong&gt; — a continuous loop that runs every few hours:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Benchmark&lt;/strong&gt;: Run a standardized test suite across all dimensions (reasoning, math, code, safety, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze&lt;/strong&gt;: Identify which dimensions scored lowest&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adjust&lt;/strong&gt;: Modify model routing, prompt templates, or temperature settings&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-benchmark&lt;/strong&gt;: Verify the adjustment improved performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Log&lt;/strong&gt;: Record everything for audit&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  GPU-First Architecture ($0 Inference)
&lt;/h2&gt;

&lt;p&gt;The key insight: &lt;strong&gt;local GPU inference is free&lt;/strong&gt;. With Ollama and a modest GPU (RTX 4050, 6GB VRAM), you can run:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;deepseek-r1:8b&lt;/strong&gt; (5.2GB) — Reasoning &amp;amp; math&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;phi4-mini&lt;/strong&gt; (2.5GB) — Science &amp;amp; general knowledge&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;qwen2.5:3b&lt;/strong&gt; (1.9GB) — Fast responses&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Cloud APIs (Groq, Cerebras, SambaNova) serve as fallback when GPU is busy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Smart Model Routing
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;selectModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;+&lt;/span&gt;&lt;span class="se"&gt;\s&lt;/span&gt;&lt;span class="sr"&gt;*&lt;/span&gt;&lt;span class="se"&gt;[\*\/\^]\s&lt;/span&gt;&lt;span class="sr"&gt;*&lt;/span&gt;&lt;span class="se"&gt;\d&lt;/span&gt;&lt;span class="sr"&gt;+|calculat/i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;deepseek-r1:8b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/atomic|element|chemical/i&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;phi4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;qwen2.5:3b&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;phi4-mini&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Self-Evolution Implementation
&lt;/h2&gt;

&lt;p&gt;The evolution cycle is a simple Node.js daemon:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;evolutionCycle&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;runBenchmark&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;correct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;suggestions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;failures&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dimension&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;suggestion&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;analyzeFix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;suggestions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;applyFix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nf"&gt;auditLog&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;evolution_complete&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;correct&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;fixes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;suggestions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;evolutionCycle&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;7200000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Security: OWASP Agentic AI 2026
&lt;/h2&gt;

&lt;p&gt;Self-evolving agents need guardrails. The OWASP Top 10 for Agentic AI (2026) identifies key risks:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Agent Goal Hijacking&lt;/strong&gt; — Defend with constitution rules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Poisoning&lt;/strong&gt; — Use TTL on stored facts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cascading Failures&lt;/strong&gt; — Implement rate limiting + circuit breakers&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;After implementing this architecture, we achieved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;100% benchmark&lt;/strong&gt; across 10 dimensions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;$0 inference cost&lt;/strong&gt; (GPU-first)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autonomous operation&lt;/strong&gt; (no human intervention needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-healing&lt;/strong&gt; (auto-restart failed components)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Install &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Pull models: &lt;code&gt;ollama pull qwen2.5:3b&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Build your agent with the routing logic above&lt;/li&gt;
&lt;li&gt;Add the evolution cycle&lt;/li&gt;
&lt;li&gt;Deploy as a systemd service for persistence&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;&lt;em&gt;Tools mentioned: &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;Ollama&lt;/a&gt; (free, open-source local LLM), &lt;a href="https://groq.com" rel="noopener noreferrer"&gt;Groq&lt;/a&gt; (fast cloud inference)&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>tutorial</category>
      <category>architecture</category>
    </item>
    <item>
      <title>My AI Vice-CEO Ran 27 Autonomous Cycles While I Was AFK</title>
      <dc:creator>YedanYagami</dc:creator>
      <pubDate>Sat, 28 Mar 2026 18:27:20 +0000</pubDate>
      <link>https://dev.to/yedanyagamiaicmd/my-ai-vice-ceo-ran-27-autonomous-cycles-while-i-was-afk-1bog</link>
      <guid>https://dev.to/yedanyagamiaicmd/my-ai-vice-ceo-ran-27-autonomous-cycles-while-i-was-afk-1bog</guid>
      <description>&lt;h2&gt;
  
  
  The Vice-CEO AI ran 27 autonomous OODA cycles and 34 self-improvement experiments in 133 minutes.
&lt;/h2&gt;

&lt;p&gt;No human intervention. Zero crashes.&lt;/p&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;When Claude Code (the main orchestrator) goes offline, GOLEM Brain on VM1 detects the missing heartbeat and switches to TAKEOVER mode:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;27 OODA cycles&lt;/strong&gt; at 5-minute intervals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;34 Karpathy experiments&lt;/strong&gt; (hypothesis → probe → record)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fleet dispatch&lt;/strong&gt; to rendan (VM2) for tactical execution&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Numbers
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Benchmark&lt;/td&gt;
&lt;td&gt;96/100 (10 dimensions)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;KG Entities&lt;/td&gt;
&lt;td&gt;5,362&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Autonomous Runtime&lt;/td&gt;
&lt;td&gt;133 min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;System Cost&lt;/td&gt;
&lt;td&gt;$0/month&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Products
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://yesinyagami.gumroad.com/l/mcp-security-checklist" rel="noopener noreferrer"&gt;MCP Security Checklist - $29&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://yesinyagami.gumroad.com/l/ai-agent-blueprint" rel="noopener noreferrer"&gt;AI Agent Blueprint - $49&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://yedanyagami.cc" rel="noopener noreferrer"&gt;yedanyagamiai&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>automation</category>
      <category>architecture</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
