<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Peter Kacerik</title>
    <description>The latest articles on DEV Community by Peter Kacerik (@peterkacerik).</description>
    <link>https://dev.to/peterkacerik</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3853619%2F2453e083-febb-4cbc-9bdf-7a9d4b497652.jpg</url>
      <title>DEV Community: Peter Kacerik</title>
      <link>https://dev.to/peterkacerik</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/peterkacerik"/>
    <language>en</language>
    <item>
      <title>The LiteLLM Supply Chain Attack Changed How We Think About AI Cost Monitoring</title>
      <dc:creator>Peter Kacerik</dc:creator>
      <pubDate>Tue, 31 Mar 2026 14:02:19 +0000</pubDate>
      <link>https://dev.to/peterkacerik/the-litellm-supply-chain-attack-changed-how-we-think-about-ai-cost-monitoring-59mm</link>
      <guid>https://dev.to/peterkacerik/the-litellm-supply-chain-attack-changed-how-we-think-about-ai-cost-monitoring-59mm</guid>
      <description>&lt;p&gt;On March 24, 2026, malicious LiteLLM packages (v1.82.7, v1.82.8) were published to PyPI after attackers compromised LiteLLM's CI/CD pipeline via a poisoned GitHub Action. The packages contained credential stealers that exfiltrated SSH keys, cloud provider sessions, and Terraform state. They were live for ~3 hours before PyPI quarantined them.&lt;/p&gt;

&lt;p&gt;LiteLLM is present in 36% of all cloud environments. The blast radius was massive.&lt;/p&gt;

&lt;p&gt;## Why This Matters for AI Cost Monitoring&lt;/p&gt;

&lt;p&gt;Most AI cost tracking tools use one of two approaches:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Gateway/Proxy&lt;/strong&gt; — Route all your AI API calls through a third-party proxy (Helicone, Portkey, LiteLLM). The proxy logs costs, tokens, latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Passive SDK&lt;/strong&gt; — A lightweight SDK that sends metadata (model name, token count, cost, tags) to a tracking service. API calls go directly to OpenAI/Anthropic — the SDK never sits in the request path.&lt;/p&gt;

&lt;p&gt;The LiteLLM breach exposed a fundamental risk with approach #1: &lt;strong&gt;any tool in the request path can be compromised&lt;/strong&gt;. A gateway handles your API keys, sees your prompts, and processes every request. A compromised version can steal everything.&lt;/p&gt;

&lt;p&gt;## The Passive SDK Alternative&lt;/p&gt;

&lt;p&gt;With a passive SDK approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The SDK &lt;strong&gt;never handles your API keys&lt;/strong&gt; — calls go directly to the provider&lt;/li&gt;
&lt;li&gt;The SDK &lt;strong&gt;never sees your prompts&lt;/strong&gt; — only metadata (model, tokens, cost, tags)&lt;/li&gt;
&lt;li&gt;Even a compromised SDK version &lt;strong&gt;cannot intercept or steal credentials&lt;/strong&gt; — it architecturally lacks access&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero latency impact&lt;/strong&gt; — nothing sits between you and the provider&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No single point of failure&lt;/strong&gt; — if the SDK goes down, your AI features keep working&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This isn't a theoretical advantage. After March 24, it's a practical security consideration.&lt;/p&gt;

&lt;p&gt;## What to Look For in Your Stack&lt;/p&gt;

&lt;p&gt;If you're evaluating AI cost monitoring tools, ask:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Does it sit in my request path?&lt;/strong&gt; If yes, it's a supply chain attack surface.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Does it handle my API keys?&lt;/strong&gt; If yes, a breach means key theft.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Does it store my prompts?&lt;/strong&gt; If yes, a breach means data exfiltration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What happens if it goes down?&lt;/strong&gt; If your AI features break, that's a single point of failure.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The answers to all four should ideally be "no."&lt;/p&gt;

&lt;p&gt;## Disclosure&lt;/p&gt;

&lt;p&gt;I'm the founder of &lt;a href="https://aispendguard.com?utm_source=devto&amp;amp;utm_medium=blog&amp;amp;utm_campaign=breach-crosspost" rel="noopener noreferrer"&gt;AISpendGuard&lt;/a&gt;, which uses the passive SDK approach. We built it this way because we believe cost monitoring shouldn't require trusting a third party with your API keys or prompt data. Free tier: 50K events/mo, no credit card.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;What's your approach to AI cost monitoring? Have you evaluated the security implications of proxy vs passive architectures? I'd love to hear how other teams are thinking about this.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>llm</category>
      <category>python</category>
    </item>
  </channel>
</rss>
