<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Alex LaGuardia</title>
    <description>The latest articles on DEV Community by Alex LaGuardia (@alexlaguardia).</description>
    <link>https://dev.to/alexlaguardia</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3832811%2Faba6514b-1248-4d62-a8ac-2568cd790b8f.jpeg</url>
      <title>DEV Community: Alex LaGuardia</title>
      <link>https://dev.to/alexlaguardia</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/alexlaguardia"/>
    <language>en</language>
    <item>
      <title>I broke my own governed MCP server by hand, then built the scanner that catches the class</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Sat, 27 Jun 2026 23:18:44 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/i-broke-my-own-governed-mcp-server-by-hand-then-built-the-scanner-that-catches-the-class-1ip7</link>
      <guid>https://dev.to/alexlaguardia/i-broke-my-own-governed-mcp-server-by-hand-then-built-the-scanner-that-catches-the-class-1ip7</guid>
      <description>&lt;p&gt;A few weeks back I shipped Warden, a governance layer that sits in front of an MCP server and enforces who can read what. Role-based, field-level. The demo had a &lt;code&gt;support&lt;/code&gt; role that could list customer accounts but never see their billing &lt;code&gt;tier&lt;/code&gt;. The &lt;code&gt;tier&lt;/code&gt; field is stripped from everything support gets back.&lt;/p&gt;

&lt;p&gt;I was poking at it the way you poke at your own work when you don't quite trust it.. I tried this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nf"&gt;query_resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;accounts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tier&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Enterprise&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six rows came back. Acme Corp, Initech, Umbrella, Hooli, Stark, Wayne. The support role can't &lt;em&gt;see&lt;/em&gt; the tier, but the query layer still accepted it as a filter. So you ask for every Enterprise account, and the ones that match tell you their tier by simply existing in the result. Redaction held on the output. It leaked through the input.&lt;/p&gt;

&lt;p&gt;That's the bug. It's small and it's boring and it's exactly the kind of thing that ships.&lt;/p&gt;

&lt;p&gt;Here's the part that bothered me more than the bug. I went and ran the MCP security scanners on it. The ones everyone uses now read the tool &lt;em&gt;manifest&lt;/em&gt;: they look at the tool descriptions, grep for poisoned instructions, flag suspicious-looking metadata. Good tools. They all came back green. They have to. There is nothing wrong with the manifest. The &lt;code&gt;query_resource&lt;/code&gt; tool description is honest. The bug only exists when the server runs and a real role makes a real call. A scanner that reads text can't reach it.&lt;/p&gt;

&lt;p&gt;So I built the thing that can. It's called Siege.&lt;/p&gt;

&lt;h2&gt;
  
  
  Run the server, don't read it
&lt;/h2&gt;

&lt;p&gt;Siege points at a live MCP server and behaves like an attacker against it, as real roles. No manifest grep. It connects as each identity you give it, and it diffs what comes back.&lt;/p&gt;

&lt;p&gt;The wedge is runtime authorization. Static scanners own static tool-poisoning and they're fine at it; I'm not going to out-grep them. What nobody ships is a tool that exercises the running server as different users and tries to break access control. The RBAC vendors all say "you should red-team your authorization scope" as advice. Siege is that advice turned into a thing you run.&lt;/p&gt;

&lt;p&gt;The hard rule I gave myself: no hardcoded field names, no hardcoded roles. If it only caught the Warden bug because I told it about &lt;code&gt;tier&lt;/code&gt;, it would be a unit test, not a scanner. So the method is differential. Learn the schema and the real values from the most-permissive identity, the one that sees everything. Then for every restricted role, diff what it sees against that, and probe the gaps.&lt;/p&gt;

&lt;p&gt;Four detectors came out of that, all role-relative:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redacted-field filter leak.&lt;/strong&gt; The Warden bug, generalized. For any field stripped from a role's output, try it as a filter. If filtering on it returns fewer rows than the baseline, the hidden value just leaked through the difference.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row-scope escalation.&lt;/strong&gt; A role whose normal view is scoped to a subset (region = West, say) tries an out-of-scope filter value. If &lt;code&gt;region=East&lt;/code&gt; returns rows it shouldn't have, the filter ran against the full dataset instead of the scoped one.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ID enumeration.&lt;/strong&gt; The list path is governed, the single-record lookup often isn't. So &lt;code&gt;get_record&lt;/code&gt; on guessed ids walks straight past the scoping that &lt;code&gt;query_resource&lt;/code&gt; enforces. Classic IDOR, MCP edition.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forbidden-resource read.&lt;/strong&gt; A role can't even list a resource, but &lt;code&gt;get_record&lt;/code&gt; hands one over anyway. Access checked on list and query, forgotten on the by-id path.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The last three I never found by hand. they fell out of writing the first one generically.. build the engine for one bug and it pulls the next few with it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The money shot
&lt;/h2&gt;

&lt;p&gt;I keep two Warden builds: the vulnerable commit and the fixed one. Siege runs against both.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BEFORE — vulnerable Warden (4938bdf)
## 1. [HIGH] Redacted field 'tier' leaks through filter predicate on 'accounts'
   Found as role: support
   Reproduce: query_resource({"resource_type":"accounts","filters":{"tier":"Enterprise"}})
   baseline_count: 8   filtered_count: 6
   leaked_records: ['Acme Corp', 'Initech', 'Umbrella Co', 'Hooli', 'Stark Industries', 'Wayne Enterprises']

AFTER — fixed Warden (7188eed)
   No findings. The probed classes held.

VERDICT: PASS — Siege caught the bug and cleared the fix.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every finding carries an exact, replayable repro: the tool, the arguments, the rows that came back. You can paste it into your own client and watch it leak. And to make sure the detectors aren't no-ops that pass everything, there's an intentionally-broken fixture server in the repo. Siege fires all four detectors on it, including the critical forbidden-resource read. it's in there if you want to watch it go.&lt;/p&gt;

&lt;h2&gt;
  
  
  The second class: did the agent actually get owned
&lt;/h2&gt;

&lt;p&gt;Tool-poisoning is the attack everyone talks about: hide an instruction in a tool description or a tool's output, and the agent reading it does what the attacker said instead of what the user asked. Static scanners ask "does this description look hostile." Siege asks a different question. Does the agent get hijacked.&lt;/p&gt;

&lt;p&gt;So it runs a real agent loop. Benign read tool, plus an &lt;code&gt;export_record&lt;/code&gt; sink that ships data to a URL. The user's task is read-only: summarize record 1, that's all. Then Siege injects each payload through the description channel and the output channel and watches whether the model fires the sink at an attacker destination it was never asked to touch. A hijack is observed, not inferred from text.&lt;/p&gt;

&lt;p&gt;The output is a matrix, not a verdict. Five payloads across two channels: system-block spoofing (run through both the description and the output), plain policy text, role-confusion, task-decomposition. You see which ones steered the model and which bounced off. A clean 0-of-5 is a real result too, and a regression guard for the day you bump model versions and a framing that used to bounce stops bouncing.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it doesn't do
&lt;/h2&gt;

&lt;p&gt;The report names the classes it ran and prints what it skipped. MCP servers only for now, no OpenAI function-calling, that's a later expansion. stdio transport today, HTTP next. The silent-failure class (does the server claim success while returning empty data) is designed and not yet shipped. No "finds all vulnerabilities" anywhere in the output, because that sentence is how scanners lie.&lt;/p&gt;

&lt;p&gt;And it only attacks my own fixtures and servers I explicitly opt in. Pointing a runtime red-team tool at someone else's live server without an invite isn't a demo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it sits
&lt;/h2&gt;

&lt;p&gt;Siege is the offense leg of a three-piece stack. Warden governs the server. Crumb attributes every call to the person who authorized it. Siege is the part that tries to break what Warden built. Build the wall, then lay siege to it.&lt;/p&gt;

&lt;p&gt;Code's public: &lt;a href="https://github.com/AlexlaGuardia/siege" rel="noopener noreferrer"&gt;github.com/AlexlaGuardia/siege&lt;/a&gt;. It's v0.1 and it's narrow on purpose. Runs against a live server, as real roles. The part the manifest can't show you.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>mcp</category>
      <category>opensource</category>
    </item>
    <item>
      <title>An AI agent acted across two companies. Whose audit log knows which human?</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Wed, 24 Jun 2026 12:16:50 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/an-ai-agent-acted-across-two-companies-whose-audit-log-knows-which-human-12nl</link>
      <guid>https://dev.to/alexlaguardia/an-ai-agent-acted-across-two-companies-whose-audit-log-knows-which-human-12nl</guid>
      <description>&lt;p&gt;Alice logs into her company's tools through their identity provider. She points an agent at a task. That agent hands part of the work to a sub-agent, and the sub-agent calls a tool that lives in a partner company's system, behind a &lt;em&gt;different&lt;/em&gt; identity provider. The tool does something it shouldn't. An auditor pulls the record.&lt;/p&gt;

&lt;p&gt;Whose log knows it was alice?&lt;/p&gt;

&lt;p&gt;Not the agent's. The agent is a process; it can claim to be anyone. Not the model's either, which reads whatever it was handed and has no idea which human is behind the session. The honest answer in most deployments today is that the partner's system can prove &lt;em&gt;a bot&lt;/em&gt; called it, and can prove &lt;em&gt;which company's bot&lt;/em&gt;. Then the trail goes cold. The person who actually directed the action dissolves into "some agent at the vendor."&lt;/p&gt;

&lt;p&gt;I have been building &lt;a href="https://crumb.alexlaguardia.dev" rel="noopener noreferrer"&gt;Crumb&lt;/a&gt; to refuse that outcome: a tamper-evident record that binds the actual person behind an agent's tool call, verifiable by someone who does not have to trust whoever ran the agent. Within a single identity provider, that chain was already working. This post is about the part that wasn't, and why it took longer than I expected.&lt;/p&gt;

&lt;h2&gt;
  
  
  The single-issuer case was the easy half
&lt;/h2&gt;

&lt;p&gt;When the whole chain lives under one identity provider, delegation has a clean answer, and it is a real standard. RFC 8693 token exchange lets you mint a token that carries two identities at once: the human as the &lt;code&gt;sub&lt;/code&gt;, and the agent acting for them as a nested &lt;code&gt;act&lt;/code&gt; claim. Add a hop and you nest again. The human stays at the root the whole way down.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://idp-a.local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alice"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"researcher"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"planner"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_record"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One provider signs that token. A resource server verifies it against that provider's public key, walks the &lt;code&gt;act&lt;/code&gt; chain back to alice, and it is done. No shared secret, no trusting the gateway that minted it. I covered that build in an earlier post. It holds up.&lt;/p&gt;

&lt;p&gt;The catch is in the assumption hiding under "one provider."&lt;/p&gt;

&lt;h2&gt;
  
  
  The boundary is where it breaks
&lt;/h2&gt;

&lt;p&gt;Real delegation does not stay inside one company. The interesting, dangerous case is the one that crosses.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fizjht77yit1gc0a95c4j.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2Fizjht77yit1gc0a95c4j.png" alt="Sequence: alice authenticates at IdP A, an agent chain hands off into IdP B, and the tool verifies the human across both issuers"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;So &lt;code&gt;planner&lt;/code&gt;, holding a token IdP A signed, needs the call into B's domain to carry a token B will honor. The textbook move is another RFC 8693 exchange, this time against B. You hand B the token A issued, and B mints you a fresh one.&lt;/p&gt;

&lt;p&gt;And right there is the problem, sitting in plain sight in the spec. When B does that exchange, it mints a token signed &lt;em&gt;only by B&lt;/em&gt; and drops A's signature on the floor. The new token says &lt;code&gt;sub: alice&lt;/code&gt; because B copied it across, but the cryptographic proof that A authenticated alice is gone. Downstream, all you hold is B's word: "A told me it was alice."&lt;/p&gt;

&lt;p&gt;For most systems that is fine, because most systems were already trusting B. But Crumb's entire reason to exist is to let an auditor verify &lt;em&gt;without&lt;/em&gt; trusting the operator. A cross-issuer hop that resolves to "trust B" puts the trust-me point right back in the middle of the chain I was trying to make checkable. It's the one thing I can't wave away.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stapling: carry the signature across, don't reissue it
&lt;/h2&gt;

&lt;p&gt;The fix I landed on is to stop throwing the upstream token away.&lt;/p&gt;

&lt;p&gt;When B exchanges A's token, two things happen. First, B verifies A's token against A's public key. B can only do that if it federates with A, so A has to be in B's trust set. That is a real relationship and I will come back to how honest it is. Second, instead of discarding A's token, B &lt;em&gt;staples&lt;/em&gt; it into the one it mints: the exact inner JWS rides along in a &lt;code&gt;prv&lt;/code&gt; claim, its SHA-256 in &lt;code&gt;psh&lt;/code&gt;, and the inner issuer in &lt;code&gt;pis&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"iss"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://idp-b.local"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"alice"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"researcher"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"act"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"sub"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"planner"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"aud"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"read_record"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"prv"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;the exact JWT that IdP A signed&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"psh"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"sha256:5992849d649979e6..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"pis"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://idp-a.local"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the outer token is not an assertion that alice was authenticated. It is a pointer to the original proof, hash-pinned so it can't be swapped. B signed its own segment. A already signed its segment. Nobody re-signed anybody else's.&lt;/p&gt;

&lt;p&gt;A verifier handed the outer token walks the chain backward and checks each segment against the key of the issuer that actually signed it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6mnyiyv9hu8490jnpn0c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Farticles%2F6mnyiyv9hu8490jnpn0c.png" alt="Vanilla exchange discards A's signature so the verifier trusts B's word; stapled provenance keeps each segment verifiable against its own issuer's key"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Each rule maps to one way a dishonest issuer could try to cheat:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Per-segment signature.&lt;/strong&gt; Every token in the chain is verified against its own issuer's key, pulled from the verifier's federation set. An issuer it does not federate with has no key, so the token is refused, not verified-then-ignored.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Staple integrity.&lt;/strong&gt; A token carrying &lt;code&gt;prv&lt;/code&gt; must have &lt;code&gt;psh&lt;/code&gt; equal to the hash of that &lt;code&gt;prv&lt;/code&gt;. Swap the embedded provenance for a different token and the hash stops matching.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Human continuity.&lt;/strong&gt; The &lt;code&gt;sub&lt;/code&gt; has to be the same identity at every hop. An outer token claiming to act for alice while stapling a token A issued for bob is a lie the walk catches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Actor continuity.&lt;/strong&gt; The chain an outer token carries beneath its own actor has to equal the inner token's chain exactly. An issuer may append a hop. It may not rewrite the hops it inherited.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What it refuses
&lt;/h2&gt;

&lt;p&gt;The part I care about most is the negative space. A mechanism that only shows the happy path hasn't proven anything. So the demo verifies the real chain across two issuers, and then it tries to break it five ways and shows each one failing by name.&lt;/p&gt;

&lt;p&gt;The sharpest of the five: a &lt;em&gt;malicious B&lt;/em&gt; tries to fabricate an upstream human. It controls its own signing key, so it mints a perfectly valid B token that says it is acting for &lt;code&gt;mallory&lt;/code&gt;, and it staples a forged "A token" that also names mallory. B can sign its own segment all day. What it cannot do is sign as A. The verifier checks the stapled segment against A's real key, the forgery fails there, and B's attempt to invent a human it was never handed dies at the boundary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;3. malicious B forges an upstream human (mallory)
   forged upstream    rejected (InvalidSignature): B can't sign as A
4. swap the stapled provenance (psh left stale)
   swapped provenance rejected (StapleMismatch): psh pins one predecessor
5. B claims alice but staples bob's token
   human discontinuity rejected (HumanDiscontinuity): same human or nothing
6. B rewrites the inherited actor chain
   rewritten chain    rejected (ActorChainBroken): append-only, no rewrite
7. upstream from an unfederated issuer
   unfederated issuer rejected (UntrustedIssuer): verifier trusts its own set
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That last one matters more than it looks. Even when B chooses to accept some sketchy third issuer C and builds a chain on it, the verifier makes its &lt;em&gt;own&lt;/em&gt; trust decision. B vouching for C buys C nothing. The verifier trusts its set, not B's.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part I am not going to oversell
&lt;/h2&gt;

&lt;p&gt;Here is the boundary, stated plainly, because pretending it isn't there is exactly the tell I am trying to avoid.&lt;/p&gt;

&lt;p&gt;This isn't a new standard. The &lt;code&gt;prv&lt;/code&gt; and &lt;code&gt;psh&lt;/code&gt; staple claims are a Crumb convention. There is no RFC that defines them, and if two vendors wanted to interoperate this way they would have to agree on the format first. And the whole thing still rests on a federation trust set. Somebody, somewhere, decides which issuers they accept. I didn't make that decision disappear.&lt;/p&gt;

&lt;p&gt;What I did was make it the &lt;em&gt;only&lt;/em&gt; thing you have to decide, and make everything downstream of it checkable. You pick your trusted issuers once, explicitly, in an object you can read. After that no single issuer gets to assert the human on its own word. Each one signs only its own segment, and the verifier re-checks all of them.&lt;/p&gt;

&lt;p&gt;There's still no trust-free answer for cross-issuer identity. Just a smaller question: who do you federate with.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;The whole thing is one additive module and a demo you can run.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/AlexlaGuardia/crumb
python &lt;span class="nt"&gt;-m&lt;/span&gt; crumb.cross_issuer_demo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It stands up two issuers with two different keys, crosses a real delegation chain between them, verifies it back to the human, and then fails the five forgeries above. The live timeline and the rest of Crumb are at &lt;a href="https://crumb.alexlaguardia.dev" rel="noopener noreferrer"&gt;crumb.alexlaguardia.dev&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;If you work on agent identity or authorization and you think the stapling model has a hole in it, I want to hear where.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>oauth</category>
      <category>mcp</category>
    </item>
    <item>
      <title>An AI agent exported a patient record. Your logs can't say who told it to.</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Tue, 23 Jun 2026 13:45:10 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/an-ai-agent-exported-a-patient-record-your-logs-cant-say-who-told-it-to-4k88</link>
      <guid>https://dev.to/alexlaguardia/an-ai-agent-exported-a-patient-record-your-logs-cant-say-who-told-it-to-4k88</guid>
      <description>&lt;p&gt;You put an LLM agent into production. It runs under a service account or a shared API key, because that's how you give software credentials. It reads a record, exports a file. Sometimes it moves money. Your audit log dutifully records the action. It says &lt;em&gt;the agent did it&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;It does not say &lt;em&gt;which human told it to&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That's fine right up until it isn't. If the agent does something it shouldn't have, "the service account did it" is not an answer anyone can act on. You can't discipline a service account. You can't tell a regulator that a bot was responsible and leave it there.&lt;/p&gt;

&lt;h2&gt;
  
  
  The deadline that makes this concrete
&lt;/h2&gt;

&lt;p&gt;The EU AI Act, Article 12, comes into force on August 2 2026. High-risk systems have to keep logs that allow "the identification of the natural persons involved" in an event. A natural person. Not a service account, not an agent id. The actual human.&lt;/p&gt;

&lt;p&gt;A log built around shared credentials can't answer that question. The identity was never captured, so no amount of log retention brings it back.&lt;/p&gt;

&lt;h2&gt;
  
  
  You can't prompt your way out of this
&lt;/h2&gt;

&lt;p&gt;The obvious instinct is to make the model report who it's acting for. Put the user in the system prompt, have the agent include it in the tool call.&lt;/p&gt;

&lt;p&gt;Two problems.&lt;/p&gt;

&lt;p&gt;A tool call, on the wire, is &lt;code&gt;{"name": "export_record", "arguments": {...}}&lt;/code&gt;. There is no field for &lt;em&gt;who&lt;/em&gt;. OpenAI function-calling has no native identity slot. MCP permits carrying it but almost nobody implements it. So at the protocol level, the "who" has nowhere to live.&lt;/p&gt;

&lt;p&gt;And worse, anything the model emits can be prompt-injected. If identity comes &lt;em&gt;from&lt;/em&gt; the model, then the data the agent reads back from a tool can rewrite it. I tested this on the same payload delivered two ways, and the tool &lt;em&gt;description&lt;/em&gt; hijacked more models than the tool &lt;em&gt;output&lt;/em&gt; did. The model's output is the one surface you can never treat as trusted for identity. It has to be stamped by the runtime, outside the agent's reasoning, before the model gets a say.&lt;/p&gt;

&lt;p&gt;So I built the runtime that stamps it. It's called Crumb. Every agent action drops a crumb; the trail leads back to the human who directed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of it
&lt;/h2&gt;

&lt;p&gt;One gateway, every tool call passes through it.&lt;/p&gt;

&lt;p&gt;It pulls the human's identity from the verified session, captured once at login, never from the model. It mints a short-lived delegation token that carries both identities: the human as the RFC 8693 &lt;code&gt;sub&lt;/code&gt;, the agent as the &lt;code&gt;act&lt;/code&gt;, scoped to the one resource being called. Then it writes a crumb to an append-only, hash-chained ledger, each entry signed with Ed25519, and calls the tool with the token. The tool refuses any call that doesn't carry a valid token, so there's no path to the data that skips it.&lt;/p&gt;

&lt;p&gt;That delegation token isn't hand-rolled. It's a real RFC 8693 token exchange against an identity provider: the human's session goes in as the &lt;code&gt;subject_token&lt;/code&gt;, an RS256 provider-signed composite comes back, and the resource verifies it against the provider's published JWKS. No shared secret. Point it at Okta or Keycloak or Zitadel and the same code path holds, because it's the standard, not a custom copy of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The part that actually breaks: more than one agent
&lt;/h2&gt;

&lt;p&gt;A single agent calling a tool is the easy case. Real systems don't look like that. A human directs an orchestrator. The orchestrator delegates to a sub-agent. The sub-agent calls the tool. Now who's accountable, and how do you prove it, when the human is two hops away from the action?&lt;/p&gt;

&lt;p&gt;This is where most attribution stories quietly stop. The standards bodies haven't fully solved it either. But RFC 8693 has the mechanism hiding in section 4.1: the &lt;code&gt;act&lt;/code&gt; claim can nest. Each new actor wraps the previous one, and the human stays the &lt;code&gt;sub&lt;/code&gt; at the root the whole way down. Walk the nesting back and you get the full chain of who-acted-for-whom, ending at the person who started it.&lt;/p&gt;

&lt;p&gt;So Crumb implements it end to end. Each hop nests the prior actor. The provider does the nesting over a real token exchange, not a dev shortcut. The crumb records the whole chain. And because the entire nested structure is signed as one token, there's no per-hop seam to forge at. I tried: rewrite a middle actor in the chain and re-sign it without the key, and verification rejects it on the signature. The chain holds together or it doesn't verify.&lt;/p&gt;

&lt;p&gt;Alice authorizes one action, &lt;code&gt;read_record&lt;/code&gt;, when she logs in. A planner agent takes her request and delegates to a researcher sub-agent. The researcher reads the record. The crumb traces it back through both agents to Alice, verified.&lt;/p&gt;

&lt;p&gt;Then a hop goes rogue and calls &lt;code&gt;export_record&lt;/code&gt;, which Alice never authorized. The action may technically run. But the crumb records no human directive behind it. It flags the action unauthorized and names the agent chain that did it. Alice is in the record. She's provably not the one accountable.&lt;/p&gt;

&lt;p&gt;A service-account log can't do that. It says a bot exported the record, and stops there. This one clears Alice by name and points at the agents instead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tamper-evidence, including against yourself
&lt;/h2&gt;

&lt;p&gt;A signed, hash-chained log sounds tamper-proof until you remember who holds the signing key. You do. If you can re-sign, you can rewrite history and re-sign the whole chain, and per-entry verification passes the forgery, because every entry is validly signed. By you.&lt;/p&gt;

&lt;p&gt;So the ledger checkpoints its Merkle root and publishes it to Sigstore's public Rekor transparency log. Now the operator-rollback attack falls apart: you rewrite a crumb, re-sign the entire chain, and per-entry verify still passes. But the rewritten root no longer matches the one already sitting public in Rekor, timestamped before your edit. The forgery is caught by something you don't control. There's a button on the live demo that runs exactly this and shows the anchor catching it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it isn't
&lt;/h2&gt;

&lt;p&gt;This is the part I want to be straight about, because attribution is a space where it's easy to overclaim.&lt;/p&gt;

&lt;p&gt;Crumb is a flight recorder, not a control plane. Stopping things is a different and well-funded job. Cerbos, Capsule, Astrix already do it. Crumb records and proves; it points at them for the rest.&lt;/p&gt;

&lt;p&gt;Attribution is only as strong as the gateway. Bypass it and there's no crumb, so the gateway has to be real and enforced, not optional.&lt;/p&gt;

&lt;p&gt;The multi-hop chain is single-issuer. One provider, one trust root. A chain that spans two different identity providers is genuinely unsolved at the standards level, and I'm not going to pretend otherwise.&lt;/p&gt;

&lt;p&gt;The ledger stores a hash of the arguments, not the raw arguments, to keep sensitive data out of the log. The tradeoff is that it proves an action happened and who directed it, not the exact bytes that were touched.&lt;/p&gt;

&lt;p&gt;And MCP attribution is permitted by the spec but rarely implemented upstream, so Crumb can stamp the record but can't force a non-compliant server to honor the human identity.&lt;/p&gt;

&lt;p&gt;That's the gap between what's built and what's marketing. In this space, that gap is the whole thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try to break it
&lt;/h2&gt;

&lt;p&gt;The demo is live at &lt;a href="https://crumb.alexlaguardia.dev" rel="noopener noreferrer"&gt;crumb.alexlaguardia.dev&lt;/a&gt;. Seed some crumbs, tamper a row, watch verification flip. Hit the operator rollback and watch the external anchor catch a forgery that per-entry signing passes. The code is on GitHub.&lt;/p&gt;

&lt;p&gt;If you're building agent infrastructure and you've hit this, or you think I've got something wrong, I want to hear it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>security</category>
      <category>llm</category>
      <category>mcp</category>
    </item>
    <item>
      <title>Same question, three answers: a governed MCP server with receipts</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Wed, 10 Jun 2026 14:44:04 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/same-question-three-answers-a-governed-mcp-server-with-receipts-1dm4</link>
      <guid>https://dev.to/alexlaguardia/same-question-three-answers-a-governed-mcp-server-with-receipts-1dm4</guid>
      <description>&lt;p&gt;Ask my agent "what's the open pipeline for Acme Corp?" as an admin and it answers $125,000 across two deals, with a table. Ask the exact same question as a support agent and it says, politely and correctly, that it can't see pipeline data and suggests who to ask instead.&lt;/p&gt;

&lt;p&gt;The model didn't decide that. It never gets the chance to. That's the whole project.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Give an AI agent tool access to company data and you get two questions you can't dodge:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Who is the agent acting as?&lt;/strong&gt; A support rep must not get answers the human behind the keyboard isn't allowed to see.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How do you know it behaved?&lt;/strong&gt; "Seemed fine in testing" doesn't survive a security review.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I kept seeing these two questions in every AI-infra and forward-deployed engineering job posting, so I built a complete answer and put it on the public internet: &lt;strong&gt;&lt;a href="https://warden.alexlaguardia.dev" rel="noopener noreferrer"&gt;Warden&lt;/a&gt;&lt;/strong&gt;, a governed MCP server with an agent, traces, and evals on top. You can fire a real (rate-limited) agent run yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three ideas worth stealing
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Governance lives outside the model.&lt;/strong&gt; The role comes from session identity (the MCP server reads it at spawn, like OAuth scopes). Every read passes through one &lt;code&gt;GovernedStore&lt;/code&gt; choke point that applies resource access, region row-scoping, and field redaction before the model sees a byte. Prompting harder widens nothing, because there's nothing on the model's side of the wall to widen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The eval oracle has to obey the rules too.&lt;/strong&gt; This was the design moment of the build. If your reference answers come from the raw database, then a &lt;em&gt;correctly denied&lt;/em&gt; answer scores as a failure: the support agent honestly says "I can't see pipeline" and your eval compares that to $125,000 and marks it wrong. So Warden's oracle computes ground truth &lt;em&gt;through the same governance layer&lt;/em&gt; as the agent. An honest denial becomes a passing grade. Then a stronger model judges than answers (Opus judging Sonnet), anchored to that reference. Unanchored LLM judges grade on vibes; anchored ones measure. 12/12 cases passing, &lt;a href="https://warden.alexlaguardia.dev/evals" rel="noopener noreferrer"&gt;scorecard is public&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Denials are data, not error strings.&lt;/strong&gt; Tools return a structured &lt;code&gt;access_denied&lt;/code&gt; object. That's what lets the eval layer check "did the agent report the limit honestly instead of guessing," and it's what makes the &lt;a href="https://warden.alexlaguardia.dev/diff" rel="noopener noreferrer"&gt;same-question-three-roles diff page&lt;/a&gt; work.&lt;/p&gt;

&lt;p&gt;Every run also emits real OpenTelemetry spans (GenAI semantic conventions) that the dashboard replays as a timeline, with the enforcing role stamped on every tool result.&lt;/p&gt;

&lt;h2&gt;
  
  
  What bit me
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Markdown tables from the agent rendered as pipe-soup until I learned Tailwind's &lt;code&gt;prose&lt;/code&gt; classes silently do nothing without &lt;code&gt;@tailwindcss/typography&lt;/code&gt;, and react-markdown needs &lt;code&gt;remark-gfm&lt;/code&gt; for tables at all. Found it by clicking the deployed site, not in the build.&lt;/li&gt;
&lt;li&gt;The official MCP Python SDK ships FastMCP at &lt;code&gt;mcp.server.fastmcp&lt;/code&gt;. The standalone &lt;code&gt;fastmcp&lt;/code&gt; package is a different thing. Know which one you're importing.&lt;/li&gt;
&lt;li&gt;A public endpoint that burns real model tokens forces the unglamorous work: per-IP limits off the CDN's forwarded header, a global daily budget, a single-flight lock, hard timeouts. That's the difference between a demo and a toy.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Live console: &lt;a href="https://warden.alexlaguardia.dev" rel="noopener noreferrer"&gt;warden.alexlaguardia.dev&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Source: &lt;a href="https://github.com/AlexlaGuardia/warden" rel="noopener noreferrer"&gt;github.com/AlexlaGuardia/warden&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Full build write-up: &lt;a href="https://alexlaguardia.dev/writing/warden" rel="noopener noreferrer"&gt;alexlaguardia.dev/writing/warden&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Built solo as a working answer to "how do you let an agent touch real data without trusting it blindly?" If you're building agents over data someone cares about, the choke point, the governance-aware oracle, and structured denials all carry straight over.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>python</category>
      <category>llm</category>
    </item>
    <item>
      <title>Your AI Agent Forgets Everything Between Sessions (Here's How to Fix It)</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Fri, 01 May 2026 10:17:57 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/your-ai-agent-forgets-everything-between-sessions-heres-how-to-fix-it-5agf</link>
      <guid>https://dev.to/alexlaguardia/your-ai-agent-forgets-everything-between-sessions-heres-how-to-fix-it-5agf</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You finish a session with Claude Code at 11pm. Three files changed, two design decisions made, one bug discovered but unresolved.&lt;/p&gt;

&lt;p&gt;You start fresh the next morning. The agent has no memory of any of it.&lt;/p&gt;

&lt;p&gt;You spend the first ten minutes of the new session doing what I call "cold-start theater": re-reading the changed files, re-explaining what you decided yesterday, re-discovering the bug you already debugged.&lt;/p&gt;

&lt;p&gt;Multiply that by every session. That's the tax you pay for stateless agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Conversation History Doesn't Solve This
&lt;/h2&gt;

&lt;p&gt;The obvious answer is "just save the conversation." It doesn't work for three reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Conversation history is enormous.&lt;/strong&gt; A real working session has thousands of messages, tool calls, and outputs. Loading it consumes most of your context window before you do anything useful.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;It's mostly noise.&lt;/strong&gt; 90% of what happened in the previous session was the agent reasoning out loud. The next agent doesn't need that. It needs the conclusions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;It doesn't compose across agents.&lt;/strong&gt; If agent A finishes a task and agent B picks up later, B doesn't want to read A's monologue. B wants to know what changed and what's next.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What a Handoff Actually Needs
&lt;/h2&gt;

&lt;p&gt;After running multi-agent setups for a year, the pattern that works is a structured summary with five fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Files touched&lt;/strong&gt;: what physically changed on disk&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decisions made&lt;/strong&gt;: architectural or design choices that affect future work&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Blockers&lt;/strong&gt;: things that stopped progress, with enough context to unblock&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Next steps&lt;/strong&gt;: what the next agent should do first&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open threads&lt;/strong&gt;: anything unresolved that future-you needs to remember&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's it. Five fields, usually under 500 words. The next agent reads this in under a second and has full operational awareness.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Protocol
&lt;/h2&gt;

&lt;p&gt;In Vigil, handoff is a first-class operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Vigil&lt;/span&gt;

&lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Vigil&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend-cc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;files_touched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api/routes.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models/user.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;decisions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Switched to JWT auth from sessions; simpler refresh flow&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;blockers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Stripe webhook still failing in test mode, see line 142&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;next_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Wire JWT middleware into protected routes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fix Stripe webhook signature validation&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;open_threads&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Decide on rate limit strategy before deploy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the next agent boots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend-cc&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Returns the most recent handoff, or chains across the last N
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or via the CLI:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil resume backend-cc
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The handoff is stored as structured data, queryable, and the daemon includes the most recent handoff in the agent's awareness file automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handoff Chains
&lt;/h2&gt;

&lt;p&gt;The thing that surprised me when I built this: chains compound.&lt;/p&gt;

&lt;p&gt;If you handoff three sessions in a row on the same project, the next agent can resume the &lt;em&gt;chain&lt;/em&gt;. It sees the most recent handoff plus a summarized rollup of the previous two. No information loss across multiple sessions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil resume backend-cc &lt;span class="nt"&gt;--chain&lt;/span&gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns a synthesized view of the last five handoffs. Decisions persist. Blockers either get resolved (and removed) or carry forward (and become urgent).&lt;/p&gt;

&lt;p&gt;This is how you keep continuity across days or weeks of work without ever loading a full conversation history.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why MCP Wasn't Enough
&lt;/h2&gt;

&lt;p&gt;I tried to do this through MCP at first: just expose handoff tools to Claude Code and let the agent emit them.&lt;/p&gt;

&lt;p&gt;That works, but it's not enough. The handoff needs to live somewhere the next agent can read &lt;em&gt;before&lt;/em&gt; tools are even loaded. Tool calls cost a round trip. The handoff should be in your boot context.&lt;/p&gt;

&lt;p&gt;The pattern that works: handoff data lands in the awareness file. Agents read awareness on boot. Zero tool calls needed to know what happened last session.&lt;/p&gt;

&lt;h2&gt;
  
  
  Concrete Numbers
&lt;/h2&gt;

&lt;p&gt;Before this protocol: my agents spent 8-15K tokens per session re-discovering context.&lt;/p&gt;

&lt;p&gt;After: the awareness file with embedded handoff is ~2K tokens, loaded once at boot. The token savings show up directly on my Anthropic bill.&lt;/p&gt;

&lt;p&gt;The bigger win is psychological: I stopped dreading the start of new sessions. The agent knows where we left off.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vigil-agent
vigil init
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Handoff is a core feature of Vigil, alongside the awareness daemon, frame-based tool filtering, and the signal protocol. Full docs and source on GitHub. MIT license.&lt;/p&gt;

&lt;p&gt;If you're running multi-session AI workflows and you've felt the cold-start tax, give it a try. I'd love to hear what handoff protocol works for your setup.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm building cognitive infrastructure for AI agents. If session handoff is something you're solving for, drop a comment or find me on GitHub.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>claudecode</category>
      <category>productivity</category>
    </item>
    <item>
      <title>Your MCP Servers Are Flying Blind (Here's How to Fix It)</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Thu, 30 Apr 2026 12:55:21 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/your-mcp-servers-are-flying-blind-heres-how-to-fix-it-3g51</link>
      <guid>https://dev.to/alexlaguardia/your-mcp-servers-are-flying-blind-heres-how-to-fix-it-3g51</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;You deploy an MCP server. Agents start calling tools. Something breaks.&lt;/p&gt;

&lt;p&gt;How do you know?&lt;/p&gt;

&lt;p&gt;Right now, you don't. Most MCP servers are black boxes. No metrics. No error rates. No latency tracking. No alerts when a tool starts failing silently.&lt;/p&gt;

&lt;p&gt;I run 95 MCP tools across multiple projects. When a tool started returning empty results instead of errors, I didn't notice for three days. The agent just quietly worked around it, producing subtly wrong output. No crash, no log, no alert.&lt;/p&gt;

&lt;p&gt;That's when I built MCPWatch.&lt;/p&gt;

&lt;h2&gt;
  
  
  What MCPWatch Does
&lt;/h2&gt;

&lt;p&gt;MCPWatch wraps any Python MCP server with one line — FastMCP and low-level Server, same API:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;MCPWatch&lt;/span&gt;

&lt;span class="n"&gt;watch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MCPWatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. From that point, every tool call is tracked:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Call volume&lt;/strong&gt; per tool (which tools are actually used?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Duration&lt;/strong&gt; with p50/p95/p99 percentiles (what's slow?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error rates&lt;/strong&gt; per tool (what's failing?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency trends&lt;/strong&gt; (is performance degrading?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Silent failures&lt;/strong&gt; (tool returned successfully but with empty/null data)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Dashboard
&lt;/h2&gt;

&lt;p&gt;MCPWatch exposes 5 REST endpoints for monitoring:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /mcp/health    -- overall server health (healthy/degraded/unhealthy)
GET /mcp/tools     -- per-tool stats breakdown
GET /mcp/errors    -- recent errors with full context
GET /mcp/latency   -- latency percentiles per tool
GET /mcp/volume    -- call volume over time
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;There's also a CLI command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil mcp-health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This gives you a per-tool breakdown right in your terminal. I run it before and after deploys.&lt;/p&gt;

&lt;h2&gt;
  
  
  Alerts
&lt;/h2&gt;

&lt;p&gt;MCPWatch emits alerts when things go wrong:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;watch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MCPWatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;error_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;# alert if &amp;gt;10% of calls fail
&lt;/span&gt;    &lt;span class="n"&gt;latency_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# alert if p95 &amp;gt; 5 seconds
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Alerts flow through Vigil's signal protocol, which means you can wire them to webhooks, Slack, or any trigger action.&lt;/p&gt;

&lt;h2&gt;
  
  
  CI/CD Health Check
&lt;/h2&gt;

&lt;p&gt;For CI pipelines, there's a stdio probe:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil mcp-health-check &lt;span class="nt"&gt;--timeout&lt;/span&gt; 5000 &lt;span class="nt"&gt;--min-tools&lt;/span&gt; 10 &lt;span class="nt"&gt;--require&lt;/span&gt; query,signal
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns exit code 0 (healthy) or 1 (unhealthy). Drop it into GitHub Actions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;MCP Health Check&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;vigil mcp-health-check --timeout 5000 --min-tools &lt;/span&gt;&lt;span class="m"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;The MCP ecosystem is growing fast. There are 11,000+ servers listed across registries. But the tooling around MCP is still in the "deploy and pray" phase.&lt;/p&gt;

&lt;p&gt;In traditional web services, you'd never deploy an API without monitoring. MCP servers deserve the same treatment. Especially when the consumer is an AI agent that won't tell you something is wrong -- it'll just silently degrade.&lt;/p&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vigil-agent
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MCPWatch is part of Vigil, a broader cognitive infrastructure toolkit for AI agents. But you can use MCPWatch standalone -- just wrap your server and point your monitoring at the endpoints.&lt;/p&gt;

&lt;p&gt;The full docs and source are on GitHub. MIT license.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I'm building tools for AI agent infrastructure. If you're running MCP servers in production, I'd love to hear what observability problems you're hitting. Drop a comment or find me on GitHub.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>python</category>
      <category>monitoring</category>
    </item>
    <item>
      <title>Three freelancer tools died in two months. I built the replacement.</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Sat, 04 Apr 2026 10:18:46 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/three-freelancer-tools-died-in-two-months-i-built-the-replacement-4dbe</link>
      <guid>https://dev.to/alexlaguardia/three-freelancer-tools-died-in-two-months-i-built-the-replacement-4dbe</guid>
      <description>&lt;p&gt;HoneyBook hiked prices 89%. AND.CO shut down entirely. Bonsai got acquired by Zoom with no roadmap. All within two months of each other.&lt;/p&gt;

&lt;p&gt;I'm a freelancer who was paying $29/mo for HoneyBook. I used two features: proposals and invoice reminders. That's it. Two features out of fifty. The other forty-eight were bloat I paid for and never touched.&lt;/p&gt;

&lt;p&gt;Dubsado is so complex that a cottage industry of "setup specialists" charge $500-3,500 just to configure it. That tells you everything about how broken this market is.&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://stampwerk.com" rel="noopener noreferrer"&gt;Stampwerk&lt;/a&gt;. Proposals, contracts, invoices, and follow-ups. $12/mo. No setup wizard. No onboarding call. No fifty features you'll never open.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the AI actually does
&lt;/h2&gt;

&lt;p&gt;This isn't "AI-powered" as a marketing checkbox. The AI does two specific things that freelancers hate doing manually.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Proposal generation.&lt;/strong&gt; Answer 5 questions about a project and the LLM writes a full proposal -- scope, timeline, pricing breakdown, and payment terms.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# 5 inputs in, structured proposal out
&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;groq_client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;llama-3.3-70b-versatile&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;PROPOSAL_SYSTEM&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;client_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;project_desc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;budget_range&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;timeline&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;timeline&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deliverables&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;deliverables&lt;/span&gt;
        &lt;span class="p"&gt;})}&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;response_format&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;json_object&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not template-fill. The model reasons about scope and pricing based on the project description. You edit what it generates, not what a template assumes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Invoice follow-ups.&lt;/strong&gt; A background daemon runs hourly and chases overdue invoices on a 3-step escalation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Day 3&lt;/strong&gt; -- friendly check-in&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 7&lt;/strong&gt; -- professional reminder with payment link&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Day 14&lt;/strong&gt; -- firm final notice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each message matches the escalation stage. This is the part of freelancing that everyone hates and nobody does consistently. Now it runs while you sleep.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full pipeline
&lt;/h2&gt;

&lt;p&gt;The thesis behind Stampwerk is that these aren't separate features. They're one flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client signs up (Google or magic link, 30 seconds)&lt;/li&gt;
&lt;li&gt;Creates a project, answers 5 questions&lt;/li&gt;
&lt;li&gt;AI generates the proposal&lt;/li&gt;
&lt;li&gt;Client views it at a public link, accepts with one click&lt;/li&gt;
&lt;li&gt;Contract auto-generates from the accepted proposal terms&lt;/li&gt;
&lt;li&gt;Client e-signs&lt;/li&gt;
&lt;li&gt;Milestones trigger invoices with Stripe payment links&lt;/li&gt;
&lt;li&gt;Overdue invoices get automatic follow-ups&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One pipeline. Every step feeds the next. No configuration, no "setup specialist" needed.&lt;/p&gt;

&lt;p&gt;HoneyBook has all these features too. They also have fifty others, a $29-59/mo price tag, and an 89% price hike that tells you where they think the market is headed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why these tech choices
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Choice&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Backend&lt;/td&gt;
&lt;td&gt;FastAPI&lt;/td&gt;
&lt;td&gt;42 routes, async, typed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;SQLite&lt;/td&gt;
&lt;td&gt;One file, WAL mode, zero ops burden&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;AI&lt;/td&gt;
&lt;td&gt;Groq + Llama 3.3 70B&lt;/td&gt;
&lt;td&gt;Free inference, structured JSON output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payments&lt;/td&gt;
&lt;td&gt;Stripe&lt;/td&gt;
&lt;td&gt;Payment links for clients, subscriptions for us&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Email&lt;/td&gt;
&lt;td&gt;Resend&lt;/td&gt;
&lt;td&gt;Transactional email under 3K sends is free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frontend&lt;/td&gt;
&lt;td&gt;Next.js 14 + Tailwind&lt;/td&gt;
&lt;td&gt;SSR, file routing, fast iteration&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Total infrastructure cost: $0/mo. AI calls are free through Groq. Email is free at this scale. Stripe only charges when money moves. The whole thing runs on a single server I already had.&lt;/p&gt;

&lt;p&gt;This matters because the competitors raised hundreds of millions. HoneyBook took $479M in funding. Dubsado bootstrapped to $2.5M ARR. Moxie raised ~$10M. When you compete against that, your margin has to be your moat. A $0 cost base means $12/mo is sustainable, not a loss leader.&lt;/p&gt;

&lt;h2&gt;
  
  
  Honest trade-offs
&lt;/h2&gt;

&lt;p&gt;The retro arcade UI is polarizing. Some people think it's unprofessional for a business tool. I'm keeping it. If your target is corporate project managers, sure. If your target is solo freelancers who are tired of software that looks like every other SaaS dashboard, it works.&lt;/p&gt;

&lt;p&gt;No PDF export yet. No time tracking. No QuickBooks integration. No mobile app. These are real gaps. But I'd rather ship the core pipeline right and add features from real user feedback than build fifty features nobody asked for.&lt;/p&gt;

&lt;p&gt;That's how we got HoneyBook in the first place.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://stampwerk.com" rel="noopener noreferrer"&gt;stampwerk.com&lt;/a&gt; -- free tier gives you 5 clients, 5 projects, and full AI proposals. Pro is $12/mo for unlimited.&lt;/p&gt;




&lt;p&gt;Built with FastAPI, Next.js 14, SQLite, Groq, Stripe, and Resend. Questions about the stack or the business model welcome.&lt;/p&gt;

</description>
      <category>python</category>
      <category>fastapi</category>
      <category>ai</category>
      <category>freelancing</category>
    </item>
    <item>
      <title>I Built a Security Scanner That Uses AI to Review Its Own Findings</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Tue, 31 Mar 2026 09:53:59 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/i-built-a-security-scanner-that-uses-ai-to-review-its-own-findings-43o9</link>
      <guid>https://dev.to/alexlaguardia/i-built-a-security-scanner-that-uses-ai-to-review-its-own-findings-43o9</guid>
      <description>&lt;p&gt;Every AI coding tool ships code fast. None of them check if it's safe.&lt;/p&gt;

&lt;p&gt;I built &lt;a href="https://github.com/AlexlaGuardia/Critik" rel="noopener noreferrer"&gt;Critik&lt;/a&gt; — an open-source security scanner that catches what your AI writes and your review misses. Regex and AST find the candidates. An LLM reviews each one with full file context, confirms the real problems, kills the false positives, and explains why in plain English.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;pip install critik&lt;/code&gt; and you're scanning in 30 seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers Are Ugly
&lt;/h2&gt;

&lt;p&gt;53% of teams that shipped AI-generated code later found security issues that passed review. Georgia Tech's Vibe Security Radar tracked 74 CVEs from AI coding tools in Q1 2026 alone — 6 in January, 15 in February, 35 in March. Accelerating.&lt;/p&gt;

&lt;p&gt;Here's what I keep finding when I scan AI-built projects:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hardcoded API keys&lt;/strong&gt; — Cursor generates a Supabase client and pastes the service_role key right in the file&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQL injection via f-strings&lt;/strong&gt; — Copilot autocompletes &lt;code&gt;db.execute(f"SELECT * FROM users WHERE id = {user_id}")&lt;/code&gt; without blinking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firebase rules wide open&lt;/strong&gt; — Bolt scaffolds &lt;code&gt;read: true, write: true&lt;/code&gt; and nobody touches it&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NEXT_PUBLIC_ prefix on secrets&lt;/strong&gt; — the env var that hands your database URL to every browser&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not edge cases. Default patterns.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tools That Exist Don't Fix This
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Snyk&lt;/strong&gt; charges $25-98/dev/mo. Built for enterprises with procurement budgets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semgrep&lt;/strong&gt; is powerful. Also requires writing custom rules in a DSL. Steep curve. Recently relicensed behind commercial terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;npm audit&lt;/strong&gt; is, in Dan Abramov's words, "broken by design" — flags devDependency issues that can't touch production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bandit&lt;/strong&gt; and &lt;strong&gt;ESLint security plugins&lt;/strong&gt; catch patterns but have zero context. They flag &lt;code&gt;eval()&lt;/code&gt; in a test fixture the same way they flag &lt;code&gt;eval(user_input)&lt;/code&gt; in a request handler.&lt;/p&gt;

&lt;p&gt;That last one is the real problem. Static scanners are noisy. Developers learn to ignore them. Which means they ignore the real findings too.&lt;/p&gt;

&lt;h2&gt;
  
  
  Two Passes. One Scanner.
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Pass 1 — Static (fast, offline, free)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Regex patterns and Python AST parsing. Hardcoded secrets (16 patterns — AWS, Stripe, OpenAI, Anthropic), SQL injection, command injection, eval/exec, XSS, framework misconfigs. Runs in milliseconds. No API key needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pass 2 — AI Review (optional, the whole point)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Add &lt;code&gt;--ai&lt;/code&gt; and each finding goes to Groq's Llama 3.3 70B with the &lt;em&gt;full file&lt;/em&gt; as context. The model acts as a security analyst:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Verdict&lt;/strong&gt;: confirmed, false_positive, or needs_review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence&lt;/strong&gt;: 0-100%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: "This eval() parses trusted JSON config from a local file" vs "This eval() takes unsanitized user input from req.query"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fix&lt;/strong&gt;: actual code, not generic advice&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The AI doesn't replace the scanner. It reviews the scanner's work.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Looks Like
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;critik scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--ai&lt;/span&gt;
&lt;span class="go"&gt;
  Critik v0.4.0 — scanned 7 files  [AI]

  CRITICAL  nextjs-public-secret
  app/config.ts:18  Secret exposed to browser via NEXT_PUBLIC_ prefix
  | 18 | const key = process.env.NEXT_PUBLIC_SECRET_API_KEY
  CONFIRMED (95%)
&lt;/span&gt;&lt;span class="gp"&gt;  &amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;The NEXT_PUBLIC_ prefix exposes this API key to every browser.
&lt;span class="go"&gt;  Fix: Rename to SECRET_API_KEY and access server-side only

  CRITICAL  aws-access-key              (dimmed — false positive)
  tests/fixtures/bad_secrets.py:2  AWS access key detected
  FALSE POS (100%)
&lt;/span&gt;&lt;span class="gp"&gt;  &amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;This file is &lt;span class="k"&gt;in &lt;/span&gt;tests/fixtures — fake credentials &lt;span class="k"&gt;for &lt;/span&gt;testing.
&lt;span class="go"&gt;
  HIGH      sql-fstring
  app/db.py:6  SQL injection via f-string in execute()
  | 6 | db.execute(f"SELECT * FROM users WHERE id = {user_id}")
  CONFIRMED (95%)
&lt;/span&gt;&lt;span class="gp"&gt;  &amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;This f-string takes unsanitized input, allowing SQL injection.
&lt;span class="go"&gt;  Fix: db.execute('SELECT * FROM users WHERE id = ?', (user_id,))

  ─────────────────────────────────────────────
  7 files scanned in 2714ms — 23 findings: 10 critical, 8 high, 5 medium
  AI analysis: 7 confirmed, 16 false positives
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without AI: 23 findings. Developer overwhelmed. Ignores all of them.&lt;/p&gt;

&lt;p&gt;With AI: 7 real issues. 16 dismissed with reasons. Developer fixes what matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Under the Hood
&lt;/h2&gt;

&lt;p&gt;Each API call sends the full file content (up to 8K chars) plus all findings for that file. One call per file — keeps tokens low, context high.&lt;/p&gt;

&lt;p&gt;The model sees imports, function signatures, data flow. It knows test fixtures are probably safe, &lt;code&gt;.env&lt;/code&gt; files are expected to have secrets, and &lt;code&gt;NEXT_PUBLIC_&lt;/code&gt; vars are intentionally client-exposed.&lt;/p&gt;

&lt;p&gt;Temperature 0.2. Structured JSON back. If the API is down, Critik falls back to regex-only. No crash, no hang.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Catches
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Patterns&lt;/th&gt;
&lt;th&gt;Examples&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Secrets&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;AWS, GitHub, OpenAI, Anthropic, Stripe, Slack, DB URLs, JWTs, private keys&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Injection&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;SQL via f-string/concat, eval(), exec(), os.system(), subprocess shell=True, XSS&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Frameworks&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Supabase RLS, Firebase rules, NEXT_PUBLIC_ secrets, NextAuth, Prisma raw, Stripe webhooks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Config&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;NODE_ENV in source, insecure cookies, open CORS, debug mode&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Missing auth on routes, open endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dotenv&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Exposed .env, sensitive vars unencrypted&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Free. All of It.
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;critik
critik scan &lt;span class="nb"&gt;.&lt;/span&gt;                &lt;span class="c"&gt;# regex/AST only — offline, instant&lt;/span&gt;
critik scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--ai&lt;/span&gt;           &lt;span class="c"&gt;# + AI review (needs GROQ_API_KEY)&lt;/span&gt;
critik scan &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; fix   &lt;span class="c"&gt;# copy-paste fix prompts for Cursor/Claude&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pre-commit hook: &lt;code&gt;critik hook install&lt;/code&gt;. SARIF output for CI/CD. GitHub Action included. MIT license.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I Built This
&lt;/h2&gt;

&lt;p&gt;I hunt bugs on HackerOne and 0din. The vulnerabilities I find in production are the same patterns AI tools ship by default. Hardcoded keys. Missing auth. SQL injection. Open configs.&lt;/p&gt;

&lt;p&gt;The irony: AI coding tools are the biggest source of new vulnerabilities &lt;em&gt;and&lt;/em&gt; the best tool for catching them. A regex finds &lt;code&gt;eval()&lt;/code&gt;. Only an LLM can tell you if it's dangerous.&lt;/p&gt;

&lt;p&gt;Critik is the scanner I wanted. Now it exists.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Website&lt;/strong&gt;: &lt;a href="https://critik.dev" rel="noopener noreferrer"&gt;critik.dev&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/AlexlaGuardia/Critik" rel="noopener noreferrer"&gt;AlexlaGuardia/Critik&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;code&gt;pip install critik&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Run &lt;code&gt;critik scan .&lt;/code&gt; on your project. You might not like what you find.&lt;/p&gt;

</description>
      <category>security</category>
      <category>ai</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>OAuth2, Two APIs, and Soft Deletes — Building an MCP Server for FreshBooks</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Thu, 26 Mar 2026 22:33:14 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/oauth2-two-apis-and-soft-deletes-building-an-mcp-server-for-freshbooks-2g03</link>
      <guid>https://dev.to/alexlaguardia/oauth2-two-apis-and-soft-deletes-building-an-mcp-server-for-freshbooks-2g03</guid>
      <description>&lt;p&gt;Most MCP servers assume your target API hands you an API key and gets out of the way. FreshBooks doesn't. It requires full OAuth2, splits its API across two different base URLs, and has resources that can only be soft-deleted. Building this server meant solving problems most MCP tutorials don't prepare you for.&lt;/p&gt;

&lt;p&gt;The result: 25 tools covering invoices, clients, expenses, payments, time tracking, projects, estimates, and reports — with the auth flow, API quirks, and deletion edge cases handled for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP&lt;/a&gt; lets AI assistants interact with external tools directly. With this server installed, you manage your entire freelance business without leaving your AI assistant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (manual):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Log into FreshBooks&lt;/li&gt;
&lt;li&gt;Navigate to Invoices → find overdue ones&lt;/li&gt;
&lt;li&gt;Note the client names and amounts&lt;/li&gt;
&lt;li&gt;Switch to your AI tool&lt;/li&gt;
&lt;li&gt;Type out all the details&lt;/li&gt;
&lt;li&gt;Ask what to do about it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;After (with mcp-freshbooks):&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Which invoices are overdue? Draft follow-up messages for each client based on how late they are."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude calls &lt;code&gt;list_invoices&lt;/code&gt; with a status filter, gets the details, and drafts personalized follow-ups — all in one shot.&lt;/p&gt;

&lt;h2&gt;
  
  
  25 Tools, Full Business Coverage
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Invoices&lt;/strong&gt; (6): List, get, create, update, delete, send by email&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clients&lt;/strong&gt; (4): List, get, create, archive with full contact details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expenses&lt;/strong&gt; (3): List, get, create with category and tax support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payments&lt;/strong&gt; (2): List, record payments against invoices&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time Entries&lt;/strong&gt; (2): List, create with project/service association&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Projects&lt;/strong&gt; (2): List, get with budget and billing details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Estimates&lt;/strong&gt; (2): List, get with line items&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reports&lt;/strong&gt; (1): Profit &amp;amp; Loss report with date filtering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auth&lt;/strong&gt; (3): OAuth2 flow, identity check, connection test&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Technical Decisions Worth Sharing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Full OAuth2 — No Shortcuts
&lt;/h3&gt;

&lt;p&gt;FreshBooks requires OAuth2. No API keys, no shortcuts. The server handles the entire flow: it spins up a local HTTPS callback server, opens the authorization URL, catches the redirect with the auth code, exchanges it for tokens, and persists them to &lt;code&gt;~/.mcp-freshbooks/tokens.json&lt;/code&gt;. Token refresh is automatic — you authenticate once and forget about it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;freshbooks_authenticate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Start OAuth2 authentication. Returns a URL to open in your browser.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_config&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_auth_url&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Spins up localhost:8555 HTTPS callback server in background
&lt;/span&gt;    &lt;span class="nf"&gt;start_callback_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Open this URL to authorize:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This was the hardest part of the build. Most MCP servers assume API keys. When your platform demands OAuth2, you either solve it properly or your server is useless.&lt;/p&gt;

&lt;h3&gt;
  
  
  Two APIs, Two Base URLs
&lt;/h3&gt;

&lt;p&gt;FreshBooks has a split API: accounting resources (invoices, clients, expenses) live at &lt;code&gt;api.freshbooks.com/accounting/account/{account_id}/...&lt;/code&gt;, while project resources (projects, time entries) live at &lt;code&gt;api.freshbooks.com/projects/business/{business_id}/...&lt;/code&gt;. Different base URLs, different ID types.&lt;/p&gt;

&lt;p&gt;The client abstracts this completely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ACCOUNTING_BASE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.freshbooks.com/accounting/account&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;PROJECTS_BASE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.freshbooks.com/projects/business&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;accounting_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...):&lt;/span&gt;
    &lt;span class="n"&gt;account_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;get_ids&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ACCOUNTING_BASE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;account_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;projects_list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...):&lt;/span&gt;
    &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;business_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;get_ids&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;PROJECTS_BASE&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;business_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tools never think about which API base to use — they just call the right function.&lt;/p&gt;

&lt;h3&gt;
  
  
  Soft Deletes vs Hard Deletes
&lt;/h3&gt;

&lt;p&gt;FreshBooks treats deletion differently depending on the resource. Invoices and estimates can be hard-deleted (actually removed). Clients and expenses can only be soft-deleted by setting &lt;code&gt;vis_state&lt;/code&gt; to 1 (archived). Delete a client with the wrong endpoint and you get a cryptic 400 error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;accounting_delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Hard-delete (invoices, estimates).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;accounting_soft_delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wrapper_key&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Soft-delete via vis_state=1 (clients, expenses).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;accounting_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wrapper_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vis_state&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each tool uses the correct method — the AI never needs to know about this distinction.&lt;/p&gt;

&lt;h3&gt;
  
  
  The search[key] Query Format
&lt;/h3&gt;

&lt;p&gt;FreshBooks uses a non-standard query parameter format for filters: &lt;code&gt;search[status]=2&amp;amp;search[date_from]=2024-01-01&lt;/code&gt;. Not &lt;code&gt;status=2&lt;/code&gt;, not &lt;code&gt;filter[status]=2&lt;/code&gt; — specifically &lt;code&gt;search[key]&lt;/code&gt;. Get the format wrong and the API silently ignores your filters.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_build_search_params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;filters&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;v&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setdefault&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;][]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]).&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;search[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tools accept clean Python dicts and handle the formatting internally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started in 2 Minutes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mcp-freshbooks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Create OAuth App
&lt;/h3&gt;

&lt;p&gt;Go to &lt;a href="https://my.freshbooks.com/#/developer" rel="noopener noreferrer"&gt;my.freshbooks.com/#/developer&lt;/a&gt;, create an app, and note the client ID and secret. Set the redirect URI to &lt;code&gt;https://localhost:8555/callback&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Desktop
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;claude_desktop_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"freshbooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp-freshbooks"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"FRESHBOOKS_CLIENT_ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-client-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"FRESHBOOKS_CLIENT_SECRET"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-client-secret"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then ask Claude to run &lt;code&gt;freshbooks_authenticate&lt;/code&gt; — it will give you a URL to authorize. One-time setup, tokens auto-refresh after that.&lt;/p&gt;

&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add freshbooks &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nb"&gt;env &lt;/span&gt;&lt;span class="nv"&gt;FRESHBOOKS_CLIENT_ID&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;id &lt;/span&gt;&lt;span class="nv"&gt;FRESHBOOKS_CLIENT_SECRET&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;secret mcp-freshbooks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor
&lt;/h3&gt;

&lt;p&gt;Same JSON config as Claude Desktop in &lt;code&gt;.cursor/mcp.json&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Add invoice line item support from day one.&lt;/strong&gt; The current &lt;code&gt;create_invoice&lt;/code&gt; accepts line items as a JSON string, which works but isn't the cleanest interface. A dedicated line-item builder would be more ergonomic for the AI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Handle plan-gated features more gracefully.&lt;/strong&gt; FreshBooks gates features by plan tier — time tracking, projects, and advanced reports require paid plans. The error handling catches 403s and explains this, but detecting plan limits upfront would be smoother.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons for MCP Server Builders
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Solve OAuth2 properly.&lt;/strong&gt; If your target platform requires it, don't punt — build the full flow with token persistence and auto-refresh. It's the difference between a demo and a tool people actually use.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abstract API inconsistencies.&lt;/strong&gt; If the platform has split APIs, different deletion behaviors, or non-standard query formats — hide all of it. The AI should never deal with platform quirks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handle plan-tier errors.&lt;/strong&gt; SaaS platforms gate features by pricing tier. Catch permission errors and explain what's happening instead of returning raw 403s.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persist tokens securely.&lt;/strong&gt; Store tokens in a well-known location (&lt;code&gt;~/.mcp-freshbooks/&lt;/code&gt;) with clear documentation. Users shouldn't have to re-authenticate every session.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/AlexlaGuardia/mcp-freshbooks" rel="noopener noreferrer"&gt;AlexlaGuardia/mcp-freshbooks&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href="https://pypi.org/project/mcp-freshbooks/" rel="noopener noreferrer"&gt;mcp-freshbooks&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License&lt;/strong&gt;: MIT&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This is part of a series of production-grade MCP servers I'm building for underserved SaaS platforms. Also available: &lt;a href="https://github.com/AlexlaGuardia/mcp-mailchimp" rel="noopener noreferrer"&gt;Mailchimp&lt;/a&gt;, &lt;a href="https://github.com/AlexlaGuardia/mcp-woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://github.com/AlexlaGuardia/mcp-activecampaign" rel="noopener noreferrer"&gt;ActiveCampaign&lt;/a&gt;. Follow me here or on &lt;a href="https://github.com/AlexlaGuardia" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; to catch the next one.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Update (April 2026):&lt;/strong&gt; Since publishing, mcp-freshbooks has been expanded from 25 to &lt;strong&gt;53 tools&lt;/strong&gt; (v0.2.0). New coverage includes recurring invoices, 5 typed reports (P&amp;amp;L, tax, accounts aging, revenue, expense), and workflow tools like invoice-from-time and estimate-to-invoice conversion. &lt;code&gt;pip install mcp-freshbooks&lt;/code&gt; to get the latest.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>python</category>
      <category>freshbooks</category>
    </item>
    <item>
      <title>Zero to 33 Tools: Building the First MCP Server for ActiveCampaign</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Thu, 26 Mar 2026 22:32:11 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/zero-to-33-tools-building-the-first-mcp-server-for-activecampaign-f8j</link>
      <guid>https://dev.to/alexlaguardia/zero-to-33-tools-building-the-first-mcp-server-for-activecampaign-f8j</guid>
      <description>&lt;p&gt;I searched GitHub, npm, PyPI, and every MCP registry I could find for an ActiveCampaign MCP server. Zero results. Not a bad one, not an incomplete one — nothing. For a platform with 185,000 paying customers and a full-featured API, that gap felt worth filling.&lt;/p&gt;

&lt;p&gt;So I built it from scratch: 33 tools covering contacts, deals, automations, tags, pipelines, campaigns, custom fields, and webhooks.&lt;/p&gt;

&lt;h2&gt;
  
  
  What It Does
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://modelcontextprotocol.io/" rel="noopener noreferrer"&gt;MCP&lt;/a&gt; lets AI assistants interact with external tools directly. With this server installed, you manage your CRM and marketing automation without leaving your AI assistant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (manual):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Log into ActiveCampaign&lt;/li&gt;
&lt;li&gt;Navigate to Contacts → search for a customer&lt;/li&gt;
&lt;li&gt;Check their tags, deals, automation history&lt;/li&gt;
&lt;li&gt;Switch to your AI tool&lt;/li&gt;
&lt;li&gt;Describe what you found&lt;/li&gt;
&lt;li&gt;Ask for analysis&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;After (with mcp-activecampaign):&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Find all contacts tagged 'enterprise-lead' and show me their deal pipeline status. Which ones haven't been contacted in 30 days?"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude calls &lt;code&gt;list_contacts&lt;/code&gt; with a tag filter, then &lt;code&gt;list_deals&lt;/code&gt; for each, and gives you an actionable priority list — all in one shot.&lt;/p&gt;

&lt;h2&gt;
  
  
  33 Tools, Full CRM Coverage
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Contacts&lt;/strong&gt; (7): List, get, create, update, delete, search, manage tags&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deals&lt;/strong&gt; (5): List, get, create, update, delete with pipeline/stage support&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tags&lt;/strong&gt; (4): List, create, add to contact, remove from contact&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lists&lt;/strong&gt; (2): List all, get details with subscriber counts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automations&lt;/strong&gt; (3): List, get details, add contacts to automations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pipelines&lt;/strong&gt; (2): List pipelines, list stages within pipelines&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Fields&lt;/strong&gt; (3): List fields, get values, set values per contact&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Campaigns&lt;/strong&gt; (2): List campaigns with stats, get full campaign details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accounts&lt;/strong&gt; (2): List and get company/organization records&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Webhooks&lt;/strong&gt; (3): List, create, delete for real-time event handling&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Technical Decisions Worth Sharing
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Client-Side Rate Limiting
&lt;/h3&gt;

&lt;p&gt;ActiveCampaign enforces 5 requests per second per account. Hit the limit and you get 429s that can cascade. Instead of reacting to failures, the client prevents them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;MAX_RPS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;
&lt;span class="n"&gt;MIN_INTERVAL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;MAX_RPS&lt;/span&gt;  &lt;span class="c1"&gt;# 0.2s between requests
&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_throttle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Enforce 5 req/s rate limit.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_lock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;monotonic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_request&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MIN_INTERVAL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MIN_INTERVAL&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;elapsed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_last_request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;monotonic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An asyncio lock ensures thread safety, and &lt;code&gt;time.monotonic()&lt;/code&gt; avoids clock-drift edge cases. If a 429 still slips through (burst from another client), there's a fallback that respects the &lt;code&gt;Retry-After&lt;/code&gt; header:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;retry_after&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Retry-After&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retry_after&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;kwargs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Belt and suspenders. The AI never sees rate limit errors.&lt;/p&gt;

&lt;h3&gt;
  
  
  The "Deal Groups" Translation
&lt;/h3&gt;

&lt;p&gt;ActiveCampaign calls pipelines "deal groups" internally. The API endpoint is &lt;code&gt;/dealGroups&lt;/code&gt;, stages are filtered by &lt;code&gt;d_groupid&lt;/code&gt;, and creating a deal requires a &lt;code&gt;group&lt;/code&gt; field — not &lt;code&gt;pipeline&lt;/code&gt;. This naming inconsistency trips up every integration:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_pipelines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;List deal pipelines (called &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deal groups&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; in AC API).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;ac&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/dealGroups&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dealGroups&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="n"&gt;pipelines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="bp"&gt;...&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tools use the word "pipeline" (what users expect) while the client sends "dealGroup" (what the API expects). The AI works with natural language; the translation happens silently.&lt;/p&gt;

&lt;h3&gt;
  
  
  URL Normalization with API Path
&lt;/h3&gt;

&lt;p&gt;ActiveCampaign API URLs look like &lt;code&gt;https://youraccountname.api-us1.com/api/3/contacts&lt;/code&gt;. Users might pass just the account URL, or include the &lt;code&gt;/api/3&lt;/code&gt; suffix. The client normalizes both:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;base_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rstrip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/api/3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/api/3&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Small thing, but it eliminates a common setup failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Read-Only Automations (And Being Honest About It)
&lt;/h3&gt;

&lt;p&gt;ActiveCampaign's API doesn't support creating automations programmatically — you can only list them, view details, and add contacts to existing ones. The tool docstrings say this explicitly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_automations&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;List automations (read-only — AC API doesn&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t support creating automations).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the AI knows the boundary, it can suggest alternatives ("Create the automation in the AC dashboard, then I can add contacts to it") instead of failing mysteriously.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started in 2 Minutes
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;mcp-activecampaign
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Get Your API Credentials
&lt;/h3&gt;

&lt;p&gt;ActiveCampaign → Settings → Developer. You need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;API URL&lt;/strong&gt;: &lt;code&gt;https://youraccountname.api-us1.com&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Key&lt;/strong&gt;: The key shown on that page&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Claude Desktop
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;claude_desktop_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"activecampaign"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp-activecampaign"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ACTIVECAMPAIGN_URL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://youraccountname.api-us1.com"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"ACTIVECAMPAIGN_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-api-key"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Claude Code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add activecampaign &lt;span class="nt"&gt;--&lt;/span&gt; &lt;span class="nb"&gt;env &lt;/span&gt;&lt;span class="nv"&gt;ACTIVECAMPAIGN_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;https://youraccountname.api-us1.com &lt;span class="nv"&gt;ACTIVECAMPAIGN_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;key mcp-activecampaign
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Cursor
&lt;/h3&gt;

&lt;p&gt;Same JSON config as Claude Desktop in &lt;code&gt;.cursor/mcp.json&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Add deal note support.&lt;/strong&gt; ActiveCampaign deals have a notes system that's heavily used by sales teams. I covered the core CRUD but skipped notes — they're high-value for AI-assisted sales workflows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Build contact-to-deal linking tools.&lt;/strong&gt; The current &lt;code&gt;create_deal&lt;/code&gt; accepts a &lt;code&gt;contact_id&lt;/code&gt;, but there's no tool to view or manage the contact-deal association after creation. That relationship is central to how AC users think about their CRM.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons for MCP Server Builders
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Rate limit proactively, not reactively.&lt;/strong&gt; Client-side throttling is always better than hitting limits and retrying. Use &lt;code&gt;asyncio.Lock&lt;/code&gt; + &lt;code&gt;time.monotonic()&lt;/code&gt; for a clean implementation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Translate internal naming to user naming.&lt;/strong&gt; If the API calls something "dealGroups" but users call it "pipelines," use the user's word in your tools. The translation is invisible and the ergonomics improve dramatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document API limitations in docstrings.&lt;/strong&gt; If the platform doesn't support creating a resource via API, say so in the tool description. The AI uses docstrings to decide what to suggest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Be the first mover.&lt;/strong&gt; When you search for "[Platform] MCP server" and find zero results, that's a signal. 185K customers and nobody built this? Ship it.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Links
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/AlexlaGuardia/mcp-activecampaign" rel="noopener noreferrer"&gt;AlexlaGuardia/mcp-activecampaign&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PyPI&lt;/strong&gt;: &lt;a href="https://pypi.org/project/mcp-activecampaign/" rel="noopener noreferrer"&gt;mcp-activecampaign&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;License&lt;/strong&gt;: MIT&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This is part of a series of production-grade MCP servers I'm building for underserved SaaS platforms. Also available: &lt;a href="https://github.com/AlexlaGuardia/mcp-mailchimp" rel="noopener noreferrer"&gt;Mailchimp&lt;/a&gt;, &lt;a href="https://github.com/AlexlaGuardia/mcp-woocommerce" rel="noopener noreferrer"&gt;WooCommerce&lt;/a&gt;, &lt;a href="https://github.com/AlexlaGuardia/mcp-freshbooks" rel="noopener noreferrer"&gt;FreshBooks&lt;/a&gt;. Follow me here or on &lt;a href="https://github.com/AlexlaGuardia" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; to catch the next one.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Update (April 2026):&lt;/strong&gt; Since publishing, mcp-activecampaign has been expanded from 33 to &lt;strong&gt;65 tools&lt;/strong&gt; (v0.2.0). New coverage includes lead scoring, saved segments, campaign create/send, forms, goals, deal custom fields, CRM notes and tasks, event tracking, ecommerce, and bulk import. &lt;code&gt;pip install mcp-activecampaign&lt;/code&gt; to get the latest.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>python</category>
      <category>activecampaign</category>
    </item>
    <item>
      <title>How I Built a Full Product in One Night with 3 Parallel AI Agents</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Wed, 25 Mar 2026 22:12:15 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/how-i-built-a-full-product-in-one-night-with-3-parallel-ai-agents-1bpk</link>
      <guid>https://dev.to/alexlaguardia/how-i-built-a-full-product-in-one-night-with-3-parallel-ai-agents-1bpk</guid>
      <description>&lt;p&gt;Last Thursday night I sat down to add session handoff to my Python library. I stood up 8 hours later with a complete product: MCP server, REST API, embedded dashboard, event triggers, signal compaction. From 1,200 lines to 5,900. From a CLI tool to something with a web UI.&lt;/p&gt;

&lt;p&gt;Here's how, and why the technique matters more than the project.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Setup
&lt;/h2&gt;

&lt;p&gt;I'd built &lt;a href="https://github.com/AlexlaGuardia/Vigil" rel="noopener noreferrer"&gt;Vigil&lt;/a&gt; — an awareness daemon for AI agents. It worked great as a CLI tool: emit signals, compile state, boot agents with context. But it was missing the features that make it a real product: session handoff, an MCP server for Claude/Cursor integration, a REST API, and a dashboard.&lt;/p&gt;

&lt;p&gt;Each of these features lives in its own module. They share the database layer but have no code dependencies on each other. That's the key insight.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Technique: Parallel AI Windows
&lt;/h2&gt;

&lt;p&gt;I opened three Claude Code terminal windows, each working on a separate file:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Window 1:&lt;/strong&gt; Session handoff protocol (&lt;code&gt;handoff.py&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Window 2:&lt;/strong&gt; Signal compaction engine (&lt;code&gt;compaction.py&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Window 3:&lt;/strong&gt; MCP server mode (&lt;code&gt;mcp_server.py&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each window had the same context: the existing codebase, the database schema, the module interfaces. But they worked independently, writing to separate files. No merge conflicts.&lt;/p&gt;

&lt;p&gt;While those three were building, I reviewed their output periodically and planned the next batch. When all three finished, I opened a single integration window that wired everything together: imports, CLI commands, shared database migrations.&lt;/p&gt;

&lt;p&gt;Then I repeated the pattern:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Window 1:&lt;/strong&gt; REST API (&lt;code&gt;api.py&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Window 2:&lt;/strong&gt; Dashboard templates (5 HTML pages)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Window 3:&lt;/strong&gt; Event triggers (&lt;code&gt;triggers.py&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same idea. Independent files, parallel execution, single integration pass.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Made It Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Clean module boundaries.&lt;/strong&gt; Every feature was a new file that imported from existing modules (&lt;code&gt;db.py&lt;/code&gt;, &lt;code&gt;signals.py&lt;/code&gt;, &lt;code&gt;awareness.py&lt;/code&gt;). No feature needed to modify another feature's code. This isn't accidental — I designed the architecture knowing I'd build this way.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Stable interfaces.&lt;/strong&gt; The database schema and the &lt;code&gt;VigilDB&lt;/code&gt; API were stable. All three windows could &lt;code&gt;from vigil.db import VigilDB&lt;/code&gt; and trust that the interface wouldn't change under them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Integration is the bottleneck, not implementation.&lt;/strong&gt; Each module took 30-60 minutes to build. Integration — wiring CLI commands, updating &lt;code&gt;__init__.py&lt;/code&gt;, running the test suite — took 20 minutes per batch. The integration step is where I caught type mismatches, missing imports, and interface disagreements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Mid-session code audit.&lt;/strong&gt; After the first batch (handoff + compaction + MCP), I ran a full code audit before starting batch two. Found 3 critical issues and 6 important ones. Fixing them before building the REST API prevented those bugs from propagating.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Before (v0.1)&lt;/th&gt;
&lt;th&gt;After (v1.0)&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Lines of code&lt;/td&gt;
&lt;td&gt;1,278&lt;/td&gt;
&lt;td&gt;5,922&lt;/td&gt;
&lt;td&gt;+4,644&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Modules&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;+6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tests&lt;/td&gt;
&lt;td&gt;48&lt;/td&gt;
&lt;td&gt;196&lt;/td&gt;
&lt;td&gt;+148&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CLI commands&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;+7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;14 commits over 8 hours. Zero merge conflicts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I'd Do Differently
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with the integration test, not the unit tests.&lt;/strong&gt; Each window wrote unit tests for its module. But the integration tests — "emit a signal, compile awareness, check the dashboard shows it" — came last. If I'd written those first, I'd have caught interface mismatches earlier.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Define the REST API schema before building the dashboard.&lt;/strong&gt; Window 2 (dashboard) had to guess what the API response shapes would be, because Window 1 (API) was still being built. A shared types file or API schema would have eliminated that friction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Beyond My Project
&lt;/h2&gt;

&lt;p&gt;The parallel AI window technique works for any codebase with clean module boundaries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microservices:&lt;/strong&gt; Each window builds a different service&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frontend components:&lt;/strong&gt; Each window builds a different page/component&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data pipelines:&lt;/strong&gt; Each window builds a different stage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The constraint is the same: modules must be independently buildable with stable interfaces between them. If feature B needs to call feature A's code and that code doesn't exist yet, you can't parallelize them.&lt;/p&gt;

&lt;p&gt;This is also a strong argument for writing clean interfaces first. The 30 minutes I spent designing &lt;code&gt;VigilDB&lt;/code&gt;'s API saved hours of integration pain later.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Product
&lt;/h2&gt;

&lt;p&gt;Vigil v1.5.0 is on PyPI. It gives AI agents persistent awareness across sessions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Awareness daemon&lt;/strong&gt; compiles system state every 90 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Frame-based tool filtering&lt;/strong&gt; reduces context by 75-85%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Signal protocol&lt;/strong&gt; lets agents coordinate without direct communication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Session handoff&lt;/strong&gt; with structured summaries and resume&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt; with 12 tools (Claude Code, Cursor, Claude Desktop)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;REST API&lt;/strong&gt; with 20 endpoints and SSE event stream&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard&lt;/strong&gt; with live updates
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vigil-agent
vigil init
vigil daemon start
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/AlexlaGuardia/Vigil" rel="noopener noreferrer"&gt;github.com/AlexlaGuardia/Vigil&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If you're building with AI coding assistants and want to move faster, try the parallel window technique on your next feature batch. The key is architecture that supports it — clean boundaries, stable interfaces, independent files.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>python</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I Built a Nervous System for AI Agents (Not Another Memory Store)</title>
      <dc:creator>Alex LaGuardia</dc:creator>
      <pubDate>Wed, 25 Mar 2026 22:09:18 +0000</pubDate>
      <link>https://dev.to/alexlaguardia/i-built-a-nervous-system-for-ai-agents-not-another-memory-store-5a8a</link>
      <guid>https://dev.to/alexlaguardia/i-built-a-nervous-system-for-ai-agents-not-another-memory-store-5a8a</guid>
      <description>&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Everyone's building AI agents. Nobody's building the infrastructure to keep them aware.&lt;/p&gt;

&lt;p&gt;I've been running ~95 MCP tools across multiple AI agents for the past year — a coding assistant, a trading system, a creative writing setup. Three problems kept hitting me:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Cold starts.&lt;/strong&gt; Every new session starts from zero. The agent has no idea what happened 5 minutes ago in a different session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Token bloat.&lt;/strong&gt; Loading 95 tool definitions into context burns ~50,000 tokens before the agent does a single useful thing. That's real money and real context window wasted on tools the agent won't use.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. No coordination.&lt;/strong&gt; Multiple agents working on the same system can't hand off work or share awareness without me copy-pasting context between them.&lt;/p&gt;

&lt;p&gt;The existing tools (Mem0, Letta, LangGraph) solve pieces of this. Mem0 does memory retrieval. Letta does stateful agents. LangGraph does workflow state. But none of them give agents &lt;strong&gt;awareness&lt;/strong&gt; — a continuously-compiled understanding of what's happening right now.&lt;/p&gt;

&lt;h2&gt;
  
  
  What If Agents Had a Nervous System?
&lt;/h2&gt;

&lt;p&gt;Memory stores are filing cabinets. You put stuff in, you pull stuff out. That's useful, but it's not how awareness works.&lt;/p&gt;

&lt;p&gt;Your nervous system doesn't wait for you to query it. It continuously processes signals from your environment and compiles them into a state that's instantly available. You don't boot up every morning and run &lt;code&gt;SELECT * FROM memories WHERE relevant = true&lt;/code&gt;. You just... know what's going on.&lt;/p&gt;

&lt;p&gt;That's what I built.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vigil: The Six Ideas
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. The Awareness Daemon
&lt;/h3&gt;

&lt;p&gt;A background process runs every 90 seconds, reading signals from agents and compiling them into "hot context" — a structured snapshot any agent can boot from instantly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VigilDaemon&lt;/span&gt;

&lt;span class="n"&gt;daemon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VigilDaemon&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;db_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vigil.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;compile_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;awareness_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;AWARENESS.md&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;daemon&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When an agent starts a session, it calls &lt;code&gt;compiler.boot()&lt;/code&gt; and gets full context in under a second: active frame, current work, recent signals, priority queue. No startup latency.&lt;/p&gt;

&lt;p&gt;The daemon also writes an &lt;code&gt;AWARENESS.md&lt;/code&gt; file — human-readable, version-controllable. My agents and I read the same file.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Frame-Based Tool Filtering
&lt;/h3&gt;

&lt;p&gt;This was the biggest win. Instead of loading all tools into every context, you tag tools with "frames" — named context modes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil.registry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tool_count&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;deploy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deploy to production&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;devops&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;deploy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;write_chapter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write a story chapter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;creative&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;write_chapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nd"&gt;@tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;health&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Health check&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="n"&gt;frames&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;core&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;  &lt;span class="c1"&gt;# Always visible
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;health&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="nf"&gt;tool_count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;              &lt;span class="c1"&gt;# 3 (all tools)
&lt;/span&gt;&lt;span class="nf"&gt;tool_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;     &lt;span class="c1"&gt;# 2 (deploy + health)
&lt;/span&gt;&lt;span class="nf"&gt;tool_count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;creative&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;    &lt;span class="c1"&gt;# 2 (write_chapter + health)
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An agent in "backend" mode never sees creative writing tools. In my setup, this took tool definitions from 95 down to 14-25 per session — a &lt;strong&gt;75-85% reduction&lt;/strong&gt; in tool-definition tokens. The LLM also makes better tool choices with fewer irrelevant options.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Signal Protocol
&lt;/h3&gt;

&lt;p&gt;Agents communicate through signals — short, categorized messages with content budgets:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;Budget&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;observation&lt;/td&gt;
&lt;td&gt;400 chars&lt;/td&gt;
&lt;td&gt;Regular activity updates&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;handoff&lt;/td&gt;
&lt;td&gt;600 chars&lt;/td&gt;
&lt;td&gt;Session conclusions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;summary&lt;/td&gt;
&lt;td&gt;800 chars&lt;/td&gt;
&lt;td&gt;Comprehensive summaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;alert&lt;/td&gt;
&lt;td&gt;300 chars&lt;/td&gt;
&lt;td&gt;Urgent notifications&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;SignalBus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VigilDB&lt;/span&gt;

&lt;span class="n"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;VigilDB&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;vigil.db&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;bus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SignalBus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Deployed auth service v2. Tests passing.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;bus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;frontend-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Dashboard layout refactored for mobile.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Content budgets prevent runaway data. The daemon reads these signals, synthesizes them into the awareness summary, and moves on. Agents don't talk to each other — they emit into the bus and the daemon handles the rest.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Session Handoff
&lt;/h3&gt;

&lt;p&gt;This is what makes multi-session work actually work. Agents end sessions with structured summaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;HandoffProtocol&lt;/span&gt;

&lt;span class="n"&gt;proto&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HandoffProtocol&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;proto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;end_session&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;backend-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Shipped auth v2 with JWT tokens&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;files_touched&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;auth.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;middleware.py&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;decisions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Switched from session cookies to JWT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;next_steps&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Add rate limiting&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Write integration tests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Next morning, different agent resumes
&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;proto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;morning-agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Includes: last handoff, signals since, pending next steps
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Handoff chains track continuity across sessions. The resume context tells the next agent exactly what happened, what decisions were made, and what to do next. No more "remind me what we were working on."&lt;/p&gt;

&lt;h3&gt;
  
  
  5. Signal Compaction
&lt;/h3&gt;

&lt;p&gt;Signals accumulate. Without compaction, your awareness context grows forever. Vigil uses tiered retention:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Raw signals&lt;/strong&gt; — kept for 7 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Daily summaries&lt;/strong&gt; — synthesized from raw, kept for 30 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Weekly digests&lt;/strong&gt; — synthesized from daily, kept for 90 days&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monthly snapshots&lt;/strong&gt; — permanent archive
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil compact &lt;span class="nt"&gt;--dry-run&lt;/span&gt;  &lt;span class="c"&gt;# Preview what would be compacted&lt;/span&gt;
vigil compact            &lt;span class="c"&gt;# Run it&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;History stays manageable without losing important context.&lt;/p&gt;

&lt;h3&gt;
  
  
  6. Event Triggers
&lt;/h3&gt;

&lt;p&gt;Pattern-match on signals and fire actions automatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;vigil&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TriggerManager&lt;/span&gt;

&lt;span class="n"&gt;triggers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TriggerManager&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;triggers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alert-to-slack&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;signal_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;alert&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;agent_pattern&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;action_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;webhook&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;action_config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://hooks.slack.com/...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"If any agent emits an alert, post to Slack." "If the backend agent goes silent for 2 hours, create a focus item." Triggers turn Vigil from a passive awareness layer into an active coordination system.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP Server: The Distribution Play
&lt;/h2&gt;

&lt;p&gt;Everything above is available as an MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;vigil serve                          &lt;span class="c"&gt;# stdio (Claude Code, Claude Desktop)&lt;/span&gt;
vigil serve &lt;span class="nt"&gt;--transport&lt;/span&gt; sse          &lt;span class="c"&gt;# SSE (remote clients)&lt;/span&gt;
vigil serve &lt;span class="nt"&gt;--transport&lt;/span&gt; http         &lt;span class="c"&gt;# REST API + dashboard&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;12 MCP tools: boot, compile, signal, status, signals, handoff, resume, chain, stale, focus, frames, agents.&lt;/p&gt;

&lt;p&gt;Any MCP-compatible client (Claude Code, Cursor, Windsurf, Claude Desktop) connects and gets persistent awareness. The agent boots with context, emits signals during work, and hands off when done. Next session picks up where it left off.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Numbers
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Modules&lt;/td&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lines of code&lt;/td&gt;
&lt;td&gt;7,100+&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tests&lt;/td&gt;
&lt;td&gt;252&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;MCP tools&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;REST endpoints&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dashboard pages&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dependencies&lt;/td&gt;
&lt;td&gt;0 (stdlib only, MCP is optional)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Infrastructure&lt;/td&gt;
&lt;td&gt;SQLite (zero setup)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Why Not [Existing Tool]?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Vigil&lt;/th&gt;
&lt;th&gt;Mem0&lt;/th&gt;
&lt;th&gt;Letta&lt;/th&gt;
&lt;th&gt;LangGraph&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Approach&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Awareness daemon&lt;/td&gt;
&lt;td&gt;Memory retrieval&lt;/td&gt;
&lt;td&gt;Stateful runtime&lt;/td&gt;
&lt;td&gt;State machine&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Context&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pre-compiled, instant&lt;/td&gt;
&lt;td&gt;Query on demand&lt;/td&gt;
&lt;td&gt;LLM-managed&lt;/td&gt;
&lt;td&gt;Checkpoint-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Tool filtering&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Frame-based (75-85% savings)&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-agent&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Signal protocol + handoff&lt;/td&gt;
&lt;td&gt;Shared memory&lt;/td&gt;
&lt;td&gt;Single agent&lt;/td&gt;
&lt;td&gt;Graph edges&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compaction&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tiered retention&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;td&gt;LLM-managed&lt;/td&gt;
&lt;td&gt;None&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MCP native&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Built-in server&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Infrastructure&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SQLite&lt;/td&gt;
&lt;td&gt;API + LLM costs&lt;/td&gt;
&lt;td&gt;Full runtime&lt;/td&gt;
&lt;td&gt;LangChain ecosystem&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These aren't competitors — they're complementary. Vigil handles awareness and coordination. Mem0 handles deep memory. Use both.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;vigil-agent

vigil init
vigil signal my-agent &lt;span class="s2"&gt;"Starting work on the auth system"&lt;/span&gt;
vigil daemon start
vigil status
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or as an MCP server:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"vigil-agent[mcp]"&lt;/span&gt;
vigil serve
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;252 tests. MIT license. Zero external dependencies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/AlexlaGuardia/Vigil" rel="noopener noreferrer"&gt;github.com/AlexlaGuardia/Vigil&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;PyPI:&lt;/strong&gt; &lt;a href="https://pypi.org/project/vigil-agent/" rel="noopener noreferrer"&gt;pypi.org/project/vigil-agent&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This is v1.5.0. The roadmap includes a hosted multi-tenant platform, federation protocol for cross-org agent coordination, and eventually a hardware device (Pi-based always-on awareness hub). If you're building multi-agent systems and fighting the same problems, I'd love to hear how you're solving them.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>mcp</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
