<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Cloud(x);</title>
    <description>The latest articles on DEV Community by Cloud(x); (@cloudx).</description>
    <link>https://dev.to/cloudx</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F1620%2F772f6ad0-995c-48ff-9f1d-8b28f2700a5b.jpg</url>
      <title>DEV Community: Cloud(x);</title>
      <link>https://dev.to/cloudx</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cloudx"/>
    <language>en</language>
    <item>
      <title>Stop Wasting Time on CVEs That Don't Affect You</title>
      <dc:creator>Juan M. Altamirano</dc:creator>
      <pubDate>Mon, 16 Mar 2026 18:47:53 +0000</pubDate>
      <link>https://dev.to/cloudx/stop-wasting-time-on-cves-that-dont-affect-you-1ln9</link>
      <guid>https://dev.to/cloudx/stop-wasting-time-on-cves-that-dont-affect-you-1ln9</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;Aren't you tired of pushing new code and then a few days later receiving an alert from Github's Dependabot? Well, I am.&lt;br&gt;
The most annoying part is looking for the CVE, reviewing your code and then detecting that you aren't using the affected part. Rinse and repeat for every single alert.&lt;/p&gt;


&lt;h2&gt;
  
  
  The solution?
&lt;/h2&gt;

&lt;p&gt;That's why I built dep_shield — a CLI that I can plug into my common workflow (lint -&amp;gt; dep_shield -&amp;gt; tests -&amp;gt; sonar) and get a straight answer: "this CVE affects you" or "relax, you're fine."&lt;/p&gt;


&lt;h2&gt;
  
  
  How dep_shield Works
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fcloudx-labs%2Fposts%2Fmain%2Fposts%2Fjuanmaalt%2Fassets%2Fdep_shield_flow.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fcloudx-labs%2Fposts%2Fmain%2Fposts%2Fjuanmaalt%2Fassets%2Fdep_shield_flow.png" alt="dep_shield analysis flow"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The flow is straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Parse dependencies&lt;/strong&gt; — Read &lt;code&gt;requirements.txt&lt;/code&gt; or &lt;code&gt;pyproject.toml&lt;/code&gt;, extract packages and versions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Check for CVEs&lt;/strong&gt; — Query the OSV database for known vulnerabilities
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Find usage in code&lt;/strong&gt; — Scan your Python files to see where you import vulnerable packages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;AI-powered analysis&lt;/strong&gt; — Send the CVE description + your import context to an LLM and ask: "Does this actually affect me?"&lt;/li&gt;
&lt;/ol&gt;


&lt;h2&gt;
  
  
  The Interesting Parts
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Parsing Dependencies (Both Formats)
&lt;/h3&gt;

&lt;p&gt;The tool supports &lt;code&gt;requirements.txt&lt;/code&gt; and &lt;code&gt;pyproject.toml&lt;/code&gt; — it figures out which one you're using.&lt;br&gt;
Why both? Well, &lt;code&gt;requirements.txt&lt;/code&gt; is still widely used, but &lt;code&gt;uv&lt;/code&gt; is taking over fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  Querying OSV (the free vulnerability database)
&lt;/h3&gt;

&lt;p&gt;OSV over NVD or Snyk? No API key, no rate limits, no pricing tiers. NVD is slow and wants you to parse CPE identifiers. Snyk needs auth and has usage limits.&lt;br&gt;
OSV just works — POST a package name, get vulnerabilities back. For a CLI tool meant to run locally and fast, that's exactly what I needed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;OSV_API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.osv.dev/v1/query&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;query_vulnerabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Vulnerability&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;package&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ecosystem&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PyPI&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;version&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;OSV_API_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;10.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;parse_vulnerabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vulns"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GHSA-cpwx-vrp4-4pq7"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Jinja2 vulnerable to sandbox breakout through attr filter selecting format method"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"details"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"An oversight in how the Jinja sandboxed environment interacts with the `|attr` filter allows an attacker that controls the content of a template to execute arbitrary Python code.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;To exploit the vulnerability, an attacker needs to control the content of a template. Whether that is the case depends on the type of application using Jinja. This vulnerability impacts users of applications which execute untrusted templates.&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s2"&gt;Jinja's sandbox does catch calls to `str.format` and ensures they don't escape the sandbox. However, it's possible to use the `|attr` filter to get a reference to a string's plain format method, bypassing the sandbox. After the fix, the `|attr` filter no longer bypasses the environment's attribute lookup."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"aliases"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"CVE-2025-27516"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2026-02-04T04:14:58.595738Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"published"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-03-05T20:40:14Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"affected"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"package"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"jinja2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"ecosystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PyPI"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"purl"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pkg:pypi/jinja2"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
                    &lt;/span&gt;&lt;span class="nl"&gt;"ranges"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ECOSYSTEM"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"events"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"introduced"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"fixed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3.1.6"&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;em&gt;The full response has more fields (references, versions, etc.), but these are the relevant ones.&lt;/em&gt;&lt;/p&gt;


&lt;h3&gt;
  
  
  Finding Where You Actually Use the Package
&lt;/h3&gt;

&lt;p&gt;It's easy to forget where you actually use a dependency — especially in large projects with dozens of packages. And not all usage is equal: importing requests in your core API is very different from importing it in a one-off migration script.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scan_file_for_package&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;CodeUsage&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;pattern_import&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^import\s+&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;(\s|,|$|\.)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;pattern_from&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;rf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;^from\s+&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;(\s|\.)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;encoding&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;utf-8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ignore&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern_import&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;import&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pattern_from&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;CodeUsage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_path&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                &lt;span class="n"&gt;line_number&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;line_num&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;line_content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;import_type&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;import_type&lt;/span&gt;
            &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The AI Part: Asking A Model If It Matters
&lt;/h3&gt;

&lt;p&gt;Now, here's where it gets interesting. I send the LLM:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The CVE description&lt;/li&gt;
&lt;li&gt;The import statements from your code&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And ask: "Given how this code uses the package, does this vulnerability apply?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Important:&lt;/strong&gt; Only import lines are sent to the LLM — not your actual business logic. The model sees &lt;code&gt;from requests import Session&lt;/code&gt;, not the body of your functions. Your code stays local.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_SYSTEM_PROMPT&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are a security analyst reviewing Python dependency vulnerabilities.
You will be given a CVE description and import-level code evidence.
Your job is to determine if the vulnerability realistically affects this codebase.

Risk level definitions:
- HIGH: The imports directly expose the vulnerable code path
- MEDIUM: Package is imported but no clear evidence the vulnerable feature is used
- LOW: Package only imported in tests or dev tooling
- NONE: CVE conditions are not plausible in this codebase&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;I use Pydantic to force the model into a consistent format, no free-form text that I'd have to parse.&lt;br&gt;
It either gives me a valid ImpactAnalysis or it fails. No "well, it depends..." answers.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ImpactAnalysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Literal&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HIGH&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDIUM&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOW&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;NONE&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;recommendation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Making It Smarter Over Time (RAG)
&lt;/h3&gt;

&lt;p&gt;Here's the thing — the LLM doesn't know your codebase. It analyzes each CVE in isolation. But what if it could remember past analyses?&lt;/p&gt;

&lt;p&gt;That's where ChromaDB comes in. Every time the tool analyzes a vulnerability, it stores the result. Next time a similar CVE shows up (same package, similar vulnerability type), it retrieves past analyses as context.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;store_analysis&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ImpactAnalysis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;code_context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;explanation&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;metadatas&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;risk&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;risk_level&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;package&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;package_name&lt;/span&gt;&lt;span class="p"&gt;}],&lt;/span&gt;
        &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vulnerability_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_similar_analyses&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vulnerability_description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;query_texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;vulnerability_description&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;limit&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;The result? The tool gets better the more you use it. If you analyzed a &lt;code&gt;requests&lt;/code&gt; CVE last month and a new one appears, the model sees how you handled it before.&lt;/p&gt;


&lt;h2&gt;
  
  
  What the Output Looks Like
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5e6e3ghsppwylzhm2tl.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm5e6e3ghsppwylzhm2tl.png" alt="TERMINAL OUTPUT"&gt;&lt;/a&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;Honestly? It reinforced an ancient rule: &lt;em&gt;context is everything&lt;/em&gt;&lt;br&gt;
A CVE that sounds critical in the abstract becomes irrelevant when you realize you only import that package in a test helper. OSV turned out to be the perfect data source — free, fast, no API keys, solid Python coverage. And while the LLM does a surprisingly good job at triage, I still treat its output as a suggestion, not a verdict. The RAG layer (ChromaDB) was an afterthought, but it's become one of the most useful parts: the tool genuinely gets better as it sees more of your codebase.&lt;/p&gt;


&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;I have a few ideas in mind:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More languages — JavaScript and Go are probably the next ones&lt;/li&gt;
&lt;li&gt;CI/CD integration — run it in your pipeline, make it fail the build if there is any 'HIGH' impact&lt;/li&gt;
&lt;li&gt;Offline mode — run it against a local LLM&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;Give it a shot and let me know what breaks. Seriously — I'd appreciate any feedback.&lt;br&gt;


&lt;/p&gt;
&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/juanmaalt" rel="noopener noreferrer"&gt;
        juanmaalt
      &lt;/a&gt; / &lt;a href="https://github.com/juanmaalt/dep_shield" rel="noopener noreferrer"&gt;
        dep_shield
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Python dependency vulnerability scanner with AI-powered impact analysis
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;
&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;dep_shield&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;A Python dependency vulnerability scanner that detects known CVEs in your project's dependencies using the OSV database and provides AI-powered impact analysis to assess how each vulnerability affects your specific codebase.&lt;/p&gt;




&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;How does it work?&lt;/h2&gt;
&lt;/div&gt;

&lt;div class="snippet-clipboard-content notranslate position-relative overflow-auto"&gt;&lt;pre class="notranslate"&gt;&lt;code&gt;requirements.txt  ↘
                   dependency list → OSV API → vulnerabilities
pyproject.toml    ↗                                   ↓
                                        Python files → import scanner → code usage
                                                      ↓
                                          (--analyze only, skipped if not imported)
                                                      ↓
                                          LLM impact analysis
                                                      ↓
                                    ChromaDB (RAG vector store)
&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;
&lt;ol&gt;
&lt;li&gt;Discover and parse &lt;code&gt;requirements.txt&lt;/code&gt; and/or &lt;code&gt;pyproject.toml&lt;/code&gt; (PEP 621 and Poetry formats) to extract package names and versions; duplicates across files are deduplicated&lt;/li&gt;
&lt;li&gt;Query the OSV API for each dependency to find known CVEs&lt;/li&gt;
&lt;li&gt;For each vulnerable dependency, scan all &lt;code&gt;.py&lt;/code&gt; files to detect import statements and usage locations&lt;/li&gt;
&lt;li&gt;If &lt;code&gt;--analyze&lt;/code&gt; is set and the package is imported, retrieve similar past CVEs from ChromaDB and call GPT-4o-mini with CVE details + import context; if the package is not imported…&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/juanmaalt/dep_shield" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;





</description>
      <category>python</category>
      <category>security</category>
      <category>ai</category>
      <category>cli</category>
    </item>
    <item>
      <title>Who tests the tests?</title>
      <dc:creator>Lucas Gabriel Sánchez</dc:creator>
      <pubDate>Fri, 20 Feb 2026 17:05:58 +0000</pubDate>
      <link>https://dev.to/cloudx/who-tests-the-tests-nhc</link>
      <guid>https://dev.to/cloudx/who-tests-the-tests-nhc</guid>
      <description>&lt;p&gt;This post is based on a &lt;a href="https://www.gophercon.com/" rel="noopener noreferrer"&gt;Gophercon&lt;/a&gt; talk by &lt;a href="https://github.com/danicat" rel="noopener noreferrer"&gt;Daniela Petruzalek&lt;/a&gt;: &lt;a href="https://www.youtube.com/watch?v=av2f2okDOyk" rel="noopener noreferrer"&gt;Who tests the tests?&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  A little bit of history
&lt;/h2&gt;

&lt;p&gt;In the beginning, we checked our code manually, running the application and trying different inputs: we called that manual testing. Then we discovered that we could write code to test application code to check if it's correct: we called that automatic testing (unit, integration, functional, etc.)&lt;/p&gt;

&lt;p&gt;Now we are in the AI era where we write fewer tests and even less code, we need a way to swiftly check that the tests are correct and that they are testing the cases we expect in our application.&lt;/p&gt;

&lt;h2&gt;
  
  
  How can we know if code written by AI is correct?
&lt;/h2&gt;

&lt;p&gt;You can use automatic tests in the same way we used to check code written by humans.&lt;/p&gt;

&lt;p&gt;One option would be to let the AI write the application code and you write the tests, &lt;a href="https://dev.to/marabesi/ai-and-tdd-a-match-that-can-work-1d59"&gt;you can even use TDD&lt;/a&gt; where you, the human, write the tests and let the AI write the implementation after.&lt;/p&gt;

&lt;p&gt;Another option is to let the AI write the application code and the tests, but then how can you be sure that those tests are testing the things you want or need? Reading and understanding the tests would be the best, but can we do that automatically?&lt;/p&gt;

&lt;h2&gt;
  
  
  Enter: mutation tests
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://en.wikipedia.org/wiki/Mutation_testing" rel="noopener noreferrer"&gt;Mutation testing&lt;/a&gt; is a way to check that the tests you have are testing the code the way you want by making small changes to the application code and checking that the tests fail as expected.&lt;/p&gt;

&lt;p&gt;It's based on a concept known as a mutant: a version of your application with a small change.&lt;/p&gt;

&lt;h3&gt;
  
  
  How does it work?
&lt;/h3&gt;

&lt;p&gt;The mutation testing cycle is this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create a mutant: change the application code by applying just one change&lt;/li&gt;
&lt;li&gt;Run your test suite&lt;/li&gt;
&lt;li&gt;Check how many mutants were killed&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After running your tests, the ones that failed are said to have killed the mutant, those tests are testing something related to the code you changed and are, from the perspective of that change, good tests.&lt;/p&gt;

&lt;p&gt;After many changes, if a test never failed it means that test didn't kill any mutant and is a weak test. You should probably delete that test or write a better one.&lt;/p&gt;

&lt;p&gt;If a mutant is never killed, then that code is not being tested (no coverage) or is being tested poorly, you have an opportunity to write a test to check that piece of code if needed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Do I have to make these changes manually?
&lt;/h3&gt;

&lt;p&gt;You can do it manually, but there are some tools to aid you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;C#, TypeScript and Scala: &lt;a href="https://stryker-mutator.io/" rel="noopener noreferrer"&gt;Stryker&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Go: &lt;a href="https://github.com/go-gremlins/gremlins" rel="noopener noreferrer"&gt;go-gremlins&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Java: &lt;a href="https://pitest.org/" rel="noopener noreferrer"&gt;pitest&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Python: &lt;a href="https://mutatest.readthedocs.io/en/latest/" rel="noopener noreferrer"&gt;mutatest&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Rust: &lt;a href="https://mutants.rs/welcome.html" rel="noopener noreferrer"&gt;mutants.rs&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These tools provide a way to make those changes automatically and some of them run the tests for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why do we need mutation testing?
&lt;/h2&gt;

&lt;p&gt;Let's see a very simple example in Python:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;can&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t divide by 0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A simple test suite we can have is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;unittest&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;divide&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;divide&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;TestDivide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;unittest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TestCase&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertRaises&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;unittest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run the tests and everything is fine:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;python tests.py
..
&lt;span class="nt"&gt;----------------------------------------------------------------------&lt;/span&gt;
Ran 2 tests &lt;span class="k"&gt;in &lt;/span&gt;0.000s

OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here we are testing both flows, when an error is raised and when we complete a successful operation, coverage is at 100% but those tests are not great, and here's why: let's change the implementation of divide from a/b to &lt;code&gt;a*b&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;can&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t divide by 0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running the tests should fail, right? They don't, because &lt;code&gt;1*1&lt;/code&gt; is still 1:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;python tests.py
..
&lt;span class="nt"&gt;----------------------------------------------------------------------&lt;/span&gt;
Ran 2 tests &lt;span class="k"&gt;in &lt;/span&gt;0.000s

OK
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a problematic test:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;test_divide_success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assertEqual&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;divide&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Even when you have 100% test coverage, that doesn't mean you have 100% test case coverage; some cases may be missing or not being tested correctly.&lt;/p&gt;

&lt;p&gt;What we did here was a mutation: from &lt;code&gt;a/b&lt;/code&gt; to &lt;code&gt;a*b&lt;/code&gt;, that mutation gives us information about our tests that we didn't have before.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to read the results of mutation testing?
&lt;/h2&gt;

&lt;p&gt;When you run your test suite against mutated code, each test can do one of two things: &lt;em&gt;kill&lt;/em&gt; the mutant (the test fails, so it detected the change) or &lt;em&gt;survive&lt;/em&gt; (the test still passes, so it missed the change). After many mutation, you can summarize how often each test killed mutants, for example:&lt;/p&gt;

&lt;p&gt;After 200 Mutations:&lt;br&gt;
TestA: 140 Kills, 60 Survived&lt;br&gt;
TestB: 200 Kills&lt;br&gt;
TestC: 30 Kills, 170 Survived&lt;/p&gt;

&lt;p&gt;How to interpret this:&lt;/p&gt;

&lt;p&gt;TestA: is good test because on most mutations it was killed.&lt;br&gt;
TestB: is your strongest test, it detects every mutation you ran.&lt;br&gt;
TestC: is the weak one, it usually doesn't detect mutations.&lt;/p&gt;

&lt;h2&gt;
  
  
  When not to use mutation testing?
&lt;/h2&gt;

&lt;p&gt;In some projects or languages, running mutation tests can take hours. For each mutant, the tool has to compile (if needed) and run the full test suite. If your test suite takes 10 minutes, each mutation will take at least that long, so keep this in mind.&lt;/p&gt;

&lt;p&gt;Use this approach on small projects or ones where the "edit, compile, test" cycle is small.&lt;/p&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;So who tests the tests? In practice, mutation testing does that: by checking that your tests react when the code is deliberately broken.&lt;/p&gt;

</description>
      <category>testing</category>
      <category>go</category>
      <category>programming</category>
    </item>
    <item>
      <title>Bicycles Are All Your AI Agents Need</title>
      <dc:creator>Federico Pascarella</dc:creator>
      <pubDate>Thu, 13 Nov 2025 12:47:16 +0000</pubDate>
      <link>https://dev.to/cloudx/bicycles-are-all-your-ai-agents-need-33cc</link>
      <guid>https://dev.to/cloudx/bicycles-are-all-your-ai-agents-need-33cc</guid>
      <description>&lt;h2&gt;
  
  
  From Condors to Code
&lt;/h2&gt;

&lt;p&gt;Somewhere between a condor and a keyboard lies human genius. Steve Jobs once told a story about how humans are terrible movers compared to animals. The condor beats us easily in the race of energy efficiency, but put a person on a bicycle and they fly.&lt;br&gt;
The bicycle, Jobs said, is "a tool that amplifies our efficiency." Computers, he added, are bicycles for the mind.&lt;br&gt;
That thought never left me. And now, with AI agents evolving super-fast, I can't help seeing the same pattern repeat.&lt;br&gt;
My humble view is that tools are still the key. Only this time, the cyclists are our AI agents with its brain (the LLM), and the bicycles are the functions we build for them.&lt;br&gt;
With the right tools, an agent moves with purpose. With clumsy tools, it stalls.&lt;/p&gt;
&lt;h2&gt;
  
  
  The Engineering of Great Tools
&lt;/h2&gt;

&lt;p&gt;Great agents sit on top of small, sharp Python functions. They are plain, predictable, and fast.&lt;/p&gt;
&lt;h3&gt;
  
  
  1. Single Responsibility
&lt;/h3&gt;

&lt;p&gt;Specialize each function. Do one job well, then compose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Swiss-army function
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;send_welcome&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;send_welcome&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;User &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; created&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Focused, composable tools
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;save_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;send_welcome_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;send_email&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Welcome!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Clear interfaces
&lt;/h3&gt;

&lt;p&gt;Name things so intent is obvious. Keep arguments explicit. Return data instead of printing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Vague names and side effects
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;discCalc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Discount applied: $&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Straight names and returns
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;calculate_discount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;percentage&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;price&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;percentage&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Structured outputs
&lt;/h3&gt;

&lt;p&gt;Agents prefer structure. Return dicts or JSON, not prose.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Unstructured string
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;fetch_temperature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;It&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s about &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; degrees in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, partly cloudy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Good: MCP tool with schema
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pydantic&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Field&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Server&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.types&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tool&lt;/span&gt;

&lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;weather&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseModel&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;City name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Temperature in Celsius&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Weather condition&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;humidity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Field&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Humidity percentage&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nd"&gt;@server.call_tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;temp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch_temperature&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;condition&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch_condition&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;WeatherData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;temp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;condition&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;humidity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;65&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Efficiency
&lt;/h3&gt;

&lt;p&gt;Use built-ins, cache where it helps, and profile before optimizing.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Manual loops
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_active_users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Built-ins plus caching
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;functools&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;lru_cache&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="nd"&gt;@lru_cache&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;maxsize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;128&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_active_users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;users_tuple&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users_tuple&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Robustness
&lt;/h3&gt;

&lt;p&gt;Validate inputs and fail loudly with helpful errors.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: No validation
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Validation and clear errors
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;read_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Path must be a non-empty string&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;FileNotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;File not found: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. The micro-tooling mindset
&lt;/h3&gt;

&lt;p&gt;Break big jobs into small tools you can test and swap. MCP benefits from chains of simple, named steps.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Monolith
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_user_data&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;validated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;enriched&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enrich&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;validated&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;enriched&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Good: Composable steps
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;fetch_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;validate_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enrich_user&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enrich&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  7. Trade-offs
&lt;/h3&gt;

&lt;p&gt;Hundreds of tiny tools can create orchestration overhead. Clear names, steady input and output shapes, and basic docs keep things manageable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Show, Don't Tell: Two Decision Flows
&lt;/h2&gt;

&lt;p&gt;A concrete example makes the difference clear. Here is the same task, done with weak tools and with strong tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Task:&lt;/strong&gt; Extract newly signed customers from a CSV in cloud storage, enrich each with firmographic data, and email an account summary.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent with poorly designed tools
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Calls a generic &lt;code&gt;process_file()&lt;/code&gt; that auto-detects type and tries to parse everything.&lt;/li&gt;
&lt;li&gt;Uses one do everything &lt;code&gt;enrich_user()&lt;/code&gt; that accepts many flags, then times out on third party rate limits.&lt;/li&gt;
&lt;li&gt;Prints logs to stdout, returns a mixed string summary, and the agent fails to decide what to send.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Decision flow with weak tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input: blob path&lt;/li&gt;
&lt;li&gt;Branch: auto-detect format, guess schema&lt;/li&gt;
&lt;li&gt;Loop: enrich with side effects&lt;/li&gt;
&lt;li&gt;Output: unstructured string&lt;/li&gt;
&lt;li&gt;Failure mode: retries loop, hallucinates missing fields, no clear errors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Agent with well designed tools
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;code&gt;load_csv(path, schema)&lt;/code&gt; returns a typed dataframe.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;batch_enrich(users, provider, rate_limit)&lt;/code&gt; yields structured rows with retry metadata.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;render_account_summary(users)&lt;/code&gt; returns JSON for &lt;code&gt;send_email(to, subject, body_html)&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Decision flow with strong tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Input: explicit path and schema&lt;/li&gt;
&lt;li&gt;Transform: strict parser&lt;/li&gt;
&lt;li&gt;Enrich: idempotent, rate limited, returns status per row&lt;/li&gt;
&lt;li&gt;Render: deterministic template&lt;/li&gt;
&lt;li&gt;Output: email send result with IDs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Result: same goal, three clean steps, easy to test and to explain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;I believe that innovation often hides in simplicity. Building efficient AI agents isn't about giving them infinite intelligence; it's about giving them great tools. Write them clean, focused, and well-documented; think in micro-tooling: small parts, big impact.&lt;br&gt;
So, next time you're debugging that stubborn Python function, just remember: you're not fixing a bug. You're tuning a bicycle for the mind of an AI.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>productivity</category>
      <category>technology</category>
    </item>
    <item>
      <title>WhatsApp + MCP: automatic audio transcription</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Mon, 29 Sep 2025 19:39:53 +0000</pubDate>
      <link>https://dev.to/cloudx/whatsapp-mcp-automatic-audio-transcription-jbh</link>
      <guid>https://dev.to/cloudx/whatsapp-mcp-automatic-audio-transcription-jbh</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;MCP (Model Context Protocol) can look complicated until you ship something real with it. Let's use it on something practical: expose your WhatsApp voice notes with your own MCP server and turn them into transcripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is MCP?
&lt;/h2&gt;

&lt;p&gt;MCP is a connection standard that connects AI agents with external systems.&lt;/p&gt;

&lt;p&gt;It has a server and a client, and they have two different ways to talk to each other:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stdio (stdin/stdout): the standard Unix mechanism for a process to receive or send data to the environment or another process.&lt;/li&gt;
&lt;li&gt;Server-Sent Events (SSE): an HTTP mechanism where the server keeps the connection open and streams events to the client (one-way).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fln8mevhsqwmsveb51lnn.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fln8mevhsqwmsveb51lnn.png" alt="STDIN/STDOUT vs SSE in MCP" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Quick comparison of stdio and SSE transports in MCP.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP architecture
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Host: Claude Desktop / Cursor / any AI agent. It coordinates the LLM, spins up MCP clients, and shows results.&lt;/li&gt;
&lt;li&gt;MCP Client: an implementation embedded in the host that connects to your server. It speaks the protocol, opens/manages the connection, and sends/receives requests.&lt;/li&gt;
&lt;li&gt;MCP Server: your program that exposes tools. It runs actions and returns data/events to the client.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;An MCP server can expose different capabilities, but in this project we stick to &lt;strong&gt;tools&lt;/strong&gt; (actions like transcribing audio). MCP also supports resources or prompts; we skip them here to keep the flow simple.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdtl4gk4p7n7kn6ojish.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ftdtl4gk4p7n7kn6ojish.png" alt="MCP architecture: Host -&amp;gt; MCP Client -&amp;gt; MCP Server" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Diagram of the Host → MCP Client → MCP Server flow.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the WhatsApp MCP
&lt;/h2&gt;

&lt;p&gt;WhatsApp Desktop on macOS stores everything locally: an SQLite database with chats and folders containing the media files.&lt;/p&gt;

&lt;p&gt;Our MCP server will:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Read the WhatsApp database&lt;/li&gt;
&lt;li&gt;Find audio files per contact&lt;/li&gt;
&lt;li&gt;Transcribe them with Whisper&lt;/li&gt;
&lt;li&gt;Send the text back to the Client (Cursor in this case)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The working code lives in the repository: &lt;a href="https://github.com/GBurgardt/mcp-whatsapp-whisper" rel="noopener noreferrer"&gt;mcp-whatsapp-whisper&lt;/a&gt;. Let's walk through the key pieces.&lt;/p&gt;

&lt;h3&gt;
  
  
  The STDIN/STDOUT connection
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;StdioServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/stdio.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioServerTransport&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With that the server listens to every client request on STDIN and replies through STDOUT.&lt;/p&gt;

&lt;p&gt;We pick stdio because this MCP server runs locally. It's the simplest and most stable transport on desktop/CLI: no open ports, no HTTP dependency, avoids CORS/firewalls, and hosts (Claude Desktop/Cursor) support it natively. SSE makes sense when the server lives remotely behind HTTP.&lt;/p&gt;

&lt;h3&gt;
  
  
  Exposing capabilities
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;whatsapp-audio-mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="c1"&gt;// We will expose actions&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Designing the tools
&lt;/h3&gt;

&lt;p&gt;The server lives on three tools each with a specific role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;getRecentAudio(contactName, count?)&lt;/code&gt;: pulls the latest audio paths for a contact.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;searchAudios(query, date?)&lt;/code&gt;: narrows the list by name or date when the history is large. We get filtering without touching SQLite directly.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;transcribeAudio(audioPath)&lt;/code&gt;: turns a path into text with Whisper. It finishes the loop by delivering the result we care about.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal was a minimal set: find, refine, transcribe. Each tool lines up with one of those stages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;transcribeAudio&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Transcribe an audio file using OpenAI Whisper (SDK)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nl"&gt;audioPath&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Path to the audio file&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="nx"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;audioPath&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema follows JSON Schema. With it, Cursor knows which parameters to send.&lt;/p&gt;

&lt;h2&gt;
  
  
  Accessing WhatsApp
&lt;/h2&gt;

&lt;p&gt;WhatsApp Desktop keeps everything under predictable paths:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dbPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;homeDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Library/Group Containers/group.net.whatsapp.WhatsApp.shared/ChatStorage.sqlite&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mediaPath&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;homeDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Library/Group Containers/group.net.whatsapp.WhatsApp.shared/Message/Media&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The database is SQLite:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
  SELECT DISTINCT 
    ZCONTACTJID as jid,
    ZPARTNERNAME as name,
    ZLASTMESSAGEDATE as lastMessageDate
  FROM ZWACHATSESSION
  WHERE ZPARTNERNAME IS NOT NULL
  AND ZCONTACTJID NOT LIKE '%@g.us'  -- Exclude groups
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Audio files are organized per contact. We scan recursively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;audioExtensions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.opus&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.m4a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.mp3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.aac&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;.wav&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scanDirectory&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;readdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;withFileTypes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;audioExtensions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ext&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="c1"&gt;// Found an audio file&lt;/span&gt;
      &lt;span class="nx"&gt;audioFiles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fullPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;modifiedDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The transcription: FFmpeg + Whisper
&lt;/h2&gt;

&lt;p&gt;WhatsApp ships audio in Opus, but OpenAI Whisper prefers MP3. We use FFmpeg:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ffmpeg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;spawn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ffmpeg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-i&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;inputPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// WhatsApp Opus audio&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-acodec&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;mp3&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;-b:a&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;128k&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Temporary MP3&lt;/span&gt;
&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then we transcribe with OpenAI Whisper (SDK):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;OpenAI&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;openai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node:fs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPENAI_API_KEY&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transcription&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;audio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;transcriptions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;createReadStream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;outputPath&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c1"&gt;// Temporary MP3&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;whisper-1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transcriptionText&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Configuring Cursor (the client)
&lt;/h2&gt;

&lt;p&gt;In the Cursor config (&lt;code&gt;~/.cursor/mcp.json&lt;/code&gt;) we add:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"whatsapp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/mcp-whatsapp-whisper/dist/server.js"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"OPENAI_API_KEY"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"YOUR_OPENAI_KEY"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cursor can now invoke our server whenever it needs to.&lt;/p&gt;

&lt;h2&gt;
  
  
  MCP in action
&lt;/h2&gt;

&lt;p&gt;The user asks Cursor:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Send me the transcript of Elian's last audio."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Cursor automatically:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Calls &lt;code&gt;getRecentAudio(contactName: "elian")&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Receives the audio file path&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;transcribeAudio(audioPath: "/path/to/audio.opus")&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Receives the transcription&lt;/li&gt;
&lt;li&gt;Summarizes or shows the full text&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The transcription flows through the OpenAI API; the temporary MP3 is sent to get the text back. Cursor orchestrates; your server prepares the file and makes the call.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgkiuwpuuc8y7qy1o2hy.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fwgkiuwpuuc8y7qy1o2hy.png" alt="Cursor + MCP WhatsApp: transcription example" width="800" height="441"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Cursor showing the transcription returned by the WhatsApp MCP server.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations: macOS only
&lt;/h2&gt;

&lt;p&gt;This server is macOS only. The WhatsApp paths are specific to Mac.&lt;/p&gt;

&lt;p&gt;It depends on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;WhatsApp Desktop installed&lt;/li&gt;
&lt;li&gt;FFmpeg (&lt;code&gt;brew install ffmpeg&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;OpenAI SDK (&lt;code&gt;npm i openai&lt;/code&gt;) with &lt;code&gt;OPENAI_API_KEY&lt;/code&gt; configured&lt;/li&gt;
&lt;li&gt;Internet connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We also skip Prompts and Resource Templates.&lt;/p&gt;

&lt;p&gt;Security depends on the host. Cursor can ask for approval before it runs tools.&lt;/p&gt;

&lt;h2&gt;
  
  
  Keep it running with PM2
&lt;/h2&gt;

&lt;p&gt;Build the project once (&lt;code&gt;npm run build&lt;/code&gt;) and keep the server alive with &lt;code&gt;pm2 start ecosystem.config.cjs&lt;/code&gt;. The provided config watches the compiled &lt;code&gt;dist/server.js&lt;/code&gt; and restarts it if it crashes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Your AI agent can now reach your data, use your tools, work in your context.&lt;/p&gt;

&lt;p&gt;The WhatsApp server is just one idea. Once you realize any program that speaks STDIN/STDOUT can be an MCP server, the possibilities get wild.&lt;/p&gt;

&lt;p&gt;Next time you think "I wish Cursor could access...", remember: it probably can. You just need to build the bridge.&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>whisper</category>
      <category>automation</category>
    </item>
    <item>
      <title>How AI Reflects Your Thinking</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Tue, 26 Aug 2025 13:36:38 +0000</pubDate>
      <link>https://dev.to/cloudx/how-ai-reflects-your-thinking-5714</link>
      <guid>https://dev.to/cloudx/how-ai-reflects-your-thinking-5714</guid>
      <description>&lt;p&gt;When we code using AI we ask ourselves: "what's the best prompt?" or "what magic prompt should I use?".&lt;/p&gt;

&lt;p&gt;We'd be better off asking: "what kind of interaction is this?". Trying to understand the nature of the interaction between us and the model.&lt;/p&gt;

&lt;p&gt;Maybe the problem isn't the technology, but us.&lt;/p&gt;

&lt;h2&gt;
  
  
  An Analogy
&lt;/h2&gt;

&lt;p&gt;Imagine you hire a remote programmer. Brilliant, but with some quirks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Never worked on your project before (0 context)&lt;/li&gt;
&lt;li&gt;Extremely literal. If you don't explicitly tell them, they never assume anything.&lt;/li&gt;
&lt;li&gt;Doesn't infer context&lt;/li&gt;
&lt;li&gt;Completely loses their memory every day, returning to their initial state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How would you communicate with them?&lt;/p&gt;

&lt;p&gt;You'd probably:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explain all the necessary context, very detailed&lt;/li&gt;
&lt;li&gt;Be very specific with requirements&lt;/li&gt;
&lt;li&gt;Not assume they'll "figure out" anything. You explain everything&lt;/li&gt;
&lt;li&gt;Expect some iterations before the final result&lt;/li&gt;
&lt;li&gt;Maybe save context files to resend them every day&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the best way to interact with an AI model.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdda3rxzav04bj1jf7y4h.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdda3rxzav04bj1jf7y4h.png" alt="Communicating with AI requires the same clarity as working with a remote programmer" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  AI As a Mirror
&lt;/h2&gt;

&lt;p&gt;The model isn't just a task executor. It's also a mirror of your clarity when communicating a problem.&lt;/p&gt;

&lt;p&gt;If you give it vague instructions, you get vague results simply because it faithfully reflects how vague your thinking was.&lt;/p&gt;

&lt;p&gt;Most of the time when the model "doesn't understand" the problem isn't the model. It's that we ourselves weren't clear about what we wanted.&lt;/p&gt;

&lt;h2&gt;
  
  
  Clarity As a Skill
&lt;/h2&gt;

&lt;p&gt;The real skill isn't "writing good prompts". It's thinking clearly about problems and communicating that clarity. This is a fundamental skill for any programmer.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3z8fou5qu5ud33t79un.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fm3z8fou5qu5ud33t79un.png" alt="Vague thinking leads to vague results, while clear communication produces precise outcomes" width="600" height="400"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Example
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;What we usually do:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Optimize this function
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why it fails:&lt;/strong&gt; Optimize in what sense? Speed? Memory? Readability? There's no success criteria.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we should do:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;The processOrders() function in orders.js takes 5 seconds with 1000 orders.
I need it to take less than 1 second.
Orders come from the database already sorted by date.
You can assume there are no duplicate orders.
Logs: &amp;lt;&amp;lt;detailed logs&amp;gt;&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is much clearer and less abstract. It describes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The problem (5 seconds is too much)&lt;/li&gt;
&lt;li&gt;The measurable goal (less than 1 second)&lt;/li&gt;
&lt;li&gt;Constraints (already sorted)&lt;/li&gt;
&lt;li&gt;Assumptions (no duplicates)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Breaking Down Problems
&lt;/h2&gt;

&lt;p&gt;One of the skills that improves working with AI is breaking problems down into smaller pieces. AI won't save you the work of thinking. The clarification process itself is valuable work in programming.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Implement a complete authentication system
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You learn to think:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Step 1: Define the User model with minimum required fields: &amp;lt;fields&amp;gt;
Step 2: Create the registration endpoint with basic validation (validation type, etc)
[etc...]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Limitations
&lt;/h2&gt;

&lt;p&gt;AI can only handle 3-4 files well at a time. It's a limitation but with its bright side:&lt;/p&gt;

&lt;p&gt;It forces you to keep responsibilities separated and create clear interfaces. You need to avoid coupling and think in small modules.&lt;/p&gt;

&lt;p&gt;It incentivizes you to follow good architecture practices.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Importance of Context
&lt;/h2&gt;

&lt;p&gt;AI needs all the context possible, don't skimp.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CONTEXT: Users report the checkout page hangs
SYMPTOM: The "Pay" button stays in "Processing..." state indefinitely
FILE: checkout.js, handlePayment() function
SUSPICION: Probably missing a catch to handle API errors
TASK: Add robust error handling and visual feedback to the user
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The Value of Programming with AI
&lt;/h2&gt;

&lt;p&gt;Programming with AI trains you in thinking clearly and communicating precisely. It forces you to break problems into manageable pieces and be explicit with your requirements while constantly verifying results.&lt;/p&gt;

&lt;p&gt;These seem like fundamental skills for any dev regardless of language.&lt;/p&gt;

&lt;h2&gt;
  
  
  Final Reflection
&lt;/h2&gt;

&lt;p&gt;AI doesn't save you from thinking, or at least you shouldn't use it that way. It's the opposite, every prompt you write is an opportunity to clarify your understanding. Every response you receive is feedback on your clarity. Every iteration is a chance to improve.&lt;/p&gt;

&lt;p&gt;Next time you use AI and don't get the expected result, before blaming the model, ask yourself:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Did I really have clarity on what I wanted?&lt;/li&gt;
&lt;li&gt;Did I break down the problem into manageable parts?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These models are honest, literal collaborators. They give you exactly what you ask for, but they demand clarity. Learning to be clear is learning to think well. AI used properly makes you a better programmer.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>programming</category>
      <category>productivity</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Automate Any Repetitive Task with MCP</title>
      <dc:creator>German Burgardt</dc:creator>
      <pubDate>Mon, 28 Jul 2025 17:30:01 +0000</pubDate>
      <link>https://dev.to/cloudx/automate-any-repetitive-task-with-mcp-o9n</link>
      <guid>https://dev.to/cloudx/automate-any-repetitive-task-with-mcp-o9n</guid>
      <description>&lt;h2&gt;
  
  
  The Problem: Repetitive Detailed Prompting
&lt;/h2&gt;

&lt;p&gt;Every time I start a new task in Claude Code / Cursor, I type a detailed prompt to guide the AI through an internal monologue before proceeding. For example:&lt;/p&gt;

&lt;p&gt;"You will generate an internal monologue of 200 numbered lines where two thinkers debate the approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Pragmatic focuses on functionality and efficiency&lt;/li&gt;
&lt;li&gt;Creative on innovation and elegance&lt;/li&gt;
&lt;li&gt;Follow these rules: exactly 200 lines, each starting with [Pragmatic] or [Creative]&lt;/li&gt;
&lt;li&gt;Be specific about code without abstractions&lt;/li&gt;
&lt;li&gt;Reflect and question without solutions&lt;/li&gt;
&lt;li&gt;Mention files/functions/variables&lt;/li&gt;
&lt;li&gt;Consider edge cases/performance/maintainability/user experience&lt;/li&gt;
&lt;li&gt;Debate simplicity vs functionality&lt;/li&gt;
&lt;li&gt;Question decisions, no repeats, end without conclusion&lt;/li&gt;
&lt;li&gt;Then address the task: [actual task here]."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Typing this repeatedly 20+ times a day wastes time and disrupts focus.&lt;/p&gt;

&lt;p&gt;As someone researching practical AI applications, we can fix that.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Before: 200+ word prompt every time&lt;/span&gt;
&lt;span class="c1"&gt;// After: "internal monologue 200 lines - implement auth system"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Enter MCPs: The Missing Link
&lt;/h2&gt;

&lt;p&gt;Model Context Protocols (MCPs) allow extending AI agents with custom tools. While common examples include fetching data, web browsing, or integrating with Slack, I used it in a novel way to automate my repetitive prompt.&lt;/p&gt;

&lt;h2&gt;
  
  
  From Repetition to Automation
&lt;/h2&gt;

&lt;p&gt;I built an MCP server in my Remix app (essentially the same as plain Node.js) that generates these monologues on demand. Now, Claude detects the trigger and handles it automatically.&lt;/p&gt;

&lt;p&gt;Here's a glimpse of what it generates:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. [Pragmatic] We need to implement auth - start with basic JWT in middleware.js
2. [Creative] But what about OAuth? Users expect social login nowadays...
3. [Pragmatic] OAuth adds complexity - first nail down password flow, then extend
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmi1ao6dc7pkccc5aqkj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjmi1ao6dc7pkccc5aqkj.png" alt="Before and After MCP Automation Flowchart" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The difference:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Before:&lt;/strong&gt; Type the full detailed prompt each time, then describe the task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;After:&lt;/strong&gt; Simply say "internal monologue 200 lines about X - [task]", and Claude generates the monologue via the tool, then proceeds.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Time saved:&lt;/strong&gt; ~2 minutes per task&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Characters typed:&lt;/strong&gt; 300+ → 40&lt;/p&gt;


&lt;h2&gt;
  
  
  Building Your Own Monologue MCP
&lt;/h2&gt;

&lt;p&gt;Here's how to implement it in a Node.js server (adaptable from my Remix example).&lt;/p&gt;
&lt;h3&gt;
  
  
  Step 1: Install Dependencies
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; @modelcontextprotocol/sdk zod @anthropic-ai/sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Step 2: Create the MCP Server Handler
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;app/lib/mcp-server.ts&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Server&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/index.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SSEServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/sse.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;Anthropic&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@anthropic-ai/sdk&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createMCPServer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;monologue-mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;1.0.0&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;capabilities&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Define the monologue tool&lt;/span&gt;
  &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tools/list&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generate-monologue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Generate a reflective internal monologue in the style of Pragmatic vs Creative thinker&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;inputSchema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="na"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Number of lines in the monologue (default: 100)&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Current conversation context&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Description of the task to perform&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;task&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;}));&lt;/span&gt;

  &lt;span class="c1"&gt;// The actual tool implementation&lt;/span&gt;
  &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tools/call&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;generate-monologue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ArgsSchema&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;object&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
        &lt;span class="na"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;optional&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;task&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ArgsSchema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;arguments&lt;/span&gt;
      &lt;span class="p"&gt;);&lt;/span&gt;

      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`You are two thinkers having an internal dialogue about programming.
Pragmatic is focused on functionality and efficiency.
Creative is obsessive about innovation and elegance.

STRICT RULES:
1. Generate EXACTLY &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; numbered lines
2. Each line must start with [Pragmatic] or [Creative]
3. NO abstractions - be specific about the code
4. NO complete solutions - REFLECT and QUESTION
5. Mention specific files, functions, variables when relevant
6. Think about: edge cases, performance, maintainability, user experience
7. Debate simplicity vs functionality
8. Question every technical decision
9. NO repeated ideas - each line must add new value
10. End without a definitive conclusion - it's reflection, not decision`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userPrompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;
          &lt;span class="nx"&gt;context&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`Previous context:\n&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;context&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;\n\n`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;""&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;Current task: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;task&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;

Generate an internal monologue of EXACTLY &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; numbered lines where the two thinkers debate the best way to approach this task.`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;claude-opus-4-20250514&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;32000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;temperature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;system&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;systemPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userPrompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;monologue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;monologue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Error generating monologue: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="na"&gt;isError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Unknown tool: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 3: Create the API Route
&lt;/h3&gt;

&lt;p&gt;Create &lt;code&gt;app/routes/api.mcp.ts&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;The MCP server needs to be exposed as an HTTP endpoint. We use Bearer authentication to secure it. Only Claude (or other authorized clients) with the correct API key can access your server. This prevents random people from using your tools.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;LoaderFunctionArgs&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@remix-run/node&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;createMCPServer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;~/lib/mcp-server&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;SSEServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@modelcontextprotocol/sdk/server/sse.js&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// SSE (Server Sent Events) keeps an open connection between Claude and your server&lt;/span&gt;
&lt;span class="c1"&gt;// This allows Claude to call your tools in real time without polling&lt;/span&gt;

&lt;span class="c1"&gt;// Simple auth check&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifyAuth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authorization&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expectedKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;MCP_API_KEY&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;your-secret-key&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="s2"&gt;`Bearer &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;expectedKey&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;loader&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;LoaderFunctionArgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;verifyAuth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;responseHeaders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Headers&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text/event-stream&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Cache-Control&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;no-cache&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;Connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;keep-alive&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Access-Control-Allow-Origin&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;*&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Access-Control-Allow-Headers&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Authorization, Content-Type&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;createMCPServer&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SSEServerTransport&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;endpoint&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/api/mcp&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;requestHeaders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromEntries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="na"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;fromEntries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;stream&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ReadableStream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Keep connection alive (SSE connections timeout after 30 seconds of silence)&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;keepAlive&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;TextEncoder&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;: keepalive&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;30000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;abort&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nf"&gt;clearInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;keepAlive&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;controller&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;responseHeaders&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Step 4: Configure Environment Variables
&lt;/h3&gt;

&lt;p&gt;Add to your &lt;code&gt;.env&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;your-anthropic-api-key
&lt;span class="nv"&gt;MCP_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;a-secret-key-for-your-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;ANTHROPIC_API_KEY&lt;/code&gt; lets your server call Claude's API to generate monologues. The &lt;code&gt;MCP_API_KEY&lt;/code&gt; is your own secret, it's what Claude will use to authenticate with your server.&lt;/p&gt;

&lt;h3&gt;
  
  
  Step 5: Deploy and Connect
&lt;/h3&gt;

&lt;p&gt;Deploy your changes (I use Vercel, but any platform works):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git add &lt;span class="nb"&gt;.&lt;/span&gt;
git commit &lt;span class="nt"&gt;-m&lt;/span&gt; &lt;span class="s2"&gt;"Add MCP server for internal monologues"&lt;/span&gt;
git push
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then connect from Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;claude mcp add &lt;span class="nt"&gt;--transport&lt;/span&gt; sse monologue https://yourdomain.com/api/mcp &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer your-secret-key"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;sse&lt;/code&gt; transport tells Claude to use Server Sent Events (the streaming connection type we set up). Replace &lt;code&gt;your-secret-key&lt;/code&gt; with the same MCP_API_KEY from your &lt;code&gt;.env&lt;/code&gt; file.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works in Practice
&lt;/h2&gt;

&lt;p&gt;Now, when working in Claude:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; internal monologue 150 lines - design user experience for login flow
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Claude detects the phrase, calls the MCP tool, generates the detailed monologue (e.g., a debate on intuitive interfaces vs secure processes, navigation logic, etc.), and uses it to design the feature thoughtfully.&lt;/p&gt;

&lt;p&gt;A sample monologue excerpt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. [Creative] Login flow should be innovative and seamless – perhaps biometric integration for delight?
2. [Pragmatic] Biometrics add complexity; focus on reliable password handling in auth.js first.
3. [Creative] But user experience suffers with forms – question if we can animate transitions smoothly.
4. [Pragmatic] Animations might impact performance on mobile; consider edge cases in responsive design.
...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Finj50qff2o8aq1b9w165.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Finj50qff2o8aq1b9w165.png" alt="Monologue Generation Screen Illustration" width="800" height="472"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Matters
&lt;/h2&gt;

&lt;p&gt;This MCP setup boosts programming efficiency by leveraging AI tools for consistent planning and productivity gains, while experimenting with a non typical application to explore MCPs more creatively and deeply.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next?
&lt;/h2&gt;

&lt;p&gt;One could build other creative tools, such as one that fetches and analyzes server logs directly, or another that integrates with external APIs for real time data checks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Your Turn
&lt;/h2&gt;

&lt;p&gt;What repetitive tasks do you deal with in your daily work? Maybe you can create an MCP. The code is ready to adapt and build something.&lt;/p&gt;




&lt;p&gt;Questions? Leave a comment below and I'll be happy to help!&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>javascript</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>A2A - Understanding the Basics and Building Multi-Agent Flight Management System</title>
      <dc:creator>Eze Quiroga</dc:creator>
      <pubDate>Wed, 09 Jul 2025 03:33:15 +0000</pubDate>
      <link>https://dev.to/cloudx/a2a-understanding-the-basics-and-building-multi-agent-flight-management-system-39c7</link>
      <guid>https://dev.to/cloudx/a2a-understanding-the-basics-and-building-multi-agent-flight-management-system-39c7</guid>
      <description>&lt;h2&gt;
  
  
  🌟 Introduction
&lt;/h2&gt;

&lt;p&gt;Continuing with the context I shared in the previous article -&lt;a href="https://dev.to/cloudx/mcp-understanding-the-basics-and-building-a-research-paper-management-chatbot-18le"&gt;MCP - Understanding the Basics and Building a Research Paper Management Chatbot&lt;/a&gt;-, where I spotted the fact there's been a growing need for a standard way to enable communication between agents and give them richer context to handle complex tasks through natural language, it's time to explore how to communicate agents or even complete agentic systems in a standard way.&lt;/p&gt;

&lt;p&gt;That's where &lt;strong&gt;Google's A2A (Agent-to-Agent) protocol&lt;/strong&gt; comes in.&lt;br&gt;
Announced by Google on &lt;strong&gt;April 9, 2025&lt;/strong&gt;, this emerging protocol standardizes how AI agents communicate with each other, enabling them to share context, delegate tasks, and collaborate on complex objectives that require multiple specialized capabilities.&lt;/p&gt;

&lt;p&gt;In this post, we'll walk through building a command-line multi-agent system using the A2A protocol. We'll learn how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create A2A agents with their cards and skills&lt;/li&gt;
&lt;li&gt;Configure how agents will return information&lt;/li&gt;
&lt;li&gt;Use a centralized LangChain ReAct agent to call A2A agents&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end, our chatbot will be able to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;(&lt;strong&gt;employee_flight_request_agent&lt;/strong&gt;) Know the status of corporate flight orders (pending purchase, purchased, and associated with a specific person)&lt;/li&gt;
&lt;li&gt;(&lt;strong&gt;airport_knowledge_base_agent&lt;/strong&gt;) Obtain information about airports and cities&lt;/li&gt;
&lt;li&gt;(&lt;strong&gt;flight_search_agent&lt;/strong&gt;) Search for real flight information departing from a specific airport&lt;/li&gt;
&lt;li&gt;Recommend airports for flights pending acquisition&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how we'll break it down:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Context&lt;/li&gt;
&lt;li&gt;Local environment setup&lt;/li&gt;
&lt;li&gt;What is A2A?&lt;/li&gt;
&lt;li&gt;Core components&lt;/li&gt;
&lt;li&gt;Communicating agents&lt;/li&gt;
&lt;li&gt;Building A2A Agents&lt;/li&gt;
&lt;li&gt;Our chatbot&lt;/li&gt;
&lt;li&gt;Running our chatbot&lt;/li&gt;
&lt;li&gt;Key features&lt;/li&gt;
&lt;li&gt;Final thoughts&lt;/li&gt;
&lt;li&gt;Resources&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's get started! 🚀&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Important note: Only the most relevant function signatures and docstrings are shown in this post. You can find the full implementation in &lt;a href="https://github.com/ezequiroga/a2a-bases" rel="noopener noreferrer"&gt;ezequiroga/a2a-bases&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🤝 Context
&lt;/h2&gt;

&lt;p&gt;The main objective of the project is to recommend departure airports for corporate flights. For this, we will create a chatbot and three A2A agents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;employee_flight_request_agent&lt;/strong&gt;: Manages employee flight requests and booking status using an internal database. It returns results immediately: receives requests → processes → returns results.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;airport_knowledge_base_agent&lt;/strong&gt;: Acts as a knowledge database that provides airport information and city-airport mappings. Since the main purpose of this article is to explore A2A, this agent uses fuzzy matching to retrieve information. It uses &lt;code&gt;streaming&lt;/code&gt; to return its results.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;flight_search_agent&lt;/strong&gt;: Performs real-time flight search using external aviation data from the &lt;code&gt;Aviation Stack API&lt;/code&gt;. This agent uses a &lt;code&gt;ReAct Agent from LangChain&lt;/code&gt; to create filters for the tool that interacts with the &lt;code&gt;Aviation Stack API&lt;/code&gt;. It responds to requests by sending push notifications.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These three agents will be called through the chatbot, which uses a &lt;code&gt;ReAct Agent from LangChain&lt;/code&gt; to interact with the user and decide which agent should be called.&lt;/p&gt;

&lt;p&gt;Each agent uses a different communication method with our chatbot, so our chatbot needs to adapt to each of them.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjl9dxuzh3dui5n7m6mk.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhjl9dxuzh3dui5n7m6mk.png" alt="Article Architecture" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🛠️ Local environment
&lt;/h2&gt;

&lt;p&gt;🐍 Python &lt;code&gt;3.13.5&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Install the required packages from requirements.txt using the &lt;code&gt;uv add -r requirements.txt&lt;/code&gt; command or &lt;code&gt;pip install -r requirements.txt&lt;/code&gt;.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;IMPORTANT NOTE&lt;/strong&gt;: The A2A protocol library must be installed using &lt;strong&gt;UV&lt;/strong&gt; to avoid installation errors. This is the recommended approach according to the official A2A documentation. Using pip may result in dependency conflicts or incomplete installations.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;Pro tip: Use Python virtual environments for cleaner dependency management.&lt;/em&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  🤔 What Is Agent2Agent (A2A)?
&lt;/h2&gt;

&lt;p&gt;Let's explore what the A2A protocol is and how it enables seamless agent-to-agent communication. For more details, check out the Resources section at the end.&lt;/p&gt;

&lt;p&gt;The A2A protocol was created by Google with the goal of standardizing and simplifying both communication and interoperability between AI Agents or even complete Agentic Systems.&lt;/p&gt;

&lt;p&gt;As the &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/specification/#11-key-goals-of-a2a" rel="noopener noreferrer"&gt;official documentation&lt;/a&gt; states, A2A's key goals are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Interoperability&lt;/strong&gt;: Bridge the communication gap between disparate agentic systems.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Collaboration&lt;/strong&gt;: Enable agents to delegate tasks, exchange context, and work together on complex user requests.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery&lt;/strong&gt;: Allow agents to dynamically find and understand the capabilities of other agents.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility&lt;/strong&gt;: Support various interaction modes including synchronous request/response, streaming for real-time updates, and asynchronous push notifications for long-running tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Security&lt;/strong&gt;: Facilitate secure communication patterns suitable for enterprise environments, relying on standard web security practices.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Asynchronicity&lt;/strong&gt;: Natively support long-running tasks and interactions that may involve human-in-the-loop scenarios.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The communication is based on HTTP(S) as the transport protocol and defines that each server exposes its services through a URL included in its &lt;em&gt;AgentCard&lt;/em&gt;. All data exchange is based on JSON-RPC 2.0, ensuring that requests and responses follow a consistent and standard format, always with Content-Type: application/json.&lt;/p&gt;

&lt;p&gt;And, for real-time updates, A2A supports streaming using Server-Sent Events (SSE). In these cases, the server returns continuous events with embedded JSON-RPC responses, allowing agents to maintain open communication flows for long-duration messages or tasks.&lt;/p&gt;
&lt;h3&gt;
  
  
  Official SDK
&lt;/h3&gt;

&lt;p&gt;The official SDK allows us to abstract away from writing JSON code by using classes and methods that facilitate communication. The recommended way to install the SDK is using &lt;code&gt;UV&lt;/code&gt; by running the following:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uv add a2a-sdk
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important note&lt;/strong&gt;: You need to initialize the &lt;code&gt;uv&lt;/code&gt; project since our A2A servers will run using &lt;code&gt;uv&lt;/code&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🧩 Core Components
&lt;/h2&gt;

&lt;p&gt;A2A communication is built around several key components that define the message structure required for proper agent interaction.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A2A Client&lt;/strong&gt;: An application or agent that initiates requests to an A2A Server on behalf of a user or another system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A2A Server (Remote Agent)&lt;/strong&gt;: An agent or agentic system that exposes an A2A-compliant HTTP endpoint, processing tasks and providing responses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Card&lt;/strong&gt;: A JSON metadata document published by an A2A Server, describing its identity, capabilities, skills, service endpoint, and authentication requirements.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Task&lt;/strong&gt;: The fundamental unit of work managed by A2A, identified by a unique ID. Tasks are stateful and progress through a defined lifecycle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message&lt;/strong&gt;: A communication turn between a client and a remote agent, having a role ("user" or "agent") and containing one or more Parts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Part&lt;/strong&gt;: The smallest unit of content within a Message or Artifact (e.g., TextPart, FilePart, DataPart).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Artifact&lt;/strong&gt;: An output (e.g., a document, image, structured data) generated by the agent as a result of a task, composed of Parts.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Important Note&lt;/strong&gt;: The protocol is based on JSON-RPC 2.0, which means all messages are sent in JSON format. To simplify development, we will use the official SDK.&lt;/p&gt;

&lt;h2&gt;
  
  
  📡 Communicating Agents
&lt;/h2&gt;

&lt;p&gt;A2A specifies three different communication patterns for A2A Servers to interact with A2A Clients:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Standard HTTP(S) Communication&lt;/strong&gt;: The client sends a request and the server sends a response, completing the standard HTTP(S) protocol cycle&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming (SSE)&lt;/strong&gt;: Real-time, incremental updates for tasks (status changes, artifact chunks) delivered via Server-Sent Events&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push Notifications&lt;/strong&gt;: Asynchronous task updates delivered via server-initiated HTTP POST requests to a client-provided webhook URL, for long-running or disconnected scenarios&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can see a communication sequence diagram in &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/topics/what-is-a2a/#a2a-request-lifecycle" rel="noopener noreferrer"&gt;A2A Request Lifecycle&lt;/a&gt; and dive deeper into communication methods in &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/topics/streaming-and-async/" rel="noopener noreferrer"&gt;Streaming &amp;amp; Asynchronous Operations in A2A&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The Agent's communication method is defined in its &lt;em&gt;Agent Card&lt;/em&gt;. Our system has three A2A agents, each using a different communication approach. Let's go ahead and start creating our agents.&lt;/p&gt;

&lt;h2&gt;
  
  
  🏗️ Building A2A Agents
&lt;/h2&gt;

&lt;p&gt;The first step in creating an A2A agent is defining its &lt;em&gt;Agent Card&lt;/em&gt;. This card is essential as it describes the server's identity, capabilities, skills, service endpoint URL, and authentication requirements. Clients use the information in the &lt;em&gt;Agent Card&lt;/em&gt; to understand how to interact with the agent.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/specification/#5-agent-discovery-the-agent-card" rel="noopener noreferrer"&gt;Agent Card&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;As described previously, the &lt;em&gt;Agent Card&lt;/em&gt; is a JSON metadata document published by the A2A Server, describing its identity, capabilities, skills, service endpoint, authentication and how clients should interact with it.&lt;/p&gt;

&lt;p&gt;The recommended location for the &lt;em&gt;Agent Card&lt;/em&gt;, following the well-known URI strategy, is &lt;code&gt;http(s)://{server_domain}/.well-known/agent.json&lt;/code&gt;. Using the official SDK, the Agent Card will be available at that path automatically. Below, you can see the Agent Cards for each of our A2A Agents.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Agent Card: Employee Flight Request&lt;/strong&gt; -&amp;gt; This card defines that our agent will respond to every request immediately. It also specifies that the agent has three skills: &lt;code&gt;list_pending_requests_skill&lt;/code&gt;, &lt;code&gt;list_booked_requests_skill&lt;/code&gt; and &lt;code&gt;check_employee_request_skill&lt;/code&gt;. The protocol does not specify how the agent knows which skill should be performed upon a request - it's the agent's responsibility to determine which skill to execute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;public_agent_card&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Employee Flight Request Management Agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Agent for managing and checking employee flight requests and bookings&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9992/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;defaultInputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;defaultOutputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AgentCapabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streaming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;list_pending_requests_skill&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;list_booked_requests_skill&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;check_employee_request_skill&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;supportsAuthenticatedExtendedCard&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Agent Card: Airport Knowledge Base&lt;/strong&gt; -&amp;gt; Here, the card specifies that the agent will stream the response to the client using &lt;code&gt;streaming=True&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;public_agent_card&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Airport Knowledge Base Agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Knowledge base agent for retrieving correct airport names and city-airport mappings&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9991/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;defaultInputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;defaultOutputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AgentCapabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streaming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;airport_knowledge_skill&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Agent Card: Flight Search&lt;/strong&gt; -&amp;gt; This agent, as its card describes by &lt;code&gt;pushNotifications=True&lt;/code&gt;, will send push notifications to the clients.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;public_agent_card&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Flight Search Agent&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Real-time flight search agent with push notification capabilities for aviation data&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9993/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;defaultInputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;defaultOutputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;capabilities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;AgentCapabilities&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streaming&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pushNotifications&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;skills&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;flight_search_skill&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;supportsAuthenticatedExtendedCard&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; Since the main purpose of this article and the project that implements the code explained here is to demonstrate the A2A Protocol, our chatbot knows beforehand how each agent will send the responses. However, in a real scenario, the client may need to implement a way to handle communications based on the agent cards.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/specification/#554-agentskill-object" rel="noopener noreferrer"&gt;Agent Skills&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;An &lt;em&gt;Agent Skill&lt;/em&gt; is a specific capability, function, or area of expertise the agent can perform or address. An agent can define more than one skill - as our &lt;em&gt;Employee Flight Request Management Agent&lt;/em&gt; does - in its card. Nevertheless, the protocol says nothing about how the agent knows which skill the user is trying to execute. Thus, it is the responsibility of the agent to determine which skill to perform based on the client's message.&lt;/p&gt;

&lt;p&gt;Below is the definition of one skill used by the &lt;em&gt;Employee Flight Request Management Agent&lt;/em&gt; - the other skills are defined in a similar fashion.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;list_pending_requests_skill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;list_pending_requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;List Pending Flight Requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;List all employee flight requests that are not yet booked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;flight&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;left&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;available&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;not booked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;employee&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;examples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;list pending flight requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;show pending requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;which flights are not booked&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;display remaining requests&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;a href="https://a2aprotocol.ai/docs/guide/google-a2a-python-sdk-tutorial#_3-implementing-the-agent-executor" rel="noopener noreferrer"&gt;Agent Executor&lt;/a&gt;
&lt;/h3&gt;

&lt;p&gt;The &lt;em&gt;Agent Executor&lt;/em&gt; is the central component that handles the processing logic of A2A agents and is responsible for processing incoming requests and generating corresponding responses. The SDK provides an abstract base class &lt;code&gt;a2a.server.agent_execution.AgentExecutor&lt;/code&gt; that we must implement to create our agent. This class defines two main methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;async def execute(self, context: RequestContext, event_queue: EventQueue):&lt;/code&gt; Handles incoming requests that expect a response or a stream of events.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;async def cancel(self, context: RequestContext, event_queue: EventQueue):&lt;/code&gt; Handles requests to cancel an ongoing task.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;RequestContext&lt;/code&gt; provides information about the incoming request, and the &lt;code&gt;EventQueue&lt;/code&gt; is used to send events back to the client.&lt;/p&gt;

&lt;p&gt;This is the Agent Executor implementation for our &lt;em&gt;Employee Flight Request Management Agent&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmployeeFlightRequestAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Employee flight request management agent executor.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EmployeeFlightRequestAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;EventQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_input&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;EventQueue&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Flight request operation cancelled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This line &lt;code&gt;response = await self.agent.invoke(query)&lt;/code&gt; calls and executes the actual logic of our agent, querying the mocked database and returning the data.&lt;/p&gt;

&lt;p&gt;Notice the line &lt;code&gt;await event_queue.enqueue_event(new_agent_text_message(response))&lt;/code&gt;. This is really important because it's how the protocol allows the server to respond to the clients. The &lt;code&gt;event_queue.enqueue_event&lt;/code&gt; is the way to return messages even if &lt;code&gt;stream&lt;/code&gt; is &lt;strong&gt;False&lt;/strong&gt; in the &lt;em&gt;Agent Card&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Creating and sending messages
&lt;/h3&gt;

&lt;p&gt;In this section we will explore how to create messages and send them to the clients. In the section Our chatbot we will describe how it handles each kind of communication.&lt;/p&gt;

&lt;p&gt;The simplest way to create a message to send to an A2A Client is using &lt;code&gt;a2a.utils.new_agent_text_message(text: str, context_id: str | None = None, task_id: str | None = None) -&amp;gt; Message:)&lt;/code&gt;. This function returns the following object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nc"&gt;Part&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;root&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;))],&lt;/span&gt;
    &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;context_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our &lt;em&gt;Employee Flight Request Management Agent&lt;/em&gt; uses this method to create messages in response to the client. The created message is sent using the &lt;code&gt;event_queue.enqueue_event&lt;/code&gt; method. See the code below.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmployeeFlightRequestAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;EventQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_input&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enqueue_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our next agent, &lt;em&gt;Airport Knowledge Base&lt;/em&gt;, streams messages to the clients. To achieve this, we need to use another class provided by the SDK: &lt;code&gt;a2a.server.tasks.TaskUpdater&lt;/code&gt;. This class allows agents to publish updates to a task's event queue. Based on this, the messages to stream must contain the &lt;code&gt;task.id&lt;/code&gt; and &lt;code&gt;task.contextId&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This is the Agent Executor for that agent:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AirportKnowledgeBaseAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Airport knowledge base agent executor.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AirportKnowledgeBaseAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;EventQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_input&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;updater&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TaskUpdater&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note the creation of the &lt;code&gt;TaskUpdater&lt;/code&gt; instance: it takes the event_queue, the task id and the context id from the task. Then, within the method &lt;code&gt;self.agent.invoke(...)&lt;/code&gt; we use the updater object to stream messages as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;TaskUpdater&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Retrieve airport information using fuzzy matching by name and municipality.

    Args:
        context: Request context
        event_queue: Event queue for streaming messages
        query: Search string (city or airport name)

    Returns:
        String with top 5 airport names and top 5 cities with their airports
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;airport_knowledge&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Airport knowledge base not loaded. Please check the database files.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;final&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📚 Accessing airport knowledge base...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;result_lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Throughout this code we can also introduce some useful classes from the SDK (package &lt;code&gt;a2a.types&lt;/code&gt;):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;TaskState&lt;/em&gt;: enum representing the possible states of a Task&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Role&lt;/em&gt;: enum representing a message sender's role&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;TextPart&lt;/em&gt;: represents a text segment within parts&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Message&lt;/em&gt;: class that represents a single message exchanged between user and agent.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So far we've seen how agents respond in a single timeline: from receiving the request until sending their response, either through a single message or by streaming multiple messages until completing the cycle.&lt;/p&gt;

&lt;p&gt;But if a task could take a long time to finish, it's not a good idea to make the client wait until the end while keeping a connection alive. For these cases, we can use push notifications. The last agent we will create does exactly this.&lt;/p&gt;

&lt;p&gt;As a requirement imposed by the protocol, a client that wants to receive push notifications should explicitly specify the &lt;code&gt;endpoint&lt;/code&gt; enabled for that purpose. We will see how to do that in the section Our chatbot. Here, we describe how the push notification process works.&lt;/p&gt;

&lt;p&gt;In the section Building A2A Servers we will see that we need to use a &lt;code&gt;request_handler&lt;/code&gt; for creating an &lt;em&gt;A2A Server&lt;/em&gt;. The SDK provides us with the following implementation: &lt;code&gt;a2a.server.request_handlers.DefaultRequestHandler&lt;/code&gt;. At the time this article was written (Jul 2025), that handler does not properly manage push notifications. Therefore, you can extend that class and override the methods within it. The class &lt;a href="https://github.com/ezequiroga/a2a-bases/blob/main/flight_search_agent/custom_request_handler.py" rel="noopener noreferrer"&gt;CustomRequestHandler&lt;/a&gt; does exactly that by overriding only one method: &lt;code&gt;on_message_send_stream&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;These are the most relevant parts of the new implementation&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CustomRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;DefaultRequestHandler&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Custom request handler that extends DefaultRequestHandler.

    This handler maintains all default functionality while providing
    custom implementation for the streaming message send method.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;on_message_send_stream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;MessageSendParams&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ServerCallContext&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AsyncGenerator&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Custom handler for &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message/stream&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; (streaming).

        Starts the agent execution and yields events as they are produced
        by the agent.
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;task_manager&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TaskManager&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;context_id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;task_store&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;task_store&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;initial_message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c1"&gt;# Start new code #
&lt;/span&gt;        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;TaskStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submitted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;task_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save_task_event&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;task_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_task&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="c1"&gt;# End new code #
&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;task_manager&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_with_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;should_add_push_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_push_notifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PushNotifier&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;MessageSendConfiguration&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;assert&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pushNotificationConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;PushNotificationConfig&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_push_notifier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pushNotificationConfig&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;EventQueue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="bp"&gt;...&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="bp"&gt;...&lt;/span&gt;
            &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result_aggregator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;consume_and_emit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;consumer&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_push_notifier&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;latest_task&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;result_aggregator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_result&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latest_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_push_notifier&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_notification&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latest_task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_cleanup_producer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;producer_task&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Moreover, to enable the ability to send push notifications in our agent, we need to add &lt;code&gt;push_notifier=InMemoryPushNotifier(httpx_client=httpx.AsyncClient())&lt;/code&gt; when creating the &lt;code&gt;request_handler&lt;/code&gt; for building the A2A Server. This is shown again in the section Running A2A servers.&lt;/p&gt;

&lt;p&gt;That's all. If &lt;code&gt;pushNotifications=True&lt;/code&gt; in the &lt;em&gt;Agent Card&lt;/em&gt;, the &lt;code&gt;PushNotifier&lt;/code&gt; is set in the &lt;code&gt;request_handler&lt;/code&gt; and the client provides the push notification endpoint, the SDK will automatically send messages to the provided endpoint each time a Task changes its status. It's important to mention that &lt;strong&gt;an instance of the class &lt;code&gt;Task&lt;/code&gt;&lt;/strong&gt; is being pushed. This is relevant because the endpoint our chatbot exposes for listening to notifications will receive that object as JSON and should be able to parse it.&lt;/p&gt;

&lt;p&gt;Because of this, the &lt;em&gt;Agent Executor&lt;/em&gt; of our agent simply sends messages using the &lt;code&gt;updater&lt;/code&gt; as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FlightSearchAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;AgentExecutor&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Flight search agent executor with ReAct capabilities.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FlightSearchAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;RequestContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;EventQueue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute flight search request.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_user_input&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;updater&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TaskUpdater&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;submitted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nf"&gt;new_agent_text_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤖 Flight Search Agent activated...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;current_task&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;working&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;updater&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The method &lt;code&gt;self.agent.invoke&lt;/code&gt; returns the following object:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;push_notification_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flights&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;final_response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flight_search_agent&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;metadata&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;task_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;context_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;context_id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;push_notification_payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
    &lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;task_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;context_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Building A2A Servers
&lt;/h3&gt;

&lt;p&gt;At this point we have already achieved:&lt;/p&gt;

&lt;p&gt;1️⃣ Creating agent cards specifying their endpoint (&lt;code&gt;url&lt;/code&gt;: where the A2A service can be reached), skills (&lt;code&gt;skills&lt;/code&gt;) and capabilities (&lt;code&gt;capabilities&lt;/code&gt;: communication methods) among other properties&lt;/p&gt;

&lt;p&gt;2️⃣ Adding agent skills to the agent cards&lt;/p&gt;

&lt;p&gt;3️⃣ Implementing the agent executor for each agent&lt;/p&gt;

&lt;p&gt;4️⃣ Composing and sending messages depending on the agent's communication method&lt;/p&gt;

&lt;p&gt;Now, it's time to build the &lt;a href="https://a2aprotocol.ai/docs/guide/a2a-sdk-currency-agent-tutorial#agentserver" rel="noopener noreferrer"&gt;A2A servers&lt;/a&gt;. For this purpose, the SDK provides us with the class &lt;code&gt;a2a.server.apps.A2AStarletteApplication&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;To create an A2A Server, it's mandatory to use an &lt;code&gt;AgentCard&lt;/code&gt;, a &lt;code&gt;RequestHandler&lt;/code&gt; (to route incoming A2A RPC calls to the appropriate methods on your executor), an &lt;code&gt;AgentExecutor&lt;/code&gt; (to execute the core logic of how agents process requests and generate responses) and a &lt;code&gt;TaskStore&lt;/code&gt; (to manage the lifecycle of tasks).&lt;/p&gt;

&lt;p&gt;As can be seen in the code below, the SDK provides useful implementations of these classes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;request_handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DefaultRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_executor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;EmployeeFlightRequestAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;task_store&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;InMemoryTaskStore&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AStarletteApplication&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;agent_card&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;public_agent_card&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;http_handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request_handler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;uvicorn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;9992&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;&lt;strong&gt;Note&lt;/strong&gt;: Since &lt;code&gt;Starlette Application&lt;/code&gt; and &lt;code&gt;uvicorn&lt;/code&gt; are beyond the scope of this article, if you need more information, you can read about them in &lt;a href="https://www.starlette.io/" rel="noopener noreferrer"&gt;Starlette&lt;/a&gt; and &lt;a href="https://www.uvicorn.org/" rel="noopener noreferrer"&gt;uvicorn&lt;/a&gt; respectively.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Running A2A servers
&lt;/h3&gt;

&lt;p&gt;The recommended directory structure for the server is as follows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;├── name_of_the_agent/
│   ├── __main__.py -&amp;gt; contains AgentSkill, AgentCard, CustomRequestHandler, A2AStarletteApplication and the line to start the uvicorn server
│   └── agent_executor.py -&amp;gt; the Agent Executor implementation with the actual logic for processing incoming requests and generating responses
│   └── ...
│   └── subdirectories/
│       └── ...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the previous sections we already showed how to implement the &lt;em&gt;Agent Executor&lt;/em&gt;. Below, you can see an example of the &lt;code&gt;__main__.py&lt;/code&gt; file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# ...IMPORTS...
&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;

    &lt;span class="n"&gt;flight_search_skill&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentSkill&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;public_agent_card&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AgentCard&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;request_handler&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CustomRequestHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;agent_executor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;FlightSearchAgentExecutor&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;task_store&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;InMemoryTaskStore&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;push_notifier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;InMemoryPushNotifier&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# ONLY needed if the Agent will send push notifications
&lt;/span&gt;    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AStarletteApplication&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;agent_card&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;public_agent_card&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;http_handler&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request_handler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;uvicorn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;build&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0.0.0.0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;port&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;9993&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With all of that in place, we can run our A2A Server by:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;name_of_the_agent/
uv run &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Important Note&lt;/strong&gt;: To run the server this way, you need to initialize a &lt;code&gt;uv&lt;/code&gt; project within the agent's folder. You can find more information in &lt;a href="https://docs.astral.sh/uv/concepts/projects/" rel="noopener noreferrer"&gt;UV Projects&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  🤖 Our chatbot
&lt;/h2&gt;

&lt;p&gt;The final piece of our AI system is our chatbot: the entry point for user interaction. Our chatbot has five important components that we'll analyze below and defines the &lt;code&gt;main&lt;/code&gt; function that contains the logic to allow users to enter prompts and display responses from our &lt;em&gt;A2A Agents&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;*Tool(BaseTool)&lt;/code&gt; classes implement &lt;code&gt;langchain.tools.BaseTool&lt;/code&gt; since they are &lt;em&gt;tools&lt;/em&gt; that our LangChain Agent can invoke. They all override the methods &lt;code&gt;def _run(self, query: str) -&amp;gt; str&lt;/code&gt; and &lt;code&gt;async def _arun(self, query: str) -&amp;gt; str&lt;/code&gt; with the actual logic of the &lt;em&gt;tool&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Note: The &lt;code&gt;chat_agent.py&lt;/code&gt; file needs to be refactored to move the classes declared in it to independent files.&lt;/em&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  A2AAgentRegistry
&lt;/h3&gt;

&lt;p&gt;A mock registry that manages A2A agent information, including their capabilities, base URLs, and initialized clients for communication.&lt;/p&gt;

&lt;p&gt;To obtain the information present in the &lt;em&gt;Agent Cards&lt;/em&gt;, we use &lt;code&gt;a2a.client.A2ACardResolver&lt;/code&gt; which automatically calls the endpoint &lt;code&gt;http(s)://{server_domain}/.well-known/agent.json&lt;/code&gt; and initiate communication with our &lt;em&gt;A2A Agents&lt;/em&gt; using &lt;code&gt;a2a.client.A2AClient&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;A2AAgentRegistry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Mock registry for A2A agents. Later will be replaced with real discovery.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;airport_knowledge_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9991&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;..&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employee_flight_requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9992&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="p"&gt;..&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flight_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:9993&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize_agents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Initialize A2A clients for all registered agents.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;resolver&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2ACardResolver&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;card&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;AgentCard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;resolver&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_agent_card&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_card&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;card&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;card&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;card&lt;/span&gt;
                    &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Initialized &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;   📝 Description: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️  Could not connect to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; at &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;base_url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;card&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
                    &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Failed to initialize &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;Optional&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Any&lt;/span&gt;&lt;span class="p"&gt;]]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get agent info by ID.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;list_available_agents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;List all agents that are available (have active clients).&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="n"&gt;agent_id&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;agent_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agents&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  EmployeeFlightRequestTool
&lt;/h3&gt;

&lt;p&gt;A LangChain tool that checks the status of employee flight requests and booking information by communicating with the employee flight request agent. Our A2A Agent &lt;em&gt;Employee Flight Request Management Agent&lt;/em&gt; uses &lt;code&gt;capabilities=AgentCapabilities(streaming=False)&lt;/code&gt;, so the message is sent and the response is awaited.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;EmployeeFlightRequestTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseTool&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="gp"&gt;    ...&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;A2AAgentRegistry&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_arun&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Async implementation to call the employee flight request agent.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employee_flight_requests&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Employee flight request agent is not available. Please check if the service is running.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;send_message_payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MessageSendParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SendMessageRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; 
                &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;send_message_payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;📋 Checking flight requests for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Error calling employee flight request agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  AirportKnowledgeTool
&lt;/h3&gt;

&lt;p&gt;A LangChain tool that retrieves airport information from the knowledge base agent when users ask about airport names or airports in specific cities. Our A2A Agent &lt;em&gt;Airport Knowledge Base Agent&lt;/em&gt; uses &lt;code&gt;capabilities=AgentCapabilities(streaming=True)&lt;/code&gt;, so this tool must receive a &lt;code&gt;stream&lt;/code&gt; of messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AirportKnowledgeTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseTool&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="gp"&gt;    ...&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;A2AAgentRegistry&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_arun&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Async implementation to call the airport knowledge base agent.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;airport_knowledge_base&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Airport knowledge base agent is not available. Please check if the service is running.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;streaming_request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SendStreamingMessageRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt; 
                &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MessageSendParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;stream_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message_streaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;streaming_request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;full_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;📚 Looking up airport information for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;stream_response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;json_chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;model_dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;json&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;exclude_none&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;json_chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;state&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;TaskState&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;full_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;✅ Knowledge base lookup completed&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json_chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="k"&gt;break&lt;/span&gt;
                &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📨 &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;json_chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;result&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;parts&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;full_response&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;full_response&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Knowledge base lookup completed - check the streaming output above.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Error calling airport knowledge base agent: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  FlightSearchTool
&lt;/h3&gt;

&lt;p&gt;A LangChain tool that searches for scheduled flights using Aviation Stack API through the flight search agent and handles results via push notifications since our A2A Agent &lt;em&gt;Flight Search Agent&lt;/em&gt; uses &lt;code&gt;capabilities=AgentCapabilities(streaming=True, pushNotifications=True)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;It's important to note how the message is created when using &lt;code&gt;a2a.types.SendStreamingMessageRequest&lt;/code&gt; since in this case we need to tell our &lt;em&gt;A2A Agent&lt;/em&gt; which &lt;strong&gt;endpoint&lt;/strong&gt; we make available to receive the &lt;em&gt;push notifications&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FlightSearchTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;BaseTool&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
&lt;/span&gt;&lt;span class="gp"&gt;    ...&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;
    &lt;span class="n"&gt;flight_search_callback_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HTTP_SERVER_PORT&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;FLIGHTS_ENDPOINT_PATH&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="c1"&gt;# TODO: DO NOT hardcode the callback URL
&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;A2AAgentRegistry&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="nf"&gt;super&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_arun&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Async implementation to call the flight search agent.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;flight_search&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Flight search agent is not available. Please check if the service is running.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;async_search&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Execute flight search asynchronously.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;A2AClient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent_info&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

                &lt;span class="n"&gt;part&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;TextPart&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;role&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;Role&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;part&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                    &lt;span class="n"&gt;messageId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
                    &lt;span class="n"&gt;contextId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
                    &lt;span class="n"&gt;taskId&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;SendStreamingMessageRequest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
                    &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MessageSendParams&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                        &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                        &lt;span class="n"&gt;configuration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;MessageSendConfiguration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                            &lt;span class="n"&gt;acceptedOutputModes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
                            &lt;span class="n"&gt;pushNotificationConfig&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nc"&gt;PushNotificationConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                                &lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flight_search_callback_url&lt;/span&gt;
                            &lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send_message_streaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;pass&lt;/span&gt; &lt;span class="c1"&gt;# we don't care about the stream since messages will be received through the exposed endpoint
&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Error in background flight search: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;async_search&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="c1"&gt;# here we decouple this tool's execution from the LangChain flow
&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🛫 Flight search initiated in background for: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Flight search initiated - results will be sent via push notification once completed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  ReactChatAgent
&lt;/h3&gt;

&lt;p&gt;A LangGraph 'ReAct' Agent that orchestrates interactions between the user and our A2A agents through specialized tools. The most notable features are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Creates a FastAPI server that exposes an endpoint to receive &lt;em&gt;push notifications&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Uses &lt;code&gt;langgraph.checkpoint.memory.MemorySaver&lt;/code&gt; so our agent has short-term memory&lt;/li&gt;
&lt;li&gt;Defines the &lt;code&gt;setup_http_endpoints(...)&lt;/code&gt; method to listen and process &lt;em&gt;push notifications&lt;/em&gt; sent by the &lt;em&gt;A2A Agent&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Uses the &lt;code&gt;langgraph.prebuilt.create_react_agent&lt;/code&gt; function -ReAct Agents are deprecated so this function creates a Graph that calls tools in a loop until a stopping condition is met- to create a &lt;em&gt;Compiled Graph&lt;/em&gt; from LangChain that acts as the "brain" of our chat&lt;/li&gt;
&lt;li&gt;Defines the &lt;code&gt;chat(self, user_input: str, thread_id: str = "default") -&amp;gt; str:&lt;/code&gt; method that makes the call to the LLM
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ReactChatAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;LangGraph ReAct agent that can interact with A2A agents through tools and receive external messages via HTTP.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;A2AAgentRegistry&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;api_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getenv&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ANTHROPIC_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ANTHROPIC_API_KEY environment variable must be set&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ChatAnthropic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claude-3-5-sonnet-20241022&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MemorySaver&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;external_message_queue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastAPI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ReAct Chat Agent API&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;1.0.0&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setup_http_endpoints&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;setup_http_endpoints&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Setup HTTP endpoints for receiving external messages.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

        &lt;span class="nd"&gt;@self.app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;FLIGHTS_ENDPOINT_PATH&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;receive_flight_findings&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;flight_finding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Receive flight findings and add them to the message queue.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="c1"&gt;# full implementation in the GitHub repository
&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;HTTPException&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;detail&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing flight findings: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Initialize the agent and its tools.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤖 Initializing LangGraph ReAct Chat Agent with Anthropic Claude...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize_agents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="nc"&gt;AirportKnowledgeTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;EmployeeFlightRequestTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="nc"&gt;FlightSearchTool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="n"&gt;system_prompt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;You are a helpful assistant that manages employee flight requests in a corporate environment and can search for scheduled flights.
&lt;/span&gt;&lt;span class="gp"&gt;        ...&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;

        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_graph&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_react_agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;checkpointer&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;memory&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;system_prompt&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;available_agents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_available_agents&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ LangGraph ReAct Agent initialized with &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;available_agents&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; available A2A agents: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;available_agents&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧠 Using Anthropic Claude as the reasoning engine&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📡 HTTP endpoint available at: http://localhost:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HTTP_SERVER_PORT&lt;/span&gt;&lt;span class="si"&gt;}{&lt;/span&gt;&lt;span class="n"&gt;FLIGHTS_ENDPOINT_PATH&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_external_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;external_msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;InternalMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Process an external message and add it to agent memory.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="c1"&gt;# full implementation in the GitHub repository
&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;default&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;configurable&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thread_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;}}&lt;/span&gt;

            &lt;span class="n"&gt;messages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

            &lt;span class="n"&gt;last_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;
            &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;agent_graph&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;astream&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
                &lt;span class="n"&gt;config&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;config&lt;/span&gt;
            &lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;last_message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;

            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;last_message&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;last_message&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Response completed - check the output above.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ Error processing request: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;code&gt;main&lt;/code&gt; function
&lt;/h3&gt;

&lt;p&gt;When the function starts, it creates and initializes our &lt;code&gt;ReactChatAgent()&lt;/code&gt; which in turn calls the Agent Registry and creates our &lt;strong&gt;A2A Clients&lt;/strong&gt; to communicate with our &lt;strong&gt;A2A Servers&lt;/strong&gt; (A2A Agents).&lt;/p&gt;

&lt;p&gt;It also runs the FastAPI server in a separate execution thread, which allows listening to push notifications as they arrive.&lt;/p&gt;

&lt;p&gt;The code inside the &lt;code&gt;while&lt;/code&gt; loop is a bit complex since it processes the message queue from push notifications, interleaved with responses to user prompts, as they arrive.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Main CLI loop for the chat agent with HTTP endpoint integration.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🚀 Starting LangGraph ReAct Chat Agent with A2A Integration&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🧠 Powered by Anthropic Claude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;📡 HTTP API Server Enabled&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;basicConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;level&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WARNING&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AsyncClient&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ReactChatAgent&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;httpx_client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🌐 Starting HTTP server on port &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HTTP_SERVER_PORT&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;http_thread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;threading&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;target&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;run_http_server&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;,))&lt;/span&gt;
        &lt;span class="n"&gt;http_thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;daemon&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
        &lt;span class="n"&gt;http_thread&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;💬 Chat Agent Ready! (Type &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; to exit)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="bp"&gt;...&lt;/span&gt;

        &lt;span class="n"&gt;thread_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;console_session_&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;uuid4&lt;/span&gt;&lt;span class="p"&gt;())[:&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;prompt_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="n"&gt;should_exit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;should_exit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;external_message_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                    &lt;span class="n"&gt;external_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;external_message_queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;process_external_message&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;external_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;prompt_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;prompt_shown&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;👤 You: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;prompt_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdin&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;select&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stdin&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                    &lt;span class="n"&gt;user_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;input&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                    &lt;span class="n"&gt;prompt_shown&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;quit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;exit&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bye&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;loop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_event_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;loop&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;is_closed&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;⚠️  Event loop is closed - cleaning up gracefully...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;✅ Event loop is healthy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ℹ️  No event loop available in current context&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;👋 Goodbye!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;should_exit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;🤖 LLM: ...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;flush&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thread_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;chunk_data&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;isinstance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;chunk_data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunk_data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunk_data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;messages&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;hasattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                                            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;**** 🤖 Agent pretty print *****&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;31&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                                            &lt;span class="k"&gt;break&lt;/span&gt;

            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;KeyboardInterrupt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;👋 Goodbye!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;should_exit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="bp"&gt;...&lt;/span&gt;

        &lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Finally, to run our &lt;em&gt;chatbot&lt;/em&gt; we execute:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 chat_agent.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  🚀 Running our chatbot
&lt;/h2&gt;

&lt;p&gt;It's time to run our chatbot. But first, we need to start each of our &lt;em&gt;A2A Agents&lt;/em&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the airport knowledge base agent&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;airport_knowledge_base_agent/
uv run &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &amp;amp;

&lt;span class="nt"&gt;---&lt;/span&gt;
✅ Initialized flight request database with 10 records
INFO:     Started server process &lt;span class="o"&gt;[&lt;/span&gt;53208]
INFO:     Waiting &lt;span class="k"&gt;for &lt;/span&gt;application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9992 &lt;span class="o"&gt;(&lt;/span&gt;Press CTRL+C to quit&lt;span class="o"&gt;)&lt;/span&gt;
INFO:     127.0.0.1:60396 - &lt;span class="s2"&gt;"GET /.well-known/agent.json HTTP/1.1"&lt;/span&gt; 200 OK  &lt;span class="c"&gt;# This shows that our chatbot called the agent card&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the employee flight requests agent &lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ../employee_flight_requests_agent/
uv run &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &amp;amp;

&lt;span class="nt"&gt;---&lt;/span&gt;
✅ Loaded airport knowledge base: 8467 airports from 235 countries
INFO:     Started server process &lt;span class="o"&gt;[&lt;/span&gt;53217]
INFO:     Waiting &lt;span class="k"&gt;for &lt;/span&gt;application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9991 &lt;span class="o"&gt;(&lt;/span&gt;Press CTRL+C to quit&lt;span class="o"&gt;)&lt;/span&gt;
INFO:     127.0.0.1:60394 - &lt;span class="s2"&gt;"GET /.well-known/agent.json HTTP/1.1"&lt;/span&gt; 200 OK  &lt;span class="c"&gt;# This shows that our chatbot called the agent card&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start the flight search agent&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; ../flight_search_agent/
uv run &lt;span class="nb"&gt;.&lt;/span&gt; &lt;span class="nt"&gt;--host&lt;/span&gt; 0.0.0.0 &amp;amp;

&lt;span class="nt"&gt;---&lt;/span&gt;
✅ Initialized Flight Search ReAct Agent with Aviation Stack API
INFO:     Started server process &lt;span class="o"&gt;[&lt;/span&gt;53228]
INFO:     Waiting &lt;span class="k"&gt;for &lt;/span&gt;application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9993 &lt;span class="o"&gt;(&lt;/span&gt;Press CTRL+C to quit&lt;span class="o"&gt;)&lt;/span&gt;
INFO:     127.0.0.1:60398 - &lt;span class="s2"&gt;"GET /.well-known/agent.json HTTP/1.1"&lt;/span&gt; 200 OK  &lt;span class="c"&gt;# This shows that our chatbot called the agent card&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, we can run our chatbot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python chat_agent.py

&lt;span class="nt"&gt;---&lt;/span&gt;
🚀 Starting LangGraph ReAct Chat Agent with A2A Integration
🧠 Powered by Anthropic Claude
📡 HTTP API Server Enabled
&lt;span class="o"&gt;============================================================&lt;/span&gt;
🤖 Initializing LangGraph ReAct Chat Agent with Anthropic Claude...
✅ Initialized Airport Knowledge Base Agent at http://localhost:9991
   📝 Description: Knowledge base &lt;span class="k"&gt;for &lt;/span&gt;airport information and city-airport mappings
✅ Initialized Employee Flight Request Agent at http://localhost:9992
   📝 Description: Check employee flight requests and booking status
✅ Initialized Flight Search Agent at http://localhost:9993
   📝 Description: Scheduled flight search using Aviation Stack
✅ LangGraph ReAct Agent initialized with 3 available A2A agents: &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;'airport_knowledge_base'&lt;/span&gt;, &lt;span class="s1"&gt;'employee_flight_requests'&lt;/span&gt;, &lt;span class="s1"&gt;'flight_search'&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
🧠 Using Anthropic Claude as the reasoning engine
📡 HTTP endpoint available at: http://localhost:9990/api/flights-findings
🌐 Starting HTTP server on port 9990...

💬 Chat Agent Ready! &lt;span class="o"&gt;(&lt;/span&gt;Type &lt;span class="s1"&gt;'quit'&lt;/span&gt; to &lt;span class="nb"&gt;exit&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
You can ask about:
  - Airport knowledge base: &lt;span class="s1"&gt;'find airports in Madrid'&lt;/span&gt;
  - Airport information: &lt;span class="s1"&gt;'what airports are in Tokyo'&lt;/span&gt;
  - Employee flight requests: &lt;span class="s1"&gt;'check pending flight requests'&lt;/span&gt;
  - Employee status: &lt;span class="s1"&gt;'check John Smith flight request'&lt;/span&gt;
  - Flight search: &lt;span class="s1"&gt;'search flights from AEP on 2025-11-20'&lt;/span&gt;
  - Real-time flights: &lt;span class="s1"&gt;'find flights from JFK to LAX on 2025-12-01'&lt;/span&gt;

📡 HTTP Endpoints available:
  - POST http://localhost:9990/api/flights-findings
  - GET  http://localhost:9990/api/status
&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;

👤 You: 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First, we ask our agent "What are the pending flight requests?". Our agent will use the &lt;code&gt;employee_flight_requests&lt;/code&gt; tool to perform the search and display the results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;👤 You: What are the pending flight requests?
🤖 LLM: ...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'agent'&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'messages'&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;AIMessage&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=[{&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt;: &lt;span class="s2"&gt;"I'll help you check the pending flight requests using the employee_flight_requests tool."&lt;/span&gt;...
...
&lt;span class="k"&gt;****&lt;/span&gt; 🤖 Agent pretty print &lt;span class="k"&gt;*****&lt;/span&gt;
Based on the results, there are 5 pending flight requests awaiting booking:

1. Robert Johnson: New York to Los Angeles &lt;span class="o"&gt;(&lt;/span&gt;Dec 1, 2025&lt;span class="o"&gt;)&lt;/span&gt;
2. Anna Thompson: London to Dublin &lt;span class="o"&gt;(&lt;/span&gt;Oct 5, 2025&lt;span class="o"&gt;)&lt;/span&gt;
3. Sophie Martin: Paris to Rome &lt;span class="o"&gt;(&lt;/span&gt;July 12, 2025&lt;span class="o"&gt;)&lt;/span&gt;
4. Elena Popov: Berlin to Amsterdam &lt;span class="o"&gt;(&lt;/span&gt;Nov 18, 2025&lt;span class="o"&gt;)&lt;/span&gt;
5. Lisa Anderson: Sydney to Melbourne &lt;span class="o"&gt;(&lt;/span&gt;Aug 25, 2025&lt;span class="o"&gt;)&lt;/span&gt;

All these requests are currently &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="s2"&gt;"Awaiting booking"&lt;/span&gt; status. Would you like more specific information about any of these requests or would you like to check the status of booked flights as well?
&lt;span class="k"&gt;*******************************&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since our agent has memory, we can ask: "Which airports can Anna depart from?". In this case, we see that it uses the &lt;code&gt;airport_knowledge_base&lt;/code&gt; tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;👤 You: Which airports can Anna depart from?
🤖 LLM: ...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'agent'&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'messages'&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;AIMessage&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=[{&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt;: &lt;span class="s2"&gt;"Since Anna's request is for a flight from London, I'll use the airport_knowledge_base tool to find the available airports in London."&lt;/span&gt;...
...
&lt;span class="k"&gt;****&lt;/span&gt; 🤖 Agent pretty print &lt;span class="k"&gt;*****&lt;/span&gt;
For Anna&lt;span class="s1"&gt;'s flight from London to Dublin, she can depart from any of these major London airports in the UK:

1. London Heathrow Airport (LHR) - The largest and most well-connected airport
2. London Gatwick Airport (LGW) - Second largest airport
3. London Stansted Airport (STN) - Major hub for low-cost carriers
4. London City Airport (LCY) - Convenient for business travelers, located in the city
5. London Biggin Hill Airport (BQH) - Smaller airport primarily for private aviation
6. RAF Northolt (NHT) - Military airport with limited civilian use

The most commonly used airports for commercial flights to Dublin would be Heathrow (LHR), Gatwick (LGW), or Stansted (STN). Would you like me to search for specific flights from any of these airports to Dublin for Anna'&lt;/span&gt;s travel &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;October 5, 2025&lt;span class="o"&gt;)&lt;/span&gt;?
&lt;span class="k"&gt;*******************************&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;As the final step, we ask our agent: "Yes, what flights are available from Heathrow?". As we can see, this time it uses the &lt;code&gt;flight_search&lt;/code&gt; tool.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;👤 You: &lt;span class="s2"&gt;"Yes, what flights are available from Heathrow?"&lt;/span&gt;
🤖 LLM: ...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'agent'&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'messages'&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;AIMessage&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=[{&lt;/span&gt;&lt;span class="s1"&gt;'text'&lt;/span&gt;: &lt;span class="s2"&gt;"I'll search for flights from London Heathrow (LHR) to Dublin (DUB) for Anna's travel date of October 5, 2025, using the flight_search tool."&lt;/span&gt;...
...
&lt;span class="k"&gt;****&lt;/span&gt; 🤖 Agent pretty print &lt;span class="k"&gt;*****&lt;/span&gt;
I&lt;span class="s1"&gt;'ve initiated the flight search from London Heathrow (LHR) to Dublin (DUB) for October 5, 2025. The search has been started and the results will be sent via push notifications. Once we receive the results, you'&lt;/span&gt;ll be able to see all available flights &lt;span class="k"&gt;for &lt;/span&gt;that route and &lt;span class="nb"&gt;date&lt;/span&gt;, including:
- Flight numbers
- Departure and arrival &lt;span class="nb"&gt;times&lt;/span&gt;
- Airlines
- Aircraft types
- Terminal information

Please &lt;span class="nb"&gt;wait &lt;/span&gt;&lt;span class="k"&gt;for &lt;/span&gt;the push notification with the detailed flight results, and &lt;span class="k"&gt;then &lt;/span&gt;we can &lt;span class="nb"&gt;help &lt;/span&gt;&lt;span class="k"&gt;select &lt;/span&gt;the most suitable flight &lt;span class="k"&gt;for &lt;/span&gt;Anna&lt;span class="s1"&gt;'s travel.
*******************************
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Since the &lt;em&gt;A2A Agent&lt;/em&gt; for flight search sends push notifications, our agent displays the messages as they arrive.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;👤 You: 
🔍 External message

&lt;span class="nt"&gt;------------------------------------------------------------&lt;/span&gt;
🤖 Processing external message...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'agent'&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'messages'&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;AIMessage&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"I'll help summarize the flight findings received. ...
...
**** 🤖 Agent Response to External Message pretty print *****
I'll help summarize the flight findings received. These are flights departing from London Heathrow (LHR) Terminal 2 at 06:00. Here's a breakdown of the available routes:

1. London (LHR) to Zurich (ZRH):
- Swiss/Air Canada codeshare flight LX345/AC6756
- Departure: 06:00, Terminal 2, Gate A18
- Arrival: 08:40, Terminal 2
- Aircraft: Airbus A220-100

2. London (LHR) to Vienna (VIE):
- Austrian Airlines flight OS458 (codeshared by Air Canada, ANA, and Asiana)
- Departure: 06:00, Terminal 2
- Arrival: 09:10, Terminal 3
- Aircraft: Airbus A320-271N

3. London (LHR) to Lisbon (LIS):
- TAP Air Portugal flight TP1363 (codeshared by Air Canada, Azul, Air India, and Azores Airlines)
- Departure: 06:00, Terminal 2, Gate A17
- Arrival: 08:45, Terminal 1
- Aircraft: Airbus A320-251N

All flights depart at the same time (06:00) from different gates at Terminal 2. These are primarily operated by European carriers with various codeshare agreements with other airlines.
*************************************************************
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We've achieved it! We've successfully searched for available flights for one of the requested trips. To exit our chatbot, we type &lt;code&gt;quit&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;👤 You: quit
✅ Event loop is healthy
👋 Goodbye!
🤖 LLM: ...
&lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'agent'&lt;/span&gt;: &lt;span class="o"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;'messages'&lt;/span&gt;: &lt;span class="o"&gt;[&lt;/span&gt;AIMessage&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'Goodbye!...
...
**** 🤖 Agent pretty print *****
Goodbye! Let me know if you need any further assistance with flight requests, bookings, or airport information in the future.
*******************************
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Note: If you're interested, you can see the entire output of the chatbot as well as the output of each of our agents in this document 👉 &lt;a href="https://github.com/ezequiroga/a2a-bases/blob/main/chatbot_responses.md" rel="noopener noreferrer"&gt;chatbot_responses.md&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  ✅ Key Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-Agent Communication&lt;/strong&gt;: Seamless coordination between specialized A2A agents using different communication patterns (standard HTTP, streaming, and push notifications)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Protocol Standardization&lt;/strong&gt;: Built on Google's A2A protocol ensuring interoperability and scalability across different agentic systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Flight Data&lt;/strong&gt;: Integration with Aviation Stack API for live flight information and airport recommendations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smart Agent Orchestration&lt;/strong&gt;: LangGraph ReAct Agent that intelligently routes user requests to the appropriate A2A agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexible Communication Methods&lt;/strong&gt;: Demonstrates all three A2A communication patterns in a single system&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Corporate Flight Management&lt;/strong&gt;: Complete workflow for managing employee flight requests from pending to booked status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Chat Interface&lt;/strong&gt;: Command-line interface powered by Anthropic Claude for natural language interactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Push Notification Support&lt;/strong&gt;: Asynchronous task handling for long-running operations without blocking the user experience&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💬 Final thoughts
&lt;/h2&gt;

&lt;p&gt;This post demonstrates how the A2A protocol can be used to build sophisticated multi-agent systems that coordinate and collaborate effectively. By standardizing agent-to-agent communication, A2A opens up new possibilities for creating complex AI workflows where specialized agents can work together seamlessly.&lt;/p&gt;

&lt;p&gt;The flight management system we built showcases the power of combining different communication patterns within a single application. From immediate responses for flight request status to streaming airport information and asynchronous flight searches, each agent operates optimally according to its specific requirements.&lt;/p&gt;

&lt;p&gt;As AI systems continue to evolve toward more distributed and specialized architectures, protocols like A2A will become increasingly important for enabling the next generation of collaborative AI applications.&lt;/p&gt;

&lt;h2&gt;
  
  
  📚 Resources
&lt;/h2&gt;

&lt;p&gt;Full code of this post 👉 &lt;a href="https://github.com/ezequiroga/a2a-bases" rel="noopener noreferrer"&gt;ezequiroga/a2a-bases&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google A2A Official documentation 👉 &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/" rel="noopener noreferrer"&gt;A2A Protocol&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google A2A Protocol JSON Specification 👉 &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/specification/" rel="noopener noreferrer"&gt;A2A Protocol Specification&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A2A Protocol Documentation 👉 &lt;a href="https://a2aprotocol.ai/docs/" rel="noopener noreferrer"&gt;A2A Protocol Documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agent2Agent (A2A) Python SDK Tutorial 👉 &lt;a href="https://a2aprotocol.ai/docs/guide/google-a2a-python-sdk-tutorial" rel="noopener noreferrer"&gt;A2A Protocol Documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google GitHub SDK examples repository 👉 &lt;a href="https://github.com/google-a2a/a2a-python" rel="noopener noreferrer"&gt;A2A Python SDK&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Google Python SDK Reference 👉 &lt;a href="https://a2aproject.github.io/A2A/v0.2.5/sdk/python/" rel="noopener noreferrer"&gt;Python SDK Reference&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Agent2Agent (A2A) Samples 👉 &lt;a href="https://github.com/a2aproject/a2a-samples/blob/main/README.md" rel="noopener noreferrer"&gt;A2A Samples&lt;/a&gt;&lt;/p&gt;

</description>
      <category>a2a</category>
      <category>ai</category>
      <category>chatbot</category>
      <category>agents</category>
    </item>
    <item>
      <title>Securing Microservices with JWT Validation at the Nginx Proxy Layer</title>
      <dc:creator>martinfernandezcx</dc:creator>
      <pubDate>Wed, 28 May 2025 20:16:43 +0000</pubDate>
      <link>https://dev.to/cloudx/securing-microservices-with-jwt-validation-at-the-nginx-proxy-layer-3dn</link>
      <guid>https://dev.to/cloudx/securing-microservices-with-jwt-validation-at-the-nginx-proxy-layer-3dn</guid>
      <description>&lt;p&gt;In a microservices architecture, separating concerns is critical for maintainability, scalability, and security. One key decision when building APIs is how and where to handle authentication. A common pattern is to delegate authentication to a dedicated &lt;strong&gt;authentication microservice&lt;/strong&gt;, which issues tokens (e.g., JWTs), and use those tokens to access protected resources on &lt;strong&gt;independent backend APIs&lt;/strong&gt;. When working on an infrastructure change, we faced the challenge of either integrating authentication in the Node.js backend (without the proper libraries) or maintaining a single backend solely for authorization.&lt;/p&gt;

&lt;p&gt;The options we considered were:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Having the Go backend validate the token and proxy to the Node.js backend over authenticated routes. (We tried this, but the Go proxy became messy and difficult to maintain.)&lt;/li&gt;
&lt;li&gt;Performing authentication in Node.js (infrastructure restrictions led us to abandon this approach.)&lt;/li&gt;
&lt;li&gt;Implementing a different authentication method using the existing infrastructure&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;And this third one is what we came up with after investigating.&lt;/p&gt;

&lt;p&gt;This post demonstrates how to validate JWT tokens directly in &lt;strong&gt;Nginx&lt;/strong&gt; before routing requests to your protected &lt;strong&gt;Node.js API&lt;/strong&gt;, centralizing authorization enforcement at the gateway layer.&lt;br&gt;
This keeps the authentication within the infrastructure boundaries and allows us to simplify both the Go backend and the Node.js backend by relying on the NGINX layer.&lt;/p&gt;
&lt;h2&gt;
  
  
  Why JWT at the Proxy?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decouples concerns&lt;/strong&gt;: Authentication logic doesn't pollute your API code.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistent enforcement&lt;/strong&gt;: All routes must pass the same token checks before hitting backend services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Nginx (especially via OpenResty) is efficient and fast at handling token validation.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Options for JWT Validation
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Validate JWT in each backend service&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Pros: Full control per service.&lt;/li&gt;
&lt;li&gt;Cons: Repeated logic, potential for inconsistency.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Nginx with a third-party JWT module&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Commercial option with NGINX Plus.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use OpenResty (Nginx + Lua) with &lt;code&gt;lua-resty-jwt&lt;/code&gt;&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Open-source, flexible, and efficient.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  OpenResty + Lua
&lt;/h2&gt;

&lt;p&gt;We use OpenResty and the &lt;code&gt;lua-resty-jwt&lt;/code&gt; library to inspect JWTs in the Nginx layer. If valid, we forward requests to the backend. Otherwise, Nginx returns a 401 response.&lt;/p&gt;
&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;auth-api&lt;/code&gt;: issues JWTs via login endpoint.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;node-api&lt;/code&gt;: protected and public routes.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;nginx&lt;/code&gt;: gateway with Lua-based JWT validation.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  Security Considerations
&lt;/h2&gt;

&lt;p&gt;Some of these concerns  were left out of this POC but we would like to mention for a proper production implementation. Please read through and evaluate wether it fits to your scenario or not.&lt;/p&gt;
&lt;h3&gt;
  
  
  Protection Against Common Attacks
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Replay Attacks&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Implement token expiration (exp claim)&lt;/li&gt;
&lt;li&gt;Use short-lived tokens (15-60 minutes)&lt;/li&gt;
&lt;li&gt;Consider implementing a token blacklist for revoked tokens&lt;/li&gt;
&lt;li&gt;Use nonce values in token claims&lt;/li&gt;
&lt;li&gt;Implement request timestamp validation&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Token Theft Prevention&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Always use HTTPS for token transmission&lt;/li&gt;
&lt;li&gt;Implement secure cookie attributes (HttpOnly, Secure, SameSite)&lt;/li&gt;
&lt;li&gt;Use token binding to prevent token reuse&lt;/li&gt;
&lt;li&gt;Implement rate limiting on authentication endpoints&lt;/li&gt;
&lt;li&gt;Monitor for suspicious patterns (multiple failed validations)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Token Expiration Best Practices
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Short-lived Access Tokens&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Set expiration time between 15-60 minutes&lt;/li&gt;
&lt;li&gt;Use refresh tokens for longer sessions&lt;/li&gt;
&lt;li&gt;Implement sliding expiration for active users&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Refresh Token Strategy&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Longer expiration (days/weeks)&lt;/li&gt;
&lt;li&gt;Store refresh tokens securely&lt;/li&gt;
&lt;li&gt;Implement refresh token rotation&lt;/li&gt;
&lt;li&gt;Maintain a refresh token family tree&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Expiration Implementation&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;   &lt;span class="c1"&gt;-- Example of expiration check in Lua&lt;/span&gt;
   &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;jwt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"resty.jwt"&lt;/span&gt;
   &lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;validators&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s2"&gt;"resty.jwt-validators"&lt;/span&gt;

   &lt;span class="n"&gt;validators&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;set_system_leeway&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;-- Strict time validation&lt;/span&gt;
   &lt;span class="n"&gt;validators&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;register_validator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"exp"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;validators&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;opt_is_not_expired&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Grace Period Considerations&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Implement a small grace period (30 seconds) for clock skew&lt;/li&gt;
&lt;li&gt;Handle token expiration gracefully&lt;/li&gt;
&lt;li&gt;Provide clear error messages for expired tokens&lt;/li&gt;
&lt;li&gt;Implement automatic token refresh when possible&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Project Layout
&lt;/h2&gt;

&lt;p&gt;You can find the full source here:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;GitHub Repo: &lt;a href="https://github.com/martinfernandezcx/NGINXAUTH" rel="noopener noreferrer"&gt;martinfernandezcx/NGINXAUTH&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;h2&gt;
  
  
  How It Works
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Client logs in via &lt;code&gt;/api/auth/login&lt;/code&gt;, receives JWT.&lt;/li&gt;
&lt;li&gt;Client sends &lt;code&gt;Authorization: Bearer &amp;lt;token&amp;gt;&lt;/code&gt; on protected requests.&lt;/li&gt;
&lt;li&gt;Nginx runs a Lua script to:

&lt;ul&gt;
&lt;li&gt;Check token structure.&lt;/li&gt;
&lt;li&gt;Validate signature and expiration.&lt;/li&gt;
&lt;li&gt;Inject user ID into a request header.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Validated requests reach the Node.js service with identity attached.&lt;/li&gt;
&lt;/ol&gt;
&lt;h2&gt;
  
  
  Testing with Postman
&lt;/h2&gt;

&lt;p&gt;The project includes a comprehensive Postman test suite to verify the JWT authentication flow and API endpoints. The test suite covers authentication, public routes, and protected routes with various scenarios.&lt;/p&gt;
&lt;h3&gt;
  
  
  Test Suite Structure
&lt;/h3&gt;

&lt;p&gt;The Postman collection (&lt;code&gt;postman/jwt-nginx-auth-tests.json&lt;/code&gt;) includes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Authentication Tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Login endpoint validation&lt;/li&gt;
&lt;li&gt;Token format verification&lt;/li&gt;
&lt;li&gt;Automatic token storage for subsequent requests&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Public Endpoint Tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access to public routes&lt;/li&gt;
&lt;li&gt;Response format validation&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Protected Endpoint Tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Access without token (401)&lt;/li&gt;
&lt;li&gt;Access with invalid token (401)&lt;/li&gt;
&lt;li&gt;Access with valid token (200)&lt;/li&gt;
&lt;li&gt;Response payload validation&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Running the Tests
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Prerequisites&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Install &lt;a href="https://www.postman.com/downloads/" rel="noopener noreferrer"&gt;Postman&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Start the application:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt; docker-compose up &lt;span class="nt"&gt;--build&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Import the Collection&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Open Postman&lt;/li&gt;
&lt;li&gt;Click "Import" button&lt;/li&gt;
&lt;li&gt;Select the &lt;code&gt;postman/jwt-nginx-auth-tests.json&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;select the &lt;code&gt;postman\environment.json&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;The collection will be imported with all test cases&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Run the Tests&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Select the "JWT Nginx Auth Tests" collection&lt;/li&gt;
&lt;li&gt;Click the "Run" button&lt;/li&gt;
&lt;li&gt;Postman will execute all tests in sequence&lt;/li&gt;
&lt;li&gt;View test results in the Postman console&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Test Flow&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tests run in a specific order to ensure proper token handling&lt;/li&gt;
&lt;li&gt;Login test stores the token for subsequent requests&lt;/li&gt;
&lt;li&gt;Protected route tests verify token validation&lt;/li&gt;
&lt;li&gt;Each test includes assertions for status codes and response formats&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Test Cases break down
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Login Test&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;   &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Status code is 200&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;have&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;});&lt;/span&gt;
   &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Response has token&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;jsonData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
       &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;jsonData&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;have&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;property&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Protected Route Test&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;   &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Status code is 200&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;have&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;});&lt;/span&gt;
   &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;test&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Response contains protected data&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;function &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
       &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;jsonData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
       &lt;span class="nx"&gt;pm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;jsonData&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;to&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;have&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;property&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;message&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
   &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  Environment Variables
&lt;/h3&gt;

&lt;p&gt;The test suite uses Postman environment variables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;auth_token&lt;/code&gt;: Automatically set after successful login&lt;/li&gt;
&lt;li&gt;Used in subsequent requests to protected routes&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Continuous Integration
&lt;/h3&gt;

&lt;p&gt;The Postman collection can be integrated into CI/CD pipelines using:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/postmanlabs/newman" rel="noopener noreferrer"&gt;Newman&lt;/a&gt; CLI tool&lt;/li&gt;
&lt;li&gt;Postman's CI/CD integrations&lt;/li&gt;
&lt;li&gt;Custom test runners&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example Newman command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;newman run postman/jwt-nginx-auth-tests.json &lt;span class="nt"&gt;-e&lt;/span&gt; postman/environment.json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Running the tests
&lt;/h3&gt;

&lt;p&gt;To run the tests you can use npm run test:postman:cli, or import both files on postman and run it there as mentioned above.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Centralizing JWT validation in the proxy simplifies backend services, enforces uniform security, and keeps authentication logic out of each microservice. This pattern is ideal for architectures using distinct auth and business logic APIs.&lt;/p&gt;

&lt;p&gt;In contrast, validating tokens in the Node.js API itself might allow greater control over roles or context-based access logic but at the cost of duplication and potential inconsistency.&lt;/p&gt;

&lt;p&gt;OpenResty strikes a solid balance between &lt;strong&gt;performance&lt;/strong&gt;, &lt;strong&gt;flexibility&lt;/strong&gt;, and &lt;strong&gt;maintainability&lt;/strong&gt; in JWT-based authentication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Apendix-A: Problems Found and Solutions
&lt;/h2&gt;

&lt;p&gt;During the implementation of this JWT authentication system, we encountered several issues that required specific solutions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;OpenResty Dependencies&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Missing Perl and curl in the OpenResty Alpine image&lt;/li&gt;
&lt;li&gt;Solution: Added required packages in Dockerfile:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt; RUN apk add --no-cache perl curl
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Nginx User Configuration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Missing nginx user in the container&lt;/li&gt;
&lt;li&gt;Solution: Created nginx user and group:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt; RUN addgroup -S nginx &amp;amp;&amp;amp; adduser -S -G nginx nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;MIME Types Configuration&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Missing mime.types file in OpenResty Alpine image&lt;/li&gt;
&lt;li&gt;Solution: Created custom mime.types file and copied it to the correct location:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt; COPY mime.types /etc/nginx/mime.types
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Lua Package Path&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Lua package path directive in wrong context&lt;/li&gt;
&lt;li&gt;Solution: Moved lua_package_path to http context in nginx.conf:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight nginx"&gt;&lt;code&gt; &lt;span class="k"&gt;http&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
     &lt;span class="kn"&gt;lua_package_path&lt;/span&gt; &lt;span class="s"&gt;"/usr/local/openresty/lualib/?.lua&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;;&lt;span class="kn"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
     &lt;span class="kn"&gt;lua_package_cpath&lt;/span&gt; &lt;span class="s"&gt;"/usr/local/openresty/lualib/?.so&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;;&lt;span class="kn"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
 &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Log Directory Permissions&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Nginx couldn't write to log directory&lt;/li&gt;
&lt;li&gt;Solution: Created log directory and set proper permissions:
&lt;/li&gt;
&lt;/ul&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt; RUN mkdir -p /var/log/nginx &amp;amp;&amp;amp; \
     chown -R nginx:nginx /var/log/nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Unit tests and routes issues&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Problem: Postman tests were failing with 404 on /protected&lt;/li&gt;
&lt;li&gt;Solution: Changed auth-api/index.js /login route and node-api/index.js /protected to /user&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These solutions ensure proper functionality of the JWT authentication system while maintaining security and following best practices for containerized applications.&lt;/p&gt;

</description>
      <category>authentication</category>
      <category>nginx</category>
      <category>microservices</category>
      <category>security</category>
    </item>
    <item>
      <title>MCP - Understanding the Basics and Building a Research Paper Management Chatbot</title>
      <dc:creator>Eze Quiroga</dc:creator>
      <pubDate>Fri, 23 May 2025 18:01:52 +0000</pubDate>
      <link>https://dev.to/cloudx/mcp-understanding-the-basics-and-building-a-research-paper-management-chatbot-18le</link>
      <guid>https://dev.to/cloudx/mcp-understanding-the-basics-and-building-a-research-paper-management-chatbot-18le</guid>
      <description>&lt;h2&gt;
  
  
  🌟 Introduction
&lt;/h2&gt;

&lt;p&gt;Since early 2024, the use of AI Agents that can make autonomous decisions and leverage tools to respond to user prompts has grown rapidly. As these systems evolve, there's been a growing need for a standard way to enable communication between agents and give them richer context to handle more complex tasks through natural language.&lt;/p&gt;

&lt;p&gt;That's where the &lt;strong&gt;Model Context Protocol (MCP)&lt;/strong&gt; comes in. Announced by Anthropic on &lt;strong&gt;November 25, 2024&lt;/strong&gt;, MCP is an open-source protocol that standardizes how large language models (LLMs) interact with external tools and data sources.&lt;/p&gt;

&lt;p&gt;In this post, we'll walk through building a command-line chatbot that manages academic papers using MCP. We'll learn how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an MCP server that exposes tools, resources, and prompt templates&lt;/li&gt;
&lt;li&gt;Connect to third-party MCP servers&lt;/li&gt;
&lt;li&gt;Build a custom MCP client to interact with those servers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end, our chatbot will be able to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Search for academic papers on arXiv&lt;/li&gt;
&lt;li&gt;Organize articles by research topic&lt;/li&gt;
&lt;li&gt;Access metadata about saved papers&lt;/li&gt;
&lt;li&gt;Pull information from any URL using a third-party MCP server&lt;/li&gt;
&lt;li&gt;Generate documents using content retrieved from external sources&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's how we'll break it down:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Local environment setup&lt;/li&gt;
&lt;li&gt;What is MCP?&lt;/li&gt;
&lt;li&gt;Building the MCP server&lt;/li&gt;
&lt;li&gt;Using third-party MCP servers&lt;/li&gt;
&lt;li&gt;Creating the host and client&lt;/li&gt;
&lt;li&gt;Key features&lt;/li&gt;
&lt;li&gt;Final thoughts&lt;/li&gt;
&lt;li&gt;Resources&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Let's get started! 🚀&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Important note: Only the most relevant function signatures and docstrings are shown in this post. You can find the full implementation in &lt;a href="https://github.com/ezequiroga/mcp-bases" rel="noopener noreferrer"&gt;this GitHub repository&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🛠️ Local environment
&lt;/h2&gt;

&lt;p&gt;🐍 Python &lt;code&gt;3.12.8&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;⬢ Node &lt;code&gt;v22.13.0&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Run the following command to install the required Python packages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip3 &lt;span class="nb"&gt;install &lt;/span&gt;dotenv             &lt;span class="c"&gt;# Loads ANTHROPIC_API_KEY from a .env file&lt;/span&gt;
pip3 &lt;span class="nb"&gt;install &lt;/span&gt;anthropic          &lt;span class="c"&gt;# Client for Anthropic's Sonnet model&lt;/span&gt;
pip3 &lt;span class="nb"&gt;install &lt;/span&gt;mcp                &lt;span class="c"&gt;# Core package for MCP servers and clients&lt;/span&gt;
pip3 &lt;span class="nb"&gt;install &lt;/span&gt;arxiv              &lt;span class="c"&gt;# Used for querying arXiv articles&lt;/span&gt;
pip3 &lt;span class="nb"&gt;install &lt;/span&gt;mcp_server_fetch   &lt;span class="c"&gt;# Third-party party MPC server&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Pro tip: Use Python virtual environments and Node Version Manager (NVM) for cleaner dependency management.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  🤔 What Is MCP?
&lt;/h2&gt;

&lt;p&gt;Let's briefly cover what MCP is and how it works. For more details, check out the Resources section at the end.&lt;/p&gt;

&lt;p&gt;MCP (Model Context Protocol) is an open protocol designed to streamline the way LLMs connect to tools and data sources. It follows a client-server architecture where each MCP client maintains a direct, one-to-one connection with each server it talks to.&lt;/p&gt;

&lt;p&gt;Here's the breakdown:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Host&lt;/strong&gt;: The application that embeds the language model (e.g., Claude Desktop or a code editor)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client&lt;/strong&gt;: A component inside the host that manages the connection to one MCP server by invoking tools, querying for resources and interpolating prompts.&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Server&lt;/strong&gt;: Provides context to the LLM via three primitives:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Tools&lt;/strong&gt; – functions that can be invoked by the client. These tools allow for retrieving, searching, sending messages, updating database records are usually meant for data that might require something like a Post request or some kind of modification.&lt;br&gt;
&lt;strong&gt;Resources&lt;/strong&gt; – similar to a Get request. They are read-only data or context that's exposed by the server.similar to a Get request. They are read-only data or context that's exposed by the server.&lt;br&gt;
&lt;strong&gt;Prompt templates&lt;/strong&gt; – predefined templates that live on the server to remove the burden of prompt engineering from users by providing optimized prompts for common tasks.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How They Communicate?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The communication between client and server follows a specific lifecycle. First, there's an initialization process where the client sends a request, the server responds, and sends a confirmation notification. After initialization, both parties can freely exchange messages and notifications.&lt;/p&gt;

&lt;p&gt;To enable this communication, MCP provides different transport mechanisms that handle the actual data flow between client and server:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standard I/O for local servers - The client launches the server as a subprocess and communicates through stdin/stdout&lt;/li&gt;
&lt;li&gt;HTTP with Server-Sent Events for remote servers - Maintains stateful connections between requests&lt;/li&gt;
&lt;li&gt;Streamable HTTP (recommended) - A newer transport that flexibly supports both stateful and stateless connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;As this article focuses on understanding the basics of MCP, we'll use the &lt;code&gt;stdio&lt;/code&gt; transport mechanism in a local environment.&lt;/p&gt;

&lt;p&gt;Let's code 🚀&lt;/p&gt;

&lt;h2&gt;
  
  
  🖥️ Building the MCP Server
&lt;/h2&gt;

&lt;p&gt;We'll use the &lt;code&gt;FastMCP&lt;/code&gt; framework to build our own MCP server in &lt;code&gt;research_server.py&lt;/code&gt;. FastMCP offers handy decorators to expose:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tools via &lt;code&gt;@mcp.tool()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Resources via &lt;code&gt;@mcp.resource()&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Prompt templates via &lt;code&gt;@mcp.prompt()&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  1. Define the Server
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;mcp.server.fastmcp&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;FastMCP&lt;/span&gt;

&lt;span class="c1"&gt;# Create the server
&lt;/span&gt;&lt;span class="n"&gt;mcp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FastMCP&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;research&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tools
&lt;/h3&gt;

&lt;p&gt;Once the server is defined, we can start defining primitives. Let's start by defining two tools using &lt;code&gt;@mcp.tool()&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;1- &lt;code&gt;search_papers&lt;/code&gt;: Searches arXiv for articles on a specific topic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;search_papers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Search for papers on arXiv based on a topic and store their information.

    Args:
        topic: The topic to search for
        max_results: Maximum number of results to retrieve (default: 5)

    Returns:
        List of paper IDs found in the search
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation details...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;2- &lt;code&gt;extract_info&lt;/code&gt;: Retrieves metadata for a specific paper;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.tool&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;extract_info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paper_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Search for information about a specific paper across all topic directories.

    Args:
        paper_id: The ID of the paper to look for

    Returns:
        JSON string with paper information if found, error message if not found
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation details...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Resources
&lt;/h3&gt;

&lt;p&gt;Next, we need to define how users can access all available research topics and retrieve articles for a specific topic. For this, we expose two resources using &lt;code&gt;@mcp.resource(...)&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;1- A list of available research topics:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;papers://folders&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_available_folders&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    List all available topic folders in the papers directory.

    This resource provides a simple list of all available topic folders.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation details...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;2- Articles stored under a given topic:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;papers://{topic}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_topic_papers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Get detailed information about papers on a specific topic.

    Args:
        topic: The research topic to retrieve papers for
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation details...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Prompt templates
&lt;/h3&gt;

&lt;p&gt;To reduce the need for manual prompt engineering, we can define prompt templates. These are listed by the MCP client, interpolated with user input, and then sent to the LLM.&lt;/p&gt;

&lt;p&gt;To expose prompt templates, we must use &lt;code&gt;@mcp.prompt()&lt;/code&gt;. In our case, we will create just one prompt template, which will perform the search for articles in arXiv:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@mcp.prompt&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_search_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_papers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Generate a prompt for Claude to find and discuss academic papers on a specific topic.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Implementation details...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing our server
&lt;/h3&gt;

&lt;p&gt;There's a great way to test our server using the &lt;strong&gt;Model Context Protocol Inspector&lt;/strong&gt;, a handy tool built to explore MCP servers. As it's written in Type Script, we need to use &lt;code&gt;npx&lt;/code&gt;:&lt;/p&gt;

&lt;p&gt;The fisrt step is to install the last npm version:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; npm@latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then, run in the command line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx @modelcontextprotocol/inspector python3 research_server.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Once the server is up and running, the URI is displayed in the console. By clicking it, the browser will show you this:&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs0rpr39ywifrqjh6y2d.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fhs0rpr39ywifrqjh6y2d.png" alt="MCP Inspector Interface" width="800" height="677"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Click the "Connect" button to start interacting with the server. You can then test the available tools, resources and prompts through the inspector interface.&lt;/p&gt;
&lt;h2&gt;
  
  
  🔌 Using Third-Party MCP Servers
&lt;/h2&gt;

&lt;p&gt;Now that we've built our own server, we can also use third-party MCP servers.&lt;/p&gt;

&lt;p&gt;In this case, we'll use the following two servers provided by Anthropic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/modelcontextprotocol/servers/tree/main/src/filesystem" rel="noopener noreferrer"&gt;Filesystem MCP Server&lt;/a&gt; -&amp;gt; handles file operations&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://github.com/modelcontextprotocol/servers/tree/main/src/fetch" rel="noopener noreferrer"&gt;Fetch MCP Server&lt;/a&gt; -&amp;gt; fetches information from URIs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both MCP servers have been developed by Anthropic and are used from their &lt;a href="https://github.com/modelcontextprotocol/servers?tab=readme-ov-file" rel="noopener noreferrer"&gt;official GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For our &lt;em&gt;host&lt;/em&gt; to create the MCP clients and connect to these servers, we need to create the &lt;code&gt;server_config.json&lt;/code&gt; file.&lt;/p&gt;

&lt;p&gt;In this file, we'll define the three servers we want to connect to. For each one, we must specify how they should be run. This information can be found in each server's documentation. In our case, our &lt;em&gt;.json&lt;/em&gt; would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;

        &lt;/span&gt;&lt;span class="err"&gt;//Third-Party&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MCP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;developed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;TypeScript&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"filesystem"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"@modelcontextprotocol/server-filesystem"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
                &lt;/span&gt;&lt;span class="s2"&gt;"."&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

        &lt;/span&gt;&lt;span class="err"&gt;//Our&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MCP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;developed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Python&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"research"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"research_server.py"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;

        &lt;/span&gt;&lt;span class="err"&gt;//Third-Party&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;MCP&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;server&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;developed&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;Python&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"fetch"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"python3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-m"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp_server_fetch"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  👨‍💻 Creating the Host and Clients
&lt;/h2&gt;

&lt;p&gt;Our chatbot will serve as the host and create MCP clients to connect with each server. We'll define a class called &lt;code&gt;MCP_ChatBot&lt;/code&gt;, which will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Load and connect to the configured MCP servers&lt;/li&gt;
&lt;li&gt;Establish individual server connections&lt;/li&gt;
&lt;li&gt;Manage available tools and sessions&lt;/li&gt;
&lt;li&gt;Handle user queries&lt;/li&gt;
&lt;li&gt;Gracefully shut down connections&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;connect_to_servers()&lt;/code&gt; method will load the server configurations and establish connections to all servers, while &lt;code&gt;connect_to_server()&lt;/code&gt; handles connecting to individual servers and registering their available tools.&lt;/p&gt;

&lt;p&gt;Additionally, it will contain several methods:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;chat_loop(...)&lt;/code&gt; manages the command-line UI and allows users to enter their prompts&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;process_query(...)&lt;/code&gt; processes prompts using Anthropic's Sonnet as the LLM&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;cleanup(...)&lt;/code&gt; closes all client connections to servers when the user ends the chat&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our host &lt;code&gt;mcp_chatbot.py&lt;/code&gt; therefore has the following structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MCP_ChatBot&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="c1"&gt;# Managing the MCP's servers sessions
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AsyncExitStack&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;anthropic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Anthropic&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;available_tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ToolDefinition&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="c1"&gt;# All the availables tools
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_to_session&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Dict&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="c1"&gt;# Relations between tools and servers
&lt;/span&gt;    &lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;connect_to_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_config&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;server_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StdioServerParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;server_config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;stdio_transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enter_async_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nf"&gt;stdio_client&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;stdio_transport&lt;/span&gt;
            &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exit_stack&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;enter_async_context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="nc"&gt;ClientSession&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# List available tools for this session
&lt;/span&gt;            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;list_tools&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tools&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;Connected to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;server_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; with tools:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_to_session&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;available_tools&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;description&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;input_schema&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inputSchema&lt;/span&gt;
                &lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to connect to &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;server_name&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;connect_to_servers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Connect to all configured MCP servers.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Here we are loading the servers out host must connect with
&lt;/span&gt;            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;server_config.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;servers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mcpServers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;server_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_config&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;servers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="c1"&gt;# This will create a client/session for each server
&lt;/span&gt;                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect_to_server&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;server_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;server_config&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error loading server configuration: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat_loop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;&lt;span class="bp"&gt;...&lt;/span&gt;

    &lt;span class="bp"&gt;...&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;chatbot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;MCP_ChatBot&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatbot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect_to_servers&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatbot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;chat_loop&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;finally&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;chatbot&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cleanup&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This way, our &lt;em&gt;host&lt;/em&gt; (mcp_chatbot.py):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Creates as many clients as servers are defined in the &lt;code&gt;servers_config.json&lt;/code&gt; file&lt;/li&gt;
&lt;li&gt;Stores each of these connections&lt;/li&gt;
&lt;li&gt;Lists the available tools in each server and stores them&lt;/li&gt;
&lt;li&gt;Processes user queries&lt;/li&gt;
&lt;li&gt;Manages the chat interface&lt;/li&gt;
&lt;li&gt;Cleans up connections when the user ends execution&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  ⚙️ Running the Chatbot
&lt;/h2&gt;

&lt;p&gt;To launch the chatbot, simply run:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 mcp_chatbot.py
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see the chatbot connect to all three servers.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb43h1sgnj6u4l0lrghrc.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb43h1sgnj6u4l0lrghrc.png" alt="Running the chatbot" width="800" height="203"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Then, try a query like: &lt;em&gt;search for two articles about deep learning and provide your summary of both&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgh2r3vii9n5qhdi2cml.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqgh2r3vii9n5qhdi2cml.png" alt="Running the chatbot" width="800" height="531"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Behind the scenes, it will:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Call &lt;code&gt;tool search_papers with args {'topic': 'deep learning', 'max_results': 2}&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Call &lt;code&gt;tool extract_info with args {'paper_id': '1805.08355v1'}&lt;/code&gt; and &lt;code&gt;tool extract_info with args {'paper_id': '1806.01756v1'}&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;.json&lt;/code&gt; created by running this prompt looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"1805.08355v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Opening the black box of deep learning"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"authors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Dian Lei"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Xiaoxiao Chen"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Jianfei Zhao"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"The great success of deep learning ..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pdf_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://arxiv.org/pdf/1805.08355v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"published"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2018-05-22"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"1806.01756v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Concept-Oriented Deep Learning"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"authors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Daniel T Chang"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Concepts are the foundation of human deep learning..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pdf_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://arxiv.org/pdf/1806.01756v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"published"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2018-06-05"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The second try fetches information from a particular URI: &lt;em&gt;look into &lt;a href="https://deeplearning.ai" rel="noopener noreferrer"&gt;https://deeplearning.ai&lt;/a&gt;, extract one relevant concept and research articles about it&lt;/em&gt;&lt;br&gt;
&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovg9pilpeooy1aolmzve.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fovg9pilpeooy1aolmzve.png" alt="Running the chatbot" width="800" height="252"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This time, the first step is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Calling tool fetch with args {'url': 'https://deeplearning.ai'}&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then, as we haven't specifyed the number of articles to look for, it uses the default value of 5:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Calling tool search_papers with args {'topic': 'Retrieval-Augmented Generation RAG LLM', 'max_results': 5}&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Finally, the five calls for extracting information are made:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Calling tool extract_info with args {'paper_id': '2409.01666v1'}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Calling tool extract_info with args {'paper_id': '2501.00353v1'}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Calling tool extract_info with args {'paper_id': '2407.21059v1'}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Calling tool extract_info with args {'paper_id': '2501.05249v1'}&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Calling tool extract_info with args {'paper_id': '2504.08758v1'}&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The final response of this try would look like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Based on my research of DeepLearning.ai...

# Retrieval-Augmented Generation (RAG) in Modern AI

## What is RAG?
Retrieval-Augmented Generation (RAG)...

## Why RAG Matters
According to recent research...

## Applications of RAG
RAG has proven particularly valuable in:...

## Future Directions
The modular approach to RAG systems suggests...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;.json&lt;/code&gt; created by running this prompt looks similar to the previous one but contains five documents:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2409.01666v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"title"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"In Defense of RAG..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"authors"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Tan Yu"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Anbang Xu"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Rama Akkiraju"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"summary"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"pdf_url"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"http://arxiv.org/pdf/2409.01666v1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"published"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-09-03"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2501.00353v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2407.21059v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2501.05249v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"2504.08758v1"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For finishig the chatbot, just type &lt;code&gt;quit&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Awesome! We managed to connect and use many MCP servers from our host, creating MCP clients and maintaining 1:1 sessions between clients and servers.&lt;/p&gt;

&lt;h2&gt;
  
  
  ✅ Key Features
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Organized by Topic&lt;/strong&gt;: Easily browse articles by research theme&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Storage&lt;/strong&gt;: Articles metadata are saved locally as JSON&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Chat UI&lt;/strong&gt;: Simple and effective CLI-based interface&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Smart Summaries&lt;/strong&gt;: Summarizations powered by Claude&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool &amp;amp; Resource Management&lt;/strong&gt;: Clean separation between read-only data and actions&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  💬 Final thoughts
&lt;/h2&gt;

&lt;p&gt;This post demonstrates how MCP can be used to build AI applications that interact with external data sources. By standardizing how AI applications connect with tools and data, MCP makes it easier to build and maintain complex AI systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  📚 Resources
&lt;/h2&gt;

&lt;p&gt;Full code of this post 👉 &lt;a href="https://github.com/ezequiroga/mcp-bases" rel="noopener noreferrer"&gt;ezequiroga/mcp-bases&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MCP servers 👉 &lt;a href="https://github.com/modelcontextprotocol/servers" rel="noopener noreferrer"&gt;Model Context Protocol servers&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Official MCP documentation 👉 &lt;a href="https://modelcontextprotocol.io" rel="noopener noreferrer"&gt;MCP Documentation&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Anthropic article introducing MCP 👉 &lt;a href="https://www.anthropic.com/news/model-context-protocol" rel="noopener noreferrer"&gt;Introducing the Model Context Protocol&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The origins of MCP, explained by Mike Krieger 👉 &lt;a href="https://youtu.be/Js1gU6L1Zi8?si=TBI8cVGrQfcxmYSg&amp;amp;t=218" rel="noopener noreferrer"&gt;Anthropic CPO Mike Krieger: Building AI Products From the Bottom Up&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;MCP short course 👉 &lt;a href="https://www.deeplearning.ai/short-courses/mcp-build-rich-context-ai-apps-with-anthropic/" rel="noopener noreferrer"&gt;MCP: Build Rich-Context AI Apps with Anthropic&lt;/a&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>ai</category>
      <category>chatbot</category>
      <category>anthropic</category>
    </item>
    <item>
      <title>How Much Does It Really Cost to Run a Voice-AI Agent at Scale?</title>
      <dc:creator>Roman Piacquadio</dc:creator>
      <pubDate>Tue, 20 May 2025 17:00:13 +0000</pubDate>
      <link>https://dev.to/cloudx/how-much-does-it-really-cost-to-run-a-voice-ai-agent-at-scale-8en</link>
      <guid>https://dev.to/cloudx/how-much-does-it-really-cost-to-run-a-voice-ai-agent-at-scale-8en</guid>
      <description>&lt;h2&gt;
  
  
  1) Why Voice Automation Is Worth Investigating (Even If You’re Not Replacing Humans)
&lt;/h2&gt;

&lt;p&gt;Voice automation has made significant progress in recent years, powered by improvements in transcription, real-time audio routing, and large language models. What was once a clunky, robotic experience is now capable of holding fluent, natural-sounding conversations with real people—across a variety of use cases.&lt;/p&gt;

&lt;p&gt;This doesn’t mean we’re replacing human agents. Quite the opposite: automation lets us offload the &lt;em&gt;repetitive, high-frequency, low-complexity tasks&lt;/em&gt; that tend to burn out human teams, and instead reserve human attention for edge cases, escalations, and creative problem solving.&lt;/p&gt;

&lt;p&gt;Whether you're handling inbound customer service, qualifying outbound leads, or proactively following up on time-sensitive actions, a well-orchestrated voice-AI pipeline can free up valuable resources—&lt;em&gt;if the economics make sense&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That last part is key. Many developers assume that using AI to automate voice interactions is inherently cost-effective. But is it really cheaper than staffing a team? How much do you actually pay per call once you add up every component: speech recognition, synthesis, telephony, model inference, and orchestration?&lt;/p&gt;

&lt;p&gt;This article takes a grounded approach to that question. We'll break down a full-stack voice-AI system—from SIP trunk to final response—and price each piece out based on real-world usage. To make it concrete, we'll walk through a common use case: outbound phone calls of around 3 minutes in duration. But the same framework applies to inbound routing, reminders, surveys, or any other automated voice interaction.&lt;/p&gt;

&lt;p&gt;Let’s dig into how it all connects—and what it really costs to make it work at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  2) System Architecture: How a Voice-AI Pipeline Works End-to-End
&lt;/h2&gt;

&lt;p&gt;Before jumping into costs, it’s important to understand how the system is architected. At a high level, a voice-based AI interaction consists of several components working together in real time to process speech, generate responses, and keep the conversation flowing naturally.&lt;/p&gt;

&lt;p&gt;Here’s a simplified view of the architecture:&lt;/p&gt;

&lt;p&gt;Caller (PSTN or SIP)&lt;br&gt;
⬇️&lt;br&gt;
SIP Trunk (Twilio or Telnyx)&lt;br&gt;
⬇️&lt;br&gt;
LiveKit (Voice orchestration &amp;amp; media routing)&lt;br&gt;
⬇️&lt;br&gt;
Speech-to-Text (Deepgram)&lt;br&gt;
⬇️&lt;br&gt;
Language Model (OpenAI GPT-4.1 mini)&lt;br&gt;
⬇️&lt;br&gt;
Text-to-Speech (ElevenLabs or Cartesia)&lt;br&gt;
⬇️&lt;br&gt;
LiveKit (returns audio back)&lt;br&gt;
⬇️&lt;br&gt;
Caller&lt;/p&gt;

&lt;p&gt;Each component has a distinct role:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SIP Trunk (Twilio or Telnyx):&lt;/strong&gt; Bridges public phone lines to our backend system.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LiveKit:&lt;/strong&gt; Manages the real-time audio streams and orchestrates audio I/O between services.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deepgram:&lt;/strong&gt; Transcribes what the user says into text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT-4.1 mini:&lt;/strong&gt; Interprets the message and generates a textual response.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-Speech (TTS):&lt;/strong&gt; Converts the AI response into natural-sounding speech.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LiveKit (again):&lt;/strong&gt; Sends the generated audio back to the caller, closing the loop.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're interested in building this kind of setup yourself, I’ve written a few posts on setting up each component individually (from SIP trunking to LiveKit integration). You can check those out here: &lt;strong&gt;[&lt;a href="https://dev.to/roman_piacquadio/series/31126"&gt;https://dev.to/roman_piacquadio/series/31126&lt;/a&gt;]&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In the next section, we’ll define the assumptions behind our usage model so we can start assigning real-world costs to each of these layers.&lt;/p&gt;

&lt;h2&gt;
  
  
  3) Baseline Assumptions for Cost Estimation
&lt;/h2&gt;

&lt;p&gt;To make the cost analysis meaningful, we need a realistic usage scenario. We’ll base our calculations on the following assumptions, which simulate a small team operating at steady scale:&lt;/p&gt;

&lt;h3&gt;
  
  
  📞 Call Volume
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Each automated call lasts &lt;strong&gt;3 minutes&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The AI system handles &lt;strong&gt;100 calls per day&lt;/strong&gt;, simulating the workload of one full-time agent.&lt;/li&gt;
&lt;li&gt;We simulate &lt;strong&gt;10 AI agents&lt;/strong&gt;, running &lt;strong&gt;22 business days per month&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;That results in &lt;strong&gt;22,000 calls per month&lt;/strong&gt;, totaling &lt;strong&gt;66,000 minutes of audio&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🗣️ Talk Time Distribution
&lt;/h3&gt;

&lt;p&gt;In a natural two-way conversation, only one party speaks at a time. For simplicity:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;AI speaks for ~50%&lt;/strong&gt; of the time → &lt;strong&gt;33,000 minutes of TTS&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;caller (human)&lt;/strong&gt; speaks for the other 50% → &lt;strong&gt;33,000 minutes of STT&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This split is a reasonable average for structured interactions such as confirmations, reminders, qualification flows, or outbound follow-ups.&lt;/p&gt;

&lt;h3&gt;
  
  
  🤖 Model and Service Selection
&lt;/h3&gt;

&lt;p&gt;We picked tools that balance &lt;strong&gt;quality&lt;/strong&gt;, &lt;strong&gt;latency&lt;/strong&gt;, and &lt;strong&gt;cost-effectiveness&lt;/strong&gt; for production-level use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LiveKit&lt;/strong&gt;: to orchestrate real-time audio and SIP integration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deepgram Nova-2 (Enterprise)&lt;/strong&gt;: for fast, accurate streaming transcription with low per-minute cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GPT-4.1 mini&lt;/strong&gt;: a lightweight, affordable OpenAI model that still delivers strong reasoning and fluency.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-Speech&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;ElevenLabs (Business plan)&lt;/strong&gt;: premium voices with emotional range and expressiveness.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cartesia (Scale plan)&lt;/strong&gt;: lower-cost alternative for simpler use cases.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;SIP Trunking&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Twilio&lt;/strong&gt;: simple and widely used.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telnyx&lt;/strong&gt;: cost-competitive with flexible routing.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This setup lets us explore the full range of pricing scenarios—from economy stacks to premium voice experiences—while keeping the core system consistent.&lt;/p&gt;

&lt;h2&gt;
  
  
  4) LiveKit — Orchestrating the Audio Layer
&lt;/h2&gt;

&lt;p&gt;At the center of our voice pipeline is &lt;strong&gt;LiveKit&lt;/strong&gt;, which handles real-time audio routing between the SIP trunk, transcription, TTS, and back to the caller. It’s the glue that makes low-latency, bidirectional communication possible.&lt;/p&gt;

&lt;p&gt;LiveKit offers a generous &lt;strong&gt;Scale Plan&lt;/strong&gt; designed for production workloads:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Base cost:&lt;/strong&gt; $500/month&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Includes:&lt;/strong&gt; 45,000 minutes of usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overage rate:&lt;/strong&gt; $0.003 per additional minute&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔢 Usage Breakdown
&lt;/h3&gt;

&lt;p&gt;With 22,000 calls per month at 3 minutes each, we consume:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;66,000 total minutes&lt;/strong&gt; of LiveKit usage (audio flowing in and out).&lt;/li&gt;
&lt;li&gt;We exceed the plan’s included minutes by &lt;strong&gt;21,000 minutes&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Overage cost: 21,000 × $0.003 = &lt;strong&gt;$63&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  💰 Total Monthly Cost for LiveKit
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Base plan: $500&lt;/li&gt;
&lt;li&gt;Overage: $63&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total: $563/month&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧮 Cost per Unit
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per call&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$563 ÷ 22,000 = &lt;strong&gt;$0.0256&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per hour&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20 calls/hour → 20 × $0.0256 = &lt;strong&gt;$0.512&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;LiveKit’s pricing scales linearly with usage and is well-suited for handling concurrent calls without needing to manage media servers manually.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing based on public rates as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  5) Speech-to-Text — Transcribing the Human Side with Deepgram
&lt;/h2&gt;

&lt;p&gt;To understand what the caller says, we need fast and accurate transcription. For this, we use &lt;strong&gt;Deepgram&lt;/strong&gt;, a popular speech-to-text (STT) provider known for real-time streaming, multilingual support, and competitive enterprise pricing.&lt;/p&gt;

&lt;p&gt;We selected the &lt;strong&gt;Nova-2 model (Enterprise tier)&lt;/strong&gt; for its balance of speed, accuracy, and affordability.&lt;/p&gt;

&lt;h3&gt;
  
  
  🎧 Why Streaming Matters
&lt;/h3&gt;

&lt;p&gt;In a voice AI pipeline, latency is critical. Transcription must happen as the user speaks—not after they’ve finished—so the AI can respond naturally in near real-time.&lt;/p&gt;

&lt;p&gt;Streaming STT allows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lower response times (smoother dialogue)&lt;/li&gt;
&lt;li&gt;Efficient token handling for the language model&lt;/li&gt;
&lt;li&gt;Better support for interruptions and barge-in behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Deepgram’s Nova-2 model supports all of this out of the box.&lt;/p&gt;

&lt;h3&gt;
  
  
  💰 Enterprise Pricing (Nova-2)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Rate:&lt;/strong&gt; $0.0047 per minute (Enterprise tier, streaming)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Applicable usage:&lt;/strong&gt; Only transcribing the &lt;strong&gt;human side&lt;/strong&gt; of the call (50% of total time)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monthly usage:&lt;/strong&gt; 22,000 calls × 1.5 min (caller talk time) = &lt;strong&gt;33,000 minutes&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧮 Cost Breakdown
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per call&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1.5 min × $0.0047 = &lt;strong&gt;$0.00705&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per hour&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20 calls/hour → 20 × $0.00705 = &lt;strong&gt;$0.141&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total per month (10 agents)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;33,000 min × $0.0047 = &lt;strong&gt;$155.10&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This STT layer is one of the more affordable components of the pipeline, thanks to Deepgram’s usage-based pricing and efficient streaming infrastructure.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing based on public Enterprise rates as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  6) Language Model — Token Costs with GPT-4.1 mini
&lt;/h2&gt;

&lt;p&gt;The core of any conversational AI system is the language model that generates responses. In our setup, we use &lt;strong&gt;OpenAI’s GPT-4.1 mini&lt;/strong&gt;, which offers a great trade-off between intelligence, latency, and price.&lt;/p&gt;

&lt;p&gt;Unlike flat-rate pricing, token billing in GPT models varies depending on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Input tokens&lt;/strong&gt; (the prompt + conversation history)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output tokens&lt;/strong&gt; (the generated reply)&lt;/li&gt;
&lt;li&gt;Whether input tokens are &lt;strong&gt;cached&lt;/strong&gt; (like a static system prompt) or &lt;strong&gt;non-cached&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Let’s break that down.&lt;/p&gt;




&lt;h3&gt;
  
  
  📥 Input Tokens
&lt;/h3&gt;

&lt;p&gt;Each user message builds on previous context. So as the conversation progresses, the number of input tokens increases with every new request.&lt;/p&gt;

&lt;p&gt;For a 3-minute call with &lt;strong&gt;6 back-and-forth exchanges&lt;/strong&gt;, we estimate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System prompt:&lt;/strong&gt; ~2,000 tokens (sent with every request, but cacheable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conversation context:&lt;/strong&gt; grows each turn (averaging 975 tokens total)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total input tokens per call:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cached input:&lt;/strong&gt; 6 × 2,000 = 12,000 tokens&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Non-cached input:&lt;/strong&gt; ~975 tokens&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;




&lt;h3&gt;
  
  
  📤 Output Tokens
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;The model responds 6 times (one per turn), with ~35 tokens per reply&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total output tokens per call:&lt;/strong&gt; ~210 tokens&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  💰 GPT-4.1 mini Pricing (May 2025)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Token Type&lt;/th&gt;
&lt;th&gt;Rate per million&lt;/th&gt;
&lt;th&gt;Per-token cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cached input&lt;/td&gt;
&lt;td&gt;$0.10 / 1M&lt;/td&gt;
&lt;td&gt;$0.00000010&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Regular input&lt;/td&gt;
&lt;td&gt;$0.40 / 1M&lt;/td&gt;
&lt;td&gt;$0.00000040&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;$1.60 / 1M&lt;/td&gt;
&lt;td&gt;$0.00000160&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🧮 Cost Breakdown (Per Call)
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Token Type&lt;/th&gt;
&lt;th&gt;Tokens&lt;/th&gt;
&lt;th&gt;Rate&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cached input&lt;/td&gt;
&lt;td&gt;12,000&lt;/td&gt;
&lt;td&gt;$0.10 / 1M&lt;/td&gt;
&lt;td&gt;$0.00120&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Non-cached input&lt;/td&gt;
&lt;td&gt;975&lt;/td&gt;
&lt;td&gt;$0.40 / 1M&lt;/td&gt;
&lt;td&gt;$0.00039&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Output&lt;/td&gt;
&lt;td&gt;210&lt;/td&gt;
&lt;td&gt;$1.60 / 1M&lt;/td&gt;
&lt;td&gt;$0.000336&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.001926&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  📊 Aggregate Costs
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per call&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.001926&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per hour&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;20 × $0.001926 = &lt;strong&gt;$0.0385&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total per month (10 agents)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;22,000 × $0.001926 = &lt;strong&gt;$42.37&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;p&gt;GPT-4.1 mini’s pricing structure rewards careful prompt engineering and context management. While the per-call cost is low, the input token growth curve makes it important to minimize unnecessary repetition in multi-turn conversations (e.g. trimming old exchanges or summarizing history).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing based on OpenAI’s GPT-4.1 mini public rates as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  7) Text-to-Speech — Choosing Between ElevenLabs and Cartesia
&lt;/h2&gt;

&lt;p&gt;For the AI to respond naturally, we need to convert the model’s text output into high-quality speech. In our analysis, we compared two leading &lt;strong&gt;Text-to-Speech (TTS)&lt;/strong&gt; providers: &lt;strong&gt;ElevenLabs&lt;/strong&gt; and &lt;strong&gt;Cartesia&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Both platforms deliver excellent results:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🗣️ &lt;strong&gt;Multilingual support&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;🎭 &lt;strong&gt;Voice cloning capabilities&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;⚡ &lt;strong&gt;Optimized for low-latency streaming&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key differences lie in pricing and voice variety.&lt;/p&gt;

&lt;h3&gt;
  
  
  🧬 ElevenLabs (Business Plan)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Extensive voice library&lt;/strong&gt;, including highly expressive and emotional voices.&lt;/li&gt;
&lt;li&gt;Well-suited for customer-facing conversations where tone and nuance matter.&lt;/li&gt;
&lt;li&gt;Plan includes &lt;strong&gt;22,000 minutes&lt;/strong&gt; for &lt;strong&gt;$1,320/month&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Overage minutes&lt;/strong&gt; cost $0.06/min.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We need &lt;strong&gt;33,000 minutes&lt;/strong&gt; per month (AI speaks ~50% of every call):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Base: $1,320
&lt;/li&gt;
&lt;li&gt;Overage: 11,000 × $0.06 = $660
&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Total: $1,980/month&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧪 Cartesia (Scale Plan)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Smaller voice library, but &lt;strong&gt;still multilingual and highly intelligible&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;More cost-effective for less expressive use cases.&lt;/li&gt;
&lt;li&gt;Estimated cost: &lt;strong&gt;$0.0299/min&lt;/strong&gt; under the Scale plan.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Monthly cost for 33,000 minutes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;33,000 × $0.0299 ≈ &lt;strong&gt;$986.70/month&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🧮 Cost Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Rate&lt;/th&gt;
&lt;th&gt;Monthly Cost&lt;/th&gt;
&lt;th&gt;Per Call&lt;/th&gt;
&lt;th&gt;Per Hour&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;ElevenLabs&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$1,980/month (blended)&lt;/td&gt;
&lt;td&gt;$1,980&lt;/td&gt;
&lt;td&gt;$0.09&lt;/td&gt;
&lt;td&gt;$1.80&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cartesia&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.0299/min&lt;/td&gt;
&lt;td&gt;$986.70&lt;/td&gt;
&lt;td&gt;$0.0449&lt;/td&gt;
&lt;td&gt;$0.897&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🎯 Which One Should You Use?
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Choose &lt;strong&gt;ElevenLabs&lt;/strong&gt; if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You want &lt;strong&gt;high voice fidelity&lt;/strong&gt;, emotional range, or public-facing use.&lt;/li&gt;
&lt;li&gt;You care about building brand consistency with &lt;strong&gt;custom voices&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;Choose &lt;strong&gt;Cartesia&lt;/strong&gt; if:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You’re optimizing for &lt;strong&gt;cost and speed&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Expressiveness is less critical (e.g. follow-up reminders, routing flows).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;Both providers are strong technically, with &lt;strong&gt;low-latency streaming&lt;/strong&gt;, voice cloning, and multilingual support out of the box.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing based on public rates as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  8) SIP Trunking — Connecting to the Phone Network (Twilio vs Telnyx)
&lt;/h2&gt;

&lt;p&gt;To make and receive real phone calls, we need to connect our voice-AI system to the &lt;strong&gt;PSTN (Public Switched Telephone Network)&lt;/strong&gt;. This is where &lt;strong&gt;SIP trunking&lt;/strong&gt; comes in. It acts as the bridge between the internet and traditional phone numbers.&lt;/p&gt;

&lt;p&gt;In our setup, we evaluated two leading providers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Twilio&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Telnyx&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both integrate seamlessly with &lt;strong&gt;LiveKit&lt;/strong&gt;, enabling bi-directional SIP call routing with support for outbound and inbound audio streams.&lt;/p&gt;




&lt;h3&gt;
  
  
  🔁 Understanding the Billing: Origination vs Termination
&lt;/h3&gt;

&lt;p&gt;SIP trunking costs are typically split into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Termination&lt;/strong&gt; — outbound calls (your AI calls a user)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Origination&lt;/strong&gt; — inbound calls (users call your AI)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phone number rental&lt;/strong&gt; — flat monthly rate per number&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For this analysis, we assume &lt;strong&gt;outbound calling to U.S. local numbers&lt;/strong&gt; (the AI initiates the conversation).&lt;/p&gt;




&lt;h3&gt;
  
  
  💰 Cost Comparison: Twilio vs Telnyx
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Twilio&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;Telnyx&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Termination (outbound)&lt;/td&gt;
&lt;td&gt;$0.0011/min&lt;/td&gt;
&lt;td&gt;$0.0050/min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Origination (inbound)&lt;/td&gt;
&lt;td&gt;$0.0034/min&lt;/td&gt;
&lt;td&gt;$0.0035/min&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total per minute&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.0045/min&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$0.0085/min&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost per 3-min call&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.0135&lt;/td&gt;
&lt;td&gt;$0.0255&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost per hour (20 calls)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.27&lt;/td&gt;
&lt;td&gt;$0.51&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Monthly cost (22,000 calls)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$297.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$561.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: Phone number rental (e.g. $1.15/month for a local number) is a small fixed cost and not included here, since it’s negligible at volume.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  📌 Summary
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Twilio&lt;/strong&gt; is more cost-effective at lower scale, with highly transparent pricing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Telnyx&lt;/strong&gt; offers flexibility, more control over routing, and competitive rates at higher volumes, especially for international calls.&lt;/li&gt;
&lt;li&gt;Both are easy to integrate with &lt;strong&gt;LiveKit SIP features&lt;/strong&gt;, making them suitable choices depending on your cost or feature preferences.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;Pricing based on public SIP trunking rates as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Putting It All Together — Full Stack Cost Comparison
&lt;/h2&gt;

&lt;p&gt;Now that we’ve broken down each component, let’s summarize the total cost of running a fully orchestrated voice AI system. We'll compare two realistic deployment stacks:&lt;/p&gt;

&lt;h3&gt;
  
  
  🟢 Economy Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TTS:&lt;/strong&gt; Cartesia (Scale plan)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIP Trunking:&lt;/strong&gt; Twilio&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;STT:&lt;/strong&gt; Deepgram (Nova-2 Enterprise)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM:&lt;/strong&gt; GPT-4.1 mini&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio Orchestration:&lt;/strong&gt; LiveKit (Scale plan)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🔵 Premium Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;TTS:&lt;/strong&gt; ElevenLabs (Business plan + overage)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SIP Trunking:&lt;/strong&gt; Telnyx&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;STT:&lt;/strong&gt; Deepgram (Nova-2 Enterprise)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LLM:&lt;/strong&gt; GPT-4.1 mini&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio Orchestration:&lt;/strong&gt; LiveKit (Scale plan)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  💵 Cost Comparison Table
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Economy Stack&lt;/th&gt;
&lt;th&gt;Premium Stack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LiveKit&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$563.00&lt;/td&gt;
&lt;td&gt;$563.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;STT (Deepgram)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$155.10&lt;/td&gt;
&lt;td&gt;$155.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM (GPT-4.1 mini)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$42.37&lt;/td&gt;
&lt;td&gt;$42.37&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TTS&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$986.70 (Cartesia)&lt;/td&gt;
&lt;td&gt;$1,980.00 (11Labs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;SIP Trunking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$297.00 (Twilio)&lt;/td&gt;
&lt;td&gt;$561.00 (Telnyx)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TOTAL&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2,044.17&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$3,301.47&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🧮 Unit Economics
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Economy Stack&lt;/th&gt;
&lt;th&gt;Premium Stack&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per call&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$0.0929&lt;/td&gt;
&lt;td&gt;$0.1500&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Per hour (20 calls)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;$1.86&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h3&gt;
  
  
  🏆 Which Stack Wins?
&lt;/h3&gt;

&lt;p&gt;The &lt;strong&gt;Economy Stack&lt;/strong&gt; clearly offers &lt;strong&gt;substantial savings&lt;/strong&gt;, making it a great choice for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-volume, low-complexity workflows&lt;/li&gt;
&lt;li&gt;Prototypes or early-stage deployments&lt;/li&gt;
&lt;li&gt;Use cases where expressive TTS is not critical&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Meanwhile, the &lt;strong&gt;Premium Stack&lt;/strong&gt; is ideal when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Caller experience and vocal quality are top priorities&lt;/li&gt;
&lt;li&gt;You need branded voices or enhanced emotional range&lt;/li&gt;
&lt;li&gt;You're targeting sensitive, trust-critical interactions (e.g., healthcare, finance)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both stacks are production-ready, but the &lt;strong&gt;Economy Stack costs ~38% less per call&lt;/strong&gt;, making it the winner in terms of operational efficiency.&lt;/p&gt;




&lt;h3&gt;
  
  
  📊 Visual Overview - Cost Comparison Bar Chart (Monthly Total and Per Call)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Total Monthly Cost (USD)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Cost (USD)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Economy Stack&lt;/td&gt;
&lt;td&gt;$2,044&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Premium Stack&lt;/td&gt;
&lt;td&gt;$3,301&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Cost Per Call (USD)
&lt;/h4&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack&lt;/th&gt;
&lt;th&gt;Cost Per Call&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Economy Stack&lt;/td&gt;
&lt;td&gt;$0.093&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Premium Stack&lt;/td&gt;
&lt;td&gt;$0.150&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Note: All prices reflect public rates as of &lt;strong&gt;May 2025&lt;/strong&gt;. Each component exceeds the highest pricing tier currently listed, so &lt;strong&gt;enterprise-level negotiation is likely to yield 30–50% discounts&lt;/strong&gt; when deployed at scale.&lt;br&gt;
With those discounts, the Economy Stack could drop below &lt;strong&gt;$1,500/month&lt;/strong&gt;, and the Premium Stack below &lt;strong&gt;$2,300/month&lt;/strong&gt;, making large-scale deployment increasingly feasible.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  10) Negotiating Beyond Public Pricing Tiers
&lt;/h2&gt;

&lt;p&gt;At the scale we’re modeling—&lt;strong&gt;22,000 calls per month&lt;/strong&gt;, totaling &lt;strong&gt;66,000 minutes of voice&lt;/strong&gt;, &lt;strong&gt;33,000 minutes of TTS&lt;/strong&gt;, and &lt;strong&gt;33,000 minutes of transcription&lt;/strong&gt;—&lt;strong&gt;every major component of the stack exceeds the highest publicly available pricing tier&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;That includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;LiveKit&lt;/strong&gt; (Scale plan: 45,000 min included → we use 66,000)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deepgram&lt;/strong&gt; (Enterprise pricing already applies)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ElevenLabs&lt;/strong&gt; (Business plan includes 22,000 min → we use 33,000)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cartesia&lt;/strong&gt; (Scale plan rates exceeded)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Twilio / Telnyx&lt;/strong&gt; (volume usage beyond typical pay-as-you-go)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI GPT-4.1 mini&lt;/strong&gt; (high token volume, consistent monthly usage)&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  🧾 Why Enterprise Negotiation Matters
&lt;/h3&gt;

&lt;p&gt;When your usage becomes predictable and high-volume, vendors are often open to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Committed-use discounts&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Volume-based pricing tiers&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Bundled service contracts&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Custom SLAs and support&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Discounts in the &lt;strong&gt;30%–50% range&lt;/strong&gt; are realistic, especially when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You negotiate multi-month or annual commitments&lt;/li&gt;
&lt;li&gt;You consolidate services under a single provider&lt;/li&gt;
&lt;li&gt;You become a reference customer or provide product feedback&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  💸 Recalculated Costs with ~40% Discount
&lt;/h3&gt;

&lt;p&gt;Applying a &lt;strong&gt;conservative 40% discount&lt;/strong&gt; across the stack:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stack Type&lt;/th&gt;
&lt;th&gt;Full Price (Monthly)&lt;/th&gt;
&lt;th&gt;After Discount (–40%)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Economy Stack&lt;/td&gt;
&lt;td&gt;$2,044.17&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,226.50&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Premium Stack&lt;/td&gt;
&lt;td&gt;$3,301.47&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$1,980.88&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These adjusted prices bring the &lt;strong&gt;cost per call&lt;/strong&gt; down to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Economy Stack:&lt;/strong&gt; ~$0.056&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Premium Stack:&lt;/strong&gt; ~$0.090&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And &lt;strong&gt;cost per hour&lt;/strong&gt; down to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Economy Stack:&lt;/strong&gt; ~$1.12&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Premium Stack:&lt;/strong&gt; ~$1.80&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  ✅ Final Takeaway
&lt;/h3&gt;

&lt;p&gt;If you’re planning to scale voice-AI automation beyond a few thousand calls per month, don’t rely solely on self-serve pricing pages. Reach out to each vendor’s enterprise sales team—you may unlock significant savings that make production-scale deployment much more cost-effective than it initially appears.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;All cost assumptions based on publicly available pricing as of &lt;strong&gt;May 2025&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  11) Operational Tips &amp;amp; Optimizations
&lt;/h2&gt;

&lt;p&gt;Once your voice-AI system is up and running, there are several strategies you can apply to reduce costs, improve performance, and make the whole experience smoother—without sacrificing quality.&lt;/p&gt;

&lt;p&gt;Here are some of the most effective optimizations:&lt;/p&gt;




&lt;h3&gt;
  
  
  🧠 1. Trim the Token Window
&lt;/h3&gt;

&lt;p&gt;Language model input costs scale with conversation history. Instead of sending the full transcript on every turn:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Summarize earlier turns&lt;/strong&gt; into compact memory chunks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Remove low-value exchanges&lt;/strong&gt; like “OK,” “Sure,” or greetings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use windowing strategies&lt;/strong&gt; (e.g., keep the last 3–4 turns only).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This helps reduce input token usage, especially in longer conversations.&lt;/p&gt;




&lt;h3&gt;
  
  
  🔇 2. Silence Trimming &amp;amp; Voice Activity Detection (VAD)
&lt;/h3&gt;

&lt;p&gt;Avoid processing and transcribing empty audio:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;Voice Activity Detection&lt;/strong&gt; to skip long silences or background noise.&lt;/li&gt;
&lt;li&gt;Trim pauses before sending audio to STT or TTS services.&lt;/li&gt;
&lt;li&gt;Detect &lt;strong&gt;barge-ins&lt;/strong&gt; (caller interrupts the bot) to pause TTS playback early.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This reduces billed minutes on both STT and TTS sides.&lt;/p&gt;




&lt;h3&gt;
  
  
  🧾 3. Cache the System Prompt
&lt;/h3&gt;

&lt;p&gt;OpenAI allows &lt;strong&gt;cached input tokens&lt;/strong&gt; (like a static system prompt) at a much lower rate. Make sure you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep your &lt;strong&gt;system prompt constant across requests&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Use API options that take advantage of caching when possible.&lt;/li&gt;
&lt;li&gt;Avoid resending unchanged instructions as raw text.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  💬 4. Pre-generate Common Replies
&lt;/h3&gt;

&lt;p&gt;For deterministic workflows (like confirming an appointment or collecting a yes/no), you can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;pre-written text responses&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Skip the language model entirely for predictable branches&lt;/li&gt;
&lt;li&gt;Cut latency and token cost to zero on those turns&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  📉 5. Committed-Use Agreements
&lt;/h3&gt;

&lt;p&gt;Once your usage stabilizes, talk to each vendor about:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;Volume discounts&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Annual billing options&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Custom usage tiers&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Vendors are often willing to negotiate lower prices when you commit to consistent usage or bundle multiple services.&lt;/p&gt;




&lt;h3&gt;
  
  
  🛠️ Bonus: Monitor &amp;amp; Adapt in Real Time
&lt;/h3&gt;

&lt;p&gt;Use analytics and observability tools (like SIP Insights, LiveKit metrics, or transcription confidence scores) to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spot anomalies (long silences, error spikes, dropped calls)&lt;/li&gt;
&lt;li&gt;Optimize system behavior dynamically&lt;/li&gt;
&lt;li&gt;Choose which interactions need human handoff&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;By applying even a few of these strategies, you can significantly reduce operational costs, improve response times, and deliver a more professional and polished AI voice experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  12) Conclusion — When the Numbers Make Sense, and When the Voice Matters
&lt;/h2&gt;

&lt;p&gt;Automating voice workflows isn’t about replacing people—it's about taking the repetitive, high-frequency interactions off their plates so they can focus on more meaningful work. With the right architecture and cost controls in place, voice-AI agents can handle thousands of predictable conversations efficiently and affordably.&lt;/p&gt;

&lt;h3&gt;
  
  
  📊 The Break-Even Point
&lt;/h3&gt;

&lt;p&gt;At roughly &lt;strong&gt;$0.056–$0.09 per call&lt;/strong&gt; (with enterprise pricing), you can simulate the monthly output of &lt;strong&gt;10 full-time agents&lt;/strong&gt; for &lt;strong&gt;$1,200–$2,000/month&lt;/strong&gt;. Depending on your geography, staffing model, and call volume, that’s often below the cost of a single human operator.&lt;/p&gt;

&lt;p&gt;This makes voice automation compelling for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lead qualification&lt;/li&gt;
&lt;li&gt;Appointment reminders&lt;/li&gt;
&lt;li&gt;Customer surveys&lt;/li&gt;
&lt;li&gt;Payment follow-ups&lt;/li&gt;
&lt;li&gt;Routine inbound routing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Especially when those interactions follow predictable patterns or scripted flows.&lt;/p&gt;




&lt;h3&gt;
  
  
  🔬 Where to Experiment Next
&lt;/h3&gt;

&lt;p&gt;If you're considering deploying your own voice AI assistant, the next logical steps might include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Testing real customer calls with different TTS providers&lt;/li&gt;
&lt;li&gt;Measuring drop-off rates and call completion times&lt;/li&gt;
&lt;li&gt;A/B testing voice styles or model temperatures&lt;/li&gt;
&lt;li&gt;Monitoring cost per resolved interaction over time&lt;/li&gt;
&lt;li&gt;Integrating fallback routes for complex queries (human transfer, async follow-up)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Voice automation is no longer experimental—it's becoming operational. With the right balance of cost, quality, and control, you can build something that not only saves time but feels genuinely helpful to the people on the other end of the line.&lt;/p&gt;

&lt;h2&gt;
  
  
  13) Resources &amp;amp; Links
&lt;/h2&gt;

&lt;p&gt;Here’s a list of all the official pricing and documentation pages for the tools and platforms referenced throughout this article. You can refer to these for the latest rates, usage limits, and API capabilities:&lt;/p&gt;

&lt;h3&gt;
  
  
  🔷 LiveKit
&lt;/h3&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [LiveKit Pricing](https://livekit.io/pricing)
- [LiveKit Docs](https://docs.livekit.io/home/)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 Deepgram (Speech-to-Text)
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [Deepgram Pricing](https://deepgram.com/pricing)  
- [Deepgram API Docs](https://developers.deepgram.com/home/introduction)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 OpenAI (GPT-4.1 mini)
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [OpenAI Pricing](https://openai.com/api/pricing/)  
- [OpenAI API Docs](https://platform.openai.com/docs/overview)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 ElevenLabs (Text-to-Speech)
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [ElevenLabs Pricing](https://elevenlabs.io/pricing/api)  
- [ElevenLabs Docs](https://elevenlabs.io/docs/overview)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 Cartesia
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [Cartesia Pricing](https://cartesia.ai/pricing)  
- [Cartesia API Docs](https://docs.cartesia.ai/2024-11-13/get-started/overview)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 Twilio (SIP Trunking)
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [Twilio SIP Pricing](https://www.twilio.com/en-us/sip-trunking/pricing/us)  
- [Twilio Docs](https://www.twilio.com/docs/sip-trunking)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;
&lt;h3&gt;
  
  
  🔷 Telnyx (SIP Trunking)
&lt;/h3&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;- [Telnyx SIP Pricing](https://telnyx.com/pricing/elastic-sip)  
- [Telnyx Docs](https://developers.telnyx.com/)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>voice</category>
      <category>cost</category>
      <category>analysis</category>
    </item>
    <item>
      <title>Cracking the &lt; 1-second Voice Loop: What We Learned After 30+ Stack Benchmarks</title>
      <dc:creator>Roman Piacquadio</dc:creator>
      <pubDate>Mon, 19 May 2025 15:09:52 +0000</pubDate>
      <link>https://dev.to/cloudx/cracking-the-1-second-voice-loop-what-we-learned-after-30-stack-benchmarks-427</link>
      <guid>https://dev.to/cloudx/cracking-the-1-second-voice-loop-what-we-learned-after-30-stack-benchmarks-427</guid>
      <description>&lt;h2&gt;
  
  
  Introduction — Why We’re Racing for &lt;em&gt;Sub-Second&lt;/em&gt; Voice Loops
&lt;/h2&gt;

&lt;p&gt;In &lt;strong&gt;October 2024&lt;/strong&gt; OpenAI unveiled its &lt;strong&gt;Realtime API&lt;/strong&gt;, the first end-to-end &lt;strong&gt;multimodal model&lt;/strong&gt; able to convert speech → text → reasoning → speech fast enough to feel &lt;em&gt;human&lt;/em&gt;.&lt;br&gt;&lt;br&gt;
That launch set the &lt;strong&gt;hype machine&lt;/strong&gt; spinning: “Why bother wiring three engines together when a single neural giant can do voice-to-voice in one shot?”&lt;/p&gt;

&lt;p&gt;Reality check:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pain Point&lt;/th&gt;
&lt;th&gt;Real-time Voice API&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cost&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~&lt;strong&gt;$20/hour&lt;/strong&gt; of two-way conversation — rough for contact-center scale.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Voices&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Locked to a handful of OpenAI-curated timbres; no custom cloning or branded voices.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Swapability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;You wait for &lt;em&gt;their&lt;/em&gt; next model drop — can’t plug in a brand-new STT or TTS that shipped yesterday.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Meanwhile, the open-source and vendor ecosystem didn’t sit still. By mid-2025 we could stitch together &lt;strong&gt;Deepgram STT + GPT-4 Nano/Mini + Cartesia Sonic (or ElevenLabs)&lt;/strong&gt; and hit &lt;em&gt;similar&lt;/em&gt; latency &lt;strong&gt;for a fraction of the cost&lt;/strong&gt; — while choosing any voice we like.&lt;/p&gt;

&lt;p&gt;The trick is to keep every stage &lt;strong&gt;modular&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speech-to-Text (STT)&lt;/strong&gt; — use whatever recognizer is fastest or cheapest today.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large Language Model (LLM)&lt;/strong&gt; — swap Mini ↔ Nano ↔ Flash checkpoints as they evolve.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text-to-Speech (TTS)&lt;/strong&gt; — pick the voice library that matches your brand.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Enter &lt;strong&gt;&lt;a href="https://livekit.io" rel="noopener noreferrer"&gt;LiveKit&lt;/a&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The glue that lets us shuffle those &lt;strong&gt;building blocks&lt;/strong&gt; in real time is &lt;strong&gt;LiveKit&lt;/strong&gt; — a WebRTC orchestration layer with an SDK that can fan-out telephone legs, browser streams, and AI workers on the same SFU.&lt;/p&gt;

&lt;p&gt;New STT, LLM, or TTS drops on a Friday?&lt;br&gt;&lt;br&gt;
We just &lt;strong&gt;swap the block&lt;/strong&gt;, &lt;strong&gt;restart the worker&lt;/strong&gt;, and it's live by lunch.&lt;/p&gt;

&lt;p&gt;No retraining. No monolithic rebuilds. Just composable parts evolving at their own pace.&lt;/p&gt;




&lt;h2&gt;
  
  
  What “Latency” Really Means (and Why It Hurts)
&lt;/h2&gt;

&lt;p&gt;Human turn-taking is &lt;em&gt;fast&lt;/em&gt;. Large-scale multilingual studies show that the &lt;strong&gt;median inter-turn gap is ≈ 200 ms&lt;/strong&gt;, but the range spans from as low as &lt;strong&gt;7 ms&lt;/strong&gt; (in Japanese) to over &lt;strong&gt;440 ms&lt;/strong&gt; (in Danish), depending on the language, sentence structure, and context of the exchange &lt;a href="https://arxiv.org/pdf/2404.16053v1" rel="noopener noreferrer"&gt;[1]&lt;/a&gt;.&lt;br&gt;&lt;br&gt;
A replication focused on English measured an average gap of &lt;strong&gt;236 ms ± 520 ms SD&lt;/strong&gt;, confirming that even within a single language, there’s wide variance depending on interaction type and formality.&lt;/p&gt;

&lt;p&gt;When the silence between turns stretches, our perception shifts:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;One-way gap&lt;/th&gt;
&lt;th&gt;How it feels&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;lt; ≈ 400 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Still “natural”, but you notice a beat.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;gt; ≈ 400 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ITU-T G.114 flags this as &lt;em&gt;unacceptable&lt;/em&gt; for conversational quality.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&amp;gt; ≈ 600–700 ms&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Most people label the call “robotic” or “satellite-delayed”.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;These reference points form the benchmark we’re chasing:&lt;br&gt;&lt;br&gt;
&lt;strong&gt;get the bot’s first syllable inside the ~400 ms comfort zone&lt;/strong&gt;—or, at the very least, close enough that the pause doesn’t break the conversational rhythm.&lt;/p&gt;




&lt;h2&gt;
  
  
  Anatomy of a Voice Pipeline
&lt;/h2&gt;

&lt;p&gt;A real-time loop has &lt;strong&gt;three streaming stages&lt;/strong&gt; that run strictly in series:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;What it does&lt;/th&gt;
&lt;th&gt;Latency metric&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;STT – Speech-to-Text&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Turns audio frames into text tokens.&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Final transcript time&lt;/em&gt; (but with proper streaming this is ≈ 0 ms relative to the next stage).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM – Large Language Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Crafts the reply.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TTFT (Time to First Token):&lt;/strong&gt; delay between sending the prompt and receiving the &lt;em&gt;first&lt;/em&gt; generated token.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;TTS – Text-to-Speech&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Voices the reply.&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;TTFB (Time to First Byte):&lt;/strong&gt; delay between sending the text and receiving the first playable PCM chunk.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Key observation:&lt;/strong&gt; in every stack we measured, &lt;strong&gt;LLM TTFT + TTS TTFB account for 90 %+ of total loop time&lt;/strong&gt;; with streaming recognizers, STT is effectively negligible.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;All three stages run in streaming inference — we start passing tokens or audio frames downstream the moment we see them.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Latency / Quality / Cost Triangle
&lt;/h2&gt;

&lt;p&gt;Push one corner, the others move:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lower latency ⇢&lt;/strong&gt; smaller / quantized models, “good-enough” neural voices.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher quality ⇢&lt;/strong&gt; bigger LLMs, premium TTS; usually slower.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lower cost ⇢&lt;/strong&gt; open-source or micro-models; may ding both speed &lt;em&gt;and&lt;/em&gt; fidelity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our job is to find the &lt;em&gt;quickest&lt;/em&gt; loop that still sounds customer-ready and doesn’t torch the budget.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv6m969l4eot7v7un154x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fv6m969l4eot7v7un154x.png" alt="Latency / Quality / Cost Triangle" width="307" height="307"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  How We Benchmarked
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Same system prompt in &lt;strong&gt;English&lt;/strong&gt; &lt;em&gt;and&lt;/em&gt; &lt;strong&gt;Spanish&lt;/strong&gt;.
&lt;/li&gt;
&lt;li&gt;Dozens of &lt;strong&gt;STT + LLM + TTS combinations&lt;/strong&gt; (cloud &amp;amp; OSS of which we have selected the top performing).
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LiveKit&lt;/strong&gt; measured STT duration, TTFT, TTFB on every turn.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  A few things we learned fast
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;LLMs &amp;amp; TTS slow down outside English.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;A long system prompt only punishes the &lt;strong&gt;first&lt;/strong&gt; turn (~ +300 ms); later turns ride the KV-cache.
&lt;/li&gt;
&lt;li&gt;The newest “nano” LLMs plus an ultra-fast TTS can get that &lt;strong&gt;first syllable under 800 ms&lt;/strong&gt;, scraping the human comfort ceiling.&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;STT&lt;/th&gt;
&lt;th&gt;LLM (version)&lt;/th&gt;
&lt;th&gt;TTS&lt;/th&gt;
&lt;th&gt;Language&lt;/th&gt;
&lt;th&gt;TTFT (1st / next)&lt;/th&gt;
&lt;th&gt;TTS TTFB&lt;/th&gt;
&lt;th&gt;First Byte Latency*&lt;/th&gt;
&lt;th&gt;Tokens/s&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Whisper-1 (no stream)&lt;/td&gt;
&lt;td&gt;GPT-4o-mini&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;0.34 / 0.34 s&lt;/td&gt;
&lt;td&gt;0.42–0.47 s&lt;/td&gt;
&lt;td&gt;3.1–3.9 s&lt;/td&gt;
&lt;td&gt;19–48&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4o-mini&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;0.31–1.63 / 0.31–0.45 s&lt;/td&gt;
&lt;td&gt;0.35–0.46 s&lt;/td&gt;
&lt;td&gt;0.7–2.1 s&lt;/td&gt;
&lt;td&gt;9–23&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;0.31–0.44 / 0.31–0.40 s&lt;/td&gt;
&lt;td&gt;0.40–0.59 s&lt;/td&gt;
&lt;td&gt;0.71–1.03 s&lt;/td&gt;
&lt;td&gt;13–67&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;0.77–1.33 / 0.75–0.95 s&lt;/td&gt;
&lt;td&gt;0.56–0.69 s&lt;/td&gt;
&lt;td&gt;1.33–2.02 s&lt;/td&gt;
&lt;td&gt;29–38&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;Gemini 1.5 Flash&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;0.45–0.76 / 0.35–0.55 s&lt;/td&gt;
&lt;td&gt;0.45–0.70 s&lt;/td&gt;
&lt;td&gt;1.2–1.5 s&lt;/td&gt;
&lt;td&gt;40–85&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;Gemini 1.5 Flash&lt;/td&gt;
&lt;td&gt;ElevenLabs&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;1.30–2.37 / 1.10–1.40 s&lt;/td&gt;
&lt;td&gt;0.46–0.69 s&lt;/td&gt;
&lt;td&gt;1.8–3.0 s&lt;/td&gt;
&lt;td&gt;25–58&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-2&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;1.22–1.41 / 0.42–0.90 s&lt;/td&gt;
&lt;td&gt;0.43–0.45 s&lt;/td&gt;
&lt;td&gt;1.65–1.86 s&lt;/td&gt;
&lt;td&gt;23–46&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-2&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;0.74–1.38 / 0.70–0.90 s&lt;/td&gt;
&lt;td&gt;0.48–0.52 s&lt;/td&gt;
&lt;td&gt;1.22–1.90 s&lt;/td&gt;
&lt;td&gt;22–42&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;1.15–1.24 / 0.44–0.65 s&lt;/td&gt;
&lt;td&gt;0.38–0.41 s&lt;/td&gt;
&lt;td&gt;1.53–1.65 s&lt;/td&gt;
&lt;td&gt;17–45&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-mini&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;0.75–1.11 / 0.30–0.40 s&lt;/td&gt;
&lt;td&gt;0.43–0.46 s&lt;/td&gt;
&lt;td&gt;1.18–1.57 s&lt;/td&gt;
&lt;td&gt;31–51&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;Gemini 1.5 Flash&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;1.19–1.27 / 1.19–1.27 s&lt;/td&gt;
&lt;td&gt;0.40–0.43 s&lt;/td&gt;
&lt;td&gt;1.59–1.70 s&lt;/td&gt;
&lt;td&gt;12–44&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;Gemini 1.5 Flash&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;1.28–1.39 / 1.00–1.10 s&lt;/td&gt;
&lt;td&gt;0.42–0.44 s&lt;/td&gt;
&lt;td&gt;1.70–1.83 s&lt;/td&gt;
&lt;td&gt;40–56&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-nano&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;EN&lt;/td&gt;
&lt;td&gt;0.90–0.97 / 0.30–0.40 s&lt;/td&gt;
&lt;td&gt;0.42–0.52 s&lt;/td&gt;
&lt;td&gt;0.73–1.45 s&lt;/td&gt;
&lt;td&gt;40–105&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;Deepgram&lt;/td&gt;
&lt;td&gt;GPT-4.1-nano&lt;/td&gt;
&lt;td&gt;Cartesia Sonic-Turbo&lt;/td&gt;
&lt;td&gt;ES&lt;/td&gt;
&lt;td&gt;1.00–1.07 / 0.26–0.40 s&lt;/td&gt;
&lt;td&gt;0.43–0.50 s&lt;/td&gt;
&lt;td&gt;0.75–1.53 s&lt;/td&gt;
&lt;td&gt;70–116&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  What the Numbers Tell Us
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. First-Turn Overhead Is Real
&lt;/h3&gt;

&lt;p&gt;Every stack shows a &lt;strong&gt;heavier first turn&lt;/strong&gt; because the LLM must ingest the entire system prompt before it can cache the KV-state.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Example:* in the &lt;strong&gt;GPT-4 Mini + Sonic-2 (EN)&lt;/strong&gt; stack the first TTFT clocks at &lt;strong&gt;≈ 1.22 s&lt;/strong&gt;, but subsequent turns fall to &lt;strong&gt;≈ 0.42–0.90 s&lt;/strong&gt;. The “prompt tax” is ~300–800 ms, and it vanishes after turn 2 because the model re-uses its internal cache.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. We’re Getting Closer to Human Latency — But Not Quite There Yet
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Human comfort band:&lt;/strong&gt; ~0.1–0.4 s one-way; anything above &lt;strong&gt;0.6–0.7 s&lt;/strong&gt; starts to feel "robotic."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best first syllable today:&lt;/strong&gt; &lt;strong&gt;0.73 s&lt;/strong&gt; (GPT-4 Nano + Sonic-Turbo, EN) and &lt;strong&gt;0.75 s&lt;/strong&gt; (same stack, ES).
&lt;em&gt;That’s about 2× slower than a natural gap, but already below the ITU’s 400 ms threshold for unacceptable RTT.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Second turn latency:&lt;/strong&gt; Since TTFT drops to &lt;strong&gt;0.26–0.40 s&lt;/strong&gt; and TTFB remains around &lt;strong&gt;0.43 s&lt;/strong&gt;, many loops land &lt;strong&gt;just under 0.7–0.8 s&lt;/strong&gt;—close enough that most users don’t perceive a delay.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. English Still Wins the Speed Race
&lt;/h3&gt;

&lt;p&gt;Across the board, Spanish incurs an extra &lt;strong&gt;+300–500 ms&lt;/strong&gt; in TTFT, and often a few additional milliseconds in TTFB.&lt;br&gt;&lt;br&gt;
This isn't surprising: most language models are trained on English-dominant datasets, and their tokenizers are optimized for English morphology. That means fewer tokens per word, higher-confidence predictions, and faster decoding paths.  &lt;/p&gt;

&lt;p&gt;In contrast, other languages often lead to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;More tokens per sentence (due to suboptimal tokenization),&lt;/li&gt;
&lt;li&gt;Less frequent vocabulary (slower logits resolution),&lt;/li&gt;
&lt;li&gt;Slightly longer prompts (higher input load),&lt;/li&gt;
&lt;li&gt;And more uncertainty during generation (costlier decoding).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Model providers are still actively optimizing multilingual performance—but for now, English remains the latency benchmark.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. STT Is a Non-Issue (When Streamed)
&lt;/h3&gt;

&lt;p&gt;Deepgram’s streaming mode continually emits tokens, so by the time the user finishes speaking the transcript is already done. &lt;strong&gt;&amp;lt; 5 ms&lt;/strong&gt; in our logs—effectively zero.&lt;/p&gt;




&lt;h2&gt;
  
  
  Which Stack for Whom?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Goal&lt;/th&gt;
&lt;th&gt;Stack to Watch&lt;/th&gt;
&lt;th&gt;Why&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ultra-low latency (&amp;lt; 0.8 s first byte)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;GPT-4 Nano + Cartesia Sonic-Turbo&lt;/strong&gt; (rows 13–14)&lt;/td&gt;
&lt;td&gt;Fastest TTFT (&amp;lt; 1 s first turn, &amp;lt; 0.40 s thereafter). Great for IVRs, live game NPCs, or any app where “snappiness” beats eloquence. Expect slightly terser, less nuanced language.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Balanced latency &amp;amp; quality&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;GPT-4 Mini + Cartesia Sonic-2 / Sonic-Turbo&lt;/strong&gt; (rows 7–8 &amp;amp; 10–11)&lt;/td&gt;
&lt;td&gt;Adds ~150-250 ms but yields noticeably richer wording and better reasoning. Sweet spot for customer support or sales calls where tone matters.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Language coverage beyond English&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;Mini or Nano stacks + Sonic-Turbo (ES)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Spanish numbers are catching up; Sonic voices remain natural and the Nano drop still delivers TTFT &amp;lt; 1.1 s.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Premium voice fidelity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;ElevenLabs + Mini stacks&lt;/strong&gt; (rows 1–4)&lt;/td&gt;
&lt;td&gt;Neural voices lead the market in prosody; latency penalty is ~0.05–0.1 s vs. Sonic-Turbo—fine for podcasts, high-touch brand experiences.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;(Quality judgments are subjective; we used blind AB tests on 30 clips per stack.)&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusions &amp;amp; Near-Term Outlook
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Composable beats monolithic—today.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Because STT, LLM, and TTS evolve on different cadences, a modular pipeline lets you upgrade components the moment something faster drops—unlike monolithic models, where you must wait for the next provider release.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Sub-second voice loops are already viable&lt;/strong&gt; for English and edging in for Spanish. With smarter caching, phoneme-level streaming, and incremental TTS we expect &lt;strong&gt;&amp;lt; 500 ms&lt;/strong&gt; within a year.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Model shrinkage will continue.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
“Nano” and “flash” checkpoints show that aggressive distillation + quantization can keep quality “good enough” while halving latency every generation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Edge deployment is accelerating.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Thanks to aggressive quantization (8-bit and even 4-bit), large language and speech models are now deployable on local hardware—consumer GPUs, mobile NPUs, and even embedded systems. This allows parts of the voice loop to run &lt;strong&gt;on-device&lt;/strong&gt;, cutting out network delays and shaving &lt;strong&gt;50–150 ms&lt;/strong&gt; off total latency.&lt;br&gt;&lt;br&gt;
&lt;a href="https://substack.com/home/post/p-160808933?source=queue&amp;amp;utm_source=chatgpt.com" rel="noopener noreferrer"&gt;Source: “AI Voice Inference at the Edge is Finally Here,” &lt;em&gt;VoiceTech Insights, 2025&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Joint LLM-TTS training is emerging.&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
A new generation of end-to-end speech models is beginning to bypass traditional TTS stages entirely. These models, like &lt;strong&gt;VITA-Audio&lt;/strong&gt; (2025), predict multiple audio tokens in a single step, generating speech directly from text while drastically reducing inference time. Once stable in streaming mode, these architectures could cut TTS latency to &lt;strong&gt;mere milliseconds&lt;/strong&gt;.&lt;br&gt;&lt;br&gt;
&lt;a href="https://arxiv.org/abs/2505.03739" rel="noopener noreferrer"&gt;Source: “VITA-Audio: Parallel Token-to-Audio Generation with Context-Aware Semantic Guidance,” &lt;em&gt;arXiv, May 2025&lt;/em&gt;&lt;/a&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Bottom line:&lt;/strong&gt; We’re only a few iteration cycles away from voice agents that &lt;em&gt;consistently&lt;/em&gt; reply in the same temporal rhythm as humans. If you build with LiveKit-style modular pipelines today, you can ride that curve with an overnight adjustment.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Stay tuned—the sub-half-second voice loop is closer than most teams think.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>livekit</category>
      <category>voice</category>
    </item>
    <item>
      <title>Building Voice AI Agents with the OpenAI Agents SDK</title>
      <dc:creator>Roman Piacquadio</dc:creator>
      <pubDate>Mon, 05 May 2025 12:49:15 +0000</pubDate>
      <link>https://dev.to/cloudx/building-voice-ai-agents-with-the-openai-agents-sdk-2aog</link>
      <guid>https://dev.to/cloudx/building-voice-ai-agents-with-the-openai-agents-sdk-2aog</guid>
      <description>&lt;h2&gt;
  
  
  Beyond Single Turns: OpenAI Enters the Voice Agent Arena
&lt;/h2&gt;

&lt;p&gt;In our previous post, &lt;a href="https://dev.to/cloudx/building-multi-agent-conversations-with-webrtc-livekit-48f1"&gt;Building Multi-Agent Conversations with WebRTC &amp;amp; LiveKit&lt;/a&gt;, we explored how to create complex, multi-stage voice interactions using the real-time power of WebRTC and the orchestration capabilities of the LiveKit Agents framework. We saw how crucial low latency and effective state management are for natural conversations, especially when handing off between different agent roles.&lt;/p&gt;

&lt;p&gt;Recently, OpenAI has significantly enhanced its offerings for building agentic systems, including dedicated tools and SDKs for creating voice agents. While the core concept of chaining Speech-to-Text (STT), Large Language Model (LLM), and Text-to-Speech (TTS) remains, OpenAI now provides more integrated primitives and an SDK designed to simplify this process, particularly within their ecosystem.&lt;/p&gt;

&lt;p&gt;This article dives into building voice agents using the OpenAI Agents SDK. We'll examine its architecture, walk through a Python example, and critically compare this approach with the LiveKit method discussed previously, highlighting the strengths, weaknesses, and ideal use cases for each.&lt;/p&gt;

&lt;h2&gt;
  
  
  OpenAI's Vision for Agents: Primitives and Orchestration
&lt;/h2&gt;

&lt;p&gt;OpenAI positions its platform as a set of composable primitives for building agents, covering domains like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Models:&lt;/strong&gt; Core intelligence (GPT-4o, the latest GPT-4.1 and GPT-4.1-mini, etc.) capable of reasoning and handling multimodality.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools:&lt;/strong&gt; Interfaces to the outside world, including developer-defined function calling, built-in web search, file search, etc.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Knowledge &amp;amp; Memory:&lt;/strong&gt; Using Vector Stores and Embeddings for context and persistence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio &amp;amp; Speech:&lt;/strong&gt; Primitives for understanding and generating voice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guardrails:&lt;/strong&gt; Moderation and instruction hierarchy for safety and control.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Orchestration:&lt;/strong&gt; The Agents SDK, Tracing, Evaluations, and Fine-tuning to manage the agent lifecycle.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For &lt;strong&gt;Voice Agents&lt;/strong&gt;, OpenAI presents two main architectural paths:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Speech-to-Speech (Multimodal - Realtime API):&lt;/strong&gt; Uses models like gpt-4o-realtime-preview that process audio input directly and generate audio output, aiming for the lowest latency and understanding vocal nuances. This uses a specific Realtime API separate from the main Chat Completions API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chained (Agents SDK + Voice):&lt;/strong&gt; The more traditional STT → LLM → TTS flow, but orchestrated using the openai-agents SDK with its [voice] extension. This provides more transparency (text transcripts at each stage) and control, making it easier to integrate into existing text-based agent workflows.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;This post will focus on the Chained architecture using the OpenAI Agents SDK, as it aligns more closely with common agent development patterns and provides a clearer comparison point to the plugin-based approach of LiveKit.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The OpenAI Agents SDK: Simplifying Agent Logic
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;openai-agents&lt;/code&gt; Python SDK aims to provide a lightweight way to build agents with a few core concepts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent:&lt;/strong&gt; An LLM equipped with instructions, tools, and potentially knowledge about when to hand off tasks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handoffs:&lt;/strong&gt; A mechanism allowing one agent to delegate tasks to another, more specialized agent. Agents are configured with a list of potential agents they can hand off to.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tools (&lt;code&gt;@function_tool&lt;/code&gt;):&lt;/strong&gt; Decorator to easily expose Python functions to the agent, similar to standard OpenAI function calling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guardrails:&lt;/strong&gt; Functions to validate inputs or outputs and enforce constraints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Runner:&lt;/strong&gt; Executes the agent logic, handling the loop of calling the LLM, executing tools, and managing handoffs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VoicePipeline (with [voice] extra):&lt;/strong&gt; Wraps an agent workflow (like one using Runner) to handle the STT and TTS parts of a voice interaction.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The philosophy is "Python-first," relying on Python's built-in features for orchestration rather than introducing many complex abstractions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture with OpenAI Agents SDK (Chained Voice)
&lt;/h2&gt;

&lt;p&gt;When using the &lt;code&gt;VoicePipeline&lt;/code&gt; from the SDK, the typical flow for a voice turn looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Audio Input:&lt;/strong&gt; Raw audio data (e.g., from a microphone) is captured.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VoicePipeline (STT):&lt;/strong&gt; The pipeline receives audio chunks. It uses an OpenAI STT model (like &lt;code&gt;gpt-4o-transcribe&lt;/code&gt; via the API) to transcribe the user's speech into text once speech ends (or via push-to-talk).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Agent Workflow Execution (MyWorkflow.run in the example):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;The transcribed text is passed to your defined workflow (e.g., a class inheriting from &lt;code&gt;VoiceWorkflowBase&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Inside the workflow, the &lt;code&gt;Runner&lt;/code&gt; is invoked with the current &lt;code&gt;Agent&lt;/code&gt;, conversation history, and the new user text.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;Agent&lt;/code&gt; (LLM) decides whether to respond directly, call a &lt;code&gt;Tool&lt;/code&gt; (function), or &lt;code&gt;Handoff&lt;/code&gt; to another agent based on its instructions and the user input.&lt;/li&gt;
&lt;li&gt;If a tool is called, the &lt;code&gt;Runner&lt;/code&gt; executes the Python function and sends the result back to the LLM.&lt;/li&gt;
&lt;li&gt;If a handoff occurs, the &lt;code&gt;Runner&lt;/code&gt; switches context to the new agent.&lt;/li&gt;
&lt;li&gt;The LLM generates the text response.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VoicePipeline (TTS):&lt;/strong&gt; The final text response from the agent workflow is sent to an OpenAI TTS model (e.g., &lt;code&gt;gpt-4o-mini-tts&lt;/code&gt;) via the API to generate audio.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audio Output:&lt;/strong&gt; The generated audio data is streamed back to be played to the user.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8fxvbdr20c6orgccvawj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F8fxvbdr20c6orgccvawj.png" alt="VoicePipeline workflow" width="800" height="404"&gt;&lt;/a&gt;&lt;br&gt;
&lt;em&gt;(Diagram: Microphone feeds audio to VoicePipeline for STT. Text goes to Agent Workflow (using Runner, Agent, Tools, Handoffs). Text response goes back to VoicePipeline for TTS, then to Speaker.)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This contrasts with the LiveKit architecture where WebRTC handles the audio transport layer directly, and the &lt;code&gt;livekit-agents&lt;/code&gt; framework integrates STT/LLM/TTS plugins into that real-time stream.&lt;/p&gt;

&lt;h2&gt;
  
  
  Let's Build: The Multi-Lingual Assistant (Python Example)
&lt;/h2&gt;

&lt;p&gt;Let's break down the key parts of the official OpenAI Agents SDK voice example. (Link to the repository will be at the end).&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Python 3.8+&lt;/li&gt;
&lt;li&gt;OpenAI API Key.&lt;/li&gt;
&lt;li&gt;Install the SDK with voice extras:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="s2"&gt;"openai-agents[voice]"&lt;/span&gt; sounddevice numpy python-dotenv textual &lt;span class="c"&gt;# For the demo UI&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Setup (.env file)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# .env&lt;/span&gt;
&lt;span class="nv"&gt;OPENAI_API_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"sk-..."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  3. Core Agent Logic (&lt;code&gt;my_workflow.py&lt;/code&gt;)
&lt;/h2&gt;

&lt;p&gt;This file defines the agents and the workflow logic that runs after speech is transcribed to text and before the response text is sent for synthesis.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Imports:&lt;/strong&gt; Necessary components from &lt;code&gt;agents&lt;/code&gt; SDK (&lt;code&gt;Agent&lt;/code&gt;, &lt;code&gt;Runner&lt;/code&gt;, &lt;code&gt;function_tool&lt;/code&gt;, &lt;code&gt;VoiceWorkflowBase&lt;/code&gt;, etc.).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tool Definition (&lt;code&gt;get_weather&lt;/code&gt;):&lt;/strong&gt; A simple Python function decorated with &lt;code&gt;@function_tool&lt;/code&gt; to make it callable by the &lt;code&gt;agent&lt;/code&gt;. The SDK handles generating the schema for the LLM.
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections.abc&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AsyncIterator&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;

&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;TResponseInputItem&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;function_tool&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.extensions.handoff_prompt&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;prompt_with_handoff_instructions&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;agents.voice&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;VoiceWorkflowBase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;VoiceWorkflowHelper&lt;/span&gt;

&lt;span class="nd"&gt;@function_tool&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get the weather for a given city.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[debug] get_weather called with city: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;choices&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sunny&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;cloudy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;rainy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snowy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The weather in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;city&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; is &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;choice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Agent Definitions (&lt;code&gt;spanish_agent&lt;/code&gt;, &lt;code&gt;agent&lt;/code&gt;):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Each &lt;code&gt;Agent&lt;/code&gt; is created with a &lt;code&gt;name&lt;/code&gt;, &lt;code&gt;instructions&lt;/code&gt; (using a helper &lt;code&gt;prompt_with_handoff_instructions&lt;/code&gt; to guide its behavior regarding handoffs), a &lt;code&gt;model&lt;/code&gt;, and optionally &lt;code&gt;tools&lt;/code&gt; it can use and other &lt;code&gt;handoffs&lt;/code&gt; it can initiate.&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;handoff_description&lt;/code&gt; helps the calling agent decide which agent to hand off to.
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;spanish_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Spanish&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;handoff_description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;A spanish speaking agent.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;prompt_with_handoff_instructions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re speaking to a human, so be polite and concise. Speak in Spanish.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Agent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Assistant&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;prompt_with_handoff_instructions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;re speaking to a human, so be polite and concise. If the user speaks in Spanish, handoff to the spanish agent.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4.1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;handoffs&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;spanish_agent&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="c1"&gt;# List of agents it can hand off to
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;get_weather&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;      &lt;span class="c1"&gt;# List of tools it can use
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Workflow Class (&lt;code&gt;MyWorkflow&lt;/code&gt;):&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;Inherits from &lt;code&gt;VoiceWorkflowBase&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;__init__&lt;/code&gt;: Stores configuration (like the &lt;code&gt;secret_word&lt;/code&gt; for a simple game logic) and maintains state like conversation history (&lt;code&gt;_input_history&lt;/code&gt;) and the currently active agent (&lt;code&gt;_current_agent&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;run(transcription: str)&lt;/code&gt;: This is the core method called by the &lt;code&gt;VoicePipeline&lt;/code&gt; after STT.&lt;/li&gt;
&lt;li&gt;It receives the user's transcribed text.&lt;/li&gt;
&lt;li&gt;Updates the conversation history.&lt;/li&gt;
&lt;li&gt;Contains custom logic (like checking for the secret word).&lt;/li&gt;
&lt;li&gt;Invokes &lt;code&gt;Runner.run_streamed&lt;/code&gt; with the current agent and history. This handles the interaction with the LLM, tool calls, and potential handoffs based on the agent's configuration.&lt;/li&gt;
&lt;li&gt;Uses &lt;code&gt;VoiceWorkflowHelper.stream_text_from&lt;/code&gt; to yield text chunks as they are generated by the LLM (enabling faster TTS start).&lt;/li&gt;
&lt;li&gt;Updates the history and potentially the &lt;code&gt;_current_agent&lt;/code&gt; based on the &lt;code&gt;Runner&lt;/code&gt;'s result (if a handoff occurred).
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;MyWorkflow&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;VoiceWorkflowBase&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;secret_word&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;Callable&lt;/span&gt;&lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
        &lt;span class="c1"&gt;# ... (init stores history, current_agent, secret_word, callback) ...
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_input_history&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;TResponseInputItem&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_secret_word&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;secret_word&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_on_start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;on_start&lt;/span&gt; &lt;span class="c1"&gt;# Callback for UI updates
&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;AsyncIterator&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_on_start&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c1"&gt;# Call the UI callback
&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_input_history&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_secret_word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;transcription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="c1"&gt;# Custom logic example
&lt;/span&gt;            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You guessed the secret word!&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="c1"&gt;# ... (update history) ...
&lt;/span&gt;            &lt;span class="k"&gt;return&lt;/span&gt;

        &lt;span class="c1"&gt;# Run the agent logic using the Runner
&lt;/span&gt;        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Runner&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run_streamed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_agent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_input_history&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Stream text chunks for faster TTS
&lt;/span&gt;        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;VoiceWorkflowHelper&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream_text_from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;

        &lt;span class="c1"&gt;# Update state for the next turn
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_input_history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_input_list&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_current_agent&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;last_agent&lt;/span&gt; &lt;span class="c1"&gt;# Agent might have changed via handoff
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4. Client &amp;amp; Pipeline Setup (&lt;code&gt;main.py&lt;/code&gt;)
&lt;/h2&gt;

&lt;p&gt;This file sets up a simple Textual-based UI and manages the audio input/output and the &lt;code&gt;VoicePipeline&lt;/code&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It initializes &lt;code&gt;sounddevice&lt;/code&gt; for microphone input and speaker output.&lt;/li&gt;
&lt;li&gt;Creates the &lt;code&gt;VoicePipeline&lt;/code&gt;, passing in the &lt;code&gt;MyWorkflow&lt;/code&gt; instance.&lt;/li&gt;
&lt;li&gt;Uses &lt;code&gt;StreamedAudioInput&lt;/code&gt; to feed microphone data into the pipeline.&lt;/li&gt;
&lt;li&gt;Starts the pipeline using &lt;code&gt;pipeline.run(self._audio_input)&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Asynchronously iterates through the &lt;code&gt;result.stream()&lt;/code&gt; to:

&lt;ul&gt;
&lt;li&gt;Play back audio chunks (&lt;code&gt;voice_stream_event_audio&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;Display lifecycle events or transcriptions in the UI.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Handles starting/stopping recording based on key presses ('k').
&lt;em&gt;(Note: We won't dive deep into the Textual UI code here, focusing instead on the agent interaction pattern.)&lt;/em&gt;
&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  5. Running the Example
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Ensure .env is set up.&lt;/li&gt;
&lt;li&gt;Run the main script: python main.py&lt;/li&gt;
&lt;li&gt;Press 'k' to start recording, speak, press 'k' again to stop. The agent should respond.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Comparing Approaches: OpenAI Agents SDK vs. LiveKit Agents
&lt;/h2&gt;

&lt;p&gt;Both frameworks allow building sophisticated voice agents with multiple roles, but they excel in different areas due to their underlying philosophies and technologies:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;strong&gt;Feature&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;OpenAI Agents SDK (Chained Voice)&lt;/strong&gt;&lt;/th&gt;
&lt;th&gt;&lt;strong&gt;LiveKit Agents Framework (WebRTC)&lt;/strong&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Core Technology&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;🐍 Python SDK orchestrating OpenAI APIs (STT, LLM, TTS)&lt;/td&gt;
&lt;td&gt;🌐 Python Framework built on LiveKit &amp;amp; WebRTC&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Higher (API calls for STT, LLM, TTS per turn)&lt;/td&gt;
&lt;td&gt;✅ Lower (Direct WebRTC streaming, optimized for voice)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-time Audio&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Handled by SDK (VoicePipeline), abstracts away stream&lt;/td&gt;
&lt;td&gt;✅ Core feature via WebRTC, fine-grained control possible&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Setup Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Generally Lower (mainly SDK install &amp;amp; API keys)&lt;/td&gt;
&lt;td&gt;⚠️ Higher (Requires LiveKit server setup/cloud account)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;STT/TTS Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Primarily uses OpenAI models via API.&lt;/td&gt;
&lt;td&gt;✅ Plugin-based (OpenAI, Deepgram, Google, etc.) easy swap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Uses OpenAI models via API.&lt;/td&gt;
&lt;td&gt;✅ Plugin-based (OpenAI, Anthropic, Local models, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Interruption Handling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Not built-in for StreamedAudioInput. Requires manual implementation listening to lifecycle events.&lt;/td&gt;
&lt;td&gt;✅ Built-in using VAD plugins (e.g., Silero).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;State Management&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Managed within Python workflow (e.g., list history)&lt;/td&gt;
&lt;td&gt;✅ Explicit userdata on AgentSession, shared state&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Agent Handoff&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Declarative (handoffs list in Agent)&lt;/td&gt;
&lt;td&gt;⚠️ Imperative (Agent function returns next agent instance)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Ecosystem&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Integrated with OpenAI Tracing, Evals, Fine-tuning.&lt;/td&gt;
&lt;td&gt;⚠️ Focused on real-time communication infrastructure.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Depends on Python deployment &amp;amp; API limits.&lt;/td&gt;
&lt;td&gt;✅ Built on scalable WebRTC infrastructure (LiveKit).&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Note on OpenAI Realtime API:&lt;/strong&gt; OpenAI does offer the gpt-4o-realtime-preview model via a separate Realtime API for true speech-to-speech with potentially very low latency. However, this is a different architecture than the Agents SDK VoicePipeline discussed here, uses specific models, and has its own implementation details.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Choose Which?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Choose OpenAI Agents SDK (Chained Voice) When:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;You primarily want to work within the OpenAI ecosystem (Models, Tracing, Evals).&lt;/li&gt;
&lt;li&gt;Your application can tolerate slightly higher latency inherent in the chained API calls.&lt;/li&gt;
&lt;li&gt;You prefer a simpler initial setup without managing WebRTC infrastructure.&lt;/li&gt;
&lt;li&gt;You need transparency with text transcripts at each stage (STT output, LLM input/output).&lt;/li&gt;
&lt;li&gt;Built-in, low-latency interruption handling is not a critical out-of-the-box requirement.&lt;/li&gt;
&lt;li&gt;Your core logic is already text-based, and you're adding a voice interface.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Choose LiveKit Agents Framework When:&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Minimizing latency&lt;/strong&gt; is paramount for natural turn-taking.&lt;/li&gt;
&lt;li&gt;You need &lt;strong&gt;robust, built-in interruption handling&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;You require &lt;strong&gt;flexibility to choose and easily swap&lt;/strong&gt; different STT, LLM, and TTS providers (including non-OpenAI or self-hosted).&lt;/li&gt;
&lt;li&gt;You need fine-grained control over the real-time audio/video streams (WebRTC).&lt;/li&gt;
&lt;li&gt;You are building applications that inherently benefit from a "room"-based model (e.g., multiple users, agent joining calls).&lt;/li&gt;
&lt;li&gt;Scalability for many concurrent real-time connections is a primary concern.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;OpenAI's introduction of the Agents SDK, especially with its voice capabilities, provides a compelling and relatively straightforward path for developers already invested in their ecosystem to build voice agents. The VoicePipeline abstracts away some of the complexities of the STT → LLM → TTS chain. Its strengths lie in integration with OpenAI's tools (like tracing) and the declarative nature of defining agents, tools, and handoffs.&lt;/p&gt;

&lt;p&gt;However, for applications demanding the absolute lowest latency, seamless interruption handling, and maximum flexibility in choosing underlying AI models, the WebRTC-based approach offered by frameworks like LiveKit Agents remains a very strong contender. It requires more infrastructure setup but provides unparalleled control over the real-time aspects of the conversation.&lt;/p&gt;

&lt;p&gt;The choice depends heavily on your specific project requirements, tolerance for latency, need for flexibility, and existing technology stack. Both approaches offer powerful ways to move beyond simple bots and create truly interactive voice AI experiences.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Explore the &lt;a href="https://openai.github.io/openai-agents-python/" rel="noopener noreferrer"&gt;OpenAI Agents SDK Documentation&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Check out the &lt;a href="https://github.com/openai/openai-agents-python" rel="noopener noreferrer"&gt;OpenAI Agents GitHub Repository&lt;/a&gt; and the &lt;a href="https://github.com/openai/openai-agents-python/tree/main/examples/voice" rel="noopener noreferrer"&gt;voice example&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Learn about the &lt;a href="https://platform.openai.com/docs/guides/realtime" rel="noopener noreferrer"&gt;OpenAI Realtime API&lt;/a&gt; for speech-to-speech.&lt;/li&gt;
&lt;li&gt;Revisit the &lt;a href="https://docs.livekit.io/agents/" rel="noopener noreferrer"&gt;LiveKit Agents Documentation&lt;/a&gt; for comparison.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What are your thoughts on these different approaches to building voice agents? Let me know in the comments!&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>openai</category>
      <category>python</category>
    </item>
  </channel>
</rss>
