<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Brian Spann</title>
    <description>The latest articles on DEV Community by Brian Spann (@bspann).</description>
    <link>https://dev.to/bspann</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3739953%2Fa1db08ea-8160-494c-b904-c0857ab61e24.png</url>
      <title>DEV Community: Brian Spann</title>
      <link>https://dev.to/bspann</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bspann"/>
    <language>en</language>
    <item>
      <title>Presidio as an LLM Guardrail</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Fri, 12 Jun 2026 02:47:05 +0000</pubDate>
      <link>https://dev.to/bspann/presidio-as-an-llm-guardrail-gcf</link>
      <guid>https://dev.to/bspann/presidio-as-an-llm-guardrail-gcf</guid>
      <description>&lt;p&gt;Every previous part of this series has been building toward this one. You can detect PII. You can anonymize it with the right operator for each entity type. You can build custom recognizers for your organization's specific data patterns. Now we put it all together into the architecture that matters most in 2026: a PII guardrail that sits between your users and your LLM.&lt;/p&gt;

&lt;p&gt;The problem is straightforward. Users type personal information into prompts. Support agents paste customer records into chat interfaces. Developers pipe production data into debugging workflows. All of that PII flows to your model provider's API endpoint. Even if the provider says they don't train on your data, the information still transits their infrastructure. For regulated industries, that transit itself can be a compliance violation.&lt;/p&gt;

&lt;h2&gt;
  
  
  The PII Proxy Pattern
&lt;/h2&gt;

&lt;p&gt;The solution is a proxy that intercepts every LLM request, scrubs PII from the prompt, forwards the clean version, and then restores the PII in the response.&lt;/p&gt;

&lt;p&gt;The flow looks like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User sends a prompt containing PII&lt;/li&gt;
&lt;li&gt;Proxy detects and encrypts all PII entities&lt;/li&gt;
&lt;li&gt;Clean prompt (with encrypted tokens) goes to the LLM&lt;/li&gt;
&lt;li&gt;LLM responds using the encrypted tokens&lt;/li&gt;
&lt;li&gt;Proxy decrypts the tokens in the response, restoring original PII&lt;/li&gt;
&lt;li&gt;User sees a response with their real data intact&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The user never notices the proxy exists. The LLM never sees the real PII. The encryption key stays on your infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Proxy in Python
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;DeanonymizeEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer.entities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OperatorConfig&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize Presidio engines
&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;deanonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DeanonymizeEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;ENCRYPTION_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WmZq4t7w!z%C*F-J&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;  &lt;span class="c1"&gt;# In production, pull from Key Vault
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrub_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Detect and encrypt PII in the prompt.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ENCRYPTION_KEY&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;restore_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Decrypt PII tokens in the LLM response.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;

    &lt;span class="n"&gt;deanonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deanonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deanonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ENCRYPTION_KEY&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;deanonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;chat_with_guardrail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Send a message to the LLM with PII protection.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Step 1: Scrub
&lt;/span&gt;    &lt;span class="n"&gt;clean_prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pii_items&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;scrub_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_message&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 2: Send to LLM
&lt;/span&gt;    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;AzureOpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;azure_endpoint&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://your-endpoint.openai.azure.com/&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-api-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;api_version&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;2024-02-01&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gpt-4o&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;clean_prompt&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;llm_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 3: Restore
&lt;/span&gt;    &lt;span class="n"&gt;final_response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;restore_response&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;llm_response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pii_items&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;final_response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Test it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;user_input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Summarize this customer case: John Smith (john.smith@acme.com, 
SSN 123-45-6789) reported unauthorized charges on his Visa 
ending 4242. He can be reached at 206-555-0147.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;chat_with_guardrail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What the LLM sees: encrypted tokens where the PII was. What the user sees: a response with their real customer data. The LLM processes the request without ever handling the actual PII.&lt;/p&gt;

&lt;h2&gt;
  
  
  Moving the Guardrail into Azure API Management
&lt;/h2&gt;

&lt;p&gt;The Python proxy works, but it lives inside one application. Every team that wants the same protection has to wire in the same code and keep it current. A guardrail belongs at the edge, where every model call already passes through. On Azure, that edge is API Management.&lt;/p&gt;

&lt;p&gt;Put APIM in front of Azure OpenAI and point your applications at the APIM endpoint instead of the model endpoint. Now APIM is the one place that sees every prompt and every completion. An inbound policy scrubs PII out of the prompt before it reaches the model. An outbound policy restores it on the way back, so the caller still gets their real values. You can run either direction on its own, or both.&lt;/p&gt;

&lt;p&gt;The flow with APIM:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;App calls the APIM endpoint with a prompt containing PII&lt;/li&gt;
&lt;li&gt;Inbound policy sends the prompt to Presidio, which encrypts the PII entities&lt;/li&gt;
&lt;li&gt;APIM stashes the entity map in a context variable and forwards the scrubbed prompt to Azure OpenAI&lt;/li&gt;
&lt;li&gt;Azure OpenAI responds, echoing back the encrypted tokens&lt;/li&gt;
&lt;li&gt;Outbound policy sends the response plus the saved entity map to Presidio to decrypt&lt;/li&gt;
&lt;li&gt;APIM returns the restored response to the app&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model never sees real PII. The encryption key and the entity map never leave your APIM instance and its backend. No application code changes.&lt;/p&gt;

&lt;p&gt;In this setup Presidio sits behind two small endpoints, &lt;code&gt;/deidentify&lt;/code&gt; and &lt;code&gt;/reidentify&lt;/code&gt;, that wrap the analyzer and anonymizer (a thin container that encrypts on the way in, decrypts on the way out, with the key pulled from Key Vault). The APIM policy calls them with &lt;code&gt;send-request&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight xml"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;policies&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;inbound&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;base&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="c"&gt;&amp;lt;!-- Pull the user's prompt out of the chat completion body --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;set-variable&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"userPrompt"&lt;/span&gt;
      &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"@(context.Request.Body.As&amp;lt;JObject&amp;gt;(preserveContent: true)["&lt;/span&gt;&lt;span class="err"&gt;messages"].Last["content"].ToString())"&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;

    &lt;span class="c"&gt;&amp;lt;!-- De-identify: send the prompt to Presidio before the model sees it --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;send-request&lt;/span&gt; &lt;span class="na"&gt;mode=&lt;/span&gt;&lt;span class="s"&gt;"new"&lt;/span&gt; &lt;span class="na"&gt;response-variable-name=&lt;/span&gt;&lt;span class="s"&gt;"deidentified"&lt;/span&gt; &lt;span class="na"&gt;timeout=&lt;/span&gt;&lt;span class="s"&gt;"10"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-url&amp;gt;&lt;/span&gt;https://presidio.internal/deidentify&lt;span class="nt"&gt;&amp;lt;/set-url&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-method&amp;gt;&lt;/span&gt;POST&lt;span class="nt"&gt;&amp;lt;/set-method&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-header&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"Content-Type"&lt;/span&gt; &lt;span class="na"&gt;exists-action=&lt;/span&gt;&lt;span class="s"&gt;"override"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;value&amp;gt;&lt;/span&gt;application/json&lt;span class="nt"&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/set-header&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-body&amp;gt;&lt;/span&gt;@(new JObject(new JProperty("text", (string)context.Variables["userPrompt"])).ToString())&lt;span class="nt"&gt;&amp;lt;/set-body&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/send-request&amp;gt;&lt;/span&gt;

    &lt;span class="c"&gt;&amp;lt;!-- Save the entity map so the outbound step can re-identify --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;set-variable&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"entityMap"&lt;/span&gt;
      &lt;span class="na"&gt;value=&lt;/span&gt;&lt;span class="s"&gt;"@(((IResponse)context.Variables["&lt;/span&gt;&lt;span class="err"&gt;deidentified"]).Body.As&amp;lt;JObject&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;()["entities"].ToString())" /&amp;gt;

    &lt;span class="c"&gt;&amp;lt;!-- Swap the scrubbed prompt back into the request before it hits the model --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;set-body&amp;gt;&lt;/span&gt;@{
      var body = context.Request.Body.As&lt;span class="nt"&gt;&amp;lt;JObject&amp;gt;&lt;/span&gt;();
      var clean = ((IResponse)context.Variables["deidentified"]).Body.As&lt;span class="nt"&gt;&amp;lt;JObject&amp;gt;&lt;/span&gt;()["text"].ToString();
      body["messages"].Last["content"] = clean;
      return body.ToString();
    }&lt;span class="nt"&gt;&amp;lt;/set-body&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/inbound&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;backend&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;base&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/backend&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;outbound&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;base&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="c"&gt;&amp;lt;!-- Re-identify: decrypt the PII back into the model's response --&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;send-request&lt;/span&gt; &lt;span class="na"&gt;mode=&lt;/span&gt;&lt;span class="s"&gt;"new"&lt;/span&gt; &lt;span class="na"&gt;response-variable-name=&lt;/span&gt;&lt;span class="s"&gt;"reidentified"&lt;/span&gt; &lt;span class="na"&gt;timeout=&lt;/span&gt;&lt;span class="s"&gt;"10"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-url&amp;gt;&lt;/span&gt;https://presidio.internal/reidentify&lt;span class="nt"&gt;&amp;lt;/set-url&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-method&amp;gt;&lt;/span&gt;POST&lt;span class="nt"&gt;&amp;lt;/set-method&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-header&lt;/span&gt; &lt;span class="na"&gt;name=&lt;/span&gt;&lt;span class="s"&gt;"Content-Type"&lt;/span&gt; &lt;span class="na"&gt;exists-action=&lt;/span&gt;&lt;span class="s"&gt;"override"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
        &lt;span class="nt"&gt;&amp;lt;value&amp;gt;&lt;/span&gt;application/json&lt;span class="nt"&gt;&amp;lt;/value&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;/set-header&amp;gt;&lt;/span&gt;
      &lt;span class="nt"&gt;&amp;lt;set-body&amp;gt;&lt;/span&gt;@{
        var resp = context.Response.Body.As&lt;span class="nt"&gt;&amp;lt;JObject&amp;gt;&lt;/span&gt;(preserveContent: true);
        var content = resp["choices"][0]["message"]["content"].ToString();
        return new JObject(
          new JProperty("text", content),
          new JProperty("entities", JArray.Parse((string)context.Variables["entityMap"]))
        ).ToString();
      }&lt;span class="nt"&gt;&amp;lt;/set-body&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;/send-request&amp;gt;&lt;/span&gt;

    &lt;span class="nt"&gt;&amp;lt;set-body&amp;gt;&lt;/span&gt;@{
      var resp = context.Response.Body.As&lt;span class="nt"&gt;&amp;lt;JObject&amp;gt;&lt;/span&gt;();
      var restored = ((IResponse)context.Variables["reidentified"]).Body.As&lt;span class="nt"&gt;&amp;lt;JObject&amp;gt;&lt;/span&gt;()["text"].ToString();
      resp["choices"][0]["message"]["content"] = restored;
      return resp.ToString();
    }&lt;span class="nt"&gt;&amp;lt;/set-body&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/outbound&amp;gt;&lt;/span&gt;

  &lt;span class="nt"&gt;&amp;lt;on-error&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;base&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/on-error&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/policies&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With this policy in place, every application pointed at the APIM endpoint gets PII protection without changing a line of its own code. The inbound and outbound blocks are independent: scrub on the way in only, restore on the way out only, or both, depending on whether you need the real values back in the response.&lt;/p&gt;

&lt;p&gt;Two decisions shape the setup:&lt;/p&gt;

&lt;p&gt;Reversibility. The policy above uses Presidio's encrypt operator so the outbound step can decrypt. If you only need to keep PII away from the model and never need it back, switch the wrapper to replace and drop the outbound policy. It's simpler and there's no key to manage.&lt;/p&gt;

&lt;p&gt;Where Presidio runs. The &lt;code&gt;send-request&lt;/code&gt; calls point at an internal Presidio endpoint. Keep it on the same VNet as APIM so prompts never touch the public internet. The next section covers those deployment options.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying on Azure
&lt;/h2&gt;

&lt;p&gt;For production, you need Presidio running as a service, not embedded in your application code. Here are the deployment options on Azure, from the quickest to stand up to the most production-ready.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure App Service
&lt;/h3&gt;

&lt;p&gt;The fastest path to production. Deploy the Presidio Docker containers to App Service with minimal configuration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a resource group&lt;/span&gt;
az group create &lt;span class="nt"&gt;--name&lt;/span&gt; rg-presidio &lt;span class="nt"&gt;--location&lt;/span&gt; eastus

&lt;span class="c"&gt;# Create an App Service plan&lt;/span&gt;
az appservice plan create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-plan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--is-linux&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--sku&lt;/span&gt; B2

&lt;span class="c"&gt;# Deploy the analyzer&lt;/span&gt;
az webapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-analyzer-prod &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--plan&lt;/span&gt; presidio-plan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--deployment-container-image-name&lt;/span&gt; mcr.microsoft.com/presidio-analyzer:latest

&lt;span class="c"&gt;# Deploy the anonymizer&lt;/span&gt;
az webapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-anonymizer-prod &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--plan&lt;/span&gt; presidio-plan &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--deployment-container-image-name&lt;/span&gt; mcr.microsoft.com/presidio-anonymizer:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Azure Container Apps
&lt;/h3&gt;

&lt;p&gt;For more control over scaling, networking, and multi-container deployments:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create an ACA environment&lt;/span&gt;
az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; eastus

&lt;span class="c"&gt;# Deploy analyzer&lt;/span&gt;
az containerapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-analyzer &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; presidio-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; mcr.microsoft.com/presidio-analyzer:latest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target-port&lt;/span&gt; 3000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ingress&lt;/span&gt; internal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-replicas&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-replicas&lt;/span&gt; 10

&lt;span class="c"&gt;# Deploy anonymizer&lt;/span&gt;
az containerapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; presidio-anonymizer &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-presidio &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; presidio-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; mcr.microsoft.com/presidio-anonymizer:latest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target-port&lt;/span&gt; 3000 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ingress&lt;/span&gt; internal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-replicas&lt;/span&gt; 1 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-replicas&lt;/span&gt; 10
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;--ingress internal&lt;/code&gt; means the Presidio services aren't exposed to the internet. Only other services in the same ACA environment (or VNet) can reach them. Your &lt;code&gt;/deidentify&lt;/code&gt; and &lt;code&gt;/reidentify&lt;/code&gt; wrapper sits in the same environment and calls the analyzer and anonymizer over the internal network, and APIM calls the wrapper the same way.&lt;/p&gt;

&lt;h3&gt;
  
  
  Kubernetes
&lt;/h3&gt;

&lt;p&gt;For enterprise deployments with existing AKS clusters, Presidio publishes Helm charts. The setup is more involved but gives you full control over resource limits, HPA scaling, pod affinity, and network policies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Production Hardening
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Logging and Monitoring
&lt;/h3&gt;

&lt;p&gt;Log every detection for audit trails, but never log the actual PII values. Log the entity types, confidence scores, and positions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;

&lt;span class="n"&gt;logger&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;presidio-guardrail&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;scrub_with_logging&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Log detection summary (not the actual PII)
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entity_type=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;score=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;start=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; end=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; total_entities=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;ENCRYPTION_KEY&lt;/span&gt;&lt;span class="p"&gt;})}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  False Positive Handling
&lt;/h3&gt;

&lt;p&gt;Presidio will occasionally flag non-PII as PII. A city name like "Jordan" might be detected as a person name. A product SKU might match a phone number pattern. For production systems, build a feedback mechanism:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Maintain an allow list of known false positives
&lt;/span&gt;&lt;span class="n"&gt;FALSE_POSITIVE_ALLOWLIST&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jordan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Phoenix&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Austin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# Cities that are also names
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;555-0100&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# Known test number
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;filter_false_positives&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;filtered&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;allowlist&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;FALSE_POSITIVE_ALLOWLIST&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;allowlist&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;filtered&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;filtered&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Performance Considerations
&lt;/h3&gt;

&lt;p&gt;Presidio's analyzer is CPU-intensive, especially with the large spaCy model. For high-throughput workloads:&lt;/p&gt;

&lt;p&gt;Keep the analyzer engine warm. Initializing &lt;code&gt;AnalyzerEngine()&lt;/code&gt; loads the NLP model, which takes a few seconds. Do it once at startup, not per request.&lt;/p&gt;

&lt;p&gt;Set a score threshold. Processing low-confidence detections wastes CPU cycles and increases false positives. Start with 0.5 and adjust based on your accuracy requirements.&lt;/p&gt;

&lt;p&gt;Use the right NLP model size. &lt;code&gt;en_core_web_lg&lt;/code&gt; is more accurate but slower. &lt;code&gt;en_core_web_sm&lt;/code&gt; is faster but misses more entities. Profile your specific workload to find the right tradeoff.&lt;/p&gt;

&lt;p&gt;Cache recognizer results for repeated text. If the same support template gets processed thousands of times, cache the detection results and only run the anonymizer.&lt;/p&gt;

&lt;p&gt;When the guardrail runs inside APIM, two more things matter. Set a sane &lt;code&gt;timeout&lt;/code&gt; on the &lt;code&gt;send-request&lt;/code&gt; calls so a slow Presidio response can't hang the whole model call, and decide how to fail. Failing closed (block the request if Presidio is unreachable) protects PII at the cost of availability. Failing open does the reverse. For regulated workloads, fail closed and put Presidio behind enough replicas that it rarely comes to that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Series Wrap-Up
&lt;/h2&gt;

&lt;p&gt;Over these five parts we've gone from zero to a production-ready PII detection and anonymization pipeline. You can install and run Presidio, detect PII in text, images, and structured data, build custom recognizers for your organization's specific patterns, choose the right anonymization strategy for each use case, and deploy Presidio as an LLM guardrail at the APIM edge that keeps sensitive data off third-party infrastructure.&lt;/p&gt;

&lt;p&gt;The framework is actively maintained, the Docker images are production-ready, and the extensibility model (custom recognizers, custom operators, external NLP services) means it adapts to whatever compliance requirements your organization throws at it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 5 of the Hands-On Microsoft Presidio series. I write about PII detection, AI infrastructure, and building with Claude Code on &lt;a href="https://dev.to/bspann"&gt;Dev.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>presidio</category>
      <category>microsoft</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Anonymization Strategies</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Wed, 10 Jun 2026 01:10:48 +0000</pubDate>
      <link>https://dev.to/bspann/anonymization-strategies-4l91</link>
      <guid>https://dev.to/bspann/anonymization-strategies-4l91</guid>
      <description>&lt;p&gt;Detection tells you where the PII is. Anonymization decides what to do about it. Presidio's anonymizer ships with five built-in operators, each suited for different compliance requirements and use cases. Choosing wrong means either destroying data you needed to recover or leaving sensitive information exposed in ways you didn't intend.&lt;/p&gt;

&lt;p&gt;This part covers every anonymization operator, when to use each one, how to build pseudonymization with consistent name mappings, and how to process PII in PDFs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Five Built-In Operators
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Replace
&lt;/h3&gt;

&lt;p&gt;Replaces the detected entity with a specified value. This is the default operator.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer.entities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OperatorConfig&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John Smith called from 206-555-0147 about his account.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Replace with entity type labels (default behavior)
&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[REDACTED NAME]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[REDACTED PHONE]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: [REDACTED NAME] called from [REDACTED PHONE] about his account.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use replace when you want the output to be human-readable and when the original values don't need to be recovered. Good for sharing anonymized datasets with external teams, displaying sanitized text in dashboards, and audit logs where the PII type matters but the value doesn't.&lt;/p&gt;

&lt;h3&gt;
  
  
  Redact
&lt;/h3&gt;

&lt;p&gt;Removes the entity entirely, leaving no placeholder behind.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output:  called from  about his account.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Redaction changes the text structure and can make sentences unreadable. It's appropriate for internal audit logs where readability isn't a priority, strict compliance scenarios where no trace of PII should remain, and automated pipelines where the text isn't shown to humans.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mask
&lt;/h3&gt;

&lt;p&gt;Replaces each character with a masking character, preserving the length of the original value.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;masking_char&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chars_to_mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Mask all characters
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;masking_char&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;#&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chars_to_mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;# Mask first 8 chars
&lt;/span&gt;            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: ********** called from ########47 about his account.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Masking is useful when you need to preserve the length or partial value. Think credit card receipts showing the last four digits, or support screens where agents need to confirm partial identifiers.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hash
&lt;/h3&gt;

&lt;p&gt;Replaces the entity with a one-way hash. The same input always produces the same hash, which makes it useful for analytics without exposing raw PII.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sha256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sha256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: ef92b778bafe771e89245b89ecbc08a44a4e166c06659911881f383d4473e94f called from ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hash supports &lt;code&gt;sha256&lt;/code&gt; (default) and &lt;code&gt;sha512&lt;/code&gt;. Hashing is irreversible. You can't get the original value back from the hash. But you can compare hashes to determine if two records refer to the same person without knowing who that person is. Good for analytics pipelines, deduplication, and cross-referencing anonymized datasets.&lt;/p&gt;

&lt;h3&gt;
  
  
  Encrypt
&lt;/h3&gt;

&lt;p&gt;Replaces the entity with an encrypted value that can be decrypted later with the right key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WmZq4t7w!z%C*F-J&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Entities replaced with base64-encoded encrypted strings
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Encrypt is the only reversible operator. You can deanonymize later:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DeanonymizeEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer.entities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OperatorConfig&lt;/span&gt;

&lt;span class="n"&gt;deanonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;DeanonymizeEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;deanonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deanonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deanonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;operators&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;decrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WmZq4t7w!z%C*F-J&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deanonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: John Smith called from 206-555-0147 about his account.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use encrypt/decrypt for the PII proxy pattern (scrub before sending to LLM, decrypt after). We'll build that exact pipeline in Part 5.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mixing Operators Per Entity Type
&lt;/h2&gt;

&lt;p&gt;In practice you'll want different strategies for different entity types in the same document.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;operators&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;PERSON&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMAIL_ADDRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hash_type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sha256&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;masking_char&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;*&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chars_to_mask&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;from_end&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CREDIT_CARD&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;encrypt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WmZq4t7w!z%C*F-J&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;US_SSN&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;redact&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DEFAULT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;OperatorConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;replace&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;new_value&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;&amp;lt;PII&amp;gt;&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;DEFAULT&lt;/code&gt; operator catches any entity type that doesn't have a specific operator assigned. Always set a default so nothing slips through unhandled.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pseudonymization with Consistent Mappings
&lt;/h2&gt;

&lt;p&gt;Standard replacement generates different placeholders each time. If "John Smith" appears three times in a document, each occurrence gets the same generic &lt;code&gt;&amp;lt;PERSON&amp;gt;&lt;/code&gt; label. That's fine for redaction but breaks any analysis that needs to track individuals across records.&lt;/p&gt;

&lt;p&gt;Pseudonymization maps each unique value to a consistent fake value. "John Smith" always becomes "Robert Chen." "Jane Doe" always becomes "Maria Santos." The mapping is consistent within a dataset but the original values are unrecoverable without the mapping table.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer.entities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OperatorConfig&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;faker&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Faker&lt;/span&gt;

&lt;span class="n"&gt;fake&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Faker&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;Faker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;seed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;42&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# Reproducible fake data
&lt;/span&gt;
&lt;span class="c1"&gt;# Maintain a mapping for consistency
&lt;/span&gt;&lt;span class="n"&gt;pii_mapping&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_consistent_replacement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;:&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;name&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMAIL_ADDRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;email&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;phone_number&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="n"&gt;entity_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;LOCATION&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fake&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;city&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;_&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;]&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;pii_mapping&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To integrate this with Presidio, you can build a custom operator or post-process the results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer.entities&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OperatorConfig&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;John Smith emailed john@example.com about the project.
Later, John Smith called to follow up. His colleague Jane Doe 
also reached out from jane@example.com.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Sort by start position (descending) to replace from end to start
&lt;/span&gt;&lt;span class="n"&gt;sorted_results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;pseudonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sorted_results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;original&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;replacement&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_consistent_replacement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;original&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pseudonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pseudonymized&lt;/span&gt;&lt;span class="p"&gt;[:&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;replacement&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;pseudonymized&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pseudonymized&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both occurrences of "John Smith" map to the same fake name. Both email addresses map to consistent fake emails. The relationships in the data are preserved without exposing the real identities.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reversible vs. Irreversible: When to Use Which
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Irreversible&lt;/strong&gt; (replace, redact, mask, hash): Use when the original values should never be recoverable. Compliance with GDPR right-to-erasure, publishing anonymized datasets, any scenario where re-identification is a risk.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reversible&lt;/strong&gt; (encrypt): Use when you need the original values back later. The PII proxy pattern (anonymize before LLM, deanonymize after), temporary anonymization for testing, workflows where an authorized user needs to see the real data.&lt;/p&gt;

&lt;p&gt;The key question: does anyone, ever, need to get the original PII back? If yes, encrypt. If no, use one of the irreversible operators. Don't hash when you need reversibility (common mistake). Don't encrypt when you need true anonymization (the key becomes a liability).&lt;/p&gt;

&lt;h2&gt;
  
  
  Processing PDFs
&lt;/h2&gt;

&lt;p&gt;Presidio doesn't process PDFs natively, but you can extract text, anonymize it, and annotate the original PDF with redaction boxes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;  &lt;span class="c1"&gt;# PyMuPDF
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Open the PDF
&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;customer_report.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_text&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Detect PII
&lt;/span&gt;    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Find the text location on the page
&lt;/span&gt;        &lt;span class="n"&gt;pii_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;instances&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;search_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pii_text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Draw redaction boxes
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;inst&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;instances&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_redact_annot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;inst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Apply all redactions on this page
&lt;/span&gt;    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply_redactions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Save the redacted PDF
&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;customer_report_redacted.pdf&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This approach searches for each detected PII string on the PDF page and draws a black box over it. The &lt;code&gt;apply_redactions()&lt;/code&gt; call permanently removes the underlying text, so the PII is gone from the file, not just covered up visually.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;You now have the full anonymization toolkit. In Part 5, we'll put it all together as an LLM guardrail: building a PII proxy that intercepts prompts, scrubs PII with encrypt, forwards the clean prompt to the model, and deanonymizes the response. We'll also cover LiteLLM integration, deployment on Azure, and production hardening.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 4 of the Hands-On Microsoft Presidio series. I write about PII detection, AI infrastructure, and building with Claude Code on &lt;a href="https://dev.to/bspann"&gt;Dev.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>presidio</category>
      <category>microsoft</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Building Custom Recognizers</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Mon, 08 Jun 2026 22:20:14 +0000</pubDate>
      <link>https://dev.to/bspann/building-custom-recognizers-5goe</link>
      <guid>https://dev.to/bspann/building-custom-recognizers-5goe</guid>
      <description>&lt;p&gt;Presidio's built-in recognizers cover the common PII types: names, emails, phone numbers, credit cards, SSNs. But every organization has PII that's specific to their business. Internal employee IDs that follow a custom format. Project codenames that shouldn't leak externally. Customer account numbers that don't match any standard pattern. Medical record numbers, policy IDs, internal ticket references. The built-in recognizers don't know about these.&lt;/p&gt;

&lt;p&gt;This part covers four ways to build custom recognizers, from the simplest (a list of words to flag) to the most sophisticated (connecting an external NLP service).&lt;/p&gt;

&lt;h2&gt;
  
  
  Deny-List Recognizers
&lt;/h2&gt;

&lt;p&gt;The fastest way to add a custom recognizer is a deny list. You give Presidio a list of words or phrases and it flags any exact match as a specific entity type.&lt;/p&gt;

&lt;p&gt;Use case: your company has internal project codenames (like "Project Titan," "Sapphire," "Nightingale") that are confidential and should never appear in data sent to external services.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PatternRecognizer&lt;/span&gt;

&lt;span class="c1"&gt;# Create a deny-list recognizer
&lt;/span&gt;&lt;span class="n"&gt;project_recognizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;supported_entity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INTERNAL_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;deny_list&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Titan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sapphire&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Nightingale&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ironclad&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Meridian&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;deny_list_score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Add it to the analyzer
&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_recognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_recognizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Test it
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The Titan rollout is scheduled for Q3. Contact sarah@company.com for details.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; (score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INTERNAL_PROJECT: 'Titan' (score: 1.00)
EMAIL_ADDRESS: 'sarah@company.com' (score: 1.00)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;deny_list_score&lt;/code&gt; parameter sets the confidence level for matches. Set it to 1.0 if the deny list is curated and every match is definitely PII. Lower it if some terms might appear in non-sensitive contexts.&lt;/p&gt;

&lt;p&gt;Deny lists are case-insensitive by default. "titan," "TITAN," and "Titan" all match.&lt;/p&gt;

&lt;h2&gt;
  
  
  Regex Recognizers
&lt;/h2&gt;

&lt;p&gt;When your PII follows a pattern but the built-in recognizers don't cover it, write a regex recognizer.&lt;/p&gt;

&lt;p&gt;Use case: your company uses employee IDs in the format EMP-XXXXX (EMP- followed by 5 digits) and customer account numbers in the format ACC-XXXX-XXXX.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Pattern&lt;/span&gt;

&lt;span class="c1"&gt;# Employee ID recognizer
&lt;/span&gt;&lt;span class="n"&gt;emp_id_pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employee_id_pattern&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;regex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bEMP-\d{5}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;emp_recognizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;supported_entity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMPLOYEE_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;emp_id_pattern&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EmployeeIdRecognizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Customer account recognizer
&lt;/span&gt;&lt;span class="n"&gt;account_pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;account_number_pattern&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;regex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\bACC-\d{4}-\d{4}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.9&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;account_recognizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;supported_entity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CUSTOMER_ACCOUNT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;account_pattern&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CustomerAccountRecognizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Register both
&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_recognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;emp_recognizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_recognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;account_recognizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Employee EMP-28471 processed refund for account ACC-9921-0047.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; (score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;EMPLOYEE_ID: 'EMP-28471' (score: 0.90)
CUSTOMER_ACCOUNT: 'ACC-9921-0047' (score: 0.90)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;score&lt;/code&gt; in the Pattern object sets the base confidence. You can define multiple patterns for the same entity type if the format varies (some systems might use EMP-XXXXX and others use E-XXXXXXX).&lt;/p&gt;

&lt;h2&gt;
  
  
  Context Enhancement
&lt;/h2&gt;

&lt;p&gt;Regex patterns alone can produce false positives. A pattern like &lt;code&gt;\d{5}&lt;/code&gt; matches any 5-digit number, not just employee IDs. Context words help Presidio distinguish between a zip code and an employee number.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Pattern&lt;/span&gt;

&lt;span class="c1"&gt;# A medical record number recognizer with context
&lt;/span&gt;&lt;span class="n"&gt;mrn_pattern&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Pattern&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mrn_pattern&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;regex&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;\b\d{7,10}\b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;  &lt;span class="c1"&gt;# Low base score because 7-10 digit numbers are common
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;mrn_recognizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PatternRecognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;supported_entity&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MEDICAL_RECORD&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;patterns&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;mrn_pattern&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;medical record&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mrn&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;patient id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;patient number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
             &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;chart number&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;medical id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;health record&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;MedicalRecordRecognizer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_recognizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;mrn_recognizer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# With context: high confidence
&lt;/span&gt;&lt;span class="n"&gt;text1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Patient medical record number: 4829173&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Score boosted because "medical record number" is a context word
&lt;/span&gt;
&lt;span class="c1"&gt;# Without context: low confidence (might be filtered by threshold)
&lt;/span&gt;&lt;span class="n"&gt;text2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Order 4829173 shipped on Tuesday&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;results2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Score stays at base 0.3 because no context words present
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The pattern starts with a low base score (0.3). When context words appear within a configurable window around the match, Presidio boosts the score. When they don't, the score stays low and gets filtered out by your threshold.&lt;/p&gt;

&lt;p&gt;This is the right approach for any pattern that's too generic on its own. Set a low base score, provide strong context words, and let the context scoring do the disambiguation.&lt;/p&gt;

&lt;h2&gt;
  
  
  No-Code Recognizers via YAML
&lt;/h2&gt;

&lt;p&gt;For teams that want to manage recognizers without touching Python code, Presidio supports YAML-based configuration. You define recognizers in a YAML file and load them at startup.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# custom_recognizers.yaml&lt;/span&gt;
&lt;span class="na"&gt;recognizers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Project&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Code&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Recognizer"&lt;/span&gt;
    &lt;span class="na"&gt;supported_language&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en"&lt;/span&gt;
    &lt;span class="na"&gt;supported_entity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INTERNAL_PROJECT"&lt;/span&gt;
    &lt;span class="na"&gt;deny_list&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Titan"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Sapphire"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Nightingale"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Ironclad"&lt;/span&gt;
    &lt;span class="na"&gt;deny_list_score&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1.0&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Employee&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;ID&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Recognizer"&lt;/span&gt;
    &lt;span class="na"&gt;supported_language&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en"&lt;/span&gt;
    &lt;span class="na"&gt;supported_entity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMPLOYEE_ID"&lt;/span&gt;
    &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;emp_id"&lt;/span&gt;
        &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;bEMP-&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;d{5}&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;b"&lt;/span&gt;
        &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.9&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employee"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;emp"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;staff"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;worker"&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Policy&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Number&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Recognizer"&lt;/span&gt;
    &lt;span class="na"&gt;supported_language&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en"&lt;/span&gt;
    &lt;span class="na"&gt;supported_entity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;POLICY_NUMBER"&lt;/span&gt;
    &lt;span class="na"&gt;patterns&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;policy_format"&lt;/span&gt;
        &lt;span class="na"&gt;regex&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;bPOL-[A-Z]{2}-&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;d{6}&lt;/span&gt;&lt;span class="se"&gt;\\&lt;/span&gt;&lt;span class="s"&gt;b"&lt;/span&gt;
        &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;0.95&lt;/span&gt;
    &lt;span class="na"&gt;context&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;policy"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;insurance"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;coverage"&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;claim"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Load them into the analyzer:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer.recognizer_registry&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;RecognizerRegistryProvider&lt;/span&gt;

&lt;span class="c1"&gt;# Load recognizers from YAML
&lt;/span&gt;&lt;span class="n"&gt;registry_provider&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;RecognizerRegistryProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;conf_file&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;custom_recognizers.yaml&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;registry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;registry_provider&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_recognizer_registry&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The YAML approach is useful when non-developers (security teams, compliance officers) need to update the recognizer list. They edit a YAML file, the service restarts with the new configuration. No code changes, no deployments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connecting External Services
&lt;/h2&gt;

&lt;p&gt;For cases where local regex and NER aren't enough, Presidio supports remote recognizers that call external NLP services. Azure AI Language is the most common integration.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer.nlp_engine&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;NlpEngineProvider&lt;/span&gt;

&lt;span class="c1"&gt;# Configure the analyzer to use a transformer model instead of spaCy
&lt;/span&gt;&lt;span class="n"&gt;nlp_config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;nlp_engine_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transformers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;models&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;lang_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;model_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;spacy&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en_core_web_sm&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;transformers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;dslim/bert-base-NER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;nlp_engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;NlpEngineProvider&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nlp_configuration&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;nlp_config&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;create_engine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nlp_engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;nlp_engine&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The transformer-based NER model (&lt;code&gt;dslim/bert-base-NER&lt;/code&gt; or similar) often outperforms spaCy's default model on names and locations, especially for non-English text or unusual name formats. The tradeoff is speed. Transformer models are slower than spaCy, so profile your latency requirements before switching.&lt;/p&gt;

&lt;h2&gt;
  
  
  Testing Your Recognizers
&lt;/h2&gt;

&lt;p&gt;Before deploying custom recognizers, test them against labeled data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="c1"&gt;# (add your custom recognizers)
&lt;/span&gt;
&lt;span class="c1"&gt;# Test cases: (input_text, expected_entity_type, expected_value)
&lt;/span&gt;&lt;span class="n"&gt;test_cases&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Employee EMP-12345 submitted the report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMPLOYEE_ID&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMP-12345&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Contact acc-9921-0047 about the refund&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CUSTOMER_ACCOUNT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ACC-9921-0047&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Project Titan launch is next month&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INTERNAL_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Titan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;The titan submarine was discovered&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;INTERNAL_PROJECT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;titan&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# Should this match?
&lt;/span&gt;    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Order number 12345 shipped&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;# Should NOT match EMPLOYEE_ID
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expected_value&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;test_cases&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;score_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;relevant&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;expected_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expected_type&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expected_type&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;relevant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;found_value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;relevant&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;relevant&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PASS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;found_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;expected_value&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAIL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;expected_type&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;relevant&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PASS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;FAIL&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; -&amp;gt; &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;expected_type&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;NONE&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pay particular attention to false positives (non-PII flagged as PII) and false negatives (actual PII missed). Adjust regex patterns, context words, and score thresholds based on your test results.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;You can now extend Presidio to detect any entity type your business needs. In Part 4, we'll cover anonymization strategies: the full set of operators (replace, redact, mask, hash, encrypt), pseudonymization with consistent mappings, synthetic data generation, and when to use reversible vs. irreversible anonymization.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 3 of the Hands-On Microsoft Presidio series. I write about PII detection, AI infrastructure, and building with Claude Code on &lt;a href="https://dev.to/bspann"&gt;Dev.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>presidio</category>
      <category>microsoft</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Detecting PII in Real-World Text</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Sun, 07 Jun 2026 18:28:34 +0000</pubDate>
      <link>https://dev.to/bspann/detecting-pii-in-real-world-text-310p</link>
      <guid>https://dev.to/bspann/detecting-pii-in-real-world-text-310p</guid>
      <description>&lt;p&gt;In Part 1 we installed Presidio and ran a basic detection on clean sample text. Real data is messier. Emails have signatures with phone numbers buried in HTML. Support tickets mix PII with technical jargon. Chat logs have informal name references that NER models struggle with. And sometimes the PII isn't in text at all. It's in screenshots and scanned documents.&lt;/p&gt;

&lt;p&gt;This part covers how Presidio's detection engine actually works under the hood, how to process different text types you'll encounter in production, and how to handle structured data and images.&lt;/p&gt;

&lt;h2&gt;
  
  
  How the Analyzer Engine Works
&lt;/h2&gt;

&lt;p&gt;Presidio doesn't rely on a single detection method. It layers three approaches and combines their results.&lt;/p&gt;

&lt;h3&gt;
  
  
  Named Entity Recognition (NER)
&lt;/h3&gt;

&lt;p&gt;The NER model (spaCy by default) processes the text and identifies entities based on the language model's training. It's good at catching names, locations, and organizations even when they don't follow a fixed pattern. "John Smith" is easy. "Dr. J. Martinez-Garcia" is harder but the NER model handles it because it understands context and word patterns.&lt;/p&gt;

&lt;p&gt;The tradeoff is that NER is probabilistic. It can miss unusual names or flag common words as entities. That's why Presidio doesn't stop here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern Matching (Regex)
&lt;/h3&gt;

&lt;p&gt;For entities with predictable formats, Presidio uses regex recognizers. Credit card numbers, SSNs, email addresses, IP addresses, phone numbers all have known patterns. A Luhn-validated 16-digit number is almost certainly a credit card. A string matching &lt;code&gt;\d{3}-\d{2}-\d{4}&lt;/code&gt; in the right context is probably an SSN.&lt;/p&gt;

&lt;p&gt;Pattern-based detections typically get higher confidence scores than NER detections because the pattern itself is strong evidence.&lt;/p&gt;

&lt;h3&gt;
  
  
  Context Scoring
&lt;/h3&gt;

&lt;p&gt;Here's where it gets interesting. Presidio looks at the words surrounding a potential match to boost or lower confidence. If the text says "my SSN is 123-45-6789," the phrase "my SSN is" provides strong context that the number is actually a social security number and not some random ID. The context words push the confidence score higher.&lt;/p&gt;

&lt;p&gt;Without context scoring, a 9-digit number in the format XXX-XX-XXXX could be an SSN or a product SKU or an internal reference number. The surrounding words help Presidio decide.&lt;/p&gt;

&lt;p&gt;Each recognizer defines its own list of context words. The SSN recognizer looks for words like "social," "security," "ssn," "tax id." The credit card recognizer looks for "credit," "card," "visa," "mastercard," "payment."&lt;/p&gt;

&lt;h2&gt;
  
  
  Processing Different Text Types
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Emails
&lt;/h3&gt;

&lt;p&gt;Email bodies often contain PII in signatures, forwarded messages, and inline contact details. The challenge is separating the PII you care about from the structural noise (headers, disclaimer text, HTML tags).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;email_body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
From: Sarah Chen &amp;lt;sarah.chen@acme.com&amp;gt;
To: support@company.com
Subject: Account Issue

Hi, I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m having trouble with my account. My customer ID is CUS-2847391 
and the last four of my card are 4242. Please call me at (415) 555-0198 
or email me at sarah.chen@acme.com.

Thanks,
Sarah Chen
VP of Engineering, Acme Corp
Office: (415) 555-0100
Mobile: (415) 555-0198
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;email_body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;email_body&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
          &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Presidio will pick up the email addresses, phone numbers, and the person's name from both the body and the signature. It will also likely flag "Acme Corp" as an organization. You'll notice the same phone number appears twice (in the body and the signature), and Presidio reports each occurrence separately with its own position.&lt;/p&gt;

&lt;h3&gt;
  
  
  Support Tickets
&lt;/h3&gt;

&lt;p&gt;Support tickets mix PII with technical content. Users paste error messages, stack traces, and config snippets alongside their personal details.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;ticket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
User report from jane.doe@company.com:

I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m getting error 500 when trying to update my billing info. 
My account number is 7829-4451-2290 and I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;m using the card 
ending in 8847. The error started after I changed my address 
to 1234 Oak Street, Portland, OR 97201.

Stack trace:
java.lang.NullPointerException at com.billing.PaymentService.update(PaymentService.java:142)
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;ticket&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Presidio handles this well because the regex recognizers match the structured PII (email, account number pattern, zip code) while the NER model catches the street address and name. The stack trace doesn't trigger any false positives because Java class names and file paths don't match PII patterns.&lt;/p&gt;

&lt;h3&gt;
  
  
  Chat Logs
&lt;/h3&gt;

&lt;p&gt;Chat logs are the hardest text type for PII detection. Messages are short, informal, and full of abbreviations. Names appear without context. Phone numbers get typed without dashes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;chat_log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
[10:42] mike_t: hey can someone help with my acct? 
[10:42] mike_t: email is m.thompson@gmail.com
[10:43] support_bot: Sure Mike! What&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s the issue?
[10:44] mike_t: charge on my visa ending 4242 wasnt mine
[10:44] mike_t: my number is 5105105105105100
[10:45] support_bot: I&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;ll look into that. Can you confirm your DOB?
[10:45] mike_t: march 15 1990
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;chat_log&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The credit card number without dashes or spaces is harder to catch, but Presidio's credit card recognizer applies Luhn validation on sequences of digits, so it will still flag it. The date of birth is trickier since Presidio detects dates but classifying a date as a DOB requires context. The surrounding text "confirm your DOB" provides that context.&lt;/p&gt;

&lt;h2&gt;
  
  
  Confidence Scores and Thresholds
&lt;/h2&gt;

&lt;p&gt;Every result comes with a confidence score between 0 and 1. By default, Presidio returns everything above 0. In production you'll want to set thresholds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Only return high-confidence detections
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;score_threshold&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Or filter after the fact for more control
&lt;/span&gt;&lt;span class="n"&gt;high_confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;medium_confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;low_confidence&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.4&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A practical approach: use a high threshold (0.7 or above) for automated anonymization where false positives are costly, and a lower threshold (0.3-0.5) for audit/review workflows where a human checks the flagged items.&lt;/p&gt;

&lt;h2&gt;
  
  
  Batch Processing with presidio-structured
&lt;/h2&gt;

&lt;p&gt;When your PII lives in CSVs, DataFrames, or JSON files, processing text column by column is tedious. The &lt;code&gt;presidio-structured&lt;/code&gt; package handles this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;presidio-structured
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pandas&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_structured&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;StructuredEngine&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PandasAnalysisBuilder&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;

&lt;span class="c1"&gt;# Sample DataFrame
&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pd&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;DataFrame&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;customer_name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;John Smith&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Jane Doe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bob Wilson&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;email&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;john@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;jane@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bob@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;notes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Called about SSN 123-45-6789&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Address: 456 Elm St, Portland OR&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Card ending 4242, refund requested&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="c1"&gt;# Set up the structured engine
&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;structured_engine&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StructuredEngine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;analyzer_engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;anonymizer_engine&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;anonymizer&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Build the analysis configuration
&lt;/span&gt;&lt;span class="n"&gt;analysis_builder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;PandasAnalysisBuilder&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Analyze and anonymize
&lt;/span&gt;&lt;span class="n"&gt;anonymized_df&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;structured_engine&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;df&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analysis_builder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized_df&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The structured engine processes each cell in the DataFrame, detects PII using the same analyzer, and anonymizes it. You can configure which columns to process, set different thresholds per column, and apply different anonymization operators per entity type.&lt;/p&gt;

&lt;h2&gt;
  
  
  Image Redaction with presidio-image-redactor
&lt;/h2&gt;

&lt;p&gt;Sometimes PII isn't in text at all. It's in screenshots of forms, scanned documents, or photos of ID cards. Presidio's image redactor handles this by running OCR (via Tesseract) to extract text from images, detecting PII in the extracted text, and then drawing colored boxes over the PII regions in the original image.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install the image redactor&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;presidio-image-redactor

&lt;span class="c"&gt;# Make sure Tesseract is installed&lt;/span&gt;
&lt;span class="c"&gt;# Mac: brew install tesseract&lt;/span&gt;
&lt;span class="c"&gt;# Ubuntu: apt-get install tesseract-ocr&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_image_redactor&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ImageRedactorEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;PIL&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;

&lt;span class="c1"&gt;# Load an image
&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;support_screenshot.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the redactor
&lt;/span&gt;&lt;span class="n"&gt;redactor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ImageRedactorEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Redact PII from the image
&lt;/span&gt;&lt;span class="n"&gt;redacted_image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redactor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;redact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# Save the result
&lt;/span&gt;&lt;span class="n"&gt;redacted_image&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;support_screenshot_redacted.png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;fill&lt;/code&gt; parameter sets the color of the redaction boxes. Black (0, 0, 0) is the default. You can also use specific colors per entity type:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_image_redactor&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ImageRedactorEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;PatternRecognizer&lt;/span&gt;

&lt;span class="n"&gt;redactor&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ImageRedactorEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Redact with entity-specific colors
&lt;/span&gt;&lt;span class="n"&gt;redacted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redactor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;redact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;image&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;fill&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;       &lt;span class="c1"&gt;# Default: black
&lt;/span&gt;    &lt;span class="n"&gt;entities&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PERSON&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PHONE_NUMBER&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;EMAIL_ADDRESS&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Image redaction accuracy depends heavily on the OCR quality. Clean screenshots with standard fonts work well. Handwritten text, low-resolution scans, and images with complex backgrounds will produce lower accuracy. For those cases, you may want to preprocess the image (deskew, enhance contrast) before sending it to the redactor.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;Now you understand how Presidio's detection layers work together and how to process the text types you'll actually encounter. In Part 3, we'll build custom recognizers: deny-list recognizers for company-specific terms, regex recognizers for internal ID formats, rule-based recognizers with context enhancement, and no-code recognizers via YAML configuration.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 2 of the Hands-On Microsoft Presidio series. I write about PII detection, AI infrastructure, and building with Claude Code on &lt;a href="https://dev.to/bspann"&gt;Dev.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>presidio</category>
      <category>microsoft</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>What Is Microsoft Presidio and Why You Need It (Setup + First Detection)</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Fri, 05 Jun 2026 12:24:35 +0000</pubDate>
      <link>https://dev.to/bspann/what-is-microsoft-presidio-and-why-you-need-it-setup-first-detection-6mh</link>
      <guid>https://dev.to/bspann/what-is-microsoft-presidio-and-why-you-need-it-setup-first-detection-6mh</guid>
      <description>&lt;p&gt;If you're building anything that touches user data and sends it to an LLM, you have a PII problem. Names, emails, phone numbers, credit card numbers, social security numbers sitting in support tickets, chat logs, documents, and database fields. Every time you pipe that data into a prompt, you're sending someone's personal information to a third-party model endpoint. Maybe that's fine for your use case. Maybe it's not. Either way, you should know what's in your data before you make that call.&lt;/p&gt;

&lt;p&gt;Microsoft Presidio is an open-source framework that detects and anonymizes PII in text, images, and structured data. It's been around since 2019, it's actively maintained, and it's what I reach for when I need to scrub data before it hits an LLM. This series walks through the entire framework from installation to production deployment. No toy examples. Real workloads.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Presidio Actually Does
&lt;/h2&gt;

&lt;p&gt;Presidio has two core modules that handle the detection and anonymization pipeline separately.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Analyzer&lt;/strong&gt; finds PII. It combines named entity recognition (NER) from spaCy or Hugging Face transformers with regex pattern matching and contextual scoring. When you feed it text, it returns a list of detected entities with types, confidence scores, and character positions. It doesn't modify the text. It just tells you what it found.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Anonymizer&lt;/strong&gt; takes the analyzer's output and does something with it. Replace detected names with &lt;code&gt;&amp;lt;PERSON&amp;gt;&lt;/code&gt;. Redact phone numbers entirely. Mask credit card numbers with asterisks. Hash emails. Encrypt values you need to reverse later. The anonymizer is where you decide how to handle each entity type.&lt;/p&gt;

&lt;p&gt;Beyond those two, Presidio has additional modules for specific use cases. &lt;strong&gt;presidio-image-redactor&lt;/strong&gt; handles OCR on images and redacts PII from screenshots and scanned documents. &lt;strong&gt;presidio-structured&lt;/strong&gt; processes tabular data in DataFrames and JSON. We'll get to those in later parts of this series.&lt;/p&gt;

&lt;h2&gt;
  
  
  Installing Presidio
&lt;/h2&gt;

&lt;p&gt;You have two paths: Python packages via pip or Docker containers. I'll cover both because you'll want pip for development and experimentation, and Docker for anything that needs to serve an API.&lt;/p&gt;

&lt;h3&gt;
  
  
  pip Installation
&lt;/h3&gt;

&lt;p&gt;Set up a virtual environment first. Presidio pulls in spaCy and NLP models that you don't want colliding with other projects.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create and activate a virtual environment&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; venv presidio-env
&lt;span class="nb"&gt;source &lt;/span&gt;presidio-env/bin/activate  &lt;span class="c"&gt;# Linux/Mac&lt;/span&gt;
&lt;span class="c"&gt;# presidio-env\Scripts\activate   # Windows&lt;/span&gt;

&lt;span class="c"&gt;# Install the core packages&lt;/span&gt;
pip &lt;span class="nb"&gt;install &lt;/span&gt;presidio-analyzer presidio-anonymizer

&lt;span class="c"&gt;# Download a spaCy language model (the large model is more accurate)&lt;/span&gt;
python &lt;span class="nt"&gt;-m&lt;/span&gt; spacy download en_core_web_lg
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;en_core_web_lg&lt;/code&gt; model is about 560MB. If you're tight on space or just experimenting, &lt;code&gt;en_core_web_sm&lt;/code&gt; works but you'll see lower accuracy on name and location detection. For anything beyond a quick test, use the large model.&lt;/p&gt;

&lt;h3&gt;
  
  
  Docker Installation
&lt;/h3&gt;

&lt;p&gt;Presidio publishes official images to Microsoft Container Registry. Each module runs as its own REST API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Pull the images&lt;/span&gt;
docker pull mcr.microsoft.com/presidio-analyzer
docker pull mcr.microsoft.com/presidio-anonymizer

&lt;span class="c"&gt;# Run the analyzer on port 5001&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 5001:3000 mcr.microsoft.com/presidio-analyzer:latest

&lt;span class="c"&gt;# Run the anonymizer on port 5002&lt;/span&gt;
docker run &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 5002:3000 mcr.microsoft.com/presidio-anonymizer:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both containers expose REST APIs on port 3000 internally. Map them to whatever ports you want on the host. Once they're running, you can hit them with curl or any HTTP client.&lt;/p&gt;

&lt;p&gt;To verify they're up:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl http://localhost:5001/health
curl http://localhost:5002/health
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Your First Detection
&lt;/h2&gt;

&lt;p&gt;Let's feed the analyzer some text and see what comes back. I'll show both the Python API and the REST API so you can pick whichever fits your workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  Python API
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;

&lt;span class="c1"&gt;# Initialize the analyzer
&lt;/span&gt;&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# Sample text with multiple PII types
&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
Hi, my name is John Smith and I live in Seattle. 
My email is john.smith@example.com and my phone 
number is 206-555-0147. My SSN is 123-45-6789 
and my credit card is 4111-1111-1111-1111.
&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;

&lt;span class="c1"&gt;# Analyze the text
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Print what we found
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entity_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
          &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;(score: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;, position: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;-&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PERSON: 'John Smith' (score: 0.85, position: 18-28)
LOCATION: 'Seattle' (score: 0.85, position: 42-49)
EMAIL_ADDRESS: 'john.smith@example.com' (score: 1.00, position: 64-86)
PHONE_NUMBER: '206-555-0147' (score: 0.75, position: 110-122)
US_SSN: '123-45-6789' (score: 0.85, position: 134-145)
CREDIT_CARD: '4111-1111-1111-1111' (score: 1.00, position: 169-188)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  REST API (Docker)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:5001/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "text": "My name is John Smith and my email is john.smith@example.com",
    "language": "en"
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response is a JSON array of detected entities with the same fields: entity type, start position, end position, and confidence score.&lt;/p&gt;

&lt;h2&gt;
  
  
  Anatomy of a Recognizer Result
&lt;/h2&gt;

&lt;p&gt;Every detection result contains five fields that matter:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;entity_type&lt;/strong&gt; is what Presidio thinks it found. &lt;code&gt;PERSON&lt;/code&gt;, &lt;code&gt;EMAIL_ADDRESS&lt;/code&gt;, &lt;code&gt;PHONE_NUMBER&lt;/code&gt;, &lt;code&gt;CREDIT_CARD&lt;/code&gt;, &lt;code&gt;US_SSN&lt;/code&gt;, &lt;code&gt;LOCATION&lt;/code&gt;, and dozens more.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;start&lt;/strong&gt; and &lt;strong&gt;end&lt;/strong&gt; are character positions in the original text. This is how you know exactly which substring triggered the detection. It's also how the anonymizer knows what to replace.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;score&lt;/strong&gt; is a confidence value between 0 and 1. A regex match on a credit card pattern returns 1.0 because the pattern is deterministic. A name detected by NER might return 0.85 because the model is making a probabilistic judgment. You can set a threshold to filter out low-confidence detections. The default is 0.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;analysis_explanation&lt;/strong&gt; is available in the detailed results and tells you which recognizer fired and why. Useful for debugging false positives.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported Entities Out of the Box
&lt;/h2&gt;

&lt;p&gt;Presidio ships with recognizers for a wide range of entity types across multiple categories.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Global entities&lt;/strong&gt; (work across languages): credit card numbers, crypto wallet addresses, email addresses, IBAN codes, IP addresses, phone numbers, URLs, domain names, dates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;US-specific&lt;/strong&gt;: Social Security numbers, bank account numbers, driver's license numbers, ITIN, passport numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;UK-specific&lt;/strong&gt;: NHS numbers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Other regions&lt;/strong&gt;: Singapore financial numbers, Australian business numbers, and more through community recognizers.&lt;/p&gt;

&lt;p&gt;The full list is in the &lt;a href="https://microsoft.github.io/presidio/supported_entities/" rel="noopener noreferrer"&gt;Presidio supported entities documentation&lt;/a&gt;. If your entity type isn't covered, you can build custom recognizers. That's Part 3 of this series.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running Your First Anonymization
&lt;/h2&gt;

&lt;p&gt;Detection is only half the job. Let's anonymize the results.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_analyzer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnalyzerEngine&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;presidio_anonymizer&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;AnonymizerEngine&lt;/span&gt;

&lt;span class="n"&gt;analyzer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnalyzerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;anonymizer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;AnonymizerEngine&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;My name is John Smith and my email is john.smith@example.com&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="c1"&gt;# Detect PII
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;analyzer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;analyze&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;language&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;en&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Anonymize with default settings (replaces with entity type labels)
&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;anonymizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;anonymize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;analyzer_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymized&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# Output: My name is &amp;lt;PERSON&amp;gt; and my email is &amp;lt;EMAIL_ADDRESS&amp;gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The default behavior replaces each detected entity with its type label wrapped in angle brackets. In Part 4 we'll dig into all the anonymization operators (replace, redact, mask, hash, encrypt) and when to use each one. For now, the point is that detection and anonymization are separate steps. You can detect without anonymizing, anonymize differently per entity type, or build a pipeline that does both in one shot.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;p&gt;That's the foundation. Presidio installed, first detection running, and you understand what the output looks like. In Part 2, we'll go deeper on the analyzer: how the NER models, regex patterns, and context scoring work together, how to process different text types (emails, support tickets, chat logs), batch processing with presidio-structured, and image redaction with presidio-image-redactor.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This is Part 1 of the Hands-On Microsoft Presidio series. I write about PII detection, AI infrastructure, and building with Claude Code on &lt;a href="https://dev.to/bspann"&gt;Dev.to&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>presidio</category>
      <category>microsoft</category>
      <category>security</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Azure Container Apps Express: The Agent-First Platform You've Been Waiting For</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Thu, 28 May 2026 03:34:09 +0000</pubDate>
      <link>https://dev.to/bspann/azure-container-apps-express-the-agent-first-platform-youve-been-waiting-for-2opf</link>
      <guid>https://dev.to/bspann/azure-container-apps-express-the-agent-first-platform-youve-been-waiting-for-2opf</guid>
      <description>&lt;p&gt;I've been running AI workloads on Azure Container Apps for over a year. Every time I spin up a new agent backend, the ritual is the same: create an environment, configure networking, set scaling rules, wire up health probes, then deploy the actual container. For a prototype agent that might live for a week, that's too much ceremony for what you get.&lt;/p&gt;

&lt;p&gt;ACA Express, which hit public preview in May 2026, kills most of that ceremony. And a separate but related announcement, Docker Compose for Agents, brings MCP gateways and model serving to standard ACA environments. They solve different problems and run on different infrastructure, but together they cover the full spectrum of agent deployment on Azure.&lt;/p&gt;

&lt;p&gt;Let me break down both.&lt;/p&gt;

&lt;h2&gt;
  
  
  ACA Express: What It Actually Is
&lt;/h2&gt;

&lt;p&gt;Express is a new environment tier within Azure Container Apps. You bring a container image. Express handles provisioning, HTTPS, scaling (including scale-from-zero with subsecond cold starts), and resource allocation. No environment to manually provision through the portal. No networking to configure. No scaling rules to write.&lt;/p&gt;

&lt;p&gt;Under the hood, Express is built on ACA Sandboxes, a platform primitive that uses prewarmed pools to deliver that subsecond startup. This isn't the standard ACA cold-start experience with a fresh coat of paint. It's a different architecture.&lt;/p&gt;

&lt;p&gt;The tradeoffs are real. Express is HTTP workloads only, consumption CPU only. No GPU. No VNet integration. No Dapr. No service discovery between apps. No managed identity at runtime. No health probes. If you need any of those, standard ACA environments are still there. But for stateless HTTP agent backends, Express is dramatically faster to deploy and cheaper to run.&lt;/p&gt;

&lt;p&gt;Here's what it takes to get a container running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create an express environment&lt;/span&gt;
az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; my-express-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-my-agents &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment-mode&lt;/span&gt; express &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--logs-destination&lt;/span&gt; none

&lt;span class="c"&gt;# Deploy your app&lt;/span&gt;
az containerapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; my-agent-api &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-my-agents &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; my-express-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; mcr.microsoft.com/k8se/quickstart:latest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target-port&lt;/span&gt; 80 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ingress&lt;/span&gt; external &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-replicas&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-replicas&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your app is running in seconds. Not minutes. Seconds.&lt;/p&gt;

&lt;p&gt;Express also has its own portal experience at &lt;a href="https://containerapps.azure.com" rel="noopener noreferrer"&gt;containerapps.azure.com&lt;/a&gt;, separate from the Azure portal. If you're using the portal, you don't even need to create the environment yourself. It handles that automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "Agent-First" Is the Right Framing
&lt;/h2&gt;

&lt;p&gt;Microsoft is explicitly positioning Express for two audiences: developers who want to ship fast, and AI agents that deploy endpoints on demand. That second audience is the interesting one.&lt;/p&gt;

&lt;p&gt;Think about how modern agent architectures work. An orchestrator spins up tool-use APIs, runs them for the duration of a task, and tears them down. The infrastructure needs to provision fast, scale from zero, and cost nothing when idle. That's exactly the Express model.&lt;/p&gt;

&lt;p&gt;The platform is designed for MCP servers, tool-use endpoints, multi-step workflow APIs, and human-in-the-loop UIs that agents spin up dynamically. Scale-from-zero with subsecond cold starts means you're not paying for agent backends that aren't actively serving requests. And when a request does come in, the agent is ready almost instantly instead of waiting through a cold start.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docker Compose for Agents: A Separate (and Complementary) Feature
&lt;/h2&gt;

&lt;p&gt;Here's where a lot of early coverage got confused, and where I got it wrong in my first draft of this post. Docker Compose for Agents is not an Express feature. It deploys to standard ACA environments with workload profiles, not to Express.&lt;/p&gt;

&lt;p&gt;Why? Because Compose for Agents supports GPU model serving, MCP gateway containers, sidecar processes, and multi-service stacks. All of those require capabilities that Express doesn't have (workload profiles, service discovery, sidecars). Different tool for a different job.&lt;/p&gt;

&lt;p&gt;What Compose for Agents does is let you take the same &lt;code&gt;compose.yml&lt;/code&gt; you use locally for development and deploy it directly to ACA. The CLI translates compose services into Container Apps resources automatically.&lt;/p&gt;

&lt;p&gt;Here's what a compose file looks like for an agent stack:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;my-agent-app&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;build&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;.&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
    &lt;span class="na"&gt;environment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;MCP_GATEWAY_URL=${MCP_GATEWAY_URL}&lt;/span&gt;

  &lt;span class="na"&gt;mcp-gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;docker/mcp-gateway&lt;/span&gt;
    &lt;span class="na"&gt;x-azure-deployment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;acateam.azurecr.io/preview-ai-compose/mcp-gateway:latest&lt;/span&gt;

&lt;span class="na"&gt;models&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;gemma&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ai/gemma3-qat&lt;/span&gt;
    &lt;span class="na"&gt;x-azure-deployment&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;workloadProfiles&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;workloadProfileType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Consumption-GPU-NC8as-T4&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;x-azure-deployment&lt;/code&gt; directive is the bridge between local and cloud. Docker ignores it locally. ACA uses it during deployment. Same file, both environments.&lt;/p&gt;

&lt;p&gt;What the CLI creates behind the scenes:&lt;/p&gt;

&lt;p&gt;Your agent app as a Container App with ingress. An MCP gateway running as its own Container App with managed identity, dynamically managing MCP tool containers. Model serving via Docker's model runner on serverless GPU. The MCP gateway handles stdio-to-SSE translation, so your MCP servers run as standard Container Apps without modification.&lt;/p&gt;

&lt;p&gt;To deploy it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Install the preview CLI extension&lt;/span&gt;
az extension remove &lt;span class="nt"&gt;--name&lt;/span&gt; containerapp
az extension add &lt;span class="nt"&gt;--source&lt;/span&gt; &lt;span class="s2"&gt;"&amp;lt;preview-extension-url&amp;gt;"&lt;/span&gt; &lt;span class="nt"&gt;--yes&lt;/span&gt;

&lt;span class="c"&gt;# Deploy your compose file to a standard ACA environment&lt;/span&gt;
az containerapp compose create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--compose-file-path&lt;/span&gt; compose.yml &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-my-agents &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; my-standard-env
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice that &lt;code&gt;--environment&lt;/code&gt; flag. This deploys to a standard ACA environment, not Express. That's the distinction.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Each Fits in the Azure AI Stack
&lt;/h2&gt;

&lt;p&gt;The Azure AI hosting landscape has gotten crowded. Here's how I think about the options as someone who's deployed on most of them:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Azure AI Foundry&lt;/strong&gt; is for when you want managed model endpoints with built-in safety, content filtering, and enterprise governance. You're consuming models, not hosting infrastructure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ACA Standard&lt;/strong&gt; is for when you need GPU workloads (self-hosted Ollama, vLLM), microservices with Dapr, VNet isolation, or any enterprise feature that Express doesn't have yet. This is also where Docker Compose for Agents deploys.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ACA Express&lt;/strong&gt; is for fast, cheap, stateless agent backends. Prototypes, MCP servers, tool-use APIs, webhook handlers, agent orchestrators that don't need GPU compute.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ACA Dynamic Sessions&lt;/strong&gt; is for sandboxed code execution for AI-generated code. Hyper-V isolated, millisecond provisioning, MCP-integrated.&lt;/p&gt;

&lt;p&gt;Express isn't replacing anything. It's filling the gap for lightweight agent infrastructure that's too simple for standard ACA but too complex for a serverless function.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Missing in Express (For Now)
&lt;/h2&gt;

&lt;p&gt;This is a public preview, and the supported feature list reflects that. The "No" column is long:&lt;/p&gt;

&lt;p&gt;No secrets management (no Key Vault integration). No managed identity at app runtime. No health probes. No custom domains or managed certificates. No VNet integration. No CORS, session affinity, or sidecar containers. No OpenTelemetry. No autoscaling rules (KEDA). Region-limited to West Central US and East Asia.&lt;/p&gt;

&lt;p&gt;For production agent backends, these gaps matter. No managed identity means you're passing credentials through environment variables. No health probes means you're trusting the platform's defaults. No secrets means API keys sit in plain text config.&lt;/p&gt;

&lt;p&gt;But for prototypes, internal tools, and agent backends in active development? These limitations are acceptable tradeoffs for the provisioning speed and cost model. And Microsoft is shipping features on what they describe as a "rapid cadence" through the preview period.&lt;/p&gt;

&lt;h2&gt;
  
  
  When to Use Which
&lt;/h2&gt;

&lt;p&gt;If you're building a lightweight agent backend, an MCP server, or a tool-use API that handles HTTP requests and doesn't need GPU, go Express. You'll have a running endpoint in seconds with zero infrastructure decisions.&lt;/p&gt;

&lt;p&gt;If you're building a full agent stack with model serving, an MCP gateway coordinating multiple tool containers, and GPU workloads, use Docker Compose for Agents on standard ACA. The compose file gives you local-to-cloud parity and the workload profiles give you the compute you need.&lt;/p&gt;

&lt;p&gt;If you need both, use both. Express for the lightweight endpoints, standard ACA for the heavy lifting. They run on the same platform and can coexist in the same resource group.&lt;/p&gt;

</description>
      <category>azure</category>
      <category>ai</category>
      <category>containers</category>
      <category>devops</category>
    </item>
    <item>
      <title>BMAD Method + Claude Code: How I Actually Ship Projects with Spec-Driven AI Development</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Wed, 27 May 2026 02:29:17 +0000</pubDate>
      <link>https://dev.to/bspann/bmad-method-claude-code-how-i-actually-ship-projects-with-spec-driven-ai-development-1eei</link>
      <guid>https://dev.to/bspann/bmad-method-claude-code-how-i-actually-ship-projects-with-spec-driven-ai-development-1eei</guid>
      <description>&lt;p&gt;I vibe-coded my way through three months of Claude Code projects before I admitted something was off. The code worked, mostly, but I kept losing hours to the same problem: Claude and I would drift from the original intent mid-session, and by session two or three, neither of us remembered why we'd made half the decisions in the codebase.&lt;/p&gt;

&lt;p&gt;I'd been watching the BMAD Method since v3 introduced its orchestrator concept, but it felt like overhead I didn't need. Then v4 landed with a real architectural overhaul (NPM distribution, modular agents, multi-IDE support) and I gave it a real shot. It clicked almost immediately. I've spent years working on teams with PMs, architects, scrum masters, QA. The full SDLC cast. BMAD maps those same roles onto AI agents, so the workflow felt familiar instead of foreign. I wasn't learning a new process. I was running the one I already knew, just with different team members. That was roughly nine months ago. I don't build without it now.&lt;/p&gt;

&lt;h2&gt;
  
  
  BMAD in 60 Seconds
&lt;/h2&gt;

&lt;p&gt;BMAD (Breakthrough Method for Agile AI-Driven Development) is an open-source framework that structures AI-assisted coding around specifications, role-based agents, and phased workflows. The spec is the source of truth. Code is the output.&lt;/p&gt;

&lt;p&gt;As of v6, the project has 19+ specialized agents (PM, Architect, Scrum Master, Developer, QA, and others), 50+ named workflows, and a module system that hooks into Claude Code natively through skills, commands, and hooks. It's crossed 40,000 GitHub stars and the ecosystem has spawned several third-party Claude Code plugins.&lt;/p&gt;

&lt;p&gt;Numbers aside, does it actually change how you work? For me, yes. Substantially.&lt;/p&gt;

&lt;h2&gt;
  
  
  How I Set It Up
&lt;/h2&gt;

&lt;p&gt;I use BMAD across several active projects. Bridgely, CoinFolio, FiveCrowns, and Vela are all built this way. Different domains, same workflow.&lt;/p&gt;

&lt;h3&gt;
  
  
  On CLAUDE.md
&lt;/h3&gt;

&lt;p&gt;A lot of BMAD guides will tell you to set up a &lt;code&gt;CLAUDE.md&lt;/code&gt; file alongside it. I actually don't bother. BMAD's own agent configurations, skills, and workflow definitions carry enough context on their own. Adding a CLAUDE.md on top of that is redundant at best, and at worst you end up with conflicting instructions. Your CLAUDE.md says one thing, BMAD's agent config says another, and Claude Code picks whichever it sees last.&lt;/p&gt;

&lt;p&gt;I keep project-level conventions (file naming, directory structure, don't-commit-secrets type rules) in BMAD's own config. One source of truth, not two.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Actual Workflow
&lt;/h3&gt;

&lt;p&gt;A feature goes through four phases. I'll walk through what this looks like on a real task, building a Dev.to publishing skill for my blog pipeline.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spec first.&lt;/strong&gt; I describe what I want. BMAD's PM agent writes a PRD. The Architect agent reviews it and produces a technical design. Both end up as markdown files in &lt;code&gt;docs/specs/&lt;/code&gt; that persist across sessions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;gt; "I need a skill that publishes blog posts to Dev.to via their API,
&amp;gt;  handles draft mode, and manages frontmatter validation."
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The PM agent gives me user stories, acceptance criteria, and scope boundaries. The Architect maps that to file structure, dependencies, and integration points. No code yet. Just the blueprint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Story breakdown.&lt;/strong&gt; The Scrum Master agent splits the spec into implementable stories, each with clear done-criteria, each in its own file under &lt;code&gt;docs/stories/&lt;/code&gt;. This is the part that replaced my old habit of writing one giant implementation prompt. Smaller chunks mean each piece is actually testable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implementation.&lt;/strong&gt; Claude Code writes code against the spec and story files, not against a vague prompt I typed twenty minutes ago. The Dev agent pulls in the story file, the architecture doc, and project conventions from BMAD's config. Decisions trace back to a spec instead of disappearing into chat history.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Validation.&lt;/strong&gt; The QA agent checks work against acceptance criteria, runs tests, and flags gaps. Vibe-coding skips this step entirely, which is exactly why vibe-coded projects accumulate the kind of debt they do.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's Different Now
&lt;/h2&gt;

&lt;p&gt;The context-drift problem I mentioned at the top? Gone. Spec files give Claude Code something persistent to anchor to, so I'm not re-explaining decisions from last Tuesday. Features that used to take two or three sessions now finish in one because the spec does the remembering.&lt;/p&gt;

&lt;p&gt;The other shift was subtler. I used to treat "it runs" as done. Now done means the QA agent signed off against acceptance criteria. It sounds like a small distinction but it changes how much rework I do later. A lot less.&lt;/p&gt;

&lt;p&gt;Refactoring got easier too. When you need to restructure something, having the original &lt;em&gt;intent&lt;/em&gt; documented next to the &lt;em&gt;implementation&lt;/em&gt; means you can tell Claude Code what the code was supposed to do, not just what it currently does.&lt;/p&gt;

&lt;p&gt;I won't overstate it. The biggest improvement isn't raw speed. It's that I can predict what I'm going to get at the end of a session, because I defined it before I started.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Falls Short
&lt;/h2&gt;

&lt;p&gt;Context window pressure is real. On bigger projects, BMAD specs plus architecture docs plus story files eat context fast. I've gotten better at keeping specs concise, but there's a tension between "enough detail to be useful" and "not so much that Claude Code forgets the beginning by the time it reads the end."&lt;/p&gt;

&lt;p&gt;Agent handoffs can be rough. The Architect agent sometimes makes assumptions that don't line up with what the PM agent specified. I've started adding explicit handoff checklists in my story files to catch this, but it's a manual workaround for what should probably be a tighter integration.&lt;/p&gt;

&lt;p&gt;And for small stuff (a typo fix, a CSS tweak, a one-line config change) the full BMAD workflow is overkill. I skip it for anything that touches fewer than three files or doesn't involve a real design decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trying It Out
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://github.com/bmad-code-org/BMAD-METHOD" rel="noopener noreferrer"&gt;GitHub repo&lt;/a&gt; has install instructions. My advice: start with BMM Core (the base module) and don't install everything at once. Pick a real feature on a real project and spec it before you write any code.&lt;/p&gt;

&lt;p&gt;The thing that took me longest to internalize is that the process matters more than the prompts. I spent months tweaking how I asked Claude Code to do things. BMAD shifted that energy toward defining &lt;em&gt;what&lt;/em&gt; I wanted Claude Code to build, and the prompts mostly took care of themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  Spec-Driven Dev Is Bigger Than BMAD
&lt;/h2&gt;

&lt;p&gt;BMAD isn't the only framework pushing this direction. Kiro, GSD, and RALPH-LOOP are all built on variations of the same thesis: AI-generated code is only as good as the structure you feed it.&lt;/p&gt;

&lt;p&gt;BMAD works for me because it maps directly onto Claude Code's extension model. Skills, hooks, commands. It's not a wrapper around Claude Code. It's a playbook for the tools Claude Code already has.&lt;/p&gt;




</description>
      <category>bmad</category>
      <category>claudecode</category>
      <category>ai</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Test Architect (TEA): AI-Driven Testing That Doesn't Rot (Part 5)</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Wed, 20 May 2026 01:23:33 +0000</pubDate>
      <link>https://dev.to/bspann/test-architect-tea-ai-driven-testing-that-doesnt-rot-part-5-1b4b</link>
      <guid>https://dev.to/bspann/test-architect-tea-ai-driven-testing-that-doesnt-rot-part-5-1b4b</guid>
      <description>&lt;h2&gt;
  
  
  Test Architect (TEA): AI-Driven Testing That Doesn't Rot
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Part 5 of the BMAD-Method series&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;We've covered the core framework, workflows, custom agents with BMad Builder, and the Creative Intelligence Suite. There's one piece of the BMAD ecosystem we haven't touched yet, and it solves the problem I hear about most from teams using AI for development: the tests are garbage.&lt;/p&gt;

&lt;p&gt;Not "they don't run" garbage. They run fine. They pass. They look reasonable in a PR review. Then three sprints later, half of them are flaky, a quarter test implementation details instead of behavior, and nobody trusts the suite enough to block a deploy on it. The tests &lt;em&gt;rotted&lt;/em&gt; — not because anyone wrote bad code, but because the AI that generated them had no testing strategy. It just wrote assertions that matched the current behavior.&lt;/p&gt;

&lt;p&gt;TEA (Test Engineering Architect) is BMAD's answer to that problem. It's a module that brings the same structured, workflow-driven approach we use for product management and architecture to the testing side. Nine workflows covering everything from risk-based test planning to release gate decisions.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem TEA Solves
&lt;/h2&gt;

&lt;p&gt;Ask any AI coding tool to "write tests for this component" and you'll get tests. Lots of tests. They'll have descriptive names, reasonable assertions, and they'll pass on the first run. Ship it, right?&lt;/p&gt;

&lt;p&gt;Here's what goes wrong. The AI doesn't know which parts of your system are high-risk and need deep coverage versus which parts are stable and need a smoke test. It doesn't know that your checkout flow handles real money and needs different test rigor than your settings page. It doesn't build fixtures that compose cleanly or follow network-first patterns that eliminate flakiness. It just generates test code that looks like test code.&lt;/p&gt;

&lt;p&gt;TEA's thesis is that testing is an engineering discipline, not a code generation task. Before you write a single test, you should know what's risky, what the priorities are, and what "good enough" coverage looks like for this specific feature. TEA provides that structure through a knowledge base of testing patterns and a set of workflows that guide you from planning through execution to release decisions.&lt;/p&gt;




&lt;h2&gt;
  
  
  What TEA Actually Is
&lt;/h2&gt;

&lt;p&gt;TEA is a BMAD module — you install it the same way you install any other BMAD component. It adds a specialized agent persona (Murat, the Test Architect) and nine workflows that cover the full testing lifecycle.&lt;/p&gt;

&lt;p&gt;The workflows span BMAD's phases:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 3 (Solutioning)&lt;/strong&gt; — system-level test design, framework scaffolding, CI pipeline setup. This is where you answer "how do we test this system?" before anyone writes implementation code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 4 (Implementation)&lt;/strong&gt; — per-epic test design, ATDD (writing failing tests before code), test automation, test review, and traceability. This is where tests get written, reviewed, and mapped to requirements.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Release Gate&lt;/strong&gt; — NFR assessment and the trace workflow's gate decision (PASS / CONCERNS / FAIL / WAIVED). This is where you decide whether the build ships.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Nine Workflows
&lt;/h2&gt;

&lt;p&gt;Here's what each one does and when you'd use it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workflow&lt;/th&gt;
&lt;th&gt;Trigger&lt;/th&gt;
&lt;th&gt;What It Does&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Teach Me Testing&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TMT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Interactive 7-session learning path — fundamentals through advanced&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Framework Setup&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TF&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Scaffolds Playwright or Cypress with config, fixtures, and sample structure&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;CI Pipeline&lt;/td&gt;
&lt;td&gt;&lt;code&gt;CI&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generates CI workflow with selective test scripts and secrets checklist&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test Design&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TD&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Risk-based test planning with P0–P3 prioritization&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ATDD&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AT&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generates failing acceptance tests before implementation (red phase TDD)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Automate&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TA&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generates tests for existing features with fixture composition&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Test Review&lt;/td&gt;
&lt;td&gt;&lt;code&gt;RV&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Audits test quality against the knowledge base, scores 0–100&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trace&lt;/td&gt;
&lt;td&gt;&lt;code&gt;TR&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Maps tests to requirements, generates coverage matrix, makes gate decisions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;NFR Assessment&lt;/td&gt;
&lt;td&gt;&lt;code&gt;NR&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Evaluates non-functional requirements — security, performance, reliability&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two workflows deserve special attention.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Test Design&lt;/strong&gt; is the backbone. It produces a risk assessment using probability × impact scoring, then generates a prioritized test plan. P0 items are critical path — if these fail, users can't use the product. P3 items are edge cases that matter but won't block a release. This prioritization is what prevents the "generate 200 tests and hope for the best" approach. You know exactly where to invest testing effort.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trace&lt;/strong&gt; is the closer. It's a two-phase workflow: Phase 1 builds a traceability matrix mapping tests to requirements, and Phase 2 makes a gate decision. The gate isn't just "did the tests pass" — it evaluates coverage gaps, risk areas without tests, and NFR compliance. The output is a YAML artifact you can attach to your release process.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started: Zero to Passing Tests in 30 Minutes
&lt;/h2&gt;

&lt;p&gt;TEA has five engagement models — you don't have to go all-in. Here's the fastest path.&lt;/p&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx bmad-method &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;span class="c"&gt;# Select: Test Architect (TEA)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Load the Agent
&lt;/h3&gt;

&lt;p&gt;In your AI coding tool (Claude Code, Cursor, Windsurf, etc.):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;bmad-tea
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loads the TEA agent with its menu of workflows.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaffold Your Test Framework
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;framework
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;TEA asks about your stack (React? Node? What test runner?) and generates a production-ready Playwright or Cypress scaffold — config, directory structure, fixtures, &lt;code&gt;.env.example&lt;/code&gt;, the works. Not a toy starter template. The generated structure follows TEA's knowledge base patterns for fixture architecture and network-first testing.&lt;/p&gt;

&lt;h3&gt;
  
  
  Create a Test Design
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;test-design
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tell TEA what you're testing and it produces a risk assessment with P0–P3 priorities. For a TodoMVC-style app, the output might flag "creating and displaying todos" as P0 (critical path) and "clearing completed todos" as P2 (medium value). Each priority level gets specific test scenarios.&lt;/p&gt;

&lt;h3&gt;
  
  
  Generate Tests
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;automate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Point it at your test design document and TEA generates tests that follow the priorities. P0 scenarios get thorough coverage. P3 scenarios get a smoke test. The generated code uses the fixture patterns from TEA's knowledge base — composable fixtures, network interception before navigation, explicit assertions instead of snapshot comparisons.&lt;/p&gt;

&lt;h3&gt;
  
  
  Run
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx playwright &lt;span class="nb"&gt;test&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the TEA Lite path. You used three workflows (&lt;code&gt;framework&lt;/code&gt;, &lt;code&gt;test-design&lt;/code&gt;, &lt;code&gt;automate&lt;/code&gt;), and you have a test suite that was designed before it was generated. The risk assessment stays with the project as documentation — when someone asks "why do we test X but not Y?" the test design document has the answer.&lt;/p&gt;




&lt;h2&gt;
  
  
  How TEA Fits Into Full BMAD Projects
&lt;/h2&gt;

&lt;p&gt;If you're running the full BMAD workflow (Parts 1–4 of this series), TEA plugs into specific phases:&lt;/p&gt;

&lt;p&gt;After your architect produces the architecture and ADRs in Phase 3, run &lt;code&gt;test-design&lt;/code&gt; in system-level mode. This produces two documents: one for the architecture team (testability gaps, ASR validation) and one for QA (test execution recipe, coverage plan, Sprint 0 setup). Both feed into the implementation-readiness gate.&lt;/p&gt;

&lt;p&gt;During Phase 4, each epic gets its own &lt;code&gt;test-design&lt;/code&gt; run. Then for each story: optionally run &lt;code&gt;atdd&lt;/code&gt; to generate failing acceptance tests before development, run &lt;code&gt;automate&lt;/code&gt; after the feature is built, and optionally run &lt;code&gt;test-review&lt;/code&gt; to audit quality. The &lt;code&gt;trace&lt;/code&gt; workflow refreshes the coverage matrix as tests accumulate.&lt;/p&gt;

&lt;p&gt;At the release gate, &lt;code&gt;trace&lt;/code&gt; Phase 2 evaluates everything and produces a PASS/CONCERNS/FAIL/WAIVED decision with evidence.&lt;/p&gt;

&lt;p&gt;You don't have to use every workflow. Plenty of teams start with just &lt;code&gt;automate&lt;/code&gt; and add the planning workflows as they see the value. TEA is designed to be adopted incrementally.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Makes TEA's Tests Different
&lt;/h2&gt;

&lt;p&gt;Three things separate TEA-generated tests from "just ask the AI to write tests":&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Risk-based prioritization.&lt;/strong&gt; Tests aren't generated uniformly. High-risk features get deep coverage. Low-risk features get a smoke test. This matches how experienced test architects actually think — you don't spend the same effort testing a payment flow and a color theme toggle.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Knowledge base patterns.&lt;/strong&gt; TEA carries a knowledge base of 42 testing fragments covering fixture architecture, network-first patterns, step-file organization, and quality standards. Every generated test follows these patterns. The fixture architecture alone — pure function → fixture → composition — prevents the most common source of test rot: fixtures that are coupled to implementation details.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Network-first approach.&lt;/strong&gt; Instead of using &lt;code&gt;page.waitForTimeout(2000)&lt;/code&gt; or hoping the page loads fast enough, TEA's patterns intercept network calls before navigating. Tests wait for actual responses, not arbitrary delays. This is the single biggest factor in eliminating flakiness.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enterprise and Brownfield Support
&lt;/h2&gt;

&lt;p&gt;TEA handles more than greenfield projects.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;brownfield projects&lt;/strong&gt; (existing codebase, existing tests), start with &lt;code&gt;trace&lt;/code&gt; to baseline your current coverage. TEA maps what's tested and what's not, identifies regression hotspots, and focuses the test design on the areas where new work intersects with existing risk. You don't throw away your existing tests — you improve them incrementally.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;enterprise projects&lt;/strong&gt; with compliance requirements, TEA's &lt;code&gt;nfr-assess&lt;/code&gt; workflow captures security, performance, and reliability requirements early. The release gate produces audit-trail artifacts that map to SOC 2 and HIPAA evidence requirements.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The fastest way to see TEA in action:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx bmad-method &lt;span class="nb"&gt;install&lt;/span&gt;  &lt;span class="c"&gt;# Select TEA&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then in your AI coding tool: &lt;code&gt;bmad-tea&lt;/code&gt; → &lt;code&gt;framework&lt;/code&gt; → &lt;code&gt;test-design&lt;/code&gt; → &lt;code&gt;automate&lt;/code&gt; → &lt;code&gt;npx playwright test&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Thirty minutes. Risk-based test plan plus passing tests. No test rot.&lt;/p&gt;

&lt;p&gt;Full documentation: &lt;a href="https://bmad-code-org.github.io/bmad-method-test-architecture-enterprise/" rel="noopener noreferrer"&gt;TEA Docs&lt;/a&gt;&lt;br&gt;
GitHub: &lt;a href="https://github.com/bmad-code-org/bmad-method-test-architecture-enterprise" rel="noopener noreferrer"&gt;bmad-code-org/bmad-method-test-architecture-enterprise&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Using TEA on a real project? I'd love to hear how the risk-based approach compares to your previous testing workflow — drop it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>testing</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>Building a C# Agent with Microsoft Agent Framework and Ollama</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Tue, 21 Apr 2026 22:01:24 +0000</pubDate>
      <link>https://dev.to/bspann/building-a-c-agent-with-microsoft-agent-framework-and-ollama-26m4</link>
      <guid>https://dev.to/bspann/building-a-c-agent-with-microsoft-agent-framework-and-ollama-26m4</guid>
      <description>&lt;h2&gt;
  
  
  Building a C# Agent with Microsoft Agent Framework and Ollama
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Part 3 of "Running LLMs &amp;amp; Agents on Azure Container Apps"&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;We've got Ollama running in Azure Container Apps with persistent storage and secure access. Now let's write an agent that talks to it.&lt;/p&gt;

&lt;p&gt;Two weeks ago, Microsoft shipped Agent Framework 1.0 -- the production-ready successor to both Semantic Kernel and AutoGen. Same team, dramatically simpler API. If you've been building agents with Semantic Kernel's &lt;code&gt;ChatCompletionAgent&lt;/code&gt; and &lt;code&gt;Kernel&lt;/code&gt; objects, the new framework strips away most of that ceremony. You get an agent in three lines of code instead of fifteen.&lt;/p&gt;

&lt;p&gt;I rewrote my Ollama agent code the week it shipped. This post walks through what that looks like.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Agent Framework Over Semantic Kernel
&lt;/h2&gt;

&lt;p&gt;I used Semantic Kernel for everything up until this month. It's solid, and I still have projects running on it. But Agent Framework fixes the things that always bothered me.&lt;/p&gt;

&lt;p&gt;In Semantic Kernel, every agent needs a &lt;code&gt;Kernel&lt;/code&gt; instance. You build a kernel, configure providers, register plugins, then pass the kernel to the agent. It's a lot of plumbing for what amounts to "talk to this model and call these functions." Agent Framework collapses that into a single extension method. You take your chat client -- whatever provider -- and call &lt;code&gt;.AsAIAgent()&lt;/code&gt;. Done.&lt;/p&gt;

&lt;p&gt;Tool registration is the other big improvement. Semantic Kernel requires &lt;code&gt;[KernelFunction]&lt;/code&gt; attributes on every method, a plugin class, and a kernel to register it on. Agent Framework uses &lt;code&gt;AIFunctionFactory.Create()&lt;/code&gt; to wrap any C# method as a tool. You pass your tools directly when you create the agent. No attributes, no plugin classes, no kernel.&lt;/p&gt;

&lt;p&gt;The underlying model abstraction is &lt;code&gt;Microsoft.Extensions.AI&lt;/code&gt;, which means any provider that implements &lt;code&gt;IChatClient&lt;/code&gt; works. Ollama, Azure OpenAI, OpenAI, Anthropic -- same agent code, different client. That portability is why I chose this stack for the series.&lt;/p&gt;

&lt;p&gt;A note on Semantic Kernel: Microsoft will keep maintaining it and fixing bugs, but new features go into Agent Framework. If you're starting fresh, start here. If you have Semantic Kernel code running in production, there's no rush to migrate -- but new projects should use the new framework.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project Setup
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;dotnet new console &lt;span class="nt"&gt;-n&lt;/span&gt; OllamaAgent
&lt;span class="nb"&gt;cd &lt;/span&gt;OllamaAgent
dotnet add package Microsoft.Agents.AI &lt;span class="nt"&gt;--prerelease&lt;/span&gt;
dotnet add package Microsoft.Extensions.AI.Ollama &lt;span class="nt"&gt;--prerelease&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two packages. &lt;code&gt;Microsoft.Agents.AI&lt;/code&gt; is the agent framework itself. &lt;code&gt;Microsoft.Extensions.AI.Ollama&lt;/code&gt; is the first-party Ollama connector built on the &lt;code&gt;IChatClient&lt;/code&gt; abstraction. Both are marked &lt;code&gt;--prerelease&lt;/code&gt; because the NuGet packages shipped as 1.0.0-preview while the framework itself is GA. Microsoft does this sometimes. The APIs are stable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Your First Agent
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OllamaChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://your-ollama.azurecontainerapps.io"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"llama3:8b"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are a helpful assistant running on self-hosted infrastructure."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"What is Azure Container Apps?"&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole thing. Three meaningful lines: create a client, make it an agent, run it. The &lt;code&gt;endpoint&lt;/code&gt; is the internal FQDN of your Ollama container app from Part 2. If your code runs in the same ACA environment (which it will in Part 4), it reaches Ollama directly over the internal network.&lt;/p&gt;

&lt;p&gt;Compare that to the Semantic Kernel equivalent, which needs &lt;code&gt;Kernel.CreateBuilder()&lt;/code&gt;, &lt;code&gt;AddOllamaChatCompletion()&lt;/code&gt;, &lt;code&gt;builder.Build()&lt;/code&gt;, then &lt;code&gt;kernel.InvokePromptAsync()&lt;/code&gt;. Same result, twice the ceremony.&lt;/p&gt;




&lt;h2&gt;
  
  
  Swappable Backends
&lt;/h2&gt;

&lt;p&gt;This is the pattern I use on every project. Configure a local backend for development and a cloud backend for production, and a flag decides which one runs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Azure.AI.OpenAI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Azure&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="nf"&gt;CreateAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="n"&gt;useLocal&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;IChatClient&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;useLocal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OllamaChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetEnvironmentVariable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"OLLAMA_URL"&lt;/span&gt;&lt;span class="p"&gt;)!),&lt;/span&gt;
            &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"llama3:8b"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetEnvironmentVariable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AZURE_OPENAI_ENDPOINT"&lt;/span&gt;&lt;span class="p"&gt;)!),&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureKeyCredential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;Environment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetEnvironmentVariable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"AZURE_OPENAI_KEY"&lt;/span&gt;&lt;span class="p"&gt;)!))&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are a helpful technical assistant."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In development, &lt;code&gt;useLocal&lt;/code&gt; is true. In production, it's false. Your agent instructions, tools, and orchestration stay identical. You're only changing the inference backend.&lt;/p&gt;

&lt;p&gt;This pays off in ways beyond the obvious cost savings. You can run your full test suite against a local model in CI/CD without API charges. You can demo at a conference or customer site without depending on network connectivity. I've done both.&lt;/p&gt;




&lt;h2&gt;
  
  
  Multi-Turn Conversations
&lt;/h2&gt;

&lt;p&gt;Agent Framework introduces sessions for managing conversation state. Each session tracks its own message history.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are a technical advisor for Azure deployments."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Create a session for a multi-turn conversation&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateSessionAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// First turn&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response1&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"I need to deploy a containerized ML model on Azure."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Second turn -- the agent remembers the context&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response2&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"What about GPU support?"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The session handles all the chat history management. In Semantic Kernel, you'd create a &lt;code&gt;ChatHistory&lt;/code&gt; object, manually append messages, and pass it around. Here, the session does that behind the scenes. You can also serialize sessions to JSON for persistence, which is useful when you need conversations that survive container restarts.&lt;/p&gt;




&lt;h2&gt;
  
  
  Adding Tools (Function Calling)
&lt;/h2&gt;

&lt;p&gt;Tools are where agents stop being chatbots and start doing useful work. Agent Framework makes tool registration dead simple compared to Semantic Kernel.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.ComponentModel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Just a regular C# method -- no [KernelFunction] attribute needed&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Gets the current weather for a location"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;GetWeather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// In production, this calls a weather API&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;$"Weather in &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;location&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: 72°F, Sunny"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Looks up an Azure resource's current status"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;CheckResourceStatus&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;resourceName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;$"&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resourceName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: Running, 0 errors in last 24h"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are an operations assistant with access to weather and Azure monitoring tools."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GetWeather&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CheckResourceStatus&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"What's the weather in Seattle and is my ollama-prod app healthy?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notice what's missing: no plugin class, no kernel, no &lt;code&gt;FunctionChoiceBehavior&lt;/code&gt; settings. You pass your tools as a list when you create the agent, and the framework handles the rest. The &lt;code&gt;[Description]&lt;/code&gt; attribute is optional but I always include it -- it's what the LLM reads to decide whether to call the function. A good description is the difference between the model calling your function correctly and ignoring it entirely.&lt;/p&gt;

&lt;p&gt;In Semantic Kernel, the same setup requires creating a plugin class with &lt;code&gt;[KernelFunction]&lt;/code&gt; attributes, building a kernel, registering the plugin on the kernel, configuring &lt;code&gt;FunctionChoiceBehavior.Auto()&lt;/code&gt; in execution settings, and then invoking. Agent Framework gets the same result with half the code and no framework-specific attributes on your business logic.&lt;/p&gt;




&lt;h2&gt;
  
  
  Function Calling with Local Models: What Actually Works
&lt;/h2&gt;

&lt;p&gt;Function calling with self-hosted models is not as reliable as with GPT-4. It works, but you need to pick the right models. I've burned enough time on this to have opinions.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Llama 3.1 and later&lt;/strong&gt; have solid function calling support. If you're on Llama 3 (without the .1), function calling will be flaky -- the model wasn't trained for tool use. This is the number one issue I see people hit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Mistral and Mixtral&lt;/strong&gt; handle tool use well. They're my go-to when you need function calling on Ollama at a smaller size than Llama 3.1 70B.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Qwen 2.5&lt;/strong&gt; is strong on structured output and function calling, especially the 7B and 14B sizes. It's become my default for agents that need reliable tool use on modest hardware.&lt;/p&gt;

&lt;p&gt;Practical advice: write an integration test that sends a prompt requiring a function call and verifies the function actually fired. Takes five minutes, saves hours.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Quick smoke test for function calling support&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Returns the current UTC time"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;GetTime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;DateTime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;UtcNow&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ToString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"o"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;testAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Use the GetTime tool to answer time questions."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;GetTime&lt;/span&gt;&lt;span class="p"&gt;)]);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;testAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"What time is it?"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// If the response contains a real timestamp, function calling works&lt;/span&gt;
&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run that against each model you're evaluating. If it returns something like "I don't have access to real-time information" instead of an actual timestamp, that model can't do tool use.&lt;/p&gt;




&lt;h2&gt;
  
  
  Smart Routing: Right Model for the Job
&lt;/h2&gt;

&lt;p&gt;Once you have both backends available, you can route requests to the model that fits the task.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;SmartRouter&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;_localAgent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;private&lt;/span&gt; &lt;span class="k"&gt;readonly&lt;/span&gt; &lt;span class="n"&gt;AIAgent&lt;/span&gt; &lt;span class="n"&gt;_cloudAgent&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;SmartRouter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;ollamaUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;azureEndpoint&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;azureKey&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;localClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OllamaChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ollamaUrl&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"qwen2.5:14b"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;cloudClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureOpenAIClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;azureEndpoint&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;AzureKeyCredential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;azureKey&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;GetChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"gpt-4"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="n"&gt;_localAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;localClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are a data processing assistant."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;_cloudAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cloudClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"You are an expert analyst and writer."&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="n"&gt;AgentResponse&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;RouteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;taskType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;taskType&lt;/span&gt; &lt;span class="k"&gt;switch&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="s"&gt;"classify"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"extract"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"summarize"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_localAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s"&gt;"reason"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"analyze"&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="s"&gt;"generate"&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_cloudAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="p"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;_localAgent&lt;/span&gt;
        &lt;span class="p"&gt;};&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Local models handle classification, extraction, and summarization almost as well as GPT-4 -- well enough for production. Where GPT-4 still pulls ahead is multi-step reasoning, complex code generation, and text that needs a specific tone. A routing layer like this cuts API costs by 60-80% without a noticeable quality drop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Complete Example: Document Triage Agent
&lt;/h2&gt;

&lt;p&gt;Here's something closer to what I've built for real teams -- an agent that triages incoming documents, classifies them, extracts key fields, and routes them for review.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;System.ComponentModel&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Agents.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;using&lt;/span&gt; &lt;span class="nn"&gt;Microsoft.Extensions.AI&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Tool functions -- clean C# methods, no framework attributes required&lt;/span&gt;
&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Classifies a document as: invoice, contract, support-ticket, or other"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;ClassifyDocument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// In production: fine-tuned classifier or pattern matching&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"invoice"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Extracts vendor name, amount, and due date from an invoice"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;ExtractInvoiceFields&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;"""{"&lt;/span&gt;&lt;span class="n"&gt;vendor&lt;/span&gt;&lt;span class="s"&gt;": "&lt;/span&gt;&lt;span class="n"&gt;Contoso&lt;/span&gt;&lt;span class="s"&gt;", "&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="s"&gt;": 4250.00, "&lt;/span&gt;&lt;span class="n"&gt;due&lt;/span&gt;&lt;span class="s"&gt;": "&lt;/span&gt;&lt;span class="m"&gt;2026&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;05&lt;/span&gt;&lt;span class="p"&gt;-&lt;/span&gt;&lt;span class="m"&gt;15&lt;/span&gt;&lt;span class="s"&gt;"}"""&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;Description&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Routes a document to a review queue based on category and priority"&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
&lt;span class="k"&gt;static&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="nf"&gt;RouteForReview&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;$"Routed &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt; to &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;priority&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s"&gt;-priority queue"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// Create the agent with Qwen 2.5 14B -- reliable tool use, runs on CPU&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;OllamaChatClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nf"&gt;Uri&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"https://your-ollama.azurecontainerapps.io"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"qwen2.5:14b"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;triageAgent&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chatClient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;AsAIAgent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;instructions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"""
&lt;/span&gt;        &lt;span class="n"&gt;You&lt;/span&gt; &lt;span class="n"&gt;are&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="n"&gt;triage&lt;/span&gt; &lt;span class="n"&gt;agent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;When&lt;/span&gt; &lt;span class="n"&gt;given&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Classify&lt;/span&gt; &lt;span class="n"&gt;its&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;
        &lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;If&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt;&lt;span class="err"&gt;'&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="n"&gt;an&lt;/span&gt; &lt;span class="n"&gt;invoice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;extract&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt; &lt;span class="n"&gt;fields&lt;/span&gt;
        &lt;span class="m"&gt;3&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="n"&gt;Route&lt;/span&gt; &lt;span class="n"&gt;it&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;review&lt;/span&gt; &lt;span class="n"&gt;based&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt; &lt;span class="k"&gt;and&lt;/span&gt; &lt;span class="n"&gt;urgency&lt;/span&gt;
        &lt;span class="n"&gt;Use&lt;/span&gt; &lt;span class="n"&gt;the&lt;/span&gt; &lt;span class="n"&gt;available&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;each&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;
        &lt;span class="s"&gt;""",
&lt;/span&gt;    &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ClassifyDocument&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ExtractInvoiceFields&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;AIFunctionFactory&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;Create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;RouteForReview&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;triageAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;CreateSessionAsync&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;triageAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;RunAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"Process this document: Invoice from Contoso for $4,250 due May 15, 2026 for Azure consulting services."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="n"&gt;Console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;WriteLine&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm using &lt;code&gt;qwen2.5:14b&lt;/code&gt; because it chains multiple tool calls reliably -- classify, then extract, then route -- without dropping steps. It's small enough to run on CPU without painful latency. Llama 3 can't do this sequence consistently; Qwen 2.5 nails it.&lt;/p&gt;

&lt;p&gt;This is a single-agent setup. In Part 4, we'll break this apart -- a classifier agent, an extraction agent, a routing agent -- each running as its own container on ACA, communicating through Dapr, with Dynamic Sessions for sandboxed code execution.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed from Semantic Kernel (Quick Reference)
&lt;/h2&gt;

&lt;p&gt;If you've been following this series and have Semantic Kernel code, here's what moves where:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Semantic Kernel&lt;/th&gt;
&lt;th&gt;Agent Framework&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;Kernel.CreateBuilder()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;new OllamaChatClient(...)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;builder.AddOllamaChatCompletion(...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(done in client constructor)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;kernel.InvokePromptAsync(...)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;agent.RunAsync(...)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;[KernelFunction]&lt;/code&gt; attribute&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AIFunctionFactory.Create(method)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;builder.Plugins.AddFromType&amp;lt;T&amp;gt;()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;tools: [...]&lt;/code&gt; parameter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;FunctionChoiceBehavior.Auto()&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;(automatic -- no config needed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;ChatHistory&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;AgentSession&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;Microsoft.SemanticKernel&lt;/code&gt; namespace&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Microsoft.Agents.AI&lt;/code&gt; + &lt;code&gt;Microsoft.Extensions.AI&lt;/code&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Semantic Kernel packages still work. If you have production code on them, there's no fire to put out. But for new projects, Agent Framework is less code, less ceremony, and where Microsoft is putting new features.&lt;/p&gt;




&lt;h2&gt;
  
  
  Next Up
&lt;/h2&gt;

&lt;p&gt;Part 4 is where the architecture gets interesting: multiple agents running as separate containers on ACA, passing messages through Dapr, with Azure Container Apps Dynamic Sessions for sandboxed code execution. We go from "one agent that triages documents" to "a team of agents that can research, code, and review."&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions about migrating from Semantic Kernel or getting Ollama working with Agent Framework? Drop them in the comments -- I migrated a project last week and the gotchas are fresh.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>azure</category>
      <category>ai</category>
      <category>csharp</category>
      <category>dotnet</category>
    </item>
    <item>
      <title>Running Ollama on Azure Container Apps</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Sun, 19 Apr 2026 17:15:39 +0000</pubDate>
      <link>https://dev.to/bspann/running-ollama-on-azure-container-apps-2550</link>
      <guid>https://dev.to/bspann/running-ollama-on-azure-container-apps-2550</guid>
      <description>&lt;h2&gt;
  
  
  Running Ollama on Azure Container Apps
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Part 2 of "Running LLMs &amp;amp; Agents on Azure Container Apps"&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;In Part 1, I made the case for why Azure Container Apps hits the sweet spot for self-hosted LLM inference. Now let's actually build it.&lt;/p&gt;

&lt;p&gt;By the end of this post, you'll have Ollama running in Azure, serving Llama 3, with persistent model storage and a secure endpoint. The basic deployment takes about 20 minutes. The production hardening we'll add (persistent volumes, auth, GPU) takes it from a demo to something you'd actually run for a team.&lt;/p&gt;

&lt;h2&gt;
  
  
  A Quick Word on Ollama
&lt;/h2&gt;

&lt;p&gt;If you haven't used Ollama before, the pitch is simple: it's the easiest way to run open-source LLMs. On your local machine, it's one command, &lt;code&gt;ollama run llama3&lt;/code&gt;, and you've got a model running with an API endpoint.&lt;/p&gt;

&lt;p&gt;The reason Ollama works so well for what we're building is the OpenAI-compatible API at &lt;code&gt;/v1/chat/completions&lt;/code&gt;. Any code written against the OpenAI SDK, including Semantic Kernel (which we'll use in Part 3), works with Ollama without modification. Swap the endpoint URL and you're done. That portability is why I chose Ollama for this series over vLLM or text-generation-inference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Step 1: Create the Environment
&lt;/h2&gt;

&lt;p&gt;First, set up a resource group and an ACA environment. The environment is the shared boundary for your container apps: networking, Dapr configuration, and logging all live at this level.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az group create &lt;span class="nt"&gt;--name&lt;/span&gt; rg-ollama-demo &lt;span class="nt"&gt;--location&lt;/span&gt; eastus

az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; eastus
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I'm using East US here because it has good availability for GPU workload profiles. If you're just doing CPU-only for development, any region works.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: Deploy Ollama
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az containerapp create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--environment&lt;/span&gt; ollama-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; ollama/ollama:latest &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--target-port&lt;/span&gt; 11434 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--ingress&lt;/span&gt; internal &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--cpu&lt;/span&gt; 4 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--memory&lt;/span&gt; 8Gi &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-replicas&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-replicas&lt;/span&gt; 1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two settings here that I want to call attention to.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;--ingress internal&lt;/code&gt; means this endpoint is only accessible to other containers in the same ACA environment. I've seen people deploy Ollama with &lt;code&gt;--ingress external&lt;/code&gt; in tutorials, and that's a real problem. An unauthenticated Ollama instance on the public internet means anyone who finds the URL can run arbitrary models on your hardware. You're handing out free GPU time. Start with internal ingress, and if you need external access later, add authentication first (I'll show you how below).&lt;/p&gt;

&lt;p&gt;&lt;code&gt;--min-replicas 0&lt;/code&gt; enables scale-to-zero. When nobody's sending requests, ACA shuts down the container entirely and you stop paying. The first request after idle triggers a cold start: the container needs to spin up and (if models aren't persisted) re-download the model weights. We'll fix the cold start problem with persistent storage in a minute, but even with it, expect 15-30 seconds on the first request. That's fine for development. For production, you might want &lt;code&gt;--min-replicas 1&lt;/code&gt; to keep one instance warm.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: Pull a Model
&lt;/h2&gt;

&lt;p&gt;With internal ingress, you can't hit the endpoint directly from your local machine. You need to either exec into the container or temporarily switch to external ingress to pull your first model.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Get the internal FQDN&lt;/span&gt;
&lt;span class="nv"&gt;OLLAMA_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;az containerapp show &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"properties.configuration.ingress.fqdn"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; tsv&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# From another container in the same environment, or temporarily with external ingress:&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST &lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$OLLAMA_URL&lt;/span&gt;&lt;span class="s2"&gt;/api/pull"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"name": "llama3:8b"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Practical tip:&lt;/strong&gt; If you're just getting started, temporarily flip to &lt;code&gt;--ingress external&lt;/code&gt;, pull your model, then flip back to internal. It's a few seconds of exposure and much simpler than setting up a jump box. For production, use the pre-baked image approach I cover later in this post. It avoids runtime downloads entirely.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Step 4: Test It
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$OLLAMA_URL&lt;/span&gt;&lt;span class="s2"&gt;/api/generate"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model": "llama3:8b", "prompt": "Hello!", "stream": false}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should get back a JSON response with the model's reply. If you do, you've got a self-hosted LLM running in Azure.&lt;/p&gt;

&lt;p&gt;The OpenAI-compatible endpoint is what we'll actually use in code:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="s2"&gt;"https://&lt;/span&gt;&lt;span class="nv"&gt;$OLLAMA_URL&lt;/span&gt;&lt;span class="s2"&gt;/v1/chat/completions"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"model": "llama3:8b", "messages": [{"role": "user", "content": "Hello"}]}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the endpoint that Semantic Kernel, LangChain, and anything else built against the OpenAI API will talk to. We'll wire it up in Part 3.&lt;/p&gt;




&lt;h2&gt;
  
  
  Persistent Model Storage
&lt;/h2&gt;

&lt;p&gt;Here's a gotcha that bites everyone the first time: when your container scales to zero and back up, it loses everything in ephemeral storage. That includes your downloaded models. Llama 3 8B is about 4.7 GB. Re-downloading it on every cold start means your first request takes minutes instead of seconds, and you're paying for egress bandwidth every time.&lt;/p&gt;

&lt;p&gt;The fix is to mount an Azure Files share so models survive container restarts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Create a storage account&lt;/span&gt;
az storage account create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; stollamademo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--location&lt;/span&gt; eastus &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--sku&lt;/span&gt; Standard_LRS

&lt;span class="c"&gt;# Create a file share&lt;/span&gt;
az storage share create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama-models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--account-name&lt;/span&gt; stollamademo

&lt;span class="c"&gt;# Get the storage account key&lt;/span&gt;
&lt;span class="nv"&gt;STORAGE_KEY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;az storage account keys list &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--account-name&lt;/span&gt; stollamademo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--query&lt;/span&gt; &lt;span class="s2"&gt;"[0].value"&lt;/span&gt; &lt;span class="nt"&gt;-o&lt;/span&gt; tsv&lt;span class="si"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Register the storage with your ACA environment&lt;/span&gt;
az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;storage &lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--storage-name&lt;/span&gt; ollama-storage &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--azure-file-account-name&lt;/span&gt; stollamademo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--azure-file-account-key&lt;/span&gt; &lt;span class="nv"&gt;$STORAGE_KEY&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--azure-file-share-name&lt;/span&gt; ollama-models &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--access-mode&lt;/span&gt; ReadWrite
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you need to mount that storage into the container. ACA requires a YAML file for volume mounts because there's no pure CLI flag for this. Create &lt;code&gt;volume-mount.yaml&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;properties&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama-models&lt;/span&gt;
        &lt;span class="na"&gt;storageName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama-storage&lt;/span&gt;
        &lt;span class="na"&gt;storageType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;AzureFile&lt;/span&gt;
    &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama/ollama:latest&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama&lt;/span&gt;
        &lt;span class="na"&gt;resources&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;cpu&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;
          &lt;span class="na"&gt;memory&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;8Gi&lt;/span&gt;
        &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;OLLAMA_MODELS&lt;/span&gt;
            &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/models&lt;/span&gt;
        &lt;span class="na"&gt;volumeMounts&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;volumeName&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;ollama-models&lt;/span&gt;
            &lt;span class="na"&gt;mountPath&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/models&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Apply it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az containerapp update &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--yaml&lt;/span&gt; volume-mount.yaml
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;OLLAMA_MODELS&lt;/code&gt; environment variable tells Ollama where to store and look for model files. With this in place, the first cold start after pulling a model still takes a few seconds (the container itself needs to start), but the model weights are already there on the mounted share. Every subsequent start is fast.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adding GPU Support
&lt;/h2&gt;

&lt;p&gt;Everything we've done so far uses CPU-only compute. For development and testing with 7-8B parameter models, CPU is fine. Llama 3 8B generates tokens at a usable speed on 4 cores with 8 GB of RAM. Not fast, but fast enough to test your agent logic without waiting.&lt;/p&gt;

&lt;p&gt;When you need production-level latency or you're working with larger models (70B+), you'll want a GPU. ACA supports this through workload profiles:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;workload-profile add &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--workload-profile-name&lt;/span&gt; gpu &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--workload-profile-type&lt;/span&gt; NC24-A100 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--min-nodes&lt;/span&gt; 0 &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--max-nodes&lt;/span&gt; 1

az containerapp update &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--workload-profile-name&lt;/span&gt; gpu
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A word of caution on cost: A100 GPUs run about $2/hour on ACA. If you leave &lt;code&gt;--min-nodes 1&lt;/code&gt; (always on), that's roughly $1,440/month. With &lt;code&gt;--min-nodes 0&lt;/code&gt;, you only pay when there's active inference traffic, but you take a cold start hit when the GPU node needs to spin up. For most development work, stick with CPU. Add GPU when you've validated your agent logic and need to optimize for latency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Securing External Access
&lt;/h2&gt;

&lt;p&gt;At some point you'll need external access. Maybe it's a frontend app, a mobile client, or a teammate who wants to test from their machine. Here are three approaches, in order of complexity.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 1: ACA Built-in Authentication
&lt;/h3&gt;

&lt;p&gt;ACA has a built-in auth feature that can gate access behind Azure AD, Google, or other identity providers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az containerapp auth update &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--enabled&lt;/span&gt; &lt;span class="nb"&gt;true&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--unauthenticated-client-action&lt;/span&gt; RedirectToLoginPage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This works well for interactive users (browser-based access), but it's clunky for programmatic API calls.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 2: API Key via Reverse Proxy
&lt;/h3&gt;

&lt;p&gt;For programmatic access, deploy a lightweight proxy container in front of Ollama that validates a custom &lt;code&gt;X-API-Key&lt;/code&gt; header before forwarding requests. This is what I typically set up for team development environments. Everyone gets an API key, and you can rotate or revoke keys without touching the Ollama deployment.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Switch to external ingress&lt;/span&gt;
az containerapp ingress update &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--type&lt;/span&gt; external
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then add a sidecar or separate container app that acts as your auth gateway.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option 3: VNet Integration
&lt;/h3&gt;

&lt;p&gt;For enterprise scenarios where you need network-level isolation, keep ingress internal and access Ollama through VNet peering, a VPN gateway, or ExpressRoute. This is the option I recommend for production workloads handling sensitive data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;az containerapp &lt;span class="nb"&gt;env &lt;/span&gt;create &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama-env &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--infrastructure-subnet-resource-id&lt;/span&gt; /subscriptions/&lt;span class="o"&gt;{&lt;/span&gt;sub&lt;span class="o"&gt;}&lt;/span&gt;/resourceGroups/&lt;span class="o"&gt;{&lt;/span&gt;rg&lt;span class="o"&gt;}&lt;/span&gt;/providers/Microsoft.Network/virtualNetworks/&lt;span class="o"&gt;{&lt;/span&gt;vnet&lt;span class="o"&gt;}&lt;/span&gt;/subnets/&lt;span class="o"&gt;{&lt;/span&gt;subnet&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You're putting your entire ACA environment inside your corporate network. External access goes through whatever VPN or gateway you already have.&lt;/p&gt;

&lt;h2&gt;
  
  
  Pre-Baking Models into the Image
&lt;/h2&gt;

&lt;p&gt;For production deployments, I recommend avoiding runtime model downloads entirely. Build a custom Docker image that includes the model weights:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; ollama/ollama:latest&lt;/span&gt;

&lt;span class="c"&gt;# Pre-download the model during build&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;ollama serve &amp;amp; &lt;span class="nb"&gt;sleep &lt;/span&gt;5 &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; ollama pull llama3:8b &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; pkill ollama
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Build and push to your Azure Container Registry:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker build &lt;span class="nt"&gt;-t&lt;/span&gt; myregistry.azurecr.io/ollama-llama3:latest &lt;span class="nb"&gt;.&lt;/span&gt;
docker push myregistry.azurecr.io/ollama-llama3:latest

az containerapp update &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--name&lt;/span&gt; ollama &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--resource-group&lt;/span&gt; rg-ollama-demo &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--image&lt;/span&gt; myregistry.azurecr.io/ollama-llama3:latest
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The downside is image size. You're looking at 5 GB+ for even a small model. But you get deterministic deployments: every release gets exactly the model version you tested against, and cold starts don't depend on network speed to a model registry. Combined with persistent storage (which acts as a cache for any additional models you pull at runtime), this is the fastest and most reliable startup configuration.&lt;/p&gt;




&lt;h2&gt;
  
  
  Practical Cost Tips
&lt;/h2&gt;

&lt;p&gt;A few things I've learned from running this setup across different projects.&lt;/p&gt;

&lt;p&gt;Scale-to-zero is your biggest lever. If your workload is bursty (heavy during business hours, quiet at night), the difference between always-on and scale-to-zero can be 3-4x on your monthly bill. The cold start penalty is real, but for many use cases it's worth it.&lt;/p&gt;

&lt;p&gt;I've seen teams default to GPU instances "just in case" and spend 10x more than they needed to. Llama 3 8B runs fine on 4 cores and 8 GB of RAM. Start with CPU, measure your token generation speed, and only upgrade if it's actually too slow for your use case.&lt;/p&gt;

&lt;p&gt;Don't overlook smaller models either. Phi-3 Mini and Qwen 2.5 3B handle classification, extraction, and structured output at a fraction of the compute cost. Not everything needs a 70B model.&lt;/p&gt;

&lt;p&gt;And persistent storage is cheap insurance. An Azure Files share costs pennies per GB per month. Re-downloading models on every cold start costs more in egress bandwidth and startup latency than the storage ever will.&lt;/p&gt;




&lt;h2&gt;
  
  
  Next Up
&lt;/h2&gt;

&lt;p&gt;In Part 3, we'll build a C# agent with Semantic Kernel that talks to this Ollama endpoint, with swappable backends so you can use self-hosted models for development and Azure OpenAI for production without changing your code.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Questions about the deployment? Hit me in the comments. I've probably hit the same wall you're about to hit.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>azure</category>
      <category>ai</category>
      <category>ollama</category>
      <category>containers</category>
    </item>
    <item>
      <title>Why Azure Container Apps for AI Workloads</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Fri, 17 Apr 2026 23:25:02 +0000</pubDate>
      <link>https://dev.to/bspann/why-azure-container-apps-for-ai-workloads-2djm</link>
      <guid>https://dev.to/bspann/why-azure-container-apps-for-ai-workloads-2djm</guid>
      <description>&lt;h2&gt;
  
  
  Why Azure Container Apps for AI Workloads
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Part 1 of "Running LLMs &amp;amp; Agents on Azure Container Apps"&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;I spend a lot of time helping teams at Microsoft figure out where to run their AI workloads. The conversation usually starts the same way: "We want to use LLMs, but we don't want to send our data to OpenAI, and we don't want to manage Kubernetes." That's a completely reasonable position. It's exactly the gap Azure Container Apps fills.&lt;/p&gt;

&lt;p&gt;In this series, I'll walk you through deploying Ollama on ACA, building C# agents with Semantic Kernel, wiring up multi-agent architectures with Dapr, and hardening the whole thing for production. But first, let's talk about &lt;em&gt;why&lt;/em&gt; ACA is the right platform for this kind of work, and when it isn't.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Running Your Own LLMs
&lt;/h2&gt;

&lt;p&gt;The moment you decide to self-host a model, you've signed up for a set of infrastructure decisions that most application developers aren't used to making. Where does the model live? How do you serve it? What happens when nobody's using it at 2 AM, are you still paying for a GPU?&lt;/p&gt;

&lt;p&gt;In my experience, teams end up in one of four places:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running on a laptop&lt;/strong&gt; works great for hacking on a Saturday afternoon, but it's a dead end for anything beyond that. You can't share it with a team, you can't scale it, and you can't keep it running when you close your lid.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A VM with a GPU&lt;/strong&gt; solves the sharing problem but creates a new one: you're paying 24/7 whether the model is handling requests or sitting idle. I've seen teams burn through hundreds of dollars a month on GPU VMs that were doing real work less than 10% of the time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Kubernetes (AKS)&lt;/strong&gt; gives you everything: autoscaling, GPU scheduling, health checks, the works. But now you need someone who knows how to operate a Kubernetes cluster. For a team building AI features, not a platform team, that's a big ask. Projects stall for weeks while developers learn about node pools, taints, and GPU device plugins.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Azure Container Apps&lt;/strong&gt; sits in the gap between "just give me a VM" and "I guess I need Kubernetes." You deploy a Docker image, ACA handles scaling, and you don't touch kubectl. It's built on Kubernetes under the hood, but that's an implementation detail you never have to think about.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Azure Container Apps Actually Gives You
&lt;/h2&gt;

&lt;p&gt;If you haven't worked with ACA before, the short version is: it's serverless containers. You give it a Docker image and tell it what port to listen on. ACA provisions the infrastructure, handles TLS, and scales your containers based on demand. That includes scaling to zero when there's no traffic, which means no cost when nobody's using your model.&lt;/p&gt;

&lt;p&gt;What makes it interesting for AI workloads specifically is the combination of a few features that came together over the last year or so. Workload profiles now include GPU-enabled options, so you can run inference on actual GPU hardware without managing nodes. Dapr integration is built in, which matters when you start running multiple agents that need to talk to each other (we'll get deep into this in Part 4). And KEDA-based autoscaling means you can scale on custom metrics beyond HTTP concurrency, like queue depth or even custom telemetry from your model.&lt;/p&gt;

&lt;p&gt;Think of it as the serverless experience of Azure Functions, but without being locked into the Functions programming model. You bring any container, and ACA runs it.&lt;/p&gt;




&lt;h2&gt;
  
  
  How ACA Compares to the Alternatives
&lt;/h2&gt;

&lt;p&gt;Let me break this down the way I explain it to teams I work with.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure OpenAI Service
&lt;/h3&gt;

&lt;p&gt;Azure OpenAI is the easiest path to production. Setup takes minutes, you get access to GPT-4 and the latest models, and Microsoft handles all the infrastructure. Your data stays within your Azure tenant, which satisfies most compliance requirements.&lt;/p&gt;

&lt;p&gt;Where it gets expensive is token volume. Azure OpenAI charges per token, and that math gets uncomfortable fast. A chatbot processing a million tokens a day at GPT-4 prices will run you around $600/month. That's fine for a prototype or a low-volume internal tool, but high-traffic production apps feel it.&lt;/p&gt;

&lt;p&gt;You also give up control. You get fine-tuning, but you don't get to run arbitrary open-source models, and you can't customize the serving infrastructure. If you need to run Llama 3 or Mistral or a fine-tuned domain model, Azure OpenAI isn't the answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure Kubernetes Service (AKS)
&lt;/h3&gt;

&lt;p&gt;AKS is the power tool. You get full control over scheduling, GPU node pools, custom operators like KubeRay, and the entire CNCF ecosystem. If you're running large-scale inference with a dedicated ML ops team, AKS is probably the right choice.&lt;/p&gt;

&lt;p&gt;But "full control" comes with "full responsibility." You're managing node pools, configuring GPU drivers, writing Helm charts, and debugging pod scheduling issues. One team I worked with spent more time operating their cluster than building their actual AI application. If you already have Kubernetes expertise on the team, great. Most teams building AI features don't, and for them it's a distraction.&lt;/p&gt;

&lt;h3&gt;
  
  
  Azure Container Apps
&lt;/h3&gt;

&lt;p&gt;ACA gives you most of what AKS offers for inference workloads (containerized deployments, autoscaling, GPU support, health probes) without the operational overhead. Setup takes minutes instead of hours. You don't need to know what a DaemonSet is.&lt;/p&gt;

&lt;p&gt;The catch is flexibility. ACA has fewer knobs than raw Kubernetes. GPU workload profiles are still relatively new, and you're limited to the instance types ACA supports. You can't install custom operators or run training workloads. But for inference, which is what most application teams actually need, it covers the use case well.&lt;/p&gt;




&lt;h2&gt;
  
  
  When ACA Is the Right Call
&lt;/h2&gt;

&lt;p&gt;I've found ACA works best in a few specific scenarios, and I want to be honest about where it doesn't.&lt;/p&gt;

&lt;p&gt;The strongest use case is &lt;strong&gt;development and iteration&lt;/strong&gt;. When you're building an agent and experimenting with different models, the last thing you want is to burn through API credits every time you test a prompt. Deploy Ollama to ACA, point your code at it, and iterate as much as you want. Scale to zero means you're only paying when you're actually working.&lt;/p&gt;

&lt;p&gt;It also makes sense for &lt;strong&gt;cost-sensitive production&lt;/strong&gt;. If you've done the math and your token volume is high enough that self-hosting is cheaper than API calls (I'll show you exactly where that crossover is in a minute), ACA lets you capture those savings without the operational burden of Kubernetes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Data sovereignty&lt;/strong&gt; comes up a lot in the government and financial services teams I work with. Some workloads simply can't send data to a third-party API, even one hosted in Azure. Self-hosting on ACA means your data never leaves your subscription, your VNet, or your region. And increasingly, I'm seeing teams run &lt;strong&gt;hybrid architectures&lt;/strong&gt; where a cheap local model handles classification, summarization, and simple tasks while complex reasoning gets routed to Azure OpenAI. ACA makes it easy to run the local piece alongside the rest of your application.&lt;/p&gt;

&lt;p&gt;Where ACA is &lt;em&gt;not&lt;/em&gt; the right call: training workloads, multi-GPU inference (70B+ parameter models that need model parallelism across GPUs), or situations where you need fine-grained control over GPU scheduling. For those, you want AKS or Azure ML.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cost Crossover: Self-Hosted vs. API
&lt;/h2&gt;

&lt;p&gt;This is the question everyone asks, so let me lay it out with real numbers.&lt;/p&gt;

&lt;p&gt;At low token volumes, say 100K tokens per day, the math is roughly a wash. Azure OpenAI GPT-4 costs about $60/month at that volume. A self-hosted Llama 3 instance on ACA with CPU-only compute costs about the same, and GPT-4 is a better model, so the API wins on quality.&lt;/p&gt;

&lt;p&gt;The crossover happens around 200-300K tokens per day. Above that, self-hosting costs stay relatively flat (you're paying for compute time, not tokens), while API costs scale linearly with usage. At 1M tokens/day, Azure OpenAI runs about $600/month. The same workload self-hosted on ACA? Still around $60/month, maybe $120 if you're on a GPU profile.&lt;/p&gt;

&lt;p&gt;That's a 5-10x difference, and it only gets wider at higher volumes.&lt;/p&gt;

&lt;p&gt;The caveat (and I always flag this) is that you're comparing different models. Llama 3 70B is good, but it's not GPT-4. For many tasks (classification, extraction, summarization, structured output), the quality gap is negligible. For complex multi-step reasoning, GPT-4 still has an edge. The hybrid approach I mentioned earlier lets you get the best of both.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note:&lt;/strong&gt; These cost estimates are based on Azure consumption pricing as of early 2026. Your actual costs will vary based on model size, workload profile, region, and usage patterns. Always check the &lt;a href="https://azure.microsoft.com/pricing/calculator/" rel="noopener noreferrer"&gt;Azure pricing calculator&lt;/a&gt; for current rates.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What We're Building in This Series
&lt;/h2&gt;

&lt;p&gt;Over the next four posts, we'll go from zero to a production-ready, multi-agent AI system running entirely on Azure Container Apps.&lt;/p&gt;

&lt;p&gt;We'll start in &lt;strong&gt;Part 2&lt;/strong&gt; by getting Ollama deployed and serving models, with persistent storage so you're not re-downloading 5GB on every cold start, and proper security so you don't accidentally expose an unauthenticated GPU endpoint to the internet. From there, &lt;strong&gt;Part 3&lt;/strong&gt; connects Semantic Kernel to your Ollama instance and builds a C# agent with function calling, the kind that can actually &lt;em&gt;do&lt;/em&gt; things, not just chat. &lt;strong&gt;Part 4&lt;/strong&gt; is where it starts to feel like a real system: multiple specialized agents communicating through Dapr, with Dynamic Sessions for safe code execution. Finally, &lt;strong&gt;Part 5&lt;/strong&gt; hardens everything for production: health probes that account for slow model loading, autoscaling that makes sense for LLM workloads, monitoring, and cost controls.&lt;/p&gt;

&lt;p&gt;I'll include working code for everything, and I'll call out the gotchas I've hit so you don't have to discover them yourself.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Next up: &lt;a href="https://dev.to/bspann/running-ollama-on-azure-container-apps-31bd-temp-slug-7631477"&gt;Deploying Ollama to Azure Container Apps&lt;/a&gt;, with persistent model storage and proper security.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>azure</category>
      <category>ai</category>
      <category>containers</category>
      <category>llm</category>
    </item>
    <item>
      <title>Creative Intelligence Suite: Innovation and Design Thinking for Developers (Part 4)</title>
      <dc:creator>Brian Spann</dc:creator>
      <pubDate>Mon, 13 Apr 2026 19:44:53 +0000</pubDate>
      <link>https://dev.to/bspann/creative-intelligence-suite-innovation-and-design-thinking-for-developers-part-4-ofn</link>
      <guid>https://dev.to/bspann/creative-intelligence-suite-innovation-and-design-thinking-for-developers-part-4-ofn</guid>
      <description>&lt;p&gt;Throughout this series, we've explored &lt;a href="https://dev.to/bspann/bmad-method-ai-driven-agile-development-that-actually-works-part-1-core-framework-4pmd-temp-slug-9742197"&gt;BMAD's core framework&lt;/a&gt;, &lt;a href="https://dev.to/bspann/bmad-method-workflows-deep-dive-from-idea-to-production-part-2-55po-temp-slug-2883363"&gt;workflows&lt;/a&gt;, and &lt;a href="https://dev.to/bspann/bmad-builder-creating-custom-ai-agents-for-your-domain-part-3-49e0-temp-slug-1665787"&gt;custom agent building&lt;/a&gt;. Now we tackle the fuzzy front-end of development: &lt;strong&gt;where ideas are born&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is the Creative Intelligence Suite?
&lt;/h2&gt;

&lt;p&gt;The Creative Intelligence Suite (CIS) extends BMAD with tools for &lt;strong&gt;structured creativity&lt;/strong&gt;. It's designed for those moments when you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Don't know &lt;em&gt;what&lt;/em&gt; to build yet&lt;/li&gt;
&lt;li&gt;Are stuck on a problem with no obvious solution&lt;/li&gt;
&lt;li&gt;Need to think beyond conventional approaches&lt;/li&gt;
&lt;li&gt;Want to validate ideas before investing in implementation&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Think differently."&lt;/em&gt; — CIS tagline&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  When to Use CIS
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;What CIS Offers&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Stuck on a problem&lt;/td&gt;
&lt;td&gt;Systematic diagnosis and root cause analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need fresh ideas&lt;/td&gt;
&lt;td&gt;Structured brainstorming with proven techniques&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Designing for users&lt;/td&gt;
&lt;td&gt;Human-centered design thinking process&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Finding market gaps&lt;/td&gt;
&lt;td&gt;Innovation and disruption analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Telling your product story&lt;/td&gt;
&lt;td&gt;Narrative crafting for products and features&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Preparing a pitch&lt;/td&gt;
&lt;td&gt;Persuasive presentation structuring&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Installation
&lt;/h2&gt;

&lt;p&gt;CIS is installed as a module during BMAD setup:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx bmad-method &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Select &lt;strong&gt;Creative Intelligence Suite&lt;/strong&gt; from the modules list.&lt;/p&gt;

&lt;h2&gt;
  
  
  The CIS Agent Lineup
&lt;/h2&gt;

&lt;h3&gt;
  
  
  🚀 Innovation Strategist
&lt;/h3&gt;

&lt;p&gt;Identifies disruption opportunities and business model innovation.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-innovation

Innovation Strategist: Let's identify opportunities for disruption.

                       What market or industry are you exploring?

You: Developer productivity tools

Strategist: Interesting space! Let me guide you through the analysis:

            🎯 MARKET ANALYSIS
            Current pain points in developer tools:
            - Tool fragmentation
            - Context switching costs
            - Configuration complexity

            📊 DISRUPTION OPPORTUNITIES
            1. AI-native development environments
            2. Zero-config tooling
            3. Intent-based programming

            Which angle interests you most?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🎨 Design Thinking Coach
&lt;/h3&gt;

&lt;p&gt;Guides you through human-centered design with empathy, ideation, and prototyping.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-design-thinking

Design Coach: Welcome to the design thinking journey!

              We'll work through 5 phases:
              1. 🎯 Empathize - Understand your users
              2. 🔍 Define - Frame the problem
              3. 💡 Ideate - Generate solutions
              4. 🔧 Prototype - Build to learn
              5. ✅ Test - Validate assumptions

              Let's start with empathy.
              Who are your target users?

You: Senior developers who mentor junior team members

Coach: Great! Let's understand their world:

       🎤 INTERVIEW QUESTIONS
       - What's most frustrating about mentoring?
       - How do you currently share knowledge?
       - What takes up most of your mentoring time?

       🔍 OBSERVATION PROMPTS
       - Watch a mentoring session
       - Review their code review comments
       - Analyze their documentation patterns

       What insights have you gathered?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🧠 Brainstorming Coach
&lt;/h3&gt;

&lt;p&gt;Facilitates creative ideation sessions with proven techniques.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-brainstorm

Brainstorming Coach: Ready to generate ideas!

                     What would you like to brainstorm about?

You: Ways to improve user onboarding

Coach: Let's use the SCAMPER technique!

       📝 SCAMPER ANALYSIS

       🔄 SUBSTITUTE
       - What if signup required no password?
       - Replace forms with conversation?

       🔗 COMBINE  
       - Merge onboarding with first use?
       - Combine tutorial with real tasks?

       🔧 ADAPT
       - How does gaming handle new players?
       - What can we learn from social apps?

       ✨ MODIFY
       - Make it 10x shorter?
       - Make it 10x more personal?

       🎯 PUT TO OTHER USE
       - Use onboarding for ongoing education?
       - Use it for feature discovery?

       ❌ ELIMINATE
       - Remove ALL steps - what happens?
       - No onboarding - users figure it out?

       🔃 REVERSE
       - End with signup, start with value?
       - User teaches the system?

       Which angle resonates? Let's dive deeper.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🔍 Problem Solver
&lt;/h3&gt;

&lt;p&gt;Systematic problem diagnosis and root cause analysis.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-problem-solve

Problem Solver: Let's diagnose this systematically.

                What problem are you facing?

You: Our API response times are inconsistent

Solver: I'll guide you through root cause analysis.

        📊 SYMPTOM MAPPING
        - When did this start?
        - What changed recently?
        - Is it all endpoints or specific ones?

        🔍 5 WHYS ANALYSIS
        Why are response times inconsistent?
        → [Your answer leads to next why]

        🐟 FISHBONE DIAGRAM
        Potential causes by category:

        PEOPLE: Team changes? Knowledge gaps?
        PROCESS: Deployment changes? Monitoring gaps?
        TECHNOLOGY: Dependencies? Infrastructure?
        ENVIRONMENT: Traffic patterns? External services?

        Let's explore each branch...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  💡 Creative Problem Solver
&lt;/h3&gt;

&lt;p&gt;Generates creative solutions using lateral thinking.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-creative-solve

Creative Solver: Let's think sideways!

                 What conventional problem are you trying to solve?

You: Users forget to complete their profiles

Solver: 🔀 LATERAL THINKING TECHNIQUES

        REVERSAL: What if incomplete profiles were the goal?
        → Maybe minimal profiles are faster to scan?
        → Progressive disclosure as relationships form?

        ANALOGY: How does nature handle incomplete information?
        → Trees grow toward light - profiles grow toward value
        → Caterpillar → butterfly - staged transformation

        RANDOM ENTRY: [Picking random concept: "music"]
        → Playlists build over time through use
        → Profile as a "greatest hits" that evolves

        PROVOCATION: "Profiles should be invisible"
        → What if actions spoke louder than descriptions?
        → Infer profile from behavior?

        Which direction excites you?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📖 Storyteller
&lt;/h3&gt;

&lt;p&gt;Crafts compelling narratives for products and features.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-storytelling

Storyteller: Every great product has a story.

             What product or feature needs a narrative?

You: Our new AI code review feature

Storyteller: Let's build your product story.

             📖 STORY STRUCTURE

             THE HERO (Your User):
             A developer drowning in pull requests...

             THE STRUGGLE:
             Hours spent on tedious reviews, missing
             critical issues while catching style nits...

             THE DISCOVERY:
             An AI assistant that understands context,
             catches bugs, and learns team preferences...

             THE TRANSFORMATION:
             From code reviewer to code mentor,
             focusing on architecture and growth...

             THE NEW WORLD:
             Faster releases, fewer bugs, happier teams...

             Want me to develop any section further?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  🎤 Presentation Master
&lt;/h3&gt;

&lt;p&gt;Structures and delivers persuasive presentations.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-presentation

Presentation Master: Let's build a compelling deck.

                     What's your presentation about?

You: Pitching our startup to investors

Master: 🎯 INVESTOR PITCH STRUCTURE

        1. HOOK (30 sec)
           One sentence that makes them lean in

        2. PROBLEM (1 min)
           Pain so real they feel it

        3. SOLUTION (2 min)
           Your unique insight

        4. TRACTION (1 min)
           Proof it's working

        5. MARKET (1 min)
           Why this can be huge

        6. TEAM (30 sec)
           Why you'll win

        7. ASK (30 sec)
           Specific and confident

        Let's craft your hook first.
        What's the one sentence?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CIS Workflows in Action
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Complete Example: Ideating a New Feature
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Scenario:&lt;/strong&gt; Your SaaS product needs a new feature, but you're not sure what would have the most impact.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-brainstorm
     We need a new feature for our project management tool.
     Users are asking for "something to help with planning."

Coach: Let's explore what "help with planning" means to users.

       DIVERGENT PHASE 🌊
       Generate as many ideas as possible:

       1. AI that predicts project timelines
       2. Visual roadmap builder
       3. Automatic dependency detection
       4. Team capacity planning
       5. Risk forecasting
       6. Meeting scheduler optimizer
       7. Sprint recommendation engine
       8. "What-if" scenario modeling
       9. Stakeholder communication templates
       10. Progress prediction vs actual

       [Generates 20+ ideas...]

       CONVERGENT PHASE 🎯
       Let's evaluate against criteria:

       | Idea | User Value | Feasibility | Differentiation |
       |------|------------|-------------|----------------|
       | AI timeline prediction | ⭐⭐⭐⭐⭐ | ⭐⭐⭐ | ⭐⭐⭐⭐ |
       | What-if modeling | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐⭐⭐ |
       | Capacity planning | ⭐⭐⭐⭐ | ⭐⭐⭐⭐ | ⭐⭐⭐ |

       Top candidates: What-if modeling + AI predictions

       Want to validate these with design thinking?

You: Yes, let's do design thinking on what-if modeling

Coach: Switching to Design Thinking mode...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Design Thinking: Validating the Idea
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Design Coach: 🎯 PHASE 1: EMPATHIZE

              Who would use what-if modeling?

              PERSONA HYPOTHESES:

              "Project Paula" - PM at a startup
              - Manages 3-5 concurrent projects
              - Constantly asked "when will this ship?"
              - Burned by scope creep before

              "Executive Eric" - VP of Engineering  
              - Needs to commit to quarterly goals
              - Balances multiple team requests
              - Lives in spreadsheets currently

              Which persona should we focus on?

You: Project Paula - she's our core user

Coach: 🔍 PHASE 2: DEFINE

       Based on empathy work, here's the problem statement:

       "Project Paula needs a way to CONFIDENTLY answer
       timeline questions because her current tools only
       show ONE path forward, hiding the uncertainty
       she knows exists."

       Key insight: She KNOWS there's uncertainty,
       but has no way to COMMUNICATE it.

       💡 PHASE 3: IDEATE

       How might we help Paula communicate uncertainty?

       IDEAS:
       1. Probability ranges instead of dates
       2. "Optimistic / Realistic / Pessimistic" views
       3. Risk-adjusted timelines
       4. Shareable scenarios for stakeholders
       5. Historical accuracy tracking

       🔧 PHASE 4: PROTOTYPE

       Let's sketch the simplest test:

       Low-fidelity prototype:
       - Add "best case / worst case" to any date
       - Show range visually on timeline
       - One-click share with stakeholders

       Want to define the test criteria?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Problem-Solving: Debugging a Business Issue
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /cis-problem-solve
     Our trial-to-paid conversion dropped 20% last month.

Problem Solver: Let's diagnose systematically.

                📊 SYMPTOM ANALYSIS

                First, some clarifying questions:
                1. Did trial signups change?
                2. Which plans saw the biggest drop?
                3. Any changes to pricing or features?
                4. What does the funnel look like?

You: Signups are stable. Drop is across all plans.
     We added a new onboarding flow 6 weeks ago.

Solver: 🔍 HYPOTHESIS FORMATION

        Timing correlates with onboarding change.

        Possible causes:
        H1: New onboarding is confusing users
        H2: New onboarding attracts wrong users
        H3: New onboarding front-loads value (no need to pay)
        H4: Technical issues in new flow

        Let's test H1 first:

        DATA TO CHECK:
        - Onboarding completion rates: old vs new
        - Time to first value: old vs new
        - Support tickets mentioning onboarding
        - User feedback/NPS scores

        What do you have access to?

You: Completion is UP 40%, but time-to-first-value
     increased from 5 min to 25 min.

Solver: 🎯 ROOT CAUSE IDENTIFIED

        The new onboarding is more thorough but
        DELAYS the "aha moment".

        Users who converted before:
        - Hit value quickly (5 min)
        - Explored on their own
        - Asked for features = signal to upgrade

        Users now:
        - Complete comprehensive tutorial (25 min)
        - Feel "trained" on free features
        - Don't discover premium value naturally

        💡 SOLUTION DIRECTIONS:
        1. Shorter path to first value, detailed training later
        2. Include premium feature preview in onboarding
        3. Guide toward premium use cases during trial

        Want to brainstorm implementations?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Combining CIS with Core BMAD
&lt;/h2&gt;

&lt;p&gt;CIS shines when integrated with the full BMAD workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────────────────────────────────────────────────┐
│                    FULL INNOVATION FLOW                       │
│                                                               │
│  CIS Phase                    BMAD Phase                     │
│  ─────────                    ──────────                     │
│  /cis-brainstorm      →      Product ideas generated        │
│  /cis-design-thinking →      User needs validated           │
│  /cis-innovation      →      Market opportunity confirmed   │
│                               ↓                              │
│                        /create-product-brief                 │
│                        /create-prd                           │
│                        /create-architecture                  │
│                        /dev-story                            │
│                               ↓                              │
│  /cis-storytelling    →      Launch narrative ready         │
│  /cis-presentation    →      Stakeholder buy-in secured     │
└──────────────────────────────────────────────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Practical Integration Example
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Phase 1: Discover (CIS)&lt;/span&gt;
/cis-brainstorm
&lt;span class="c"&gt;# Generate and evaluate feature ideas&lt;/span&gt;

/cis-design-thinking
&lt;span class="c"&gt;# Validate with user empathy&lt;/span&gt;

&lt;span class="c"&gt;# Phase 2: Define (BMAD)&lt;/span&gt;
/create-product-brief
&lt;span class="c"&gt;# Capture the validated idea&lt;/span&gt;

/create-prd
&lt;span class="c"&gt;# Full requirements with PM agent&lt;/span&gt;

&lt;span class="c"&gt;# Phase 3: Design (BMAD)&lt;/span&gt;
/create-architecture
&lt;span class="c"&gt;# Technical solution&lt;/span&gt;

&lt;span class="c"&gt;# Phase 4: Build (BMAD)&lt;/span&gt;
/sprint-planning
/dev-story
/code-review

&lt;span class="c"&gt;# Phase 5: Launch (CIS)&lt;/span&gt;
/cis-storytelling
&lt;span class="c"&gt;# Craft the launch narrative&lt;/span&gt;

/cis-presentation
&lt;span class="c"&gt;# Prepare stakeholder communications&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Team Collaboration with CIS
&lt;/h2&gt;

&lt;p&gt;CIS includes team configurations for collaborative creativity:&lt;/p&gt;

&lt;h3&gt;
  
  
  Creative Squad
&lt;/h3&gt;

&lt;p&gt;Bring together multiple CIS agents for cross-functional sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /party-mode Innovation, Design, Storyteller

Innovation: 🚀 Looking at this from a market disruption angle...

Design: 🎨 Let me consider the user experience implications...

Storyteller: 📖 Here's how we might frame this for users...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Design Pair
&lt;/h3&gt;

&lt;p&gt;Two-person design thinking sessions:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: /party-mode Problem-Solver, Creative-Solver

Problem-Solver: 🔍 Systematically, the issue stems from...

Creative-Solver: 💡 But what if we flip that assumption...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  CIS Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Workflow&lt;/th&gt;
&lt;th&gt;Command&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Brainstorming&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-brainstorm&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Generating many ideas&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Design Thinking&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-design-thinking&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;User-centered solutions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Innovation&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-innovation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Market opportunities&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Problem Solving&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-problem-solve&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Root cause analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Creative Solving&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-creative-solve&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Unconventional solutions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Storytelling&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-storytelling&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Product narratives&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Presentations&lt;/td&gt;
&lt;td&gt;&lt;code&gt;/cis-presentation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Persuasive decks&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  When CIS + BMAD Help Synergize
&lt;/h2&gt;

&lt;p&gt;Here's how CIS enhances different BMAD phases:&lt;/p&gt;

&lt;h3&gt;
  
  
  Analysis Phase
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CIS: /cis-brainstorm → Generate feature ideas
CIS: /cis-innovation → Identify market opportunities  
BMAD: /create-product-brief → Capture validated direction
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Planning Phase
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CIS: /cis-design-thinking → Validate user needs
CIS: /cis-problem-solve → Clarify problem space
BMAD: /create-prd → Document requirements
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Solutioning Phase
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CIS: /cis-creative-solve → Explore novel architectures
BMAD: /create-architecture → Document technical decisions
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Launch Phase
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;CIS: /cis-storytelling → Craft product narrative
CIS: /cis-presentation → Prepare launch materials
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Tips for Effective CIS Usage
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Separate Divergent from Convergent Thinking
&lt;/h3&gt;

&lt;p&gt;Don't evaluate ideas while generating them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❌ "That won't work because..." (during brainstorm)
✅ "Let's generate 20 ideas, then evaluate" (process)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Trust the Structure
&lt;/h3&gt;

&lt;p&gt;CIS techniques are proven. Even when uncomfortable, follow the process:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;❌ Skipping empathy to jump to solutions
✅ Completing all design thinking phases
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Document Insights
&lt;/h3&gt;

&lt;p&gt;Capture what you learn for future reference:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/cis-brainstorm
→ Save promising ideas to _bmad-output/brainstorm-results.md

/cis-design-thinking
→ Save personas to _bmad-output/user-personas.md
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  4. Combine Techniques
&lt;/h3&gt;

&lt;p&gt;Different situations need different approaches:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Vague problem → /cis-problem-solve first
Clear problem → /cis-brainstorm directly
User-facing → /cis-design-thinking required
Technical → /cis-creative-solve useful
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Use CIS Help
&lt;/h3&gt;

&lt;p&gt;Integrated with BMAD's help system:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/bmad-help I'm stuck on a problem and don't know where to start

BMAD: Based on your situation, I recommend:
      - /cis-problem-solve for diagnosis
      - /cis-brainstorm once problem is clear
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Series Wrap-Up
&lt;/h2&gt;

&lt;p&gt;Over these four articles, we've covered:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Core BMAD&lt;/strong&gt;: AI as collaborator with 12+ specialized agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Workflows&lt;/strong&gt;: Quick Flow, Full Planning, and Party Mode&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BMad Builder&lt;/strong&gt;: Creating custom agents and modules&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative Intelligence Suite&lt;/strong&gt;: Innovation and design thinking&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The BMAD Philosophy Revisited
&lt;/h3&gt;

&lt;p&gt;BMAD isn't about AI doing work &lt;em&gt;for&lt;/em&gt; you—it's about AI as a &lt;strong&gt;thinking partner&lt;/strong&gt; that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Brings structure to chaos&lt;/li&gt;
&lt;li&gt;Ensures nothing is forgotten&lt;/li&gt;
&lt;li&gt;Provides expert perspectives on demand&lt;/li&gt;
&lt;li&gt;Maintains context across sessions&lt;/li&gt;
&lt;li&gt;Scales from bug fixes to enterprise systems&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Getting Started
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx bmad-method &lt;span class="nb"&gt;install&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Select the modules you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BMad Method (BMM)&lt;/strong&gt; — Core workflows&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BMad Builder (BMB)&lt;/strong&gt; — Custom agents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Creative Intelligence Suite (CIS)&lt;/strong&gt; — Innovation tools&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Resources
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;💻 &lt;a href="https://github.com/bmad-code-org/BMAD-METHOD" rel="noopener noreferrer"&gt;BMAD-Method GitHub&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🔧 &lt;a href="https://github.com/bmad-code-org/bmad-builder" rel="noopener noreferrer"&gt;BMad Builder&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;💡 &lt;a href="https://github.com/bmad-code-org/bmad-module-creative-intelligence-suite" rel="noopener noreferrer"&gt;Creative Intelligence Suite&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📚 &lt;a href="http://docs.bmad-method.org" rel="noopener noreferrer"&gt;Documentation&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;🎮 &lt;a href="https://discord.gg/gk8jAdXWmj" rel="noopener noreferrer"&gt;Discord Community&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;📺 &lt;a href="https://www.youtube.com/@BMadCode" rel="noopener noreferrer"&gt;YouTube Channel&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;The Creative Intelligence Suite completes the BMAD ecosystem—from the first spark of an idea through implementation and launch. Whether you're solving problems, generating ideas, or crafting narratives, CIS provides the structured creativity that turns good developers into innovative ones.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thanks for following this series! Questions? Ideas for future topics? Drop them in the comments!&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  Series Index
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;a href="https://dev.to/bspann/bmad-method-ai-driven-agile-development-that-actually-works-part-1-core-framework-4pmd-temp-slug-9742197"&gt;BMAD-Method Core: AI-Driven Agile Development That Actually Works&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/bspann/bmad-method-workflows-deep-dive-from-idea-to-production-part-2-55po-temp-slug-2883363"&gt;BMAD-Method Workflows Deep Dive: From Idea to Production&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/bspann/bmad-builder-creating-custom-ai-agents-for-your-domain-part-3-49e0-temp-slug-1665787"&gt;BMad Builder: Creating Custom AI Agents for Your Domain&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Creative Intelligence Suite: Innovation and Design Thinking for Developers (this article)&lt;/li&gt;
&lt;/ol&gt;

</description>
      <category>ai</category>
      <category>creativity</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
