<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shamsuddin Ahmed</title>
    <description>The latest articles on DEV Community by Shamsuddin Ahmed (@shamspias).</description>
    <link>https://dev.to/shamspias</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1454053%2F50176f08-83a7-4d25-b777-25899425f500.png</url>
      <title>DEV Community: Shamsuddin Ahmed</title>
      <link>https://dev.to/shamspias</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/shamspias"/>
    <language>en</language>
    <item>
      <title>Green Padlock, Zero Headache: Let’s Encrypt SSL for Self-Hosted Dify</title>
      <dc:creator>Shamsuddin Ahmed</dc:creator>
      <pubDate>Thu, 08 May 2025 18:06:34 +0000</pubDate>
      <link>https://dev.to/shamspias/green-padlock-zero-headache-lets-encrypt-ssl-for-self-hosted-dify-cie</link>
      <guid>https://dev.to/shamspias/green-padlock-zero-headache-lets-encrypt-ssl-for-self-hosted-dify-cie</guid>
      <description>&lt;p&gt;&lt;em&gt;“Your app isn’t &lt;strong&gt;production&lt;/strong&gt; until the padlock turns green.”&lt;/em&gt;&lt;br&gt;
This guide merges the &lt;strong&gt;one-liner speed&lt;/strong&gt; of the quick-start and the &lt;strong&gt;step-by-step clarity&lt;/strong&gt; of the original long-form post.&lt;br&gt;&lt;br&gt;
Follow along and you’ll mint fresh Let’s Encrypt certificates, wire them into Dify’s Nginx, and set-and-forget auto-renewal—all in &lt;strong&gt;~15 minutes&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  0 Why bother?
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;✅ Win&lt;/th&gt;
&lt;th&gt;🚀 Why it matters&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;End-to-end encryption&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Keep chat sessions &amp;amp; API calls private.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Browser trust&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No more red “Not secure” labels.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Free &amp;amp; automated&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Let’s Encrypt renews every 60–90 days without a credit-card or cron anxiety.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h2&gt;
  
  
  1 Prerequisites
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Domain&lt;/strong&gt; – &lt;code&gt;A&lt;/code&gt;/&lt;code&gt;AAAA&lt;/code&gt; record → your server
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ports&lt;/strong&gt; – 80 &amp;amp; 443 open on firewall/cloud SG
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker&lt;/strong&gt; – v24 + Compose v2 (&lt;code&gt;docker compose version&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dify repo&lt;/strong&gt; – &lt;code&gt;git clone https://github.com/langgenius/dify.git &amp;amp;&amp;amp; cd dify/docker&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  2 Patch &lt;code&gt;.env&lt;/code&gt; (tell Dify who it is)
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# --- SSL filenames Nginx expects&lt;/span&gt;
&lt;span class="nv"&gt;NGINX_SSL_CERT_FILENAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;fullchain.pem
&lt;span class="nv"&gt;NGINX_SSL_CERT_KEY_FILENAME&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;privkey.pem

&lt;span class="c"&gt;# --- Let Certbot answer ACME challenges&lt;/span&gt;
&lt;span class="nv"&gt;NGINX_ENABLE_CERTBOT_CHALLENGE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;

&lt;span class="c"&gt;# --- Your domain + a real e-mail&lt;/span&gt;
&lt;span class="nv"&gt;CERTBOT_DOMAIN&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;dify.example.com
&lt;span class="nv"&gt;CERTBOT_EMAIL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ops@example.com

&lt;span class="c"&gt;# --- Leave this OFF until the cert exists&lt;/span&gt;
&lt;span class="nv"&gt;NGINX_HTTPS_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;false&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Got multiple sub-domains?&lt;/strong&gt;&lt;br&gt;
Add &lt;code&gt;APP_WEB_URL&lt;/code&gt;, &lt;code&gt;APP_API_URL&lt;/code&gt;, &lt;code&gt;CONSOLE_WEB_URL&lt;/code&gt;, and &lt;code&gt;CONSOLE_API_URL&lt;/code&gt;, each with &lt;code&gt;https://…&lt;/code&gt; so Dify’s front-end matches its back-end once HTTPS is live.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2&gt;
  
  
  3 (Tidy-up) Prune stray Docker networks
&lt;/h2&gt;

&lt;p&gt;Old bridges sometimes clash with the Certbot profile.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker network prune        &lt;span class="c"&gt;# hit y when asked&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  4 Launch the stack &lt;strong&gt;with&lt;/strong&gt; the Certbot profile
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nt"&gt;--profile&lt;/span&gt; certbot up &lt;span class="nt"&gt;--force-recreate&lt;/span&gt; &lt;span class="nt"&gt;-d&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Flag&lt;/th&gt;
&lt;th&gt;Why we need it&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--profile certbot&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Starts &lt;em&gt;nginx&lt;/em&gt; + &lt;em&gt;certbot&lt;/em&gt; side-cars.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--force-recreate&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Rebuild nginx so it mounts the Let’s Encrypt volume.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;-d&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Runs in the background (detached).&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  5 Run Certbot inside its container
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; certbot /bin/sh /update-cert.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What just happened?&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Certbot spins a micro webroot on &lt;strong&gt;:80&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;Let’s Encrypt validates &lt;code&gt;dify.example.com&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Keys drop into &lt;code&gt;volumes/certbot/conf/live/dify.example.com/&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Expect the triumphant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Congratulations! Your certificate and chain have been saved at:
  /etc/letsencrypt/live/dify.example.com/fullchain.pem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  6 Flip the HTTPS switch
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# in docker/.env&lt;/span&gt;
&lt;span class="nv"&gt;NGINX_HTTPS_ENABLED&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;true&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  7 Recreate &lt;strong&gt;only&lt;/strong&gt; nginx (fast restart)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker compose &lt;span class="nt"&gt;--profile&lt;/span&gt; certbot up &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="nt"&gt;--no-deps&lt;/span&gt; &lt;span class="nt"&gt;--force-recreate&lt;/span&gt; nginx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Nginx boots, finds &lt;code&gt;fullchain.pem&lt;/code&gt; &amp;amp; &lt;code&gt;privkey.pem&lt;/code&gt;, and starts serving &lt;strong&gt;443&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  8 Verify the green padlock
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://dify.example.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You should see:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔒 Green padlock&lt;/li&gt;
&lt;li&gt;Auto-redirect from &lt;code&gt;http://&lt;/code&gt; → &lt;code&gt;https://&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  9 Renewal in two commands (bookmark this)
&lt;/h2&gt;

&lt;p&gt;Let’s Encrypt certs last 90 days. Certbot’s built-in cron renews them, but here’s the manual drill:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 1. Force renew (safe to run anytime)&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec&lt;/span&gt; &lt;span class="nt"&gt;-it&lt;/span&gt; certbot /bin/sh /update-cert.sh

&lt;span class="c"&gt;# 2. Hot-reload nginx to pick up the new files&lt;/span&gt;
docker compose &lt;span class="nb"&gt;exec &lt;/span&gt;nginx nginx &lt;span class="nt"&gt;-s&lt;/span&gt; reload
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  10 Troubleshooting cheat-sheet
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;🛠️ Symptom&lt;/th&gt;
&lt;th&gt;💡 Fix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;timeout during connect (likely firewall)&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Port 80 closed or DNS not propagated.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;Permission denied&lt;/code&gt; on &lt;code&gt;/update-cert.sh&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Script lost its +x bit—&lt;code&gt;chmod +x /update-cert.sh&lt;/code&gt; inside the image or rebuild.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Infinite “Loading…” UI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;All five &lt;code&gt;*_WEB_URL&lt;/code&gt; / &lt;code&gt;*_API_URL&lt;/code&gt; env vars &lt;strong&gt;must&lt;/strong&gt; match the new &lt;code&gt;https://…&lt;/code&gt; URL.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;code&gt;bind: address already in use: 0.0.0.0:80&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Another web server running—stop Apache/old Nginx or change Compose ports.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  11 Recap (copy-paste edition)
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# 0  clone repo &amp;amp; cd dify/docker&lt;/span&gt;
&lt;span class="c"&gt;# 1  edit .env  → domain, e-mail, filenames&lt;/span&gt;
&lt;span class="c"&gt;# 2  docker network prune&lt;/span&gt;
&lt;span class="c"&gt;# 3  docker compose --profile certbot up --force-recreate -d&lt;/span&gt;
&lt;span class="c"&gt;# 4  docker compose exec -it certbot /bin/sh /update-cert.sh&lt;/span&gt;
&lt;span class="c"&gt;# 5  set NGINX_HTTPS_ENABLED=true in .env&lt;/span&gt;
&lt;span class="c"&gt;# 6  docker compose --profile certbot up -d --no-deps --force-recreate nginx&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Fifteen minutes, one free cert, &lt;strong&gt;zero recurring fees&lt;/strong&gt;—your Dify instance is officially production-grade.&lt;/p&gt;

&lt;p&gt;Happy shipping! &lt;/p&gt;

</description>
      <category>dify</category>
      <category>letsencrypt</category>
      <category>docker</category>
      <category>devops</category>
    </item>
    <item>
      <title>Homomorphic Encryption (HE) Explained: A Beginner’s Guide to Secure AI on Encrypted Data</title>
      <dc:creator>Shamsuddin Ahmed</dc:creator>
      <pubDate>Thu, 01 May 2025 22:35:44 +0000</pubDate>
      <link>https://dev.to/shamspias/homomorphic-encryption-he-explained-a-beginners-guide-to-secure-ai-on-encrypted-data-2en1</link>
      <guid>https://dev.to/shamspias/homomorphic-encryption-he-explained-a-beginners-guide-to-secure-ai-on-encrypted-data-2en1</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Imagine this:&lt;/strong&gt; You send a message to a server. That message is completely encrypted—nobody can read it. But the server still &lt;em&gt;knows&lt;/em&gt; whether your message is harmful or not. It didn't decrypt it. It didn't peek inside. It just... ran a program &lt;strong&gt;on the encrypted data&lt;/strong&gt;. Sounds like magic? That’s &lt;strong&gt;Homomorphic Encryption (HE)&lt;/strong&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In this post, we’ll break down what homomorphic encryption is, why it matters in today’s privacy-first digital world, and how &lt;strong&gt;you&lt;/strong&gt; can start experimenting with it—even if you don’t have a background in cryptography or security.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔐 What Is Homomorphic Encryption?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Homomorphic Encryption (HE)&lt;/strong&gt; is a way to do calculations on data &lt;strong&gt;without decrypting it&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  🤯 Wait, what?
&lt;/h3&gt;

&lt;p&gt;Let’s say you lock a number in a box (&lt;code&gt;5 → 🔒5&lt;/code&gt;) and send it to a server. The server doesn’t unlock the box. But somehow, it adds &lt;code&gt;3&lt;/code&gt; to the &lt;strong&gt;locked&lt;/strong&gt; number and sends back a locked result. You open the box later and get... &lt;code&gt;8&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;That’s the power of HE: you can &lt;strong&gt;process encrypted data&lt;/strong&gt; and only decrypt the result later.&lt;/p&gt;




&lt;h2&gt;
  
  
  💡 Why Should You Care?
&lt;/h2&gt;

&lt;p&gt;In a world where &lt;strong&gt;privacy is everything&lt;/strong&gt;—from end-to-end encrypted messaging to secure medical data—HE enables systems to analyze and process data &lt;strong&gt;without ever seeing it&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use cases:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔍 Detect child exploitation in encrypted messages&lt;/li&gt;
&lt;li&gt;🏥 Analyze encrypted medical records&lt;/li&gt;
&lt;li&gt;📊 Run secure AI models without leaking user data&lt;/li&gt;
&lt;li&gt;💬 Filter toxic messages in encrypted chats&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🧠 How Does HE Work (Simple Version)
&lt;/h2&gt;

&lt;p&gt;All HE systems work with a few basic ideas:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You encrypt a number&lt;/strong&gt;: Let’s say &lt;code&gt;m = 5&lt;/code&gt; becomes &lt;code&gt;Enc(m) = 🔒5&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You perform math on the encrypted number&lt;/strong&gt;: For example, &lt;code&gt;Enc(5) + Enc(3)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You decrypt the result&lt;/strong&gt;: You get &lt;code&gt;5 + 3 = 8&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So &lt;code&gt;Decrypt(Enc(5) + Enc(3)) = 8&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;But here’s the twist: The server &lt;strong&gt;never&lt;/strong&gt; saw the 5 or the 3. Just the encrypted stuff.&lt;/p&gt;




&lt;h2&gt;
  
  
  🍱 Types of Homomorphic Encryption
&lt;/h2&gt;

&lt;p&gt;There are several types of HE:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Type&lt;/th&gt;
&lt;th&gt;What It Can Do&lt;/th&gt;
&lt;th&gt;Example&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Partial HE (PHE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Only one operation (add &lt;strong&gt;or&lt;/strong&gt; multiply)&lt;/td&gt;
&lt;td&gt;Paillier (add), RSA (multiply)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Somewhat HE (SHE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;A few operations (limited depth)&lt;/td&gt;
&lt;td&gt;Good for simple tasks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fully HE (FHE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Unlimited operations&lt;/td&gt;
&lt;td&gt;The holy grail—do anything!&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;The most famous FHE system was invented by Craig Gentry in 2009. Since then, it's become faster and more usable every year.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  🤖 HE + Machine Learning = Privacy-Preserving AI
&lt;/h2&gt;

&lt;p&gt;Let’s talk AI. You want to build a model that detects bad messages... but the messages are encrypted. Can HE help?&lt;/p&gt;

&lt;p&gt;Yes.&lt;/p&gt;

&lt;p&gt;Here’s how:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The &lt;strong&gt;user encrypts&lt;/strong&gt; their message.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;server runs a machine learning model&lt;/strong&gt; on the encrypted data.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;output is encrypted&lt;/strong&gt; too.&lt;/li&gt;
&lt;li&gt;Only the user (or a trusted party) decrypts the result.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Nobody sees the actual message. Ever.&lt;/strong&gt; But we still get a result like:&lt;br&gt;&lt;br&gt;
✅ &lt;em&gt;Safe&lt;/em&gt;&lt;br&gt;&lt;br&gt;
🚫 &lt;em&gt;Violating&lt;/em&gt;&lt;/p&gt;


&lt;h2&gt;
  
  
  🛠️ How To Get Started with Homomorphic Encryption (In Python)
&lt;/h2&gt;

&lt;p&gt;Let’s get practical. There are several Python libraries that make HE accessible—even for beginners.&lt;/p&gt;
&lt;h3&gt;
  
  
  🧰 Tools You Can Use
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Library&lt;/th&gt;
&lt;th&gt;Best For&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/OpenMined/TenSEAL" rel="noopener noreferrer"&gt;TenSEAL&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Easy encrypted ML in Python&lt;/td&gt;
&lt;td&gt;Built on Microsoft SEAL&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/microsoft/SEAL" rel="noopener noreferrer"&gt;Microsoft SEAL&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hardcore HE with C++ performance&lt;/td&gt;
&lt;td&gt;Very powerful&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/OpenMined/PySyft" rel="noopener noreferrer"&gt;PySyft&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Federated + encrypted ML&lt;/td&gt;
&lt;td&gt;Works with PyTorch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/ibarrond/Pyfhel" rel="noopener noreferrer"&gt;Pyfhel&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Lightweight Python HE&lt;/td&gt;
&lt;td&gt;Good for learning&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;&lt;a href="https://github.com/zama-ai/concrete-ml" rel="noopener noreferrer"&gt;Concrete-ML&lt;/a&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;ML → FHE pipeline&lt;/td&gt;
&lt;td&gt;Great for fast demos&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h3&gt;
  
  
  🔢 Simple Example: Encrypted Classification in Python
&lt;/h3&gt;

&lt;p&gt;Let’s say we want to classify a message as &lt;em&gt;safe&lt;/em&gt; or &lt;em&gt;not safe&lt;/em&gt;. We’ve got two features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Suspicious keywords count&lt;/li&gt;
&lt;li&gt;Message length&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We encrypt these features, apply a linear model, and decrypt the result.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;tenseal&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;

&lt;span class="c1"&gt;# Setup encryption context
&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;context&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SCHEME_TYPE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CKKS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;poly_modulus_degree&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;coeff_mod_bit_sizes&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generate_galois_keys&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;global_scale&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;

&lt;span class="c1"&gt;# Share this with the user for encryption
&lt;/span&gt;&lt;span class="n"&gt;public_context&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;public_context&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;make_context_public&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="c1"&gt;# User encrypts their features
&lt;/span&gt;&lt;span class="n"&gt;features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;3.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# 3 bad keywords, 50 characters
&lt;/span&gt;&lt;span class="n"&gt;enc_features&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ckks_vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;public_context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;features&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Model: score = 0.5*x1 - 0.3*x2 + 0.1
&lt;/span&gt;&lt;span class="n"&gt;weights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;bias&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.1&lt;/span&gt;

&lt;span class="c1"&gt;# Server evaluates the model on encrypted data
&lt;/span&gt;&lt;span class="n"&gt;enc_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enc_features&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;weights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;enc_score&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;bias&lt;/span&gt;

&lt;span class="c1"&gt;# Server sends result back for decryption
&lt;/span&gt;&lt;span class="n"&gt;decrypted_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;enc_score&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;decrypt&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Score:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;decrypted_score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Class:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;violating&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;decrypted_score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;safe&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;&lt;br&gt;
`&lt;/p&gt;

&lt;p&gt;🧪 This example shows HE in action: a server can run a model on encrypted data. Only the client sees the result.&lt;/p&gt;




&lt;h2&gt;
  
  
  🧭 Where to Go Next?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Try TenSEAL&lt;/strong&gt; – It's great for Python users who want encrypted ML.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read up on CKKS&lt;/strong&gt; – It’s the HE scheme most used in AI.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Think of a use case&lt;/strong&gt; – How could you apply HE in healthcare, finance, or messaging?&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  ⚠️ Real Talk: Challenges of HE
&lt;/h2&gt;

&lt;p&gt;Homomorphic encryption is powerful, but it's not perfect.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Challenge&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;🚀 Slower than plaintext&lt;/td&gt;
&lt;td&gt;But improving fast! Use batching and approximate math&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🤹‍♂️ Only supports + and ×&lt;/td&gt;
&lt;td&gt;Use polynomials to approximate activations&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;🔊 Noise grows with operations&lt;/td&gt;
&lt;td&gt;Choose parameters carefully&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;📦 Data gets large&lt;/td&gt;
&lt;td&gt;One ciphertext can be ~10KB-100KB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Still, it’s absolutely worth learning if you care about &lt;strong&gt;privacy-first machine learning&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  ✅ Final Thoughts
&lt;/h2&gt;

&lt;p&gt;Homomorphic encryption &lt;strong&gt;lets you compute on secrets&lt;/strong&gt;. That’s huge.&lt;/p&gt;

&lt;p&gt;It means we can have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AI without surveillance&lt;/li&gt;
&lt;li&gt;Insight without exposure&lt;/li&gt;
&lt;li&gt;Safety without spying&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you’re dreaming about building AI that respects user privacy—even on encrypted messages—&lt;strong&gt;this is your starting point&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Let’s build the future where privacy and AI work &lt;strong&gt;together&lt;/strong&gt;, not against each other.&lt;/p&gt;

</description>
      <category>privacy</category>
      <category>encryption</category>
      <category>machinelearning</category>
      <category>homomorphicencryption</category>
    </item>
    <item>
      <title>Retrieval Metrics Demystified: From BM25 Baselines to EM@5 &amp; Answer F1</title>
      <dc:creator>Shamsuddin Ahmed</dc:creator>
      <pubDate>Tue, 29 Apr 2025 10:38:04 +0000</pubDate>
      <link>https://dev.to/shamspias/retrieval-metrics-demystified-from-bm25-baselines-to-em5-answer-f1-ldl</link>
      <guid>https://dev.to/shamspias/retrieval-metrics-demystified-from-bm25-baselines-to-em5-answer-f1-ldl</guid>
      <description>&lt;p&gt;&lt;em&gt;“If a fact falls in a database and nobody retrieves it, does it make a sound?”&lt;/em&gt;&lt;br&gt;&lt;br&gt;
Retrieval‑Augmented Generation (RAG) lives or dies on that first hop—&lt;strong&gt;can the system put the right snippets in front of the language model?&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
In this post, we peel back the buzzwords (&lt;em&gt;BM25&lt;/em&gt;, &lt;em&gt;EM@5&lt;/em&gt;, &lt;em&gt;F1&lt;/em&gt;) and show how to turn them into levers you can actually pull.&lt;/p&gt;


&lt;h2&gt;
  
  
  1. Why bother measuring retrieval separately?
&lt;/h2&gt;

&lt;p&gt;End‑to‑end metrics (BLEU, ROUGE, human ratings) blur two questions together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;em&gt;Did I pull the right passages?&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Did the generator use them well?&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Untangling the knot matters. If you log a 5‑point jump in answer F1, you want to know &lt;strong&gt;where&lt;/strong&gt; the jump came from—better retrieval, a smarter prompt, or a lucky seed? The retrieval metrics below give you that X‑ray.&lt;/p&gt;


&lt;h2&gt;
  
  
  2. BM25—the keyword workhorse
&lt;/h2&gt;

&lt;p&gt;Before transformers, there was the &lt;strong&gt;inverted index&lt;/strong&gt;: a glorified phonebook where every word points to the documents it lives in. BM25 (“Best Match 25”) is the score those phonebooks still use today:&lt;/p&gt;

&lt;p&gt;

&lt;/p&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;BM25⁡(q,d)=∑t∈qIDF(t)  f(t,d)(k1+1)f(t,d)+k1(1−b+b∣d∣∣d∣‾)
\operatorname{BM25}(q,d)=\sum_{t\in q} \text{IDF}(t)\;\frac{f(t,d)(k_1+1)}{f(t,d)+k_1\left(1-b+b\frac{|d|}{\overline{|d|}}\right)}
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop"&gt;&lt;span class="mord mathrm"&gt;BM25&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;q&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mop op-limits"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;t&lt;/span&gt;&lt;span class="mrel mtight"&gt;∈&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;q&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="mop op-symbol large-op"&gt;∑&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;IDF&lt;/span&gt;&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;f&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="minner"&gt;&lt;span class="mopen delimcenter"&gt;&lt;span class="delimsizing size2"&gt;(&lt;/span&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;−&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;b&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;b&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord overline mtight"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;∣&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;d&lt;/span&gt;&lt;span class="mord mtight"&gt;∣&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="overline-line mtight"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mtight"&gt;∣&lt;/span&gt;&lt;span class="mord mathnormal mtight"&gt;d&lt;/span&gt;&lt;span class="mord mtight"&gt;∣&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose delimcenter"&gt;&lt;span class="delimsizing size2"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;f&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord mathnormal"&gt;t&lt;/span&gt;&lt;span class="mpunct"&gt;,&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;span class="mopen"&gt;(&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1&lt;/span&gt;&lt;span class="mclose"&gt;)&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;ul&gt;
&lt;li&gt;
&lt;em&gt;f(t,d)&lt;/em&gt; = term frequency of &lt;em&gt;t&lt;/em&gt; in document &lt;em&gt;d&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;|d|&lt;/em&gt; = token length of &lt;em&gt;d&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;IDF(t)&lt;/em&gt; = inverse document frequency
&lt;/li&gt;
&lt;li&gt;Default hyper‑parameters: 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;k1≈1.2k_1 \approx 1.2&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;k&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;1.2&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
, 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;b≈0.75b \approx 0.75&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;b&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;≈&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;0.75&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Mental model:&lt;/strong&gt; BM25 is a tug‑of‑war between &lt;em&gt;how often&lt;/em&gt; a query word shows up and &lt;em&gt;how common&lt;/em&gt; that word is across the whole corpus.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Why keep it around?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; – microseconds per query on millions of docs.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparency&lt;/strong&gt; – devs can still debug with Ctrl‑F.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Baseline gravity&lt;/strong&gt; – if you can’t beat BM25, something’s off.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  3. EM@k—Exact Match at &lt;em&gt;k&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;Imagine playing &lt;em&gt;Where’s Waldo?&lt;/em&gt; but you’re allowed to search the first &lt;em&gt;k&lt;/em&gt; pages instead of the whole book. &lt;strong&gt;EM@k&lt;/strong&gt; asks: &lt;em&gt;“Does any of my top‑k passages contain the gold answer string **exactly&lt;/em&gt;&lt;em&gt;?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Algorithm for a question set of size 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;NN&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;N&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Retrieve top‑&lt;em&gt;k&lt;/em&gt; passages per question.
&lt;/li&gt;
&lt;li&gt;Mark &lt;strong&gt;hit = 1&lt;/strong&gt; if at least one passage contains the gold answer, otherwise 0.
&lt;/li&gt;
&lt;li&gt;
&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;EM@k=∑i=1NhitiN\text{EM@k} = \frac{\sum_{i=1}^{N} \text{hit}_i}{N}
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;EM@k&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord mathnormal"&gt;N&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mop"&gt;&lt;span class="mop op-symbol small-op"&gt;∑&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;span class="mrel mtight"&gt;=&lt;/span&gt;&lt;span class="mord mtight"&gt;1&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;N&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;hit&lt;/span&gt;&lt;/span&gt;&lt;span class="msupsub"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="sizing reset-size6 size3 mtight"&gt;&lt;span class="mord mathnormal mtight"&gt;i&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;em&gt;Why the fuss over exact match?&lt;/em&gt;&lt;br&gt;&lt;br&gt;
Because partial overlaps (“2008 financial crash” vs. “the 2008 recession”) are slippery to grade at retrieval time. EM@k stays dumb on purpose—either the string shows up or it doesn’t.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Rule‑of‑thumb:&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
&lt;em&gt;EM@5 ≥ 80%&lt;/em&gt; → retrieval is likely &lt;em&gt;not&lt;/em&gt; your bottleneck.&lt;br&gt;&lt;br&gt;
&lt;em&gt;EM@5 ≤ 60%&lt;/em&gt; → focus on the retriever before prompt‑tuning.&lt;/p&gt;
&lt;/blockquote&gt;


&lt;h2&gt;
  
  
  4. Answer‑level F1—did the generator actually use the context?
&lt;/h2&gt;

&lt;p&gt;Once your passages hit the jackpot, the generator still has to &lt;em&gt;say&lt;/em&gt; the answer. For extractive QA, the go‑to metric is token‑level &lt;strong&gt;F1&lt;/strong&gt;:&lt;/p&gt;


&lt;div class="katex-element"&gt;
  &lt;span class="katex-display"&gt;&lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;F1=2×precision×recallprecision+recall
\text{F1} = \frac{2 \times \text{precision} \times \text{recall}}{\text{precision} + \text{recall}}
&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;F1&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mrel"&gt;=&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mopen nulldelimiter"&gt;&lt;/span&gt;&lt;span class="mfrac"&gt;&lt;span class="vlist-t vlist-t2"&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;precision&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;+&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;recall&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="frac-line"&gt;&lt;/span&gt;&lt;/span&gt;&lt;span&gt;&lt;span class="pstrut"&gt;&lt;/span&gt;&lt;span class="mord"&gt;&lt;span class="mord"&gt;2&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;precision&lt;/span&gt;&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mbin"&gt;×&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;recall&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-s"&gt;​&lt;/span&gt;&lt;/span&gt;&lt;span class="vlist-r"&gt;&lt;span class="vlist"&gt;&lt;span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose nulldelimiter"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/div&gt;


&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Component&lt;/th&gt;
&lt;th&gt;Definition&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Precision&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tokens in the model answer ∩ tokens in the gold answer ÷ tokens in the model answer&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recall&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tokens in the model answer ∩ tokens in the gold answer ÷ tokens in the gold answer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;F1 forgives small wording tweaks—&lt;em&gt;“Barack Obama”&lt;/em&gt; vs. &lt;em&gt;“Obama”&lt;/em&gt;—in a way EM cannot.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. From BM25 to Dense Retrieval &amp;amp; Reranking
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Stage&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;What changes&lt;/th&gt;
&lt;th&gt;Why you win&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dual‑encoder&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;em&gt;Dense Passage Retriever&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;Index contains 768‑D vectors, not word positions&lt;/td&gt;
&lt;td&gt;Captures synonyms (“terminate” ≈ “cancel”)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cross‑encoder&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;MiniLM, MonoT5…&lt;/td&gt;
&lt;td&gt;Re‑score 
&lt;span class="katex-element"&gt;
  &lt;span class="katex"&gt;&lt;span class="katex-mathml"&gt;[CLS]q  [SEP]  d[\text{CLS}] q\;[SEP]\;d&lt;/span&gt;&lt;span class="katex-html"&gt;&lt;span class="base"&gt;&lt;span class="strut"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;[&lt;/span&gt;&lt;span class="mord text"&gt;&lt;span class="mord"&gt;CLS&lt;/span&gt;&lt;/span&gt;&lt;span class="mclose"&gt;]&lt;/span&gt;&lt;span class="mord mathnormal"&gt;q&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mopen"&gt;[&lt;/span&gt;&lt;span class="mord mathnormal"&gt;SEP&lt;/span&gt;&lt;span class="mclose"&gt;]&lt;/span&gt;&lt;span class="mspace"&gt;&lt;/span&gt;&lt;span class="mord mathnormal"&gt;d&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;
&lt;/span&gt;
 with full token interactions&lt;/td&gt;
&lt;td&gt;Sharp ordering; filters noise&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;A typical contract QA study logged:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;BM25 → &lt;strong&gt;61% EM@5&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;DPR + Cross‑encoder → &lt;strong&gt;79% EM@5&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Same corpus, same questions—just a richer notion of “relevance”.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Other retrieval diagnostics you’ll meet in the wild
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;What it asks&lt;/th&gt;
&lt;th&gt;Best when…&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Recall@k&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;em&gt;Any&lt;/em&gt; gold passage in top‑&lt;em&gt;k&lt;/em&gt;?&lt;/td&gt;
&lt;td&gt;Gold labels are full passages, not spans&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MRR&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How early is the &lt;strong&gt;first&lt;/strong&gt; correct hit?&lt;/td&gt;
&lt;td&gt;You care about position 1 above all&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;MAP&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;How well are &lt;strong&gt;all&lt;/strong&gt; relevant docs ranked?&lt;/td&gt;
&lt;td&gt;Multiple correct passages per query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;nDCG@k&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Same as MAP but with graded (0–3) relevance&lt;/td&gt;
&lt;td&gt;Web search, ad ranking&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  7. Hands‑on: computing EM@5 in Python
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;em_at_k&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;retrieved&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]],&lt;/span&gt; &lt;span class="n"&gt;gold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;retrieved[i] is the ranked list for question i; gold[i] the gold answer string&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;any&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;retrieved&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][:&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pro tip:&lt;/strong&gt; pre‑lowercase and strip punctuation on both sides to avoid false misses.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  8. Cheat‑sheet 🧾
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BM25         – bag‑of‑words baseline; fast, transparent
EM@k         – % questions whose answer text appears in top‑k passages
Answer F1    – token overlap between generated and gold answer
Dense Retr.  – dual‑encoder embeddings; higher recall than BM25
Cross‑encode – reranks with full attention; boosts top‑1 precision
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  9. Try it yourself 🧪
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;FAQ Retriever Bake‑Off&lt;/strong&gt;
Index your company FAQ with BM25 &lt;em&gt;and&lt;/em&gt; DPR; measure EM@5 on a 50‑question test set. Which wins?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prompt‑Effect Audit&lt;/strong&gt;
Freeze retrieval; vary only the generation prompt. How much does answer F1 move? Log your findings in a two‑column table.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Metric Mixing Board&lt;/strong&gt;
Build a dashboard that shows EM@1, EM@5, Recall@20, and answer F1 side by side for each experiment run.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  10. Final words
&lt;/h2&gt;

&lt;p&gt;Like good coffee, a RAG system is only as strong as its first extraction. Nail the retrieval metrics, and the language model can do what it does best—explain, summarise, and synthesise without hallucinating. Happy hunting, and may your EM curves trend ever upward!&lt;/p&gt;




&lt;h2&gt;
  
  
  11. Live Demo in Colab
&lt;/h2&gt;

&lt;p&gt;I've packed the full retrieval-metrics pipeline—including BM25 retrieval, EM@k scoring, token-level F1, and EM-curve plotting—into a runnable Google Colab notebook. Click below to open, run, and experiment:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://colab.research.google.com/drive/1IzCYnxtvM1fPPrCVW4SMAyo7aYdFWSeX?usp=sharing" rel="noopener noreferrer"&gt;Open the “Retrieval Metrics Demystified” Colab notebook&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feel free to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Fork and modify the corpus or QA set
&lt;/li&gt;
&lt;li&gt;Tune BM25 hyper-parameters (&lt;code&gt;k1&lt;/code&gt;, &lt;code&gt;b&lt;/code&gt;)
&lt;/li&gt;
&lt;li&gt;Swap in a dense retriever or reranker
&lt;/li&gt;
&lt;li&gt;Plot EM@k curves on your own data
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Comments and pull-requests on the notebook are very welcome—let me know what you build!&lt;/p&gt;

</description>
      <category>ragevaluation</category>
      <category>rag</category>
      <category>evaluation</category>
      <category>bm25</category>
    </item>
  </channel>
</rss>
