<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Wesley Smith</title>
    <description>The latest articles on DEV Community by Wesley Smith (@wesleysmyth).</description>
    <link>https://dev.to/wesleysmyth</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2485371%2F94212521-fbf7-4106-8921-cbad09883b9d.jpeg</url>
      <title>DEV Community: Wesley Smith</title>
      <link>https://dev.to/wesleysmyth</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/wesleysmyth"/>
    <language>en</language>
    <item>
      <title>How I Built a Chrome Extension That Summarizes Any Article in 2 Seconds Using AI</title>
      <dc:creator>Wesley Smith</dc:creator>
      <pubDate>Fri, 06 Mar 2026 17:42:15 +0000</pubDate>
      <link>https://dev.to/wesleysmyth/how-i-built-a-chrome-extension-that-summarizes-any-article-in-2-seconds-using-ai-1a8f</link>
      <guid>https://dev.to/wesleysmyth/how-i-built-a-chrome-extension-that-summarizes-any-article-in-2-seconds-using-ai-1a8f</guid>
      <description>&lt;p&gt;You know the workflow. You find a 12-minute article. You want the gist. So you Ctrl+A the whole page, paste it into ChatGPT, type "summarize this," and wait. Ten seconds. Fifteen seconds. Finally, a wall of text comes back that's somehow almost as long as the original article.&lt;/p&gt;

&lt;p&gt;Then you do it again on the next tab. And the next one. And by the fourth article, you've spent more time summarizing than you would have spent just reading.&lt;/p&gt;

&lt;p&gt;I got tired of this loop. Not because the AI part was bad -- GPT does a fine job summarizing -- but because the &lt;em&gt;workflow&lt;/em&gt; was broken. The friction of copy, switch tab, paste, wait, read, switch back... it adds up. So I built a Chrome extension called TLDR that does the whole thing in one click, in about two seconds. Here is what I learned building it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What TLDR Does
&lt;/h2&gt;

&lt;p&gt;Click the extension icon. Get a summary. That's it.&lt;/p&gt;

&lt;p&gt;Behind one click, the extension extracts the article text from the page, sends it to an LLM, and renders a summary with key bullet points in a clean popup. It caches results so revisiting an article is instant. And it gives you 36 different summary "styles" -- four tones, three lengths, three focus areas -- so the output actually matches how you think, not how a generic chatbot defaults.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: Four Moving Parts
&lt;/h2&gt;

&lt;p&gt;TLDR is a Manifest V3 Chrome extension with four components that talk to each other through message passing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Content Script] --&amp;gt; extracts article from DOM
       |
       | chrome.tabs.sendMessage
       v
[Popup Script] --&amp;gt; orchestrates the flow, renders UI
       |
       | chrome.runtime.sendMessage
       v
[Service Worker] --&amp;gt; checks cache, calls AI, stores results
       |
       | fetch()
       v
[Groq API] --&amp;gt; LLaMA 3.1 inference
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;strong&gt;content script&lt;/strong&gt; runs on every page and exposes an article extraction function. The &lt;strong&gt;popup&lt;/strong&gt; is the entry point -- when the user clicks the icon, it asks the content script to extract the article, then sends it to the service worker for summarization. The &lt;strong&gt;service worker&lt;/strong&gt; handles caching, settings, and the actual API call. And &lt;strong&gt;Groq's API&lt;/strong&gt; does the inference.&lt;/p&gt;

&lt;p&gt;This separation is not just for cleanliness. Manifest V3 &lt;em&gt;forces&lt;/em&gt; it. Content scripts can access the DOM but not extension APIs. Service workers can make API calls but cannot touch the DOM. The popup bridges the two.&lt;/p&gt;

&lt;h2&gt;
  
  
  Article Extraction: The Surprisingly Hard Part
&lt;/h2&gt;

&lt;p&gt;The first version of this extension used &lt;code&gt;document.body.innerText&lt;/code&gt;. It was terrible. You get nav links, cookie banners, sidebar widgets, comment sections, footer text, ad copy -- basically everything except the actual article.&lt;/p&gt;

&lt;p&gt;Naive DOM parsing fails because modern web pages are 80% chrome and 20% content. A typical news article page might have 50,000 characters of HTML, of which maybe 5,000 are the story you want to read.&lt;/p&gt;

&lt;p&gt;The solution is Mozilla's &lt;a href="https://github.com/mozilla/readability" rel="noopener noreferrer"&gt;Readability.js&lt;/a&gt; -- the same library that powers Firefox's Reader View. It uses a scoring algorithm that analyzes DOM nodes by their tag names, class names, content density, and position to identify the most likely "article" element. It is battle-tested on millions of pages.&lt;/p&gt;

&lt;p&gt;The extraction pipeline looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isProbablyReaderable&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@mozilla/readability&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;DOMPurify&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dompurify&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;extractArticle&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Bail early if this page is not article-shaped&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nf"&gt;isProbablyReaderable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;not_article&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Clone the DOM so Readability's mutations don't affect the live page&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;documentClone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cloneNode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Strip noise before Readability even sees it&lt;/span&gt;
  &lt;span class="nx"&gt;documentClone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;querySelectorAll&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;script, style, noscript, iframe&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;forEach&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;el&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;el&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Readability&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;documentClone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;charThreshold&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;keepClasses&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;nbTopCandidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;article&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;reader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="c1"&gt;// ...sanitize with DOMPurify, calculate reading time, return&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things worth noting. First, &lt;code&gt;isProbablyReaderable&lt;/code&gt; is a lightweight pre-check -- if you are on a Google search results page or a login form, it rejects fast without doing the expensive parse. Second, we clone the entire document because Readability mutates the DOM during parsing (it removes elements, restructures nodes). If you pass it the live document, the page breaks. Third, even after Readability extracts the article, we run the title through DOMPurify with &lt;code&gt;ALLOWED_TAGS: []&lt;/code&gt; to strip any injected HTML. You would be surprised what some CMSes put in &lt;code&gt;&amp;lt;title&amp;gt;&lt;/code&gt; tags.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;charThreshold: 100&lt;/code&gt; setting is worth calling out. The default is 500, which causes Readability to reject shorter articles. Lowering it to 100 means we can summarize brief blog posts that the default config would skip.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AI Layer: Why Groq, Not OpenAI
&lt;/h2&gt;

&lt;p&gt;The first prototype used OpenAI's API. It worked. It was also slow. A typical summarization call took 8-15 seconds with GPT-3.5 Turbo, and the free tier is... not free. For a browser extension where the entire value proposition is &lt;em&gt;speed&lt;/em&gt;, that was a dealbreaker.&lt;/p&gt;

&lt;p&gt;Groq runs LLaMA 3.1 8B on custom LPU hardware. Same call, roughly two seconds. And the free tier gives you 30 requests per minute with generous daily limits. For a summarization task where you don't need GPT-4-level reasoning -- you need fast, competent text compression -- it is the right tool.&lt;/p&gt;

&lt;p&gt;The API is OpenAI-compatible, so switching was a one-line URL change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;GROQ_API_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://api.groq.com/openai/v1/chat/completions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_MODEL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;llama-3.1-8b-instant&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;We request structured JSON output with &lt;code&gt;response_format: { type: 'json_object' }&lt;/code&gt;, which means every response comes back as parseable JSON with &lt;code&gt;summary&lt;/code&gt;, &lt;code&gt;keyPoints&lt;/code&gt;, and &lt;code&gt;tone&lt;/code&gt; fields. No regex extraction of markdown. No hoping the model follows your format. Structured output just works.&lt;/p&gt;

&lt;h2&gt;
  
  
  36 Summary Styles: The Prompt Engineering
&lt;/h2&gt;

&lt;p&gt;Most summarizers give you one output format. TLDR gives you 36 combinations: 4 tones (witty, professional, casual, academic) times 3 lengths (one-liner, brief, detailed) times 3 focus areas (key facts, opinions, implications). Each combination gets a distinct system prompt assembled at runtime.&lt;/p&gt;

&lt;p&gt;The prompt builder composes these modularly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;buildSystemPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tonePreset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;TONE_PRESETS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tone&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;lengthPreset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;LENGTH_PRESETS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;focusPreset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;FOCUS_PRESETS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;focus&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s2"&gt;`You are TLDR, a brilliant summarizer...

STYLE: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tonePreset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instruction&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
LENGTH: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;lengthPreset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instruction&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
FOCUS: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;focusPreset&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instruction&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;
...`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each preset carries its own instruction text and few-shot examples. The witty preset says "Use wordplay, irony, or unexpected angles." The academic preset says "Acknowledge complexity, use precise terminology." The length presets are &lt;em&gt;aggressively&lt;/em&gt; specific because LLMs tend to under-generate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nx"&gt;brief&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;instruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Write EXACTLY 2-3 complete sentences totaling 30-40 words. &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;You MUST use AT LEAST 30 words. If your first draft is shorter, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
    &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;expand with context, significance, or relevant details.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;targetWords&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This specificity came from testing. The comment at the top of the prompts file tells the story: "TUNED based on variation test results (34 articles, 283 API calls)." Early versions would ask for "a brief summary" and get back 8 words. You have to spell out minimums, use words like "MUST" and "NEVER," and give concrete sentence counts. LLMs respect constraints they can count.&lt;/p&gt;

&lt;p&gt;One particularly stubborn problem: opening variety. Early testing showed that the model started 70%+ of summaries with "[Topic] is..." -- "AI is transforming healthcare," "The study is groundbreaking." Every summary opened the same way. The fix was adding explicit anti-patterns to the prompt:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;BAD patterns to NEVER use:
- "[Topic] is..." or "[Topic] are..." (boring, every AI does this)
- "This article discusses..." (passive, meta)
- "The key takeaways are..." (robotic, predictable)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Combined with positive opening strategies ("Lead with the most surprising finding," "Start with an action or consequence"), this dramatically improved variety.&lt;/p&gt;

&lt;h2&gt;
  
  
  Manifest V3: The Service Worker Problem
&lt;/h2&gt;

&lt;p&gt;If you have built Chrome extensions before, Manifest V3's service worker model is the biggest architectural change. In MV2, you had a persistent background page. In MV3, the service worker can be terminated at any time when idle.&lt;/p&gt;

&lt;p&gt;This has one critical implication for message handling: you must register your &lt;code&gt;onMessage&lt;/code&gt; listener at the top level of the service worker, synchronously, during initial execution. If you try to register it inside an async init function or after an &lt;code&gt;await&lt;/code&gt;, Chrome might terminate the worker before your listener is set up.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// MUST be at top level for MV3 -- not inside an async function&lt;/span&gt;
&lt;span class="nx"&gt;chrome&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;runtime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onMessage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addListener&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sendResponse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;handleMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sender&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sendResponse&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;sendResponse&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Keep the message channel open for async response&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;return true&lt;/code&gt; is easy to forget and brutal to debug. Without it, the message channel closes before your async handler resolves, and &lt;code&gt;sendResponse&lt;/code&gt; silently fails. The popup just hangs.&lt;/p&gt;

&lt;p&gt;The message passing architecture uses a typed message system where each message has a &lt;code&gt;type&lt;/code&gt; field (&lt;code&gt;SUMMARIZE&lt;/code&gt;, &lt;code&gt;GET_CACHED_SUMMARY&lt;/code&gt;, &lt;code&gt;SAVE_SETTINGS&lt;/code&gt;, etc.) that gets routed through a switch statement. It is simple, explicit, and easy to trace when debugging.&lt;/p&gt;

&lt;h2&gt;
  
  
  Smart Caching
&lt;/h2&gt;

&lt;p&gt;Nobody wants to wait two seconds for a summary they already generated. The caching layer uses a hash of the URL as the cache key:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;_hashUrl&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;char&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;charCodeAt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;char&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;hash&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Convert to 32-bit integer&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;url_&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hash&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is a DJB2-style hash -- not cryptographically secure, but fast and collision-resistant enough for 100 cache entries. We use &lt;code&gt;chrome.storage.local&lt;/code&gt; for the cache (per-machine, 10MB quota) and &lt;code&gt;chrome.storage.sync&lt;/code&gt; for settings (synced across devices, 100KB quota).&lt;/p&gt;

&lt;p&gt;Cache entries expire after 24 hours, and when the cache hits 100 entries, the oldest gets evicted. The eviction is simple -- find the entry with the smallest timestamp and delete it. No LRU linked list, no priority queue. For 100 entries, a linear scan is fine.&lt;/p&gt;

&lt;p&gt;When the user hits "Regenerate," we pass &lt;code&gt;forceRefresh: true&lt;/code&gt; which bypasses the cache, generates a fresh summary, and overwrites the cached version.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The hard part was extraction, not AI.&lt;/strong&gt; I spent more time debugging Readability edge cases -- paywalled sites, SPAs that load content after DOMContentLoaded, pages with multiple article-like sections -- than I spent on the AI integration. The Groq API just works. Getting clean text out of the wild web is the real challenge.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Users care about speed more than summary quality.&lt;/strong&gt; The jump from 10+ seconds (OpenAI) to ~2 seconds (Groq) changed everything. At 10 seconds, people wonder if it is worth the wait. At 2 seconds, it feels instant, and they use it reflexively. The quality difference between GPT-3.5 and LLaMA 3.1 8B for summarization is marginal; the speed difference is transformative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Groq's free tier is viable for production.&lt;/strong&gt; Thirty requests per minute with no credit card required. For a browser extension where each user generates maybe 10-20 summaries per day, this is more than sufficient. If you are building a developer tool or personal productivity app, you do not need to spin up your own inference server.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Manifest V3 is an improvement, but the migration pain is real.&lt;/strong&gt; The service worker lifecycle, the restricted API surface, the new permissions model -- they all push you toward better patterns (no persistent background state, explicit permissions, declarative APIs). But the documentation assumes you already know what changed, and debugging a terminated service worker is not fun.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;TLDR is free and open source.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Chrome Web Store&lt;/strong&gt;: &lt;a href="https://chromewebstore.google.com/detail/tldr-article-summarizer/hmphaahhfmfdebdjedigdmjjnbcdpjkm" rel="noopener noreferrer"&gt;Install TLDR&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GitHub&lt;/strong&gt;: &lt;a href="https://github.com/wesleysmyth/TLDR-extension" rel="noopener noreferrer"&gt;wesleysmyth/TLDR-extension&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You need a free Groq API key (takes 30 seconds to get one at &lt;a href="https://console.groq.com/keys" rel="noopener noreferrer"&gt;console.groq.com&lt;/a&gt;). Paste it into the settings page, and you are summarizing articles.&lt;/p&gt;

&lt;p&gt;If you build something similar, or have questions about Manifest V3 or Readability.js, I would love to hear about it in the comments.&lt;/p&gt;

</description>
      <category>chromeextension</category>
      <category>javascript</category>
      <category>ai</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
