DEV Community

Atul Srivastava
Atul Srivastava

Posted on

How I Built a Chrome Extension with 2000+ Users: Lessons in Screen Capture, OCR, and Browser APIs

Two years ago, I published Advanced Smart Capture on the Chrome Web Store. Today it has over 2,000 active users. Here's what I learned building a feature-rich Chrome extension from scratch.

What It Does

Advanced Smart Capture is a screenshot and productivity tool:

  • Full-page capture — scrolls and stitches the entire page
  • Region capture — select any area of the screen
  • Element capture — click on any DOM element to capture it
  • OCR text extraction — extract text from any image or screenshot
  • Annotations — draw, highlight, add text, arrows, and shapes
  • PDF export — save annotated screenshots as PDFs
  • Scheduled captures — automate recurring screenshots

The Technical Challenges

1. Full-Page Screenshot Stitching

Chrome's captureVisibleTab API only captures the visible viewport. For full-page screenshots, you need to:

async function captureFullPage(tab) {
  const { scrollHeight, clientHeight } = await chrome.scripting.executeScript({
    target: { tabId: tab.id },
    func: () => ({
      scrollHeight: document.documentElement.scrollHeight,
      clientHeight: document.documentElement.clientHeight
    })
  })[0].result;

  const captures = [];
  let scrollY = 0;

  while (scrollY < scrollHeight) {
    // Scroll to position
    await chrome.scripting.executeScript({
      target: { tabId: tab.id },
      func: (y) => window.scrollTo(0, y),
      args: [scrollY]
    });

    // Wait for rendering
    await new Promise(r => setTimeout(r, 150));

    // Capture visible area
    const dataUrl = await chrome.tabs.captureVisibleTab(
      tab.windowId, { format: 'png' }
    );
    captures.push({ dataUrl, scrollY });

    scrollY += clientHeight;
  }

  return stitchImages(captures, scrollHeight);
}
Enter fullscreen mode Exit fullscreen mode

The tricky part is handling:

  • Fixed/sticky headers that appear in every capture
  • Lazy-loaded images that haven't rendered yet
  • Dynamic content that changes between scrolls

2. OCR with Tesseract.js

For text extraction, I integrated Tesseract.js — a pure JavaScript OCR engine:

import Tesseract from 'tesseract.js';

async function extractText(imageDataUrl) {
  const { data: { text, confidence } } = await Tesseract.recognize(
    imageDataUrl,
    'eng',
    {
      logger: (m) => {
        if (m.status === 'recognizing text') {
          updateProgress(m.progress * 100);
        }
      }
    }
  );

  return { text: text.trim(), confidence };
}
Enter fullscreen mode Exit fullscreen mode

Key optimization: I run OCR in a Web Worker to keep the popup responsive.

3. Canvas-Based Annotation Editor

The annotation layer uses HTML5 Canvas with a custom tool system:

class AnnotationEditor {
  constructor(canvas, image) {
    this.canvas = canvas;
    this.ctx = canvas.getContext('2d');
    this.tools = {
      pen: new PenTool(this),
      arrow: new ArrowTool(this),
      rectangle: new RectangleTool(this),
      text: new TextTool(this),
      highlight: new HighlightTool(this)
    };
    this.history = []; // For undo/redo
  }

  addAnnotation(annotation) {
    this.history.push(annotation);
    this.redraw();
  }

  undo() {
    if (this.history.length > 0) {
      this.history.pop();
      this.redraw();
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

4. Manifest V3 Migration

Chrome is deprecating Manifest V2. Key changes I had to make:

  • Background scripts → Service Workers (no persistent state)
  • chrome.browserActionchrome.action
  • Content Security Policy changes for Wasm (needed for Tesseract)
{
  "manifest_version": 3,
  "permissions": ["activeTab", "scripting", "storage"],
  "background": {
    "service_worker": "background.js",
    "type": "module"
  },
  "action": {
    "default_popup": "popup.html"
  }
}
Enter fullscreen mode Exit fullscreen mode

Growth Lessons

  1. Solve a real pain point — people need screenshots with annotations daily
  2. Free tier matters — generous free features build your user base
  3. Chrome Web Store SEO — title, description, and screenshots are critical
  4. Respond to reviews — users appreciate when developers engage
  5. Iterate based on feedback — OCR and scheduled captures were user requests

Numbers

  • 2,000+ active users after organic growth only
  • Listed on Product Hunt — additional visibility
  • 4.5+ star rating on Chrome Web Store

Try It


Building Chrome extensions is one of the fastest ways to ship a product that people use daily. The APIs are powerful, distribution is built-in (Chrome Web Store), and the feedback loop is fast.

Have you built a Chrome extension? What was your biggest challenge? Let me know in the comments!

Top comments (0)