As developers, we know the pain of a slow browser extension. The moment a tool designed to help you starts hogging resources, it becomes a liability. That was the challenge we faced with Gaze Guard, our image classification extension. The initial, naive implementation — a simple "scan everything all the time" approach — was functional but created noticeable performance bottlenecks on image-heavy sites.
We decided to go back to the drawing board and completely re-engineer the core scanning logic. The goal was simple: deliver powerful, real-time image classification with near-zero impact on the user's browsing experience. This article breaks down the eight key architectural changes that transformed Gaze Guard from a resource hog into a performance powerhouse.
1. Viewport-Based Scanning: The IntersectionObserver Revolution
The most significant change was moving away from classifying every image on the page immediately. This led to massive, synchronous classification spikes on page load.
The Fix: We adopted a viewport-based scanning strategy using the IntersectionObserver API.
- Lazy Discovery: Images are only processed when they come near the viewport, using a generous
rootMargin. - Minimal Workload: As soon as an image is handled (blurred or cleared), it is immediately unobserved. This keeps the observer's workload small and prevents the system from being overwhelmed by hundreds or thousands of off-screen elements.
This change alone eliminated the "big synchronous classification spikes" that plagued the previous version, ensuring a smooth initial page load.
2. Event-Driven DOM Updates: Replacing Polling with MutationObserver
The old architecture relied on "aggressive interval scanning" — a fixed timer that constantly woke up to check for new images. This is a classic anti-pattern for performance.
The Fix: We replaced polling with an event-driven approach using MutationObserver.
- Initial Scan: A single, non-recurring
scanImagesOnce()handles the initial page content. - Dynamic Content: For sites with infinite scroll or dynamic feeds, a
MutationObserverwatches for added nodes. It only scans the subtrees of the newly added elements. - Throttling: To handle bursts of DOM changes, the observer's callback is debounced by 100ms, collapsing multiple mutations into a single, efficient scan.
The result is that work only happens when the page actually changes, eliminating the constant, unnecessary overhead of fixed timers.
3. Batching and Throttling the ML Pipeline
Even with smart discovery, the machine learning classification process is the heaviest part of the workload. Running a large batch of classifications synchronously will inevitably freeze the UI.
The Fix: We introduced batching and cooperative scheduling for the ML pipeline.
- Batch Processing: Images waiting for classification are put into an
analysisQueue. TheprocessQueue()function handles them in small batches (e.g.,BATCH_SIZE = 5). - Yielding to the Browser: Crucially, between batches, the process yields to the browser using a tiny
setTimeout(..., 5)andtf.nextFrame(). This allows layout and paint operations to run, keeping the UI responsive and preventing jank. - Timeouts: Each classification is wrapped in a timeout (
TIMEOUT_MS = 3000). This prevents "stuck" images (due to slow network or bad CORS) from blocking the entire queue.
4. Caching Classification Results to Avoid Rework
Why classify the same image twice? On sites with repeated elements, or during back/forward navigation, re-running the ML model is pure waste.
The Fix: A robust, multi-layered verdict cache.
- In-Memory Cache:
srcVerdictsprovides fast, O(1) lookups for the current session. - Persistent Cache:
persistentVerdictsstores results inchrome.storage.local, ensuring that if a user navigates away and comes back, or even restarts the browser, known images are instantly recognized.
When an image is spotted, the extension checks the cache first. If a verdict exists, it immediately applies the blur or clears the image, skipping the entire classification pipeline.
5. Smarter and Cheaper DOM Operations (Micro-Optimizations)
Performance often comes down to avoiding expensive browser operations, particularly those that trigger layout thrashing (reflows).
| Optimization | Old Approach (Expensive) | New Approach (Efficient) | Why it Matters |
|---|---|---|---|
| Image Dimensions | Accessing element.width/height
|
Using naturalWidth / naturalHeight
|
Accessing element.width/height can force a synchronous layout/reflow. |
| Background Images | Using getComputedStyle
|
Using only inline element.style.backgroundImage
|
getComputedStyle can also trigger reflows, while inline style access is cheap. |
| Irrelevant Content | Scanning all images | Skipping images smaller than 16x16px | Avoids wasting cycles on icons, tracking pixels, and other irrelevant elements. |
| Vector Graphics | Feeding SVGs to the raster classifier | Short-circuiting SVGs as "safe" | Prevents feeding vector graphics into a model designed for raster images, saving classification time. |
| DOM Traversal |
querySelectorAll('*') deep scan |
Targeted findAllImages() strategy |
The previous deep scan was explicitly marked as "too expensive" and replaced with a targeted search for <img> tags and selective iframe handling. |
6. More Efficient Model and Backend Usage
The machine learning model itself needs to be managed efficiently.
- Model Caching: The NSFWJS model is loaded once and cached in memory, preventing repeated heavy initializations.
- TensorFlow Backend:
ensureTfReady()is used to enable TensorFlow.js production mode and allow it to select the best available backend (WebGL, WASM, etc.), rather than forcing a slower CPU-based backend. - Clean Cleanup: Dangerous blanket cleanup calls like
tf.disposeVariables()(which could wipe model weights) were removed, ensuring the model stays "warm" and ready for immediate use.
7. Background Fetching for Problematic Images
Some images, particularly those with strict CORS policies or special hosting, can fail to load in the content script, leading to repeated, failing attempts.
The Fix: We defer to the background service worker (background.js).
For images that fail to load directly, the content script sends a message to the background worker, which fetches the image and returns a data URI. This offloads network details and retries to Chrome's more robust extension architecture, keeping the main content script simpler and less prone to getting stuck.
8. Turning Off Heavy Work on Unsupported Contexts
Finally, we prevent the extension from running in contexts where it can't or shouldn't work, such as PDF viewers.
- Robust Detection: A comprehensive
isPdfEnvironment()check detects PDF viewers based on URLs, content types, and embedded tags. - Immediate Stop: When a PDF is detected,
stopAll()is called immediately. This cancels all observers, clears queues, and removes any existing blur markers, eliminating all overhead in a context where image classification is not useful.
Conclusion: Performance as a Feature
The performance improvements in Gaze Guard are not just a nice-to-have; they are a core feature. By applying principles of lazy loading, event-driven architecture, cooperative scheduling, and aggressive caching, we've managed to integrate a heavy machine learning workload into the browser without compromising the user experience.
This journey highlights that in extension development, optimizing for the DOM and the browser's event loop is just as critical as optimizing the core algorithm. We hope this deep dive provides valuable insights for anyone building performance-sensitive tools for the web.
Top comments (0)