Projekta2

Posted on Jul 5

I built a Chrome extension used daily for PR review — 5 decisions I'd make differently

#productivity #chrome #webdev #javascript

I've been reviewing pull requests for most of my career.

At some point the queue got bad enough that I stopped asking "which PR should I review first?" and started asking "why does this keep happening?"

The answer was that I had no system. Just a flat list of open PRs, a vague sense of urgency, and two hours a day I couldn't account for.

So I built a tool. PR Focus Pro — a Chrome extension that triages GitHub PRs with AI summaries and a hybrid risk score. Free tier included; Pro is a $9.50 one-time payment, no subscription. It's been running in production for 6 weeks. I use it every day.

Here are five decisions I made building it, and what I'd do differently if I started today.

Decision 1 — I stored state in module-level variables. The MV3 service worker killed them silently.

The first version of the background script looked like this:

// background.js — v1, the wrong way
let cachedPRs = [];
let lastRefresh = null;

chrome.alarms.onAlarm.addListener(async (alarm) => {
  if (alarm.name === 'refreshPRs') {
    cachedPRs = await fetchPRsGraphQL(token, login);
    lastRefresh = Date.now();
  }
});

This worked perfectly in development. The service worker stayed alive because DevTools was open.

In production, Chrome terminates MV3 service workers after ~30 seconds of inactivity. Every time it restarted, cachedPRs was [] and lastRefresh was null. The popup loaded and showed nothing. Users saw a blank screen and assumed the extension was broken.

The fix is obvious once you've been burned by it — never store state in memory. Everything that needs to survive goes to chrome.storage.local or IndexedDB:

// background.js — current version
async function refreshAllData() {
  const token = await getActiveToken();
  if (!token) {
    await savePRsToCache([], 'review');
    updateBadge(0);
    return;
  }
  const prs = await fetchPRsGraphQL(token, login);
  await savePRsToCache(prs, 'review'); // persists across service worker restarts
  updateBadge(prs.length);
}

There's a second problem I missed: calling refreshAllData() at the top level of background.js means it runs every time the service worker starts — which in MV3 can be dozens of times per hour. Without a throttle, that's dozens of unintended GitHub API calls.

// Add this before calling refreshAllData() at startup
async function shouldRefresh() {
  const { lastRefresh } = await chrome.storage.local.get('lastRefresh');
  return !lastRefresh || Date.now() - lastRefresh > 3 * 60 * 1000;
}

if (await shouldRefresh()) {
  await chrome.storage.local.set({ lastRefresh: Date.now() });
  refreshAllData();
}

What I'd do differently: Treat the service worker as stateless from day one. Not as a refactor when users report blank screens — from day one.

Update from the comments on the previous article: Mudassir Khan caught a real bug in the streaming implementation I published — one that's easy to miss because it fails silently. decoder.decode(value) gives you whatever bytes arrived in that read, which doesn't always align with SSE event boundaries. If a chunk ends mid-line, chunk.split('\n') drops the tail and the next read starts without that continuation. The result is occasional token drops that look like the model cutting itself short. The fix is a line buffer that carries incomplete fragments across reads:

// streaming with chunk boundary fix
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = ''; // accumulate incomplete lines across reads

while (true) {
  const { done, value } = await reader.read();
  if (done) break;

  buffer += decoder.decode(value, { stream: true });
  const lines = buffer.split('\n');

  // Keep the last (potentially incomplete) line in the buffer
  buffer = lines.pop();

  for (const line of lines) {
    if (!line.startsWith('data: ')) continue;
    const data = line.slice(6);
    if (data === '[DONE]') continue;

    try {
      const parsed = JSON.parse(data);
      const token = parsed.choices[0]?.delta?.content || '';
      chrome.tabs.sendMessage(tabId, { type: 'AI_TOKEN', token });
    } catch (e) {
      // malformed chunk — skip
    }
  }
}

Separately, Nazar Boyko pointed out that the cleaner long-term fix for the service worker termination problem itself is an offscreen document — it holds the fetch open without the 30-second limit, and relays tokens back via messages the same way. One caveat: only one offscreen document at a time, so concurrent streams need a small queue. Build Log #008 is going to cover this.

Decision 2 — I built AI features before I confirmed anyone wanted them.

The hybrid priority algorithm took two weeks to build:

// priority.js
function computeScore(pr, w) {
  let score = 0;

  if (pr.ciStatus === 'failure') score += w.weightCiFail;
  else if (pr.ciStatus === 'pending') score += w.weightCiFail * 0.5;

  score += pr.age * w.weightAge;

  // AI risk boost: score 0–100 maps to 0–200 extra points
  if (pr.aiRisk?.score != null) {
    score += pr.aiRisk.score * 2;
  }

  return score;
}

The problem: the non-AI version — CI status + age — already solved 80% of the problem for most users. The AI layer is genuinely useful for catching things the deterministic signals miss (a PR that's 2 days old with green CI but touches authentication). But most users configure the extension, see their PRs sorted, and never set up an API key.

I built AI first because it was more interesting to build. The unsexy features — multi-account support, stale PR notifications, CSV export — are what users actually ask about in support emails.

The architecture at least degrades gracefully:

// ai.js
export async function enrichPRsWithAI(prs, fetchDiff) {
  const [pro, config] = await Promise.all([isPro(), getAIConfig()]);

  if (!pro) {
    console.log('[AI] No PRO license – skipping AI enrichment');
    return prs; // sorting still works without AI
  }
  if (!config.apiKey || !config.enabled) {
    console.log('[AI] No API key – skipping enrichment');
    return prs;
  }
  // ... AI enrichment only runs when it should
}

What I'd do differently: Ship the deterministic version first. Let users tell you they want AI. Build the interesting thing second.

Decision 3 — I almost skipped proper license validation. I'm glad I didn't.

The Gumroad license validation is about 80 lines of code. I almost used a simple boolean flag stored locally that users could flip in DevTools.

What made me build it properly was thinking about the failure mode: what happens when Gumroad's API is down and a paying user opens the extension?

// license.js — the failure mode that matters
export async function verifyLicense(key) {
  try {
    const res = await fetch(GUMROAD_VERIFY_URL, { ... });
    const json = await res.json();

    if (json.success) {
      await chrome.storage.local.set({
        licenseKey: key.trim(),
        licenseValid: true,
        licenseValidatedAt: Date.now(),
      });
      return { success: true };
    }

    // Key invalid according to Gumroad.
    // Do NOT wipe a previously stored valid license —
    // offline users and rate-limited requests look identical to invalid keys.
    return { success: false, error: json.message };

  } catch (err) {
    // Network failure — fall back to cached state
    const { licenseValid } = await chrome.storage.local.get('licenseValid');
    if (licenseValid) return { success: true, offline: true };
    return { success: false, error: 'networkError', offline: true };
  }
}

The 24-hour revalidation window was deliberate: validate on first use, cache for 24 hours, revalidate silently in the background. The extension works on airplanes and in corporate environments with restricted outbound traffic.

What I'd do differently: Build license validation on day one of adding any paid feature. The edge cases — revoked keys, refunds, network failures, offline users — are things you want to think through before you have paying customers, not while handling their support tickets.

Decision 4 — My AI prompts were English-only for the first month.

The extension has full bilingual UX from the start — UI strings, notifications, the options page. But the AI prompts were hardcoded in English:

// ai.js — v1, the wrong way
const system = `You are a senior software engineer. Write a 2-3 sentence summary...`;

A developer using the extension with Chrome set to Spanish would get English AI responses. The fix was one function and a conditional:

// ai.js — current version
function getPromptLanguage() {
  const lang = chrome.i18n.getUILanguage().slice(0, 2);
  return lang === 'es' ? 'es' : 'en';
}

export async function summarizePR(pr, diff = '') {
  const lang = getPromptLanguage();

  const system = lang === 'es'
    ? `Eres un ingeniero de software senior. Dado un pull request de GitHub,
       escribe un resumen de 2-3 oraciones en español sencillo de lo que hace
       este PR y por qué es importante. Sin markdown. Sin viñetas.`
    : `You are a senior software engineer. Given a GitHub pull request,
       write a 2-3 sentence plain-English summary of what this PR does
       and why it matters. No markdown. No bullet points.`;

  return complete(system, user, 150);
}

What I'd do differently: If your UI is bilingual, your AI responses need to be too. Check the non-obvious surfaces — AI prompts aren't visible in the UI, which makes them easy to miss.

Decision 5 — I shipped `<all_urls>` in host_permissions and didn't think hard enough about it.

The manifest currently has this:

{
  "host_permissions": [
    "https://api.github.com/*",
    "https://github.com/*",
    "<all_urls>"
  ]
}

The <all_urls> is there because the AI endpoint is user-configurable — someone might use Azure OpenAI, a self-hosted LLM, or any other OpenAI-compatible endpoint. You can't enumerate all of those in the manifest at build time.

The problem: <all_urls> in an extension that also handles GitHub tokens is a legitimate concern for security-conscious users. It's not a vulnerability — the extension only calls the endpoint you configure. But it looks bad. And in CWS review, anything that looks bad slows you down.

The cleaner path I'm moving to — declare known providers statically, request runtime permissions for custom endpoints:

// options.js — when user enters a custom endpoint
async function requestCustomEndpointPermission(endpoint) {
  const url = new URL(endpoint);
  const origin = `${url.protocol}//${url.hostname}/*`;

  const granted = await chrome.permissions.request({ origins: [origin] });

  if (granted) {
    await saveAIConfig({ endpoint, ... });
  } else {
    showError('Permission required to connect to this endpoint.');
  }
}

And the manifest becomes explicit again:

{
  "host_permissions": [
    "https://api.github.com/*",
    "https://github.com/*",
    "https://api.openai.com/*",
    "https://api.groq.com/*",
    "https://api.mistral.ai/*",
    "https://api.together.xyz/*",
    "https://api.gumroad.com/*",
    "http://localhost:*/*"
  ]
}

What I'd do differently: Think about the permission model before shipping, not after. <all_urls> is the nuclear option. It works, but it's harder to defend to the exact users you want using a BYOK developer tool.

What actually moved the needle

Looking back at 6 weeks of production use:

Signal	What I expected	What actually happened
Most-requested features	AI summaries, risk scoring	Multi-account, CSV export, stale notifications
Top conversion trigger	Risk score visualization	AI summary on first PR opened
Top trust signal	Privacy policy page	BYOK answering "does my code go to your server?" with No
Biggest support category	Onboarding confusion	API keys with wrong format or missing scopes
Hardest trade-off	BYOK vs hosted backend	One-time payment vs subscription
Biggest gap I didn't see coming	User errors	AI traceability — when a summary is wrong, users can't tell what prompt, what diff, what response produced it

That last row came from a comment by Raju Dandigam on the previous article — and he's right. When a user says "this AI summary is wrong," I need to answer: what diff was sent, what prompt was used, what did the provider return? Currently I log all three in IndexedDB, but there's no UI to inspect it. The fix I'm building: a rolling debug view of the last 20 AI calls — prompt, diff, raw response — that turns "wrong summary" from a support ticket into a self-serve investigation.

The specific error messages reduced API key support tickets by ~60%:

if (response.status === 401)
  return { valid: false, error: 'Invalid key — check you copied it completely, no trailing spaces.' };
if (response.status === 429)
  return { valid: false, error: 'Rate limit — you\'ve hit the free tier ceiling.' };
if (response.status === 403)
  return { valid: false, error: 'Permission denied — this key may not access this model tier.' };

The decision I'm most uncertain about in retrospect: the one-time payment model. It's philosophically right — I don't want to charge monthly for something that doesn't recur. But it means no recurring revenue. Every month starts at zero. I wrote the full reasoning in Build Log #003.

If you're building something similar

PR Review Canvas — free, open-source code review checklist. 51 items, live readiness score, no account required. MIT licensed.

Build Logs — the engineering decisions behind PR Focus, including the ones I got wrong. Not a tutorial series — a record of real choices with real trade-offs.

Summer Review Swap — post a PR, review one in return. Open through July.

One more decision I didn't cover: what to keep closed-source. PR Focus is the paid product, so its code stays private. TabCost Pro and ChainTrace are also private. The open-source layer — PR Review Canvas, Build Logs, the Review Swap — is where I put the methodology, the decisions, and the community infrastructure. That separation has held up better than I expected.

Five decisions in the article.
A sixth one found by a reader in the comments of the last article.
A seventh one I hadn't seen until someone named it for me.

That's probably the honest ratio for any tool in production:
the ones you found yourself, and the ones you needed someone else to find.

Which of these have you hit building your own tools? The service worker state one I hear from almost everyone who's shipped an MV3 extension — curious if the <all_urls> permissions problem is as common.

And if you've shipped something — what's the one decision you'd reverse if you could? Drop it in the comments. I'd genuinely like to know.