Building Replygen started as a personal itch. As a solo SaaS founder, I needed to stay consistently active on LinkedIn, X, and Threads to drive organic distribution — but crafting thoughtful replies at scale was burning time I didn't have. So I built an AI engagement co-pilot that lives in your browser and suggests context-aware replies as you scroll your feed.
This post is a technical walkthrough of the key architecture decisions, the unexpected challenges, and the lessons that only come from shipping a real Chrome extension to production users.
The Core Architecture
The extension follows the standard content script + background service worker pattern:
-
Content script — injected into LinkedIn/X/Threads DOM. Handles UI injection (the suggestion overlay), reads post content via DOM selectors, sends data to the background worker via
chrome.runtime.sendMessage -
Background service worker — handles LLM API calls, manages auth tokens, persists preferences via
chrome.storage.sync - Side panel — settings, tone configuration, usage stats
This is clean in theory. In practice, the DOM layer is where all the pain lives.
The DOM Fragility Problem
LinkedIn, X, and Threads update their frontend constantly. Selectors that worked last week break silently this week — no errors, just a blank suggestion box.
The solution we landed on:
const POST_SELECTORS = {
linkedin: [
'.feed-shared-update-v2__description',
'.feed-shared-text',
'[data-test-id="main-feed-activity-card"]' // fallback
],
twitter: [
'[data-testid="tweetText"]',
'.tweet-text' // legacy fallback
],
threads: [
'._a9zs',
'[class*="x1iorvi4"]' // brittle, needs frequent updates
]
}
We run a selector health check on extension startup — if the primary selector returns null, it falls through to the next. When all fallbacks fail, the UI gracefully shows "Content unavailable" rather than throwing an error.
Prompt Design: Making Replies Sound Like You
The first version generated generic, professional replies. Users hated it. The fix was a three-layer prompt structure:
Layer 1: Platform context
"You are writing a LinkedIn comment. LinkedIn has a professional, insight-driven tone..."Layer 2: Post context
"The post you are replying to says: [post_content]
Thread context: [thread_context]"Layer 3: User tone profile
"The user's writing style: [tone_descriptor]
Examples of their recent comments: [sample_1], [sample_2]"
Constraints:
- Under 280 chars for X replies
- Never start with "Great post!" or similar filler
- Always add a unique perspective, not just agreement text
- The tone profile is built from the user's last 15–20 posts, summarized into a descriptor. This one change improved day-7 retention significantly.
Handling API Costs at Scale
Two optimizations that cut costs by ~40%:
1. Prompt caching — platform context and user tone profile are static between requests. Using Anthropic's prompt caching on these portions means you only pay full tokens for the dynamic post content.
2. Generation gating — only fire the API call when the user explicitly clicks "Generate Reply." Early versions pre-generated proactively and burned tokens on posts users never engaged with.
Auth Architecture
Users authenticate with Google OAuth for account management and cross-device sync. Platform sessions (LinkedIn, X, Threads) are never stored — the extension operates on whatever browser session is already active. Zero credential storage liability, zero OAuth integration work per platform.
The UX Insight That Changed Everything
Early version had a two-step flow: generate → confirm → post. Collapsed to a single-click post flow (with a 3-second undo toast) — this doubled daily active usage in the week after shipping.
Every extra click in a workflow tool is a reason to abandon the habit. Optimize ruthlessly for speed of the happy path.
If you're building something in the Chrome extension + AI space, happy to compare notes. Try the extension at replygen.app.
Top comments (0)