Most modern browser AI reading assistants operate in the cloud. Every time you ask a question or read a page, your browsing history, active tabs, and queries are shipped off to a proprietary backend database.
I wanted a private, zero-subscription reading companion that ran 100% locally on my machine.
To solve this, I built ContextBridge—a Chrome extension that sits in your sidebar, indexes web pages to Markdown, stores them locally, and lets you chat with them offline using local LLMs via Ollama.
In this article, I’ll walk through the architectural choices of building a local-first extension using Chrome’s Manifest V3, IndexedDB, and Ollama.
The Architecture: Why Local-First in a Browser?
Building a local-first system within a browser sandbox presents unique challenges and benefits:
[ Active Webpage ] ──(Ctrl+Shift+E)──> [ DOM-to-Markdown Extractor ]
│
▼
[ Ollama (localhost) ] <───(Direct API)─── [ Extension Sidebar ] <───> [ IndexedDB (Local Store) ]
No Backend Middleman: The sidebar connects directly to your local Ollama port. Your API keys for cloud fallbacks (like Claude or Gemini) are stored strictly in chrome.storage.local.
IndexedDB over LocalStorage: For storing larger document indexes, chunked text, and metadata, localStorage is too restrictive (5MB limit and synchronous). IndexedDB provides asynchronous, high-capacity structured storage.
No Bundlers, No Bloat: The extension is built in vanilla JS with zero external dependencies, making load times instant.
💾 Storing the Web: Implementing IndexedDB
We need a clean utility to manage IndexedDB without adding large NPM wrappers. Here is a simple, robust vanilla Javascript DB wrapper that manages our pages:
javascript
class ContextStore {
constructor() {
this.dbName = 'ContextBridgeDB';
this.dbVersion = 1;
this.db = null;
}
async init() {
return new Promise((resolve, reject) => {
const request = indexedDB.open(this.dbName, this.dbVersion);
request.onerror = () => reject(request.error);
request.onsuccess = () => {
this.db = request.result;
resolve(this);
};
request.onupgradeneeded = (event) => {
const db = event.target.result;
if (!db.objectStoreNames.contains('pages')) {
const store = db.createObjectStore('pages', { keyPath: 'url' });
store.createIndex('domain', 'meta.domain', { unique: false });
store.createIndex('indexedAt', 'indexedAt', { unique: false });
}
};
});
}
async savePage(pageData) {
return new Promise((resolve, reject) => {
const transaction = this.db.transaction(['pages'], 'readwrite');
const store = transaction.objectStore('pages');
const request = store.put({
...pageData,
indexedAt: Date.now()
});
request.onsuccess = () => resolve();
request.onerror = () => reject(request.error);
});
}
}
🤖 Direct Sidebar Integration with Ollama
Ollama exposes a beautiful HTTP API locally (by default at http://localhost:11434). The primary challenge when communicating from a Chrome extension sidebar to localhost is CORS (Cross-Origin Resource Sharing).
Fortunately, in Manifest V3, background scripts and sidebar panels have special network privileges. As long as the extension requests the appropriate host permissions in the manifest.json, it can bypass standard browser CORS restrictions and hit localhost directly:
json
{
"permissions": [
"activeTab",
"storage",
"sidePanel"
],
"host_permissions": [
"http://localhost:11434/*"
]
}
Here is the exact fetch logic we use in the sidebar to stream responses from local models (like llama3 or mistral) with the bounding context of the active page:
javascript
async function* askLocalOllama(modelName, systemPrompt, userMessage) {
const response = await fetch('http://localhost:11434/api/chat', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({
model: modelName,
messages: [
{ role: 'system', content: systemPrompt },
{ role: 'user', content: userMessage }
],
stream: true
})
});
if (!response.ok) throw new Error('Ollama connection failed.');
const reader = response.body.getReader();
const decoder = new TextDecoder();
let buffer = '';
while (true) {
const { done, value } = await reader.read();
if (done) break;
buffer += decoder.decode(value, { stream: true });
const lines = buffer.split('\n');
buffer = lines.pop(); // Keep partial line in buffer
for (const line of lines) {
if (!line.trim()) continue;
const json = JSON.parse(line);
if (json.message && json.message.content) {
yield json.message.content;
}
}
}
}
Lessons Learned
Keep it Vector-less when possible: For single-page chats, loading full vector embeddings in-browser is overkill. Passing semantically chunked text directly in the LLM context window is much faster and completely reliable.
Offline-First is a UX Superpower: Loading the extension instantly without loading spinners or spinner overlays makes the extension feel incredibly premium.
Check out the full open-source codebase here: github.com/sujalmeena7/ContextBridge
If you enjoyed this, feel free to drop a star on the repo or ask any questions in the comments below!
Top comments (0)