This is a submission for the GitHub Finish-Up-A-Thon Challenge
I wired edge-context-mode into my own Claude Code setup. Then I stopped using it. Not because it didn't work because I didn't understand what I'd built.
Six weeks later, this challenge made me come back. What I found: a tool that was half-finished, a Durable Object secretly lying to me, and when I finally looked at the code honestly — something actually worth finishing.
What I Built
The problem is simple to describe and annoying to live with.
You start a session with Claude. You read a file, run a command, ask a question. Twenty minutes in the answers get worse. Forty minutes in it's forgotten the context. An hour in you're hitting limits and starting over.
The cause: raw output floods the context window. A cat on a 500-line file puts 500 lines in context. npm list adds 200 more. git log adds more. The context fills with output the LLM will never reference again, and the things that actually matter — decisions, architecture, what you chose and why — get pushed out.
edge-context-mode intercepts that:
Normal: cat large-file.ts → 500 lines flood into context
edge-context: ctx_execute(...) → [ctx:ab3f9x] + "12 line(s): interface User..."
The raw output goes to Cloudflare D1 at the edge. The LLM gets a reference token and a 50-word summary. Context stays clean. Sessions stay coherent.
Search is hybrid: D1's FTS5 gives you BM25 keyword matching out of the box. Pair it with vectorize-mcp-worker — another tool I built — and ctx_search runs semantic vector search on top. Same stored data, same reference tokens. The retrieval layer just gets smarter.
That's the design. What was actually shipped in April was a different story.
This directly solves the compaction problem
If you've used Claude Code for a long session, you've probably seen this:
"Anyone else notice that compaction seems to lose more details than normal? It never seemed to matter before, but I'm seeing it frequently now."
That's the same problem, one layer up. When Claude Code hits context limits, it compacts — auto-summarises the conversation to make room. The details that disappear are the exact things that matter: error messages from 30 minutes ago, what was tried and failed, the architectural choice that explains why the code looks the way it does.
They disappear because they were sitting raw in the context window. Compaction summarises them aggressively and the specifics are gone.
edge-context-mode attacks this in two ways. First, prevention: every ctx_execute call keeps raw output out of the context entirely — only a 50-word summary and a reference token go in. Less in context means compaction triggers less often. Second, survival: everything stored via ctx_execute and ctx_annotate lives in D1, outside the context. Compaction can't touch it. After compaction wipes your conversation:
ctx_history(session_id: "myproject-2026-05-22")
→ full chronological list of everything that happened, pulled from D1
ctx_reflect(session_id: "myproject-2026-05-22")
→ "Session has 14 entries over ~47 min. Fixed D1 FK constraint, added
ctx_get tool, updated README, decided 512KB cap on raw_output..."
The session memory didn't compact. Only the conversation did.
The same day I wrote this, this appeared on X:
@ankkala: "There should be an entire new class of LLM dedicated just to compaction." My reply: "Compression without loss is the oldest hard problem in cognition. Summarization fails the same way bad thinking fails — not because the words are wrong, but because the underlying structure was never identified in the first place. A specialized compaction model doesn't fix that. It just obscures where the reasoning broke down."
The argument for a dedicated compaction model assumes the problem is compression efficiency. It isn't. The problem is that the wrong things are in the context in the first place. A better compressor produces confident-sounding summaries of the wrong things. edge-context-mode doesn't make compaction smarter — it reduces what has to be compacted at all.
One honest caveat: this is not LLM-agnostic. edge-context-mode is an MCP server. It works with any client that speaks the Model Context Protocol — Claude Code today, anything else that adopts MCP tomorrow. If you're on ChatGPT, Gemini, or Copilot Chat without MCP support, it won't help you. This is a Claude Code tool first.
Demo
GitHub: https://github.com/dannwaneri/edge-context-mode
Release: https://github.com/dannwaneri/edge-context-mode/releases/tag/v1.0.0
Local setup — 4 commands, no cloud account:
git clone https://github.com/dannwaneri/edge-context-mode
cd edge-context-mode
npm install
npm run migrate:local
npm run local
Register with Claude Code:
claude mcp add edge-context-mode -- node /path/to/edge-context-mode/src/local.ts
What a session looks like now:
# Run a command — only a reference token enters context
ctx_execute("node -e \"require('./package.json').dependencies\"", "check deps")
→ [ctx:kx92ma3b1p]
→ 1 line(s): { hono: '^4.7.11', '@modelcontextprotocol/sdk': '^1.12.0'...
# Need the actual output? Pull it by reference
ctx_get("[ctx:kx92ma3b1p]")
→ ref: [ctx:kx92ma3b1p]
→ summary: 1 line(s): { hono: '^4.7.11'...
→ --- raw output ---
→ { hono: '^4.7.11', '@modelcontextprotocol/sdk': '^1.12.0', ... }
# Save a decision without running a command
ctx_annotate("decided to cap raw_output at 512KB — D1 row limit is ~1MB, leaving headroom")
→ [ctx:mw71nx4d2q]
# Search past context semantically
ctx_search("D1 storage decisions")
→ [ctx:mw71nx4d2q] [score:1.84] annotation: decided to cap raw_output at 512KB...
# Health check
ctx_doctor
→ { "d1": "ok", "execution_mode": "local-stdio", "vectorize_mcp": "configured", "sessions": 6, "entries": 9 }
The Comeback Story
On April 15th, I shipped the initial release. Two commits, deployed to Cloudflare, wired into my own setup. Then I moved on.
Coming back for this challenge, I read the code properly for the first time. Here's what was actually there.
The Durable Object was returning a placeholder.
This is the one I'm most embarrassed about. ExecutorDO.ts — the Cloudflare Durable Object that was supposed to sandbox execution in Workers mode — had this in production:
return {
stdout: `[DO received: ${command} ${args.join(" ")}]`,
exit_code: 0,
timed_out: false,
};
If you deployed to Cloudflare Workers and called ctx_execute, you'd get back a fake success. No output. No error. Just a quietly wrong result. I'd left a comment: "integrate with a Workers AI function or trusted external runner" — and never did it.
The fix wasn't to build the external runner. Cloudflare Workers genuinely cannot spawn subprocesses, and building a remote execution service in two weeks isn't the right call. The honest fix was to say so: replace the silent stub with a clear error that tells you exactly what to run instead.
ctx_get didn't exist.
The entire architecture depends on [ctx:id] references being retrievable. Store a summary, get back a token, pull the original when you need it. That was the design. There was no tool to do the pulling. Every reference was write-only. I'd built half a memory system and hadn't noticed.
Added ctx_get — strips the [ctx:] prefix, queries D1 by ID, checks expiry, returns the summary and raw output. If it's gone: "Entry not found or expired." No crash, no drama.
ctx_annotate didn't exist either.
Context only accumulated through ctx_execute — shell commands. You couldn't save why you made a decision. You couldn't annotate an architectural choice. You couldn't store a note without wrapping it in a fake command. The tool only captured what you ran, not what you thought.
Added ctx_annotate — takes text, stores it as type: "annotation", shows up in ctx_search and ctx_history. The session history now reflects intent, not just execution.
Raw output was never stored.
There's a comment in the original schema that says exactly this: "raw output is NEVER stored here — only the summary." Deliberate design. The problem: ctx_get needs something to return. You can't have a retrieval tool with nothing to retrieve.
Migration 0002_raw_output.sql recreates the table with a raw_output TEXT column and an updated CHECK constraint to include annotation as a valid entry type. Full stdout now stored in D1, capped at 512KB. Old entries retain NULL gracefully.
Setup required three services before anything ran.
The original README listed as prerequisites: Cloudflare account with Workers Paid plan, a D1 database, and a separately deployed vectorize-mcp-worker. Three services, three secrets, before the server would start. Most people would give up at step two.
Local mode now works with zero cloud setup. Four commands. The vectorize-mcp-worker is an optional semantic search upgrade — worth deploying once you're running sessions regularly, but not a requirement to get started.
ctx_search had a silent phrase-matching bug — found it live.
This one I found after v1.0.0 shipped, while testing the tools in a real session. I stored an annotation, then searched for words I knew were in it: "Vectorize optional". No results.
The bug was in ftsPhrase() in store.ts. It wrapped the entire query in double quotes:
// Before — strict phrase search
return `"${q.replace(/"/g, '""')}"`;
// "Vectorize optional" only matches if those two words are adjacent
The annotation text said "Vectorize **stays* optional"*. One word between them. No match.
The fix: quote each term individually so FTS5 treats them as implicit AND — all terms must appear in the document, anywhere, not consecutively:
// After — per-term quoting
return q.trim().split(/\s+/).filter(Boolean)
.map(word => `"${word.replace(/"/g, '""')}"`)
.join(" ");
// "Vectorize" "optional" — matches regardless of what's between them
Special-char safety (hyphens, numbers) preserved. Fixed and shipped as a post-v1.0.0 patch the same day.
The migration runner was silently wiping all data on every restart.
This one I found after the article was mostly written, while investigating why ctx_get returned (raw output not available) on every annotation.
The migration runner in local.ts used a try/catch pattern:
try { db.exec(sql); } catch { /* already applied */ }
The assumption: if the SQL fails, the migration was already applied. The problem: migration 0002 starts with DROP TABLE IF EXISTS context_entries. IF EXISTS never throws — so the catch never fires. Every server restart ran 0002 from scratch, dropping the entire context_entries table and recreating it empty. All stored context wiped. Silently.
The fix: a _migrations table that records each .sql file by name. On startup, already-applied files are skipped entirely:
db.exec(`CREATE TABLE IF NOT EXISTS _migrations (name TEXT PRIMARY KEY, applied_at INTEGER)`);
// ...
const already = db.prepare("SELECT 1 FROM _migrations WHERE name = ?").get(f);
if (already) continue;
Data now survives restarts. This is the kind of bug that only shows up when you actually use the thing — not in tests, not in code review, only when you store something, close the terminal, reopen it, and find nothing there.
My Experience with GitHub Copilot
I want to be honest here because vague Copilot praise is exactly what's eroding trust in challenge submissions right now.
The code work in this finish-up — the migration, the new tools, the server fixes — I did with Claude Code. Copilot was where the project had been failing silently since April: tests.
vitest was in package.json from the initial commit. Zero tests had ever been written. I'd been aware of this the way you're aware of a leak you haven't fixed.
I opened src/tools/executor.ts in VS Code and gave Copilot Chat this prompt:
"Write vitest unit tests for the
validateCommandfunction in this file. Test: a whitelisted command like 'node' passes, an unknown binary like 'rm' is rejected with COMMAND_NOT_ALLOWED, a path traversal attempt with '../etc' is blocked, and a git subcommand not in the allowlist is rejected."
Copilot opened package.json to check the test configuration, created executor.spec.ts (+51 lines), ran npm test --silent to verify, and reported back: "Tests added and run: all 4 passing." The agentic loop — read config, write file, run tests, confirm results — without me prompting each step.
Then I asked it to look at summarise for missing coverage. It came back with eight specific edge cases: empty output, whitespace-only lines, Windows line endings where \r could leak into summaries stored in D1, lines with leading spaces, the SUMMARY_MAX_CHARS boundary, mixed empty and non-empty lines.
The Windows one stopped me. I'm on Windows. \r\n line endings are something I live with and had completely stopped thinking about. A summary with a trailing \r stored in D1 and returned to the LLM is a subtle, real bug I would not have found on my own.
That's what honest Copilot use looks like. It found what I'd been ignoring and made me own it.
What's Next
The question I keep getting: can this work outside Claude Code?
Closer than you'd think. Anything that speaks MCP already works — Cursor, Windsurf, Claude Desktop, and GitHub Copilot in VS Code (MCP server support is on by default). "Claude Code only" was underselling it.
Genuinely universal requires one more step: function-calling adapters for OpenAI and Gemini. The storage layer — D1, FTS5, the reference system — doesn't change at all. Same data, same search, same [ctx:id] tokens. You'd just expose the tools as OpenAI function definitions or Gemini tool declarations instead of MCP tool registrations.
The Workers HTTP mode is already the bridge. Any LLM with tool/function calling can hit the deployed endpoint directly if you wire up the schema on their side.
The seamless part — where raw output is automatically kept out of context before you even think about it — still requires the client to route through edge-context-mode. MCP does that natively. For non-MCP LLMs you'd call ctx_execute manually instead of running commands directly. Less automatic. Still useful. The memory survives either way.
That's v1.1: OpenAI and Gemini adapters, same core, no storage changes. If you're building on a different stack and want to help, the repo is open.
One more gap worth naming: you can't retroactively capture a session that started before edge-context-mode was running. If you worked for an hour before registering the MCP server, that context lived in the Claude Code conversation window only — edge-context-mode never saw it.
The workaround right now is ctx_annotate — manually summarise what happened before the tool was active. It works but it's manual.
I tested this on the very session in which I'm writing this article. It hit context limits once, compacted, and continued. I opened the .jsonl file, found the compaction summary (stored as a type: "user" entry with isCompactSummary: true), and ran one ctx_annotate call:
ctx_annotate("SESSION IMPORT (finishupathon, 2026-05-22) — edge-context-mode v1.0.0 decisions:
ExecutorDO stub → honest error. ctx_get added. ctx_annotate added. raw_output migration.
Local mode: 4 commands, zero cloud. ftsPhrase() per-term fix.")
→ [ctx:8wo1rh1buy]
One compaction, one call. That context is now in D1. If this session compacts again tomorrow, ctx_search("edge-context-mode v1.0.0") will surface exactly what was decided and why.
The proper fix: Claude Code stores every session as a .jsonl file in ~/.claude/projects/. Full conversation, every tool call, every output. A ctx_import command that reads those files and bulk-loads them into D1 would close the gap completely — retroactive context, searchable, surviving all future compaction. The storage layer already handles it. It just needs a reader for the .jsonl format (compact summaries are the isCompactSummary: true entries) and a bulk insert path. That's v1.2.
Built with TypeScript, Cloudflare Workers, Durable Objects, D1, and a belated appreciation for what I'd actually made.



Top comments (0)