I have a bad habit.
I'll spend three hours debugging a nasty Docker networking issue, finally crack it, close the terminal, and then two weeks later hit the exact same problem. I know I solved it before. I remember the frustration. But the commands? Gone. The output that finally made it click? Gone.
I tried shell history. Too much noise. I tried keeping notes. Too much friction — I never remember to write things down while debugging. I tried asking AI assistants, but they don't know what I was actually doing on my machine.
So I built Rewind.
What Rewind does
Rewind is a CLI tool that records your terminal sessions, IDE activity, and AI conversations — then lets you recall and chat with that history using a local LLM via Ollama.
The key word is local. No cloud. No API keys. No subscriptions. Everything — embeddings, ranking, summaries, chat — runs on your machine. A single Go binary backed by SQLite.
$ rewind run docker build -t myapp .
● Recording... [exit 1] 2.3s
$ rewind chat qwen2.5:1.5b
> why did my docker build fail yesterday?
↳ Searching 47 sessions... found 3 relevant
[2h ago] docker build failed: COPY failed, file not found
The build tried to COPY ./dist but the folder didn't exist yet.
Run `npm run build` first, then retry the build.
That's the whole pitch. Your terminal finally has memory.
The constraints I set for myself
I wanted to build this with zero budget and make it run on a potato laptop (mine is a Lenovo with an i7-4765T and 8GB RAM — not exactly a powerhouse).
That shaped every technical decision:
- Go — single static binary, fast startup, easy cross-compilation
- SQLite — embedded, zero infrastructure, WAL mode for performance
- Ollama — run small quantized models locally, no GPU required
- No background daemons — everything is on-demand
How it works under the hood
Recording
When you run rewind run <command>, it forks a child process, captures stdout/stderr in real-time, and writes events to SQLite as they stream in.
Before storing, two things happen:
Cleaning — ANSI escape sequences, spinner characters, and terminal control codes are stripped. Raw terminal output is surprisingly dirty; storing it verbatim makes recall useless.
Redaction — a pattern-based scanner checks each line for secrets before it hits the database. GitHub PATs, AWS keys, OpenAI tokens, Slack tokens, private keys — 12 patterns total. The last thing you want is your API keys ending up in a searchable local database.
// redact.go — simplified
var patterns = []*regexp.Regexp{
regexp.MustCompile(`ghp_[A-Za-z0-9]{36}`), // GitHub PAT
regexp.MustCompile(`AKIA[0-9A-Z]{16}`), // AWS Access Key
regexp.MustCompile(`sk-[A-Za-z0-9]{48}`), // OpenAI key
// ... 9 more
}
func RedactCommand(line string) string {
for _, p := range patterns {
line = p.ReplaceAllString(line, "[REDACTED]")
}
return line
}
Storage
Everything goes into SQLite with WAL mode enabled and 11 indexes. Sessions and events are stored separately with a foreign key relationship. A single LEFT JOIN query handles loading all sessions with their events — no N+1 problem.
SELECT s.id, s.command, s.title, s.summary, ...
e.timestamp, e.type, e.content
FROM sessions s
LEFT JOIN events e ON e.session_id = s.id
ORDER BY s.started_at DESC, e.id
Semantic recall
This is the interesting part. When you run rewind recall "docker networking issue", it:
- Embeds your query using
nomic-embed-textvia Ollama - Loads cached embeddings from
.rewind/embeddings/(pre-computed, not re-generated each time) - Ranks sessions using cosine similarity + recency decay
- Returns the top matches
The recency decay matters more than it sounds. Without it, an old session with a perfect semantic match will beat a recent session that's slightly less similar. In practice, you almost always care more about what happened recently.
// ranking — simplified
score := cosineSimilarity(queryVec, sessionVec)
age := time.Since(session.StartedAt).Hours() / 24 // days
decayedScore := score * math.Exp(-0.1 * age)
Chat with context
rewind chat <model> loads your most relevant sessions and injects them as context before your conversation. The model "knows" what you've been working on without you having to explain it.
The chat engine uses streaming from Ollama's HTTP API — so responses feel responsive even on slow hardware.
IDE integration
This was the hardest part to architect. I wanted VS Code, JetBrains, and Neovim to all feed data into the same SQLite database without building three completely different integrations.
The solution: a local JSON-RPC server (rewind ide start) that all extensions talk to. Each extension sends events — file opens, saves, git operations, AI suggestions, build/test results — using the same protocol. The server writes them to SQLite and links them to shell sessions via a Bridge layer.
VS Code ──►┐
JetBrains──►├──► JSON-RPC server ──► SQLite ──► recall / chat
Neovim ──►┘ (Go)
IDE recording is opt-in per-project. Nothing records until you explicitly enable it:
rewind ide permissions vscode on /path/to/project
What I learned building this
Start with the storage layer. I initially had everything in JSON files. Migrating to SQLite mid-project was painful — I had to write a migration tool and keep the old JSON reader alive. If I started over, SQLite from day one.
Embedding cache is critical for performance. The first version re-embedded every session on every recall query. On a slow machine with 47 sessions that meant 47 HTTP calls to Ollama before returning a single result. Caching embeddings to disk made recall go from ~60 seconds to ~2 seconds.
Secret redaction is non-negotiable. I almost shipped without it. A developer's terminal output is full of tokens, keys, and credentials. If you're building anything that stores terminal history, build redaction first.
Single binary is a superpower for adoption. No Docker, no Python venv, no npm install. go build, move the binary, done. For a tool people need to trust enough to let it record their terminal, low friction installation matters a lot.
Current state
Rewind is in active development. What's working today:
- ✅ Terminal recording with redaction and cleaning
- ✅ SQLite storage with WAL mode
- ✅ Semantic recall via Ollama embeddings
- ✅ Chat with session context
- ✅ Shell hooks for auto-recording (bash/zsh/fish)
- ✅ VS Code, JetBrains, and Neovim extensions
- ✅ Web UI for browsing sessions
- ✅ Export to HTML/Markdown
- ✅ Shell history import
On the roadmap:
- [ ]
rewind sync— optional encrypted backup to S3/R2 - [ ] MCP server — expose Rewind memory to Claude Code, Cursor, and other AI tools
- [ ] GitHub Actions integration — record CI runs
Try it
git clone https://github.com/Oridjinnn/Rewind.git
cd Rewind
go build -o rewind ./cmd/rewind
# Pull models
ollama pull qwen2.5:1.5b
ollama pull nomic-embed-text
# Record something
./rewind run ls -la
# Chat with your history
./rewind chat qwen2.5:1.5b
The repo is at github.com/Oridjinnn/Rewind — MIT licensed, contributions welcome.
If you're building something on top of Rewind (a smart terminal, an agent, an IDE plugin), I'd love to hear about it. Drop a comment or open an issue.
Any command. Any session. Any question. Rewind knows.
Top comments (1)
Nice work!