<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Oridjinnn</title>
    <description>The latest articles on DEV Community by Oridjinnn (@oridjinnn).</description>
    <link>https://dev.to/oridjinnn</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3945128%2F8e9b906f-b0dc-4210-9503-66cb232dcc40.jpeg</url>
      <title>DEV Community: Oridjinnn</title>
      <link>https://dev.to/oridjinnn</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/oridjinnn"/>
    <language>en</language>
    <item>
      <title>How I built Rewind — a local-first AI memory layer for developers that records your terminal sessions and lets you chat with your history using Ollama.</title>
      <dc:creator>Oridjinnn</dc:creator>
      <pubDate>Fri, 22 May 2026 03:40:28 +0000</pubDate>
      <link>https://dev.to/oridjinnn/how-i-built-rewind-a-local-first-ai-memory-layer-for-developers-that-records-your-terminal-41an</link>
      <guid>https://dev.to/oridjinnn/how-i-built-rewind-a-local-first-ai-memory-layer-for-developers-that-records-your-terminal-41an</guid>
      <description>&lt;p&gt;I have a bad habit.&lt;/p&gt;

&lt;p&gt;I'll spend three hours debugging a nasty Docker networking issue, finally crack it, close the terminal, and then two weeks later hit the exact same problem. I know I solved it before. I remember the frustration. But the commands? Gone. The output that finally made it click? Gone.&lt;/p&gt;

&lt;p&gt;I tried shell history. Too much noise. I tried keeping notes. Too much friction — I never remember to write things down &lt;em&gt;while&lt;/em&gt; debugging. I tried asking AI assistants, but they don't know what I was actually doing on my machine.&lt;/p&gt;

&lt;p&gt;So I built Rewind.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Rewind does
&lt;/h2&gt;

&lt;p&gt;Rewind is a CLI tool that records your terminal sessions, IDE activity, and AI conversations — then lets you recall and chat with that history using a local LLM via Ollama.&lt;/p&gt;

&lt;p&gt;The key word is &lt;strong&gt;local&lt;/strong&gt;. No cloud. No API keys. No subscriptions. Everything — embeddings, ranking, summaries, chat — runs on your machine. A single Go binary backed by SQLite.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;rewind run docker build &lt;span class="nt"&gt;-t&lt;/span&gt; myapp &lt;span class="nb"&gt;.&lt;/span&gt;
&lt;span class="go"&gt;● Recording... [exit 1] 2.3s

&lt;/span&gt;&lt;span class="gp"&gt;$&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;rewind chat qwen2.5:1.5b
&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;why did my docker build fail yesterday?
&lt;span class="go"&gt;↳ Searching 47 sessions... found 3 relevant

[2h ago] docker build failed: COPY failed, file not found
The build tried to COPY ./dist but the folder didn't exist yet.
Run `npm run build` first, then retry the build.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole pitch. Your terminal finally has memory.&lt;/p&gt;




&lt;h2&gt;
  
  
  The constraints I set for myself
&lt;/h2&gt;

&lt;p&gt;I wanted to build this with zero budget and make it run on a potato laptop (mine is a Lenovo with an i7-4765T and 8GB RAM — not exactly a powerhouse).&lt;/p&gt;

&lt;p&gt;That shaped every technical decision:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Go&lt;/strong&gt; — single static binary, fast startup, easy cross-compilation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite&lt;/strong&gt; — embedded, zero infrastructure, WAL mode for performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; — run small quantized models locally, no GPU required&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No background daemons&lt;/strong&gt; — everything is on-demand&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  How it works under the hood
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Recording
&lt;/h3&gt;

&lt;p&gt;When you run &lt;code&gt;rewind run &amp;lt;command&amp;gt;&lt;/code&gt;, it forks a child process, captures stdout/stderr in real-time, and writes events to SQLite as they stream in.&lt;/p&gt;

&lt;p&gt;Before storing, two things happen:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cleaning&lt;/strong&gt; — ANSI escape sequences, spinner characters, and terminal control codes are stripped. Raw terminal output is surprisingly dirty; storing it verbatim makes recall useless.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redaction&lt;/strong&gt; — a pattern-based scanner checks each line for secrets before it hits the database. GitHub PATs, AWS keys, OpenAI tokens, Slack tokens, private keys — 12 patterns total. The last thing you want is your API keys ending up in a searchable local database.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// redact.go — simplified&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Regexp&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`ghp_[A-Za-z0-9]{36}`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;         &lt;span class="c"&gt;// GitHub PAT&lt;/span&gt;
    &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`AKIA[0-9A-Z]{16}`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;             &lt;span class="c"&gt;// AWS Access Key&lt;/span&gt;
    &lt;span class="n"&gt;regexp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MustCompile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;`sk-[A-Za-z0-9]{48}`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;           &lt;span class="c"&gt;// OpenAI key&lt;/span&gt;
    &lt;span class="c"&gt;// ... 9 more&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;RedactCommand&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;patterns&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReplaceAllString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"[REDACTED]"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Storage
&lt;/h3&gt;

&lt;p&gt;Everything goes into SQLite with WAL mode enabled and 11 indexes. Sessions and events are stored separately with a foreign key relationship. A single &lt;code&gt;LEFT JOIN&lt;/code&gt; query handles loading all sessions with their events — no N+1 problem.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;command&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;summary&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;
       &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sessions&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;session_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;started_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Semantic recall
&lt;/h3&gt;

&lt;p&gt;This is the interesting part. When you run &lt;code&gt;rewind recall "docker networking issue"&lt;/code&gt;, it:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Embeds your query using &lt;code&gt;nomic-embed-text&lt;/code&gt; via Ollama&lt;/li&gt;
&lt;li&gt;Loads cached embeddings from &lt;code&gt;.rewind/embeddings/&lt;/code&gt; (pre-computed, not re-generated each time)&lt;/li&gt;
&lt;li&gt;Ranks sessions using cosine similarity + recency decay&lt;/li&gt;
&lt;li&gt;Returns the top matches&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The recency decay matters more than it sounds. Without it, an old session with a perfect semantic match will beat a recent session that's slightly less similar. In practice, you almost always care more about what happened recently.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="c"&gt;// ranking — simplified&lt;/span&gt;
&lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;cosineSimilarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;queryVec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sessionVec&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;age&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StartedAt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hours&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="m"&gt;24&lt;/span&gt; &lt;span class="c"&gt;// days&lt;/span&gt;
&lt;span class="n"&gt;decayedScore&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Exp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="m"&gt;0.1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;age&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Chat with context
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;rewind chat &amp;lt;model&amp;gt;&lt;/code&gt; loads your most relevant sessions and injects them as context before your conversation. The model "knows" what you've been working on without you having to explain it.&lt;/p&gt;

&lt;p&gt;The chat engine uses streaming from Ollama's HTTP API — so responses feel responsive even on slow hardware.&lt;/p&gt;

&lt;h3&gt;
  
  
  IDE integration
&lt;/h3&gt;

&lt;p&gt;This was the hardest part to architect. I wanted VS Code, JetBrains, and Neovim to all feed data into the same SQLite database without building three completely different integrations.&lt;/p&gt;

&lt;p&gt;The solution: a local JSON-RPC server (&lt;code&gt;rewind ide start&lt;/code&gt;) that all extensions talk to. Each extension sends events — file opens, saves, git operations, AI suggestions, build/test results — using the same protocol. The server writes them to SQLite and links them to shell sessions via a &lt;code&gt;Bridge&lt;/code&gt; layer.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;VS Code  ──►┐
JetBrains──►├──► JSON-RPC server ──► SQLite ──► recall / chat
Neovim   ──►┘         (Go)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;IDE recording is opt-in per-project. Nothing records until you explicitly enable it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;rewind ide permissions vscode on /path/to/project
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  What I learned building this
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Start with the storage layer.&lt;/strong&gt; I initially had everything in JSON files. Migrating to SQLite mid-project was painful — I had to write a migration tool and keep the old JSON reader alive. If I started over, SQLite from day one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Embedding cache is critical for performance.&lt;/strong&gt; The first version re-embedded every session on every recall query. On a slow machine with 47 sessions that meant 47 HTTP calls to Ollama before returning a single result. Caching embeddings to disk made recall go from ~60 seconds to ~2 seconds.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Secret redaction is non-negotiable.&lt;/strong&gt; I almost shipped without it. A developer's terminal output is full of tokens, keys, and credentials. If you're building anything that stores terminal history, build redaction first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Single binary is a superpower for adoption.&lt;/strong&gt; No Docker, no Python venv, no npm install. &lt;code&gt;go build&lt;/code&gt;, move the binary, done. For a tool people need to trust enough to let it record their terminal, low friction installation matters a lot.&lt;/p&gt;




&lt;h2&gt;
  
  
  Current state
&lt;/h2&gt;

&lt;p&gt;Rewind is in active development. What's working today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ Terminal recording with redaction and cleaning&lt;/li&gt;
&lt;li&gt;✅ SQLite storage with WAL mode&lt;/li&gt;
&lt;li&gt;✅ Semantic recall via Ollama embeddings&lt;/li&gt;
&lt;li&gt;✅ Chat with session context&lt;/li&gt;
&lt;li&gt;✅ Shell hooks for auto-recording (bash/zsh/fish)&lt;/li&gt;
&lt;li&gt;✅ VS Code, JetBrains, and Neovim extensions&lt;/li&gt;
&lt;li&gt;✅ Web UI for browsing sessions&lt;/li&gt;
&lt;li&gt;✅ Export to HTML/Markdown&lt;/li&gt;
&lt;li&gt;✅ Shell history import&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;On the roadmap:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;[ ] &lt;code&gt;rewind sync&lt;/code&gt; — optional encrypted backup to S3/R2&lt;/li&gt;
&lt;li&gt;[ ] MCP server — expose Rewind memory to Claude Code, Cursor, and other AI tools&lt;/li&gt;
&lt;li&gt;[ ] GitHub Actions integration — record CI runs&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Oridjinnn/Rewind.git
&lt;span class="nb"&gt;cd &lt;/span&gt;Rewind
go build &lt;span class="nt"&gt;-o&lt;/span&gt; rewind ./cmd/rewind

&lt;span class="c"&gt;# Pull models&lt;/span&gt;
ollama pull qwen2.5:1.5b
ollama pull nomic-embed-text

&lt;span class="c"&gt;# Record something&lt;/span&gt;
./rewind run &lt;span class="nb"&gt;ls&lt;/span&gt; &lt;span class="nt"&gt;-la&lt;/span&gt;

&lt;span class="c"&gt;# Chat with your history&lt;/span&gt;
./rewind chat qwen2.5:1.5b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repo is at &lt;a href="https://github.com/Oridjinnn/Rewind" rel="noopener noreferrer"&gt;github.com/Oridjinnn/Rewind&lt;/a&gt; — MIT licensed, contributions welcome.&lt;/p&gt;

&lt;p&gt;If you're building something on top of Rewind (a smart terminal, an agent, an IDE plugin), I'd love to hear about it. Drop a comment or open an issue.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Any command. Any session. Any question. Rewind knows.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>go</category>
      <category>devops</category>
    </item>
  </channel>
</rss>
