In Part 1, Part 2, and Part 3, I covered pain points and solutions that work reliably. This article is different—it's about experiments, works-in-progress, and lessons from things that didn't quite pan out.
Not every tool needs to be polished. Some are scaffolding for better ideas. Some solve problems that disappear with faster models. And some teach valuable lessons even when they fail.
The Project Explorer: Solving Yesterday's Problem
The Original Problem
Before Sonnet 4.5, exploring a codebase with Claude was slow. Reading 20 files meant 20 sequential API calls, token limits to manage, and 10+ minutes of setup time.
Workarounds emerged: naming key files with an @
prefix (@README.md
, @main.go
) so they'd appear first in directory listings, making them easier for Claude to discover. Some users created special "guide" files that aggregated important context.
I built project-ingest
(inspired by gitingest.com) to solve this. The tool would output a single markdown document with the project structure, key file contents, and dependency graph. Claude could ingest this in one shot instead of reading files incrementally.
What Changed
Sonnet 4.5 changed the game, though I'm not entirely sure how. Is it just faster at reading files? Does it batch requests differently? Does it handle context more efficiently? Whatever the implementation, the result is clear: it's fast enough that project ingestion overhead feels worse than just reading files directly.
Before (Sonnet 3.5):
- Run
project-ingest
→ 15 seconds - Claude reads summary → 5 seconds
- Total: 20 seconds
After (Sonnet 4.5):
- Claude reads 20 files directly → 8 seconds
- Total: 8 seconds
The ingester became slower than the problem it solved.
When It's Still Useful
I haven't deleted project-ingest
because it remains valuable for:
- Very large codebases (100+ files): Still faster to get a high-level view
- Project snapshots: Capturing codebase state at a point in time
- Documentation generation: Creating an overview for human readers
- Cross-project analysis: Comparing architecture across multiple projects
But for everyday "help me understand this project" tasks? Obsolete.
The Lesson
Build for today's constraints, not tomorrow's. The tool was perfect for its time, but model improvements made it obsolete. That's okay. The investment taught me patterns I applied elsewhere (like how to efficiently traverse project structures).
When a tool becomes unnecessary because the problem disappeared, that's a success, not a failure.
Code Review in Emacs: Closing the Loop
The Review Problem
I'm browsing through a codebase—maybe one I wrote months ago, maybe one Claude just generated, maybe something I'm casually exploring. I spot issues: a function that could be clearer, error handling that's too generic, a repeated pattern that should be abstracted.
The problem: I'm in discovery mode, not fix mode. I don't want to stop and fix each issue immediately. I want to:
- Mark the issue at the exact line while I'm looking at it
- Keep browsing without losing flow
- Later, batch all issues together and have an LLM fix them all at once
This is where the Code Review Logger comes in. It decouples discovery from fixing.
The Emacs Integration
I built an Emacs mode (code-review-logger.el
) that tracks review comments in an org-mode file:
;; While reviewing code in Emacs:
;; SPC r c - Log comment at current line
;; SPC r r - Log comment for selected region
;; SPC r o - Open review log
(defun code-review-log-comment (comment)
"Log a review comment with file/line tracking"
(let* ((file (buffer-file-name))
(line (line-number-at-pos)))
(code-review-format-entry comment file line "TODO")))
This creates entries in ~/code_review.org
:
** TODO [[file:~/repos/memento/src/cli.py::127][cli.py:127]]
:PROPERTIES:
:PROJECT: memento
:TIMESTAMP: [2025-09-30 Mon 14:23]
:END:
This error handling is too generic - catch specific exceptions
** TODO [[file:~/repos/memento/src/search.py::89][search.py:89]]
:PROPERTIES:
:PROJECT: memento
:TIMESTAMP: [2025-09-30 Mon 14:25]
:END:
Add caching here - search is called repeatedly with same query
The Workflow
- Review code in Emacs (with syntax highlighting, jump-to-def, all IDE features)
- Mark issues as I find them (SPC r c for quick comment)
- Trigger the automated fix process:
Read code-review-llm-prompt-template and follow it
-
Claude automatically:
- Reads
~/code_review.org
for all TODO items - Fixes each issue in the actual code
- Runs
make test
after every change - Marks items as DONE only when tests pass
- Provides a summary of what was fixed
- Reads
The entire workflow is encoded in a memento note (code-review-llm-prompt-template
) that Claude reads. This note contains:
- The review format specification
- Priority order (correctness → architecture → security → performance)
- Testing requirements (always run
make test
, never leave tests failing) - Guidelines for what makes a good vs. bad review
- The complete fix-and-verify process
Why This Works
Batch processing is more efficient than interactive fixes:
- Claude sees all issues at once and can plan holistically
- No back-and-forth during fixing
- Tests run after every change (not just at the end)
- Clear audit trail of what was fixed
Emacs integration solves the "review without IDE" problem:
- I'm in my editor with all my tools
- Jump to definitions, search references, check blame
- Clicking org links takes me directly to the code
Structured format means Claude gets precise instructions:
- Exact file paths (clickable org-mode links)
- Exact line numbers
- Context about the issue
- Project name for multi-repo workflows
Current State: Automated Fix Process
The system is fully automated for the fix workflow. When I have pending reviews, I simply say:
Read code-review-llm-prompt-template and follow it
Claude then:
- Reads the standardized prompt from memento
- Processes all TODO items from
~/code_review.org
- Fixes issues, runs tests, marks items DONE
- Never leaves the codebase with failing tests
The key insight: encoding the entire workflow in a memento note makes it repeatable and consistent. I don't need to remember the exact prompt or process—it's all documented and ready to execute.
Future improvements:
- Auto-trigger on commit: Git hook that checks for pending reviews before allowing commits
- Proactive review suggestions: Claude analyzing code during normal sessions and adding items to the review log
- Review metrics: Track what types of issues are most common to improve coding patterns
The Diff Workflow: Bringing Changes Back to Emacs
The Problem
Claude makes changes in the terminal. I want to review them in Emacs. How do I bridge that gap?
The Current Solution
Simple but effective:
# Claude generates changes, I run:
git diff > /tmp/review.diff
# In Emacs:
# Open the diff file
# Use Emacs diff-mode for navigation
# Apply/reject hunks interactively
This works but feels clunky. I'm copying diffs manually, opening files, navigating around.
What I Want
A tighter integration:
- Claude signals "I made changes"
- Emacs automatically shows the diff in a split window
- I review with full IDE context
- I approve/reject specific changes
- Claude sees my feedback and adjusts
This requires:
- MCP server for Emacs communication
- Claude code that can signal "review needed"
- Emacs mode that listens for review requests
- Two-way communication (Claude → Emacs → Claude)
I've prototyped pieces of this but nothing production-ready yet.
The Barrier
Building reliable two-way communication between Claude and Emacs is hard:
- Emacs server needs to be always-on
- Need protocol for structured messages
- Need to handle failures gracefully
- Race conditions when multiple Claudes talk to one Emacs
I'm experimenting with using memento as the message bus:
- Claude writes "review-needed" note
- Emacs polls memento for new reviews
- Emacs writes feedback to memento
- Claude reads feedback
Clunky but doesn't require real-time communication.
What Didn't Work: Session Auto-Resume
The Idea
When I restart my computer, I lose all tmux sessions. What if Claude could auto-resume?
# Before shutdown, save session state:
tmux-save-sessions # Captures all window/pane layouts
# After restart:
tmux-restore-sessions # Recreates everything
Each session would:
- Restore to the correct directory
- Read the last prompt from history
- Show a summary: "You were working on memento refactoring"
Why It Failed
Context loss is too severe. Even if I restore the directory and prompt, Claude doesn't remember:
- What code was already written
- What decisions were made
- What tests were run
- What bugs were found
I'd need to capture and replay the entire conversation, which means:
- Huge token usage (replaying thousands of tokens)
- Slow startup (processing all that history)
- Potential for Claude to make different decisions on replay
The Lesson
Session continuity requires more than just state restoration. You need:
- Explicit checkpoints (memento notes with "current status")
- Clear handoff documents ("Session ended here, next steps are...")
- Project-specific context (not just conversation history)
Instead of auto-resume, I now use explicit handoff notes:
# Session Checkpoint: 2025-09-30 14:30
## What We Did
- Refactored CLI argument parsing to use argparse
- All tests pass
- Committed changes: git log -1
## What's Next
- [ ] Add JSON output support to all commands
- [ ] Update documentation
- [ ] Add integration tests
## Key Decisions
- Using argparse instead of manual parsing for consistency
- All commands must support --json flag
## Files Modified
- src/cli.py (lines 1-89, 127-145)
- src/parser.py (new file)
Next session reads this note and picks up where we left off. Works better than trying to resume the conversation.
Experiments in Progress
1. MCP Coordination Server
Building an MCP server specifically for coordinating parallel LLM sessions:
# Hypothetical API
coordinator.claim_file("src/parser.py", session="A")
coordinator.add_barrier("refactor-complete", required=["A", "B"])
coordinator.wait_for_barrier("refactor-complete")
coordinator.get_session_status("A") # → "in_progress" | "blocked" | "completed"
This would solve the "stepping on each other" problem when running parallel sessions.
2. Telemetry Mining
I have months of telemetry data (see Part 2). Now I want to mine it:
# Which prompts lead to longest sessions?
# Which projects have the most rework?
# When do I context-switch most?
# Correlation between session length and memory usage?
The goal: optimize my workflow based on data, not intuition.
3. LLM-Generated Architecture Docs
After a major refactor, can Claude generate architecture documentation automatically?
Read all files in src/. Generate an architecture document explaining:
- Key components and their responsibilities
- Data flow through the system
- API boundaries
- Design decisions and trade-offs
Early experiments are promising. The docs aren't perfect but are good starting points.
Key Learnings
Embrace obsolescence. If a tool becomes unnecessary, that's progress.
Perfect is the enemy of done. The code review logger works even though it's not fully automated. Ship it.
Tight integration is hard. Two-way communication between tools (Claude ↔ Emacs) requires careful design.
Explicit beats implicit. Session handoff notes work better than trying to auto-resume from history.
Data reveals patterns. Telemetry showed me I context-switch too often—now I batch similar tasks.
What's Next
Part 5 (final article) covers using Claude as a learning tool: generating flashcards, creating annotated worksheets, and building a spaced-repetition system for technical concepts.
The code review logger is in the memento repo. The project ingester is at ~/bin/project-ingest
. The tmux session tools are in my dotfiles. All MIT licensed—use freely.
Top comments (0)