Rex Zhen

Posted on Jan 25 • Edited on Jan 29

Claude Code memory management: Long-Term and Short-Term Memory with Hooks and Skills

#ai #agents #programming

Claude Code Memory Management: Long-Term and Short-Term Memory with Hooks and Skills

The Challenge: AI Amnesia

When working with AI assistants like Claude Code, you've probably experienced this frustrating pattern:

You start a new session
The AI asks questions you've already answered before
Previous decisions and context are lost
You waste time re-explaining the same background information

This is the AI memory problem. Most AI conversations are stateless - each session starts with a blank slate. While AI models have impressive context windows, they still face two fundamental memory constraints:

Short-Term Memory (Context Window Limits)

Even with large context windows (100K-200K tokens), a single conversation can exceed these limits when:

Working on complex, multi-hour projects
Reviewing large codebases with many files
Accumulating dozens of tool calls and outputs
Discussing detailed technical specifications

When you hit these limits, you get the dreaded error:

API Error: 400 Input is too long for requested model.

Long-Term Memory (Session Persistence)

Between sessions, AI has no memory at all. When you:

Close and reopen the CLI
Start a new day's work
Switch between projects

All context from previous conversations is lost. The AI doesn't remember:

Your project structure and architecture
Previous decisions and why they were made
Bugs you've encountered and solved
Code patterns and conventions you established
Your preferences and workflow

The Solution: Hooks, Skills, and Persistent Memory

The solution is a three-tier memory system that mimics human memory:

1. Session Summaries (Long-Term Memory)

Create a session save mechanism that captures conversation history in permanent storage:

# .claude/scripts/save_session.sh
#!/bin/bash
SESSIONS_DIR=".claude/sessions"
TIMESTAMP=$(date +"%Y-%m-%d_%H%M")
SESSION_FILE="${SESSIONS_DIR}/session_${TIMESTAMP}.md"

# Save full transcript with timestamp
claude sessions export > "$SESSION_FILE"

# Update latest session pointer
cp "$SESSION_FILE" "${SESSIONS_DIR}/latest_session.md"

# Generate summary using AI
claude sessions summarize > "${SESSIONS_DIR}/latest_summary_short.md"

This creates a searchable archive of all your work, organized by date and project.

2. Automatic Memory Loading (Session Startup)

Use a SessionStart hook to automatically load context when you begin work:

// .claude/config.json
{
  "hooks": {
    "SessionStart": {
      "startup": {
        "command": ".claude/scripts/load_latest_summary.sh",
        "background": false
      }
    }
  }
}

# .claude/scripts/load_latest_summary.sh
#!/bin/bash
SUMMARY_FILE=".claude/sessions/latest_summary_short.md"

if [ -f "$SUMMARY_FILE" ]; then
  echo "================== Previous Session Context =================="
  cat "$SUMMARY_FILE"
  echo "=============================================================="
else
  echo "No previous session found. Starting fresh."
fi

Now every session starts with a brief recap of where you left off.

3. On-Demand Memory Recall (Skills)

Create custom skills for memory operations:

# .claude/skills/save-session/SKILL.md
---
name: save-session
description: "Saves current conversation transcript and creates summary"
trigger: /save-session | /ss
---

Execute: .claude/scripts/save_session.sh

Then respond: "Session saved to .claude/sessions/session_[timestamp].md"

# .claude/skills/load-previous-summary/SKILL.md
---
name: load-previous-summary
description: Loads previous session summary for context
trigger: /load | /recall
---

Execute: .claude/scripts/load_latest_summary.sh

Then summarize the loaded context for the user.

Now you can use natural commands:

/save-session or /ss - Save current work
/load or /recall - Recall previous context
The AI can also invoke these proactively when needed

Implementation Architecture

Here's the complete memory system architecture:

┌─────────────────────────────────────────────────────────────┐
│                      AI Session (Current)                    │
│  ┌────────────────────────────────────────────────────┐     │
│  │  Active Conversation (Short-Term Memory)          │     │
│  │  - Current task context                            │     │
│  │  - Recent messages and tool calls                  │     │
│  │  - Limited by context window                       │     │
│  └────────────────────────────────────────────────────┘     │
│                           │                                  │
│                           │ Save on exit/demand              │
│                           ▼                                  │
└─────────────────────────────────────────────────────────────┘
                            │
                            │
┌───────────────────────────▼─────────────────────────────────┐
│              Persistent Storage (Long-Term Memory)          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Session Archive (.claude/sessions/)                 │   │
│  │  - session_2026-01-24_1430.md  (full transcript)    │   │
│  │  - session_2026-01-24_1600.md  (full transcript)    │   │
│  │  - session_2026-01-24_1820.md  (full transcript)    │   │
│  │  - latest_session.md            (most recent full)   │   │
│  │  - latest_summary_short.md      (condensed version)  │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Project Memory (session-notes/)                     │   │
│  │  - vibe67-memory.md  (manually curated notes)       │   │
│  │  - Key decisions and architecture                    │   │
│  │  - Gotchas and learnings                             │   │
│  └──────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘
                            │
                            │ Load on startup
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                   New AI Session (Restored)                  │
│  ┌────────────────────────────────────────────────────┐     │
│  │  Previous Context Loaded                           │     │
│  │  ✓ Project structure understood                    │     │
│  │  ✓ Recent work summarized                          │     │
│  │  ✓ Key decisions recalled                          │     │
│  │  ✓ Ready to continue where you left off            │     │
│  └────────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────────┘

Real-World Example

Let's see this system in action:

Day 1: Initial Work

You: I'm building a video generator that downloads classical music and creates
     YouTube videos. I need to avoid copyright issues.

Claude: [Works on the problem, creates scanner tool, tests files...]

You: /save-session

Claude: ✓ Session saved to .claude/sessions/session_2026-01-24_1830.md

Day 2: Continuation

[Session starts automatically]

System: ================== Previous Session Context ==================
Working on vibe67 video generator project. Created YouTube-safe audio scanner
to pre-screen MP3 files for copyright risk. Discovered Classicals.de hosts
modern copyrighted performances despite claiming "public domain". Scanner
checks metadata for recording year, copyright statements, and DAW encoders.
Next: Run scanner on Chopin collection and find alternative PD sources.
==================================================================

You: Let's continue with the scanner

Claude: I'll run the YouTube-safe audio scanner on the Chopin collection we
        discussed yesterday. [Continues work seamlessly...]

Mid-Session: Context Overflow Prevention

[After many tool calls and file reads]

Claude: I'm approaching context limits. Let me save current progress.
        [Invokes save-session skill automatically]

        Now I'll load just the summary to continue with a fresh context window.
        [Loads latest_summary_short.md instead of full transcript]

Benefits of This System

1. No More Repeated Questions

The AI remembers your project structure, conventions, and previous decisions.

2. Seamless Multi-Day Projects

Pick up exactly where you left off, days or weeks later.

3. Context Window Management

Automatic summarization prevents "input too long" errors on complex projects.

4. Searchable History

Full transcripts are saved with timestamps - search past sessions for solutions.

5. Learning from History

The AI can reference past mistakes, gotchas, and successful patterns.

6. Automatic and Manual Control

Hooks provide automatic save/load
Skills give you manual control when needed
You decide when to save important milestones

Advanced: Hierarchical Memory

For complex projects, use a tiered memory structure:

.claude/sessions/
  ├── latest_summary_short.md       # 500 tokens - Quick context
  ├── latest_summary.md             # 2000 tokens - Detailed recap
  ├── latest_session.md             # Full transcript
  └── manual_summary_2026-01-24.md  # Hand-crafted context

session-notes/
  └── vibe67-memory.md              # Curated project knowledge

The AI loads different levels based on need:

Quick tasks: Load short summary only (saves tokens)
Continue work: Load detailed summary
Complex debugging: Reference full session transcript
Long-term recall: Search curated project memory

Implementation Tips

1. Keep Summaries Focused

Don't save everything - extract the essential context:

Current goals and progress
Key decisions and rationale
Active bugs or blockers
File paths and important locations
Next planned steps

2. Use Timestamps

Date-based filenames make it easy to find specific sessions:

session_2026-01-24_1430.md  # 2:30 PM session
session_2026-01-24_1820.md  # 6:20 PM session

3. Automatic Hook Configuration

Set hooks in .claude/config.json so memory loading is automatic:

{
  "hooks": {
    "SessionStart": {
      "startup": ".claude/scripts/load_latest_summary.sh"
    },
    "Stop": {
      "autosave": ".claude/scripts/save_session.sh"
    }
  }
}

4. Skill Triggers

Use short, memorable triggers:

/ss → save session
/load → load previous summary
/recall → search session archive

5. Compression Strategy

As sessions accumulate, compress older ones:

# Keep full transcripts for 7 days
# After 7 days, keep only summaries
# After 30 days, archive to compressed format

Handling the Token Budget

Even with unlimited conversation length, each API call has token limits. The memory system handles this:

SessionStart Hook: Loads compact summary (~500 tokens)
During work: Full context in active window
Before limit: Auto-save and restart with summary
On-demand: /recall loads specific past context when needed

This creates the illusion of infinite memory while respecting API constraints.

Code Example: Complete Setup

Here's everything you need:

1. Directory structure:

mkdir -p .claude/{sessions,scripts,skills/{save-session,load-previous-summary}}

2. Save script:

# .claude/scripts/save_session.sh
#!/bin/bash
set -e
SESSIONS_DIR=".claude/sessions"
mkdir -p "$SESSIONS_DIR"

TIMESTAMP=$(date +"%Y-%m-%d_%H%M")
SESSION_FILE="${SESSIONS_DIR}/session_${TIMESTAMP}.md"

echo "Saving session to $SESSION_FILE..."

# Export conversation (implement based on your CLI's export method)
claude sessions export > "$SESSION_FILE"

# Update latest pointers
cp "$SESSION_FILE" "${SESSIONS_DIR}/latest_session.md"

# Generate short summary (implement summarization)
cat "$SESSION_FILE" | claude summarize --max-tokens 500 > \
  "${SESSIONS_DIR}/latest_summary_short.md"

echo "Session saved successfully"

3. Load script:

# .claude/scripts/load_latest_summary.sh
#!/bin/bash
SUMMARY_FILE=".claude/sessions/latest_summary_short.md"

if [ -f "$SUMMARY_FILE" ]; then
  echo "================== Previous Session Context =================="
  cat "$SUMMARY_FILE"
  echo "=============================================================="
  exit 0
else
  echo "No previous session found"
  exit 0
fi

4. Hook configuration:

// .claude/config.json
{
  "hooks": {
    "SessionStart": {
      "startup": {
        "command": ".claude/scripts/load_latest_summary.sh",
        "background": false,
        "showOutput": true
      }
    },
    "Stop": {
      "autosave": {
        "command": ".claude/scripts/save_session.sh",
        "background": true
      }
    }
  }
}

5. Skills:

# .claude/skills/save-session/SKILL.md
---
name: save-session
description: Immediately saves conversation transcript and summary
trigger: /save-session | /ss
---

Execute this command:
.claude/scripts/save_session.sh

After success, respond:
"✓ Session saved to .claude/sessions/session_[timestamp].md"

Conclusion

The AI memory problem isn't unsolvable - it just requires thinking about memory the same way operating systems do:

RAM (Short-term): Active conversation context
Disk (Long-term): Session transcripts and summaries
Cache (Recall): On-demand loading of specific context
Compression: Summarization to manage storage

With hooks for automatic save/load and skills for manual control, you create a persistent memory layer that makes AI assistants truly useful for long-term projects.

Stop re-explaining your project every session. Start working with an AI that remembers.

DEV Community