DEV Community

Rex Zhen
Rex Zhen

Posted on

AI Memory Problem: Long-Term and Short-Term Memory with Hooks and Skills

Solving the AI Memory Problem: Long-Term and Short-Term Memory with Hooks and Skills

The Challenge: AI Amnesia

When working with AI assistants like Claude Code, you've probably experienced this frustrating pattern:

  1. You start a new session
  2. The AI asks questions you've already answered before
  3. Previous decisions and context are lost
  4. You waste time re-explaining the same background information

This is the AI memory problem. Most AI conversations are stateless - each session starts with a blank slate. While AI models have impressive context windows, they still face two fundamental memory constraints:

Short-Term Memory (Context Window Limits)

Even with large context windows (100K-200K tokens), a single conversation can exceed these limits when:

  • Working on complex, multi-hour projects
  • Reviewing large codebases with many files
  • Accumulating dozens of tool calls and outputs
  • Discussing detailed technical specifications

When you hit these limits, you get the dreaded error:

API Error: 400 Input is too long for requested model.
Enter fullscreen mode Exit fullscreen mode

Long-Term Memory (Session Persistence)

Between sessions, AI has no memory at all. When you:

  • Close and reopen the CLI
  • Start a new day's work
  • Switch between projects

All context from previous conversations is lost. The AI doesn't remember:

  • Your project structure and architecture
  • Previous decisions and why they were made
  • Bugs you've encountered and solved
  • Code patterns and conventions you established
  • Your preferences and workflow

The Solution: Hooks, Skills, and Persistent Memory

The solution is a three-tier memory system that mimics human memory:

1. Session Summaries (Long-Term Memory)

Create a session save mechanism that captures conversation history in permanent storage:

# .claude/scripts/save_session.sh
#!/bin/bash
SESSIONS_DIR=".claude/sessions"
TIMESTAMP=$(date +"%Y-%m-%d_%H%M")
SESSION_FILE="${SESSIONS_DIR}/session_${TIMESTAMP}.md"

# Save full transcript with timestamp
claude sessions export > "$SESSION_FILE"

# Update latest session pointer
cp "$SESSION_FILE" "${SESSIONS_DIR}/latest_session.md"

# Generate summary using AI
claude sessions summarize > "${SESSIONS_DIR}/latest_summary_short.md"
Enter fullscreen mode Exit fullscreen mode

This creates a searchable archive of all your work, organized by date and project.

2. Automatic Memory Loading (Session Startup)

Use a SessionStart hook to automatically load context when you begin work:

// .claude/config.json
{
  "hooks": {
    "SessionStart": {
      "startup": {
        "command": ".claude/scripts/load_latest_summary.sh",
        "background": false
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode
# .claude/scripts/load_latest_summary.sh
#!/bin/bash
SUMMARY_FILE=".claude/sessions/latest_summary_short.md"

if [ -f "$SUMMARY_FILE" ]; then
  echo "================== Previous Session Context =================="
  cat "$SUMMARY_FILE"
  echo "=============================================================="
else
  echo "No previous session found. Starting fresh."
fi
Enter fullscreen mode Exit fullscreen mode

Now every session starts with a brief recap of where you left off.

3. On-Demand Memory Recall (Skills)

Create custom skills for memory operations:

# .claude/skills/save-session/SKILL.md
---
name: save-session
description: "Saves current conversation transcript and creates summary"
trigger: /save-session | /ss
---

Execute: .claude/scripts/save_session.sh

Then respond: "Session saved to .claude/sessions/session_[timestamp].md"
Enter fullscreen mode Exit fullscreen mode
# .claude/skills/load-previous-summary/SKILL.md
---
name: load-previous-summary
description: Loads previous session summary for context
trigger: /load | /recall
---

Execute: .claude/scripts/load_latest_summary.sh

Then summarize the loaded context for the user.
Enter fullscreen mode Exit fullscreen mode

Now you can use natural commands:

  • /save-session or /ss - Save current work
  • /load or /recall - Recall previous context
  • The AI can also invoke these proactively when needed

Implementation Architecture

Here's the complete memory system architecture:

┌─────────────────────────────────────────────────────────────┐
│                      AI Session (Current)                    │
│  ┌────────────────────────────────────────────────────┐     │
│  │  Active Conversation (Short-Term Memory)          │     │
│  │  - Current task context                            │     │
│  │  - Recent messages and tool calls                  │     │
│  │  - Limited by context window                       │     │
│  └────────────────────────────────────────────────────┘     │
│                           │                                  │
│                           │ Save on exit/demand              │
│                           ▼                                  │
└─────────────────────────────────────────────────────────────┘
                            │
                            │
┌───────────────────────────▼─────────────────────────────────┐
│              Persistent Storage (Long-Term Memory)          │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Session Archive (.claude/sessions/)                 │   │
│  │  - session_2026-01-24_1430.md  (full transcript)    │   │
│  │  - session_2026-01-24_1600.md  (full transcript)    │   │
│  │  - session_2026-01-24_1820.md  (full transcript)    │   │
│  │  - latest_session.md            (most recent full)   │   │
│  │  - latest_summary_short.md      (condensed version)  │   │
│  └──────────────────────────────────────────────────────┘   │
│                                                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Project Memory (session-notes/)                     │   │
│  │  - vibe67-memory.md  (manually curated notes)       │   │
│  │  - Key decisions and architecture                    │   │
│  │  - Gotchas and learnings                             │   │
│  └──────────────────────────────────────────────────────┘   │
└──────────────────────────────────────────────────────────────┘
                            │
                            │ Load on startup
                            ▼
┌─────────────────────────────────────────────────────────────┐
│                   New AI Session (Restored)                  │
│  ┌────────────────────────────────────────────────────┐     │
│  │  Previous Context Loaded                           │     │
│  │  ✓ Project structure understood                    │     │
│  │  ✓ Recent work summarized                          │     │
│  │  ✓ Key decisions recalled                          │     │
│  │  ✓ Ready to continue where you left off            │     │
│  └────────────────────────────────────────────────────┘     │
└─────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Real-World Example

Let's see this system in action:

Day 1: Initial Work

You: I'm building a video generator that downloads classical music and creates
     YouTube videos. I need to avoid copyright issues.

Claude: [Works on the problem, creates scanner tool, tests files...]

You: /save-session

Claude: ✓ Session saved to .claude/sessions/session_2026-01-24_1830.md
Enter fullscreen mode Exit fullscreen mode

Day 2: Continuation

[Session starts automatically]

System: ================== Previous Session Context ==================
Working on vibe67 video generator project. Created YouTube-safe audio scanner
to pre-screen MP3 files for copyright risk. Discovered Classicals.de hosts
modern copyrighted performances despite claiming "public domain". Scanner
checks metadata for recording year, copyright statements, and DAW encoders.
Next: Run scanner on Chopin collection and find alternative PD sources.
==================================================================

You: Let's continue with the scanner

Claude: I'll run the YouTube-safe audio scanner on the Chopin collection we
        discussed yesterday. [Continues work seamlessly...]
Enter fullscreen mode Exit fullscreen mode

Mid-Session: Context Overflow Prevention

[After many tool calls and file reads]

Claude: I'm approaching context limits. Let me save current progress.
        [Invokes save-session skill automatically]

        Now I'll load just the summary to continue with a fresh context window.
        [Loads latest_summary_short.md instead of full transcript]
Enter fullscreen mode Exit fullscreen mode

Benefits of This System

1. No More Repeated Questions

The AI remembers your project structure, conventions, and previous decisions.

2. Seamless Multi-Day Projects

Pick up exactly where you left off, days or weeks later.

3. Context Window Management

Automatic summarization prevents "input too long" errors on complex projects.

4. Searchable History

Full transcripts are saved with timestamps - search past sessions for solutions.

5. Learning from History

The AI can reference past mistakes, gotchas, and successful patterns.

6. Automatic and Manual Control

  • Hooks provide automatic save/load
  • Skills give you manual control when needed
  • You decide when to save important milestones

Advanced: Hierarchical Memory

For complex projects, use a tiered memory structure:

.claude/sessions/
  ├── latest_summary_short.md       # 500 tokens - Quick context
  ├── latest_summary.md             # 2000 tokens - Detailed recap
  ├── latest_session.md             # Full transcript
  └── manual_summary_2026-01-24.md  # Hand-crafted context

session-notes/
  └── vibe67-memory.md              # Curated project knowledge
Enter fullscreen mode Exit fullscreen mode

The AI loads different levels based on need:

  • Quick tasks: Load short summary only (saves tokens)
  • Continue work: Load detailed summary
  • Complex debugging: Reference full session transcript
  • Long-term recall: Search curated project memory

Implementation Tips

1. Keep Summaries Focused

Don't save everything - extract the essential context:

  • Current goals and progress
  • Key decisions and rationale
  • Active bugs or blockers
  • File paths and important locations
  • Next planned steps

2. Use Timestamps

Date-based filenames make it easy to find specific sessions:

session_2026-01-24_1430.md  # 2:30 PM session
session_2026-01-24_1820.md  # 6:20 PM session
Enter fullscreen mode Exit fullscreen mode

3. Automatic Hook Configuration

Set hooks in .claude/config.json so memory loading is automatic:

{
  "hooks": {
    "SessionStart": {
      "startup": ".claude/scripts/load_latest_summary.sh"
    },
    "Stop": {
      "autosave": ".claude/scripts/save_session.sh"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

4. Skill Triggers

Use short, memorable triggers:

  • /ss → save session
  • /load → load previous summary
  • /recall → search session archive

5. Compression Strategy

As sessions accumulate, compress older ones:

# Keep full transcripts for 7 days
# After 7 days, keep only summaries
# After 30 days, archive to compressed format
Enter fullscreen mode Exit fullscreen mode

Handling the Token Budget

Even with unlimited conversation length, each API call has token limits. The memory system handles this:

  1. SessionStart Hook: Loads compact summary (~500 tokens)
  2. During work: Full context in active window
  3. Before limit: Auto-save and restart with summary
  4. On-demand: /recall loads specific past context when needed

This creates the illusion of infinite memory while respecting API constraints.

Code Example: Complete Setup

Here's everything you need:

1. Directory structure:

mkdir -p .claude/{sessions,scripts,skills/{save-session,load-previous-summary}}
Enter fullscreen mode Exit fullscreen mode

2. Save script:

# .claude/scripts/save_session.sh
#!/bin/bash
set -e
SESSIONS_DIR=".claude/sessions"
mkdir -p "$SESSIONS_DIR"

TIMESTAMP=$(date +"%Y-%m-%d_%H%M")
SESSION_FILE="${SESSIONS_DIR}/session_${TIMESTAMP}.md"

echo "Saving session to $SESSION_FILE..."

# Export conversation (implement based on your CLI's export method)
claude sessions export > "$SESSION_FILE"

# Update latest pointers
cp "$SESSION_FILE" "${SESSIONS_DIR}/latest_session.md"

# Generate short summary (implement summarization)
cat "$SESSION_FILE" | claude summarize --max-tokens 500 > \
  "${SESSIONS_DIR}/latest_summary_short.md"

echo "Session saved successfully"
Enter fullscreen mode Exit fullscreen mode

3. Load script:

# .claude/scripts/load_latest_summary.sh
#!/bin/bash
SUMMARY_FILE=".claude/sessions/latest_summary_short.md"

if [ -f "$SUMMARY_FILE" ]; then
  echo "================== Previous Session Context =================="
  cat "$SUMMARY_FILE"
  echo "=============================================================="
  exit 0
else
  echo "No previous session found"
  exit 0
fi
Enter fullscreen mode Exit fullscreen mode

4. Hook configuration:

// .claude/config.json
{
  "hooks": {
    "SessionStart": {
      "startup": {
        "command": ".claude/scripts/load_latest_summary.sh",
        "background": false,
        "showOutput": true
      }
    },
    "Stop": {
      "autosave": {
        "command": ".claude/scripts/save_session.sh",
        "background": true
      }
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

5. Skills:

# .claude/skills/save-session/SKILL.md
---
name: save-session
description: Immediately saves conversation transcript and summary
trigger: /save-session | /ss
---

Execute this command:
.claude/scripts/save_session.sh

After success, respond:
"✓ Session saved to .claude/sessions/session_[timestamp].md"
Enter fullscreen mode Exit fullscreen mode

Conclusion

The AI memory problem isn't unsolvable - it just requires thinking about memory the same way operating systems do:

  • RAM (Short-term): Active conversation context
  • Disk (Long-term): Session transcripts and summaries
  • Cache (Recall): On-demand loading of specific context
  • Compression: Summarization to manage storage

With hooks for automatic save/load and skills for manual control, you create a persistent memory layer that makes AI assistants truly useful for long-term projects.

Stop re-explaining your project every session. Start working with an AI that remembers.


Further Reading


About this article: Written using Claude Code with the exact memory system described above. This session will be automatically saved when you finish, and loaded when you return tomorrow.

Top comments (0)