DEV Community

邱敬幃 Pardn Chiu
邱敬幃 Pardn Chiu

Posted on • Edited on

Cognitive Imperfect Memory System

[!Note]
This content is translated by LLM. Original text can be found here

cover

Attempts to solve traditional LLM issues of getting lost in multi-turn conversations and single conversation length limitations by simulating human dialogue patterns.

Based on research paper LLMs Get Lost In Multi-Turn Conversation

Example provides both traditional and new architecture TUI modes for testing: Quick Jump to TUI Example

Paper Problem Analysis

LLMs Get Lost In Multi-Turn Conversation

Common LLM Issues in Long Conversations

Research on 15 LLMs across 200,000+ conversations shows:

  • Problem: Multi-turn conversation performance drops 39%
  • Cause: LLMs use "complete memory" models instead of human-like selective memory

Four Key Issues

LLM Problem Human Behavior
Premature Solutions Ask clarifying questions first
Information Hoarding Forget irrelevant details
Linear Replay Maintain "current state"
Verbose Spirals Stay focused

Filtering Trumps Memory

Cognitive Burden of Perfect Memory

Research shows exceptional memory ability doesn't equal intelligence advantage. The classic neuropsychological case of Solomon Shereshevsky could remember arbitrary details from decades ago, including meaningless number sequences and vocabulary lists, but "perfect memory" actually created cognitive burden. He couldn't distinguish important from unimportant information, leading to difficulties in abstract thinking and daily decision-making.

Design Insights for LLMs

Traditional LLMs using complete memory models may actually be simulating cognitive impairment. This leads to requiring bigger, more powerful hardware support without proportional performance gains.

Selective attention > Complete memory
Abstract summarization > Detail preservation  
Dynamic adaptation > Fixed replay
Enter fullscreen mode Exit fullscreen mode

Real Conversation Process

Continuously Updating Mental Summary

  • Humans don't repeatedly go through entire conversation history in their minds
  • Instead, they maintain a dynamic "current understanding" and update conclusions based on new information
  • Past details fade or disappear, but key conclusions and constraints persist

Keyword-Triggered Recall

  • When someone says "that thing we discussed earlier"
  • We perform fuzzy searches of recent memory for relevant information
  • Only retrieve specific details when triggered by reference keywords

Implementation Plan

Human conversation process → Engineering implementation

Mental summary → Small model generates structured summary after each turn
Content recall → Automatic fuzzy search of conversation history for each question (relevance scoring)
New conversation → Latest summary + relevant history fragments + new question
Enter fullscreen mode Exit fullscreen mode

Human Conversation Simulation

Simulating imperfect memory: Instead of designing better, larger information retrieval support, we built a system that processes information like humans do

Humans Are Inherently Poor at Complete Memory

  • We forget irrelevant details
  • We remember key decisions
  • We learn from mistakes
  • We have internal measures
  • We maintain current conversation focus
  • We actively associate relevant past content

Combining Machine Advantages

This approach explores combining human cognitive advantages with machine computational advantages:

  • Simulate human mechanisms: Default state uses only structured summaries, avoiding historical information overwhelming current conversation
  • Machine enhancement: Complete conversation records are still preserved; when retrieval is triggered, can provide more precise detail recall than humans

Maintains the natural focus characteristics of human conversation while leveraging machine advantages in precise retrieval. Doesn't use complete history during conversation, only activating detailed retrieval under specific trigger conditions.

Engineering Simulation Focus

Exclude unnecessary information: Remove from key summaries
Maintain focus: Use structured summaries, similar to mental rough overviews
Active recall: Automatically retrieve relevant historical content for each question
State updates: Continuous summarization, similar to mental event understanding
Enter fullscreen mode Exit fullscreen mode
  • Don't replay complete content but use summaries to simulate human rough overviews
  • Summarize into new overviews to adjust conversation direction, simulating human internal perspectives
  • Actively retrieve relevant history to simulate human associative memory

Implementation

  1. Continuous mental perspective updates → Automatic summary updates (relevant information retention vs complete history) After each conversation turn, humans unconsciously update their current conversation summary based on new information and proceed with the next turn using new perspectives
  2. Active associative memory → Fuzzy search system (automatic memory retrieval) For each new question, automatically search relevant content in conversation history, simulating human active association of past discussions
  3. Current state focus → Fixed context structure (structured summaries) Dynamically adjust current conversation direction, not re-reviewing entire conversation history

Comparison

Cognitive Mode Human Behavior Traditional LLM Simulation Implementation
Memory Management Selective retention Perfect recall Structured forgetting
Error Learning Avoid known failures Repeat mistakes Excluded options tracking
Focus Maintenance Current state oriented Historical drowning Summary-based context
Memory Retrieval Active associative triggering Passive complete memory Automatic fuzzy search

Memory Architecture

LLM "Complete Memory" (Non-human conversation method)

Flowchart

graph TB
  T1["Turn 1 Conversation"] --> T1_Store["Store: [Q1] + [R1]"]
  T1_Store --> T2["Turn 2 Conversation"]
  T2 --> T2_Store["Store: [Q1] + [R1] + [Q2] + [R2]"]
  T2_Store --> T3["Turn 3 Conversation"]
  T3 --> T3_Store["Store: [Q1] + [R1] + [Q2] + [R2] + [Q3] + [R3]"]
  T3_Store --> TN["Turn N Conversation"]
  TN --> TN_Store["Store: Complete conversation"]
Enter fullscreen mode Exit fullscreen mode
Turn 1: [question 1] + [response 1]
Turn 2: [question 1] + [response 1] + [question 2] + [response 2]
Turn 3: [question 1] + [response 1] + [question 2] + [response 2] + [question 3] + [response 3]
...
Turn N: [Complete verbatim conversation record]
Enter fullscreen mode Exit fullscreen mode
  • Humans don't completely recall all content
  • Old irrelevant information interferes with current content generation; humans exclude irrelevant information
  • No mechanism for learning from mistakes; gets interfered by irrelevant information in long conversations
  • Linear token growth leads to conversation length limits; humans don't interrupt conversations because they're too long

Human Real Conversation Method Study

Flowchart

graph TB
  H_Input["New question input"] --> H_Fuzzy["Fuzzy search history"]
  H_Fuzzy --> H_Components["Context composition"]
  H_Components --> H_Summary["Structured summary"]
  H_Components --> H_Relevant["Relevant history fragments"]
  H_Components --> H_Question["New question"]

  H_Summary --> H_LLM["LLM response"]
  H_Relevant --> H_LLM
  H_Question --> H_LLM

  H_LLM --> H_Response["Generate answer"]
  H_Response --> H_NewSummary["Update structured summary"]
  H_NewSummary --> H_Store["Store to memory"]
Enter fullscreen mode Exit fullscreen mode
Each turn: [Structured current state] + [Relevant history fragments] + [New question]
Enter fullscreen mode Exit fullscreen mode

Conversation Summary Design

Core topic of current discussion
Accumulated retention of all confirmed requirements
Accumulated retention of all constraint conditions
Excluded options + reasons
Accumulated retention of all important data, facts, and conclusions
Current topic-related questions to clarify
All important historical discussion points
Enter fullscreen mode Exit fullscreen mode

Fuzzy Retrieval Algorithm

Human memory retrieval is typically triggered by keywords, such as: "what we mentioned earlier..."

This section is designed to calculate high similarity between the latest question and conversation history to provide supplementary reference materials, simulating natural memory trigger mechanisms:

  • Keyword triggering: Immediately associate relevant content upon hearing specific keywords
  • Semantic Similarity: Comprehend content with similar meaning but different wording
  • Time Weight: Recent conversations are more easily recalled

Multi-dimensional Scoring Mechanism

Total score = Keyword overlap (40%) + Semantic similarity (40%) + Time weight (20%)
Enter fullscreen mode Exit fullscreen mode

Keyword triggering

  • Use Jaccard similarity to calculate vocabulary matching degree
  • Support partial matching and inclusion relationships

Semantic Similarity

  • Simplified cosine similarity, calculating common vocabulary proportion
  • Suitable for Chinese-English mixed text processing

Time Weight

  • Linear decay within 24 hours: recent=1.0, 24 hours ago=0.7
  • Fixed score of 0.7 after 24 hours (suitable for long-term continuous conversations)

Retrieval Control Mechanism

  • Relevance threshold: Default 0.3, filters irrelevant content
  • Result quantity limit: Return maximum 5 most relevant records
  • Keyword extraction: Automatically filter stop words, retain meaningful vocabulary

Context Combination Strategy

Each turn conversation context = [Structured summary] + [Relevant historical conversation] + [New question]
Enter fullscreen mode Exit fullscreen mode

Implemented

  • [x] Structured summary system: Simulate human mental rough summaries
  • [x] State update mechanism: Automatically update cognitive state after each conversation turn (gpt-4o-mini)
  • [x] Error learning system: Avoid repeated mistakes through ExcludedOptions
  • [x] Token efficiency optimization: Fixed transmission of summaries and new content, no longer passing complete message streams
  • [x] Fuzzy retrieval mechanism: Automatically retrieve relevant historical conversations as reference
  • [x] Multi-dimensional scoring algorithm: Comprehensive relevance assessment of keywords+semantics+time
  • [x] Long conversation optimization: Time weight design suitable for continuous conversation scenarios

To Be Implemented

  • [ ] Semantic understanding enhancement: Integrate more precise semantic similarity algorithms
  • [ ] Keyword extraction optimization: More intelligent vocabulary extraction and weight allocation
  • [ ] Dynamic threshold adjustment: Automatically adjust relevance thresholds based on conversation content
  • [ ] Conversation type identification: Optimize memory strategies for different conversation scenarios
  • [ ] Multi-model support: Support more LLM providers (Claude, Gemini, etc.)

TUI Example Usage

Environment Requirements

  • Go 1.20 or higher
  • OpenAI API key

Installation Steps

  1. Clone the project
git clone https://github.com/pardnchiu/cim-prototype 
cd cim-prototype
Enter fullscreen mode Exit fullscreen mode
  1. Configure API key Create an OPENAI_API_KEY file and put your OpenAI API key:
echo "your-openai-api-key-here" > OPENAI_API_KEY
Enter fullscreen mode Exit fullscreen mode

Or set environment variable:

export OPENAI_API_KEY="your-openai-api-key-here"
Enter fullscreen mode Exit fullscreen mode
  1. Run the program
./cimp
./cimp --old # Run traditional memory mode
Enter fullscreen mode Exit fullscreen mode

or

go run main.go
go run main.go --old # Run traditional memory mode
Enter fullscreen mode Exit fullscreen mode

API Key Configuration

The program will look for OpenAI API key in the following order:

  1. Environment variable OPENAI_API_KEY
  2. OPENAI_API_KEY file in current directory
  3. OPENAI_API_KEY file in executable directory

Instruction File Configuration

INSTRUCTION_CONVERSATION

  • Defines system instructions for main conversation model (GPT-4o)
  • Affects AI assistant's response style and behavior
  • If file doesn't exist, will use blank instructions

INSTRUCTION_SUMMARY

  • Defines system instructions for summary generation model (GPT-4o-mini)
  • Affects conversation summary update logic and format
  • If file doesn't exist, will use blank instructions

Usage

  1. Start the program: After execution, displays three-panel interface

    • Left: Conversation history display
    • Top right: Conversation summary display
    • Bottom right: Question input field
  2. Basic operations:

    • Enter: Submit question
    • Tab: Switch panel focus
    • Ctrl+C: Exit program
  3. Conversation flow:

    • After inputting question, system automatically retrieves relevant historical conversations
    • AI provides answers based on summary and relevant history
    • System automatically updates conversation summary, maintaining memory state (wait for summary update before continuing conversation)

License

This source code project is licensed under the MIT license.


©️ 2025 邱敬幃 Pardn Chiu

Top comments (0)