NeuroLink AI

Posted on Apr 5 • Edited on Jun 27 • Originally published at blog.neurolink.ink

Building Personalized AI: Multi-User Memory and Context Management in TypeScript

#webdev #programming #typescript #ai

Building Personalized AI: Multi-User Memory and Context Management in TypeScript

In the rapidly evolving landscape of AI applications, personalization and context continuity are paramount. Users expect AI to remember past interactions, understand their preferences, and adapt to their individual needs. This is where robust memory and context management systems become critical.

NeuroLink, the universal AI SDK for TypeScript, provides sophisticated mechanisms to handle both short-term conversational context and long-term, condensed user memory. This article explores why per-user memory is essential, how NeuroLink implements it, multi-user patterns, and important considerations for production deployments.

Why AI Apps Need Per-User Memory

Imagine a customer support AI that forgets every previous interaction, or a personal assistant that asks for your name every time you converse. Without memory, AI applications are stateless and generic, leading to frustrating and inefficient user experiences. Per-user memory addresses this by:

Personalization: Tailoring responses and actions based on individual user preferences, history, and goals.
Context Continuity: Maintaining the flow of conversation and understanding references to past topics, even across sessions.
Efficiency: Avoiding redundant information exchange, as the AI can recall previously established facts.
Adaptability: Enabling AI to learn and adapt to user behavior over time, becoming more useful and intuitive.

NeuroLink tackles this challenge with two complementary memory systems: "Conversation Memory" for short-term chat history and "Memory Engine" for condensed, long-term user context.

NeuroLink's Conversation Memory System

NeuroLink's Conversation Memory focuses on maintaining the immediate context of an ongoing dialogue. It stores recent turns in a session, ensuring the AI can refer to previous messages within the same conversation.

Key features include:

Session-based memory: Each conversation session (identified by a sessionId) has its own isolated context.
Turn-by-turn persistence: AI remembers messages from both the user and itself within a session.
Automatic cleanup: Configurable maxTurnsPerSession and maxSessions prevent memory bloat by automatically removing older data (Least Recently Used eviction for sessions).
In-memory storage: Fast and lightweight for active conversations.
Universal method support: Works seamlessly with both generate() and stream() methods.

Here's a quick example of how NeuroLink uses sessionId and userId to manage conversation memory:

import { NeuroLink } from "@juspay/neurolink";

const neurolink = new NeuroLink({
  conversationMemory: { enabled: true },
});

// First message in session for a specific user
const response1 = await neurolink.generate({
  prompt: "My name is Alice and I love reading books",
  context: {
    sessionId: "user-123-chat-session-001", // Unique session ID
    userId: "alice", // User identifier
  },
});

// Follow-up message - AI will remember previous context for Alice
const response2 = await neurolink.generate({
  prompt: "What is my favorite hobby?",
  context: {
    sessionId: "user-123-chat-session-001", // Same session ID
    userId: "alice",
  },
});
// Response: "Based on what you told me, your favorite hobby is reading books!"

This ensures that within the user-123-chat-session-001 session, the AI retains the context of Alice's preferences.

NeuroLink's Memory Engine: Condensed, Persistent User Memory

While Conversation Memory handles immediate dialogue, the NeuroLink Memory Engine (powered by the @juspay/hippocampus SDK) provides long-term, condensed, and persistent memory for individual users. This is crucial for applications where facts about a user need to survive across many different conversations, days, or even months.

Unlike conversation memory which stores raw turns, the memory engine maintains a condensed summary of durable facts about each user. This condensation is performed by an LLM, keeping the memory concise and relevant.

How it works:

Retrieve: Before an LLM call, memory.get(userId) fetches the user's condensed memory.
Inject: This memory is prepended to the user's prompt as context, enabling the LLM to use it for generating responses.
Store (Background): After the LLM responds, memory.add(userId, content) runs in the background. An LLM condenses the old memory with the new conversation turn, creating an updated summary.

Redis-Backed Memory for Production

For production environments, the Memory Engine supports various storage backends, including Redis. Redis is an excellent choice for distributed, persistent memory due to its speed and in-memory data structure store capabilities.

import { NeuroLink } from "@juspay/neurolink";

const neurolink = new NeuroLink({
  conversationMemory: {
    enabled: true,
    memory: {
      enabled: true,
      storage: {
        type: "redis",
        url: "redis://localhost:6379", // Redis connection URL
      },
      neurolink: {
        provider: "openai", // LLM for condensation
        model: "gpt-4o-mini",
      },
      maxWords: 100, // Keep condensed memory to 100 words
    },
  },
});

// User interacts for the first time
await neurolink.generate({
  input: { text: "I'm a software engineer and I specialize in TypeScript." },
  context: { userId: "user-developer-x" },
});

// Days later, the same user returns
const result = await neurolink.generate({
  input: { text: "What's my primary programming language?" },
  context: { userId: "user-developer-x" },
});
// Result: "Based on your previous interactions, your primary programming language is TypeScript."

In this setup, Redis provides persistence, ensuring that user-developer-x's condensed memory (that they are a TypeScript-specializing software engineer) is available even after application restarts or across multiple instances in a distributed system.

Other storage options include S3 (recommended for ultimate durability), SQLite (for development/local use), and custom backends for maximum flexibility.

Multi-User Patterns: Layered Contexts

One of NeuroLink's powerful features is its support for multi-user memory. This allows AI applications to simultaneously retrieve and store memory for multiple distinct "users" within a single generate() or stream() call. This enables "layered memory," combining different scopes of context, such as:

Personal context: The individual user's preferences and history.
Organizational policies: Company-wide rules or guidelines.
Team-specific knowledge: Information relevant to a user's team or department.

This is achieved by specifying additionalUsers in the memory options:

const result = await neurolink.stream({
  input: { text: "How should I handle sensitive customer data in our new feature?" },
  context: { userId: "dev-mary" }, // Primary user
  memory: {
    additionalUsers: [
      {
        userId: "org-security-policy",
        label: "Security Policy",
        write: false, // Read-only policy
        prompt: `Summarize the key security and compliance rules related to data handling into at most {{MAX_WORDS}} words.
OLD_MEMORY: {{OLD_MEMORY}}
NEW_CONTENT: {{NEW_CONTENT}}
Condensed memory:`,
        maxWords: 150,
      },
      {
        userId: "team-fraud-detection",
        label: "Fraud Team Insights",
        read: true,
        write: true, // Allow AI to update team insights
      },
    ],
  },
});

The AI's prompt will then be augmented with context from all specified memory sources, allowing it to provide a holistic answer:

Context from previous conversations:

[User]
Mary is a backend developer, working on a new payment processing feature.

[Security Policy]
All sensitive customer data must be encrypted at rest and in transit. Adhere to PCI-DSS Level 1.
Data access must be logged and audited. Implement least privilege.

[Fraud Team Insights]
Recent fraud attempts involved social engineering; emphasize strong identity verification steps.

Current user's request: How should I handle sensitive customer data in our new feature?

This layered approach ensures that the AI's responses are not only personalized but also compliant with organizational policies and informed by relevant team knowledge. Each additionalUser can have its own condensation prompt and maxWords, allowing for fine-grained control over how different types of memory are summarized.

Memory Export and Analytics

While NeuroLink manages memory internally, the underlying @juspay/hippocampus SDK provides mechanisms for memory export. For Redis, you can directly access the stored keys for analytics. For S3, memories are stored as objects, easily accessible for batch processing or auditing.

NeuroLink also exposes getConversationStats() for in-memory conversation stats and the ability to clearConversationSession() or clearAllConversations(). For comprehensive auditing and debugging, the Redis conversation export feature allows full session history to be exported as JSON.

Privacy Considerations: TTL, Cleanup, Data Isolation

When dealing with user memory, privacy and data management are paramount. NeuroLink provides features and patterns to address these:

TTL (Time-To-Live): While not directly part of NeuroLink's memory configuration, Redis (a common backend) allows setting TTLs on keys. This can be used to automatically expire user memories after a certain period of inactivity or as per data retention policies.
Cleanup: maxTurnsPerSession and maxSessions in Conversation Memory automatically handle cleanup. For the Memory Engine, explicit deletion (onDelete in custom storage) can be implemented. For Redis, this can be done programmatically or via Redis's eviction policies.
Data Isolation: Both Conversation Memory and the Memory Engine are designed for per-user isolation, ensuring that one user's data does not bleed into another's. Each userId acts as a distinct key.
Custom Storage: The "custom" storage type for the Memory Engine allows you to integrate with existing data privacy frameworks, implement consent management, and enforce data residency requirements.

By carefully configuring these aspects, developers can build powerful, personalized AI applications while adhering to strict privacy and compliance standards.

NeuroLink — The Universal AI SDK for TypeScript

GitHub: github.com/juspay/neurolink
Install: npm install @juspay/neurolink
Docs: docs.neurolink.ink
Blog: blog.neurolink.ink — 150+ technical articles

DEV Community

Building Personalized AI: Multi-User Memory and Context Management in TypeScript

Building Personalized AI: Multi-User Memory and Context Management in TypeScript

Why AI Apps Need Per-User Memory

NeuroLink's Conversation Memory System

NeuroLink's Memory Engine: Condensed, Persistent User Memory

Redis-Backed Memory for Production

Multi-User Patterns: Layered Contexts

Memory Export and Analytics

Privacy Considerations: TTL, Cleanup, Data Isolation

Top comments (0)