DEV Community

RUIPENG CAO
RUIPENG CAO

Posted on • Originally published at github.com

Building an AI Companion That Actually Remembers You — Live2D Avatar + Persistent Memory + Affinity System

The Problem

I've been running local LLMs for a while — Ollama, vLLM, the usual suspects. The models keep getting better, but the experience hasn't changed: you type something, it responds, and next time you open the chat, it has no idea who you are.

That's the gap I wanted to close. Not a smarter model — a smarter shell around the model.

What I Built

OpenClaw Live2D is an open-source frontend framework that adds three layers on top of any LLM backend:

1. Live2D Avatar

Instead of a text box, your AI has a face. A Live2D character that:

  • Lip syncs in real-time with TTS audio output
  • Changes expressions based on conversation emotion (24 expression types)
  • Simulates physics — hair and clothing react naturally
  • Responds to touch — click/tap interactions with different reactions

The rendering pipeline uses PixiJS with the pixi-live2d-display library, running at 60fps.

2. Persistent Memory (EverMemOS)

This is the core differentiator. Powered by EverMemOS, the AI remembers conversations across sessions:

You (Monday): "My cat has been sick lately"
You (Friday): "Had a rough day"
AI: "I'm sorry to hear that. How's your cat doing, by the way? You mentioned she was sick earlier this week."
Enter fullscreen mode Exit fullscreen mode

Memories are stored via WebSocket and retrieved with semantic search. When something important is remembered, a toast notification appears in the UI. It's not RAG — it's actual conversational memory with boundary detection and importance scoring.

3. Affinity System

A heartbeat-driven relationship tracker with 5 levels:

Stranger (0) → Acquaintance (20) → Friend (40) → Close (60) → Soulmate (80)
Enter fullscreen mode Exit fullscreen mode

The heart in the UI literally beats at a rate proportional to the affinity level (50-120 BPM). As the relationship grows:

  • The AI's tone becomes more casual and open
  • Milestone celebrations pop up when you level up
  • A "Relationship Card" summarizes your bond — exportable as PNG

It sounds like a gimmick until you experience it. There's something genuinely engaging about watching a number go up because you had a good conversation.

Architecture

┌──────────────────────────────────────────────────┐
│  OpenClaw Live2D Frontend                        │
│                                                  │
│  ┌──────────┐  ┌───────────┐  ┌──────────────┐  │
│  │ Live2D   │  │ Affinity  │  │ Relationship │  │
│  │ Canvas   │  │ System    │  │ Card         │  │
│  └────┬─────┘  └─────┬─────┘  └──────┬───────┘  │
│       │              │               │           │
│  ┌────┴──────────────┴───────────────┴────────┐  │
│  │         WebSocket Handler                  │  │
│  │  (audio, control, affinity, memory, chat)  │  │
│  └────────────────────┬───────────────────────┘  │
│                       │                          │
└───────────────────────┼──────────────────────────┘
                        │ WebSocket
┌───────────────────────┼──────────────────────────┐
│  Open-LLM-VTuber Backend                        │
│  ┌────────────────────┴───────────────────────┐  │
│  │  LLM + TTS + ASR + Live2D Model Server    │  │
│  └────────────────────┬───────────────────────┘  │
│                       │                          │
│  ┌────────────────────┴───────────────────────┐  │
│  │  EverMemOS (Long-term Memory)              │  │
│  └────────────────────────────────────────────┘  │
└──────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The frontend communicates with the backend entirely over WebSocket. Audio streams, chat messages, affinity updates, and memory events all flow through a single connection with message-type routing.

Tech Stack

Layer Tech
UI Framework React 18 + TypeScript
Components Chakra UI v3
State Zustand
Rendering PixiJS + Live2D Cubism SDK
Desktop Electron
Build Vite
Animation Framer Motion
i18n i18next (CN/EN)
Voice VAD (Voice Activity Detection)

Running It Yourself

Web Mode (quickest)

git clone https://github.com/Singularity-Engine/openclaw-live2d.git
cd openclaw-live2d
npm install
cp .env.example .env.local
# Edit .env.local with your backend URL
npm run dev:web
Enter fullscreen mode Exit fullscreen mode

Desktop Mode

npm run dev  # Launches Electron app
Enter fullscreen mode Exit fullscreen mode

Docker

docker build \
  --build-arg VITE_API_URL=http://your-backend:12393 \
  --build-arg VITE_WS_URL=ws://your-backend:12393 \
  -t openclaw-live2d .
docker run -p 3001:3001 openclaw-live2d
Enter fullscreen mode Exit fullscreen mode

Prerequisites

You need a running Open-LLM-VTuber backend. It handles LLM inference, TTS, and ASR. OpenClaw Live2D is the frontend — it adds the memory, affinity, and enhanced avatar experience on top.

Open Core Model

The project is fully open source under MIT. You can self-host everything.

For people who don't want to manage infrastructure, we offer a hosted version at sngxai.com with optional billing (Stripe integration).

What's Next

  • ASR integration — hands-free voice conversation
  • Custom model marketplace — use your own Live2D avatar
  • Mobile support — React Native or Capacitor
  • Multi-agent — multiple AI characters in the same scene

Links


If you've been looking for a way to make AI conversations feel less disposable, give it a try. Star the repo if you find it interesting, and PRs are always welcome.

Top comments (0)