DEV Community

Cover image for I Built Yumii — An Open-Source AI Companion
Biprayan Choudhuri
Biprayan Choudhuri

Posted on

I Built Yumii — An Open-Source AI Companion

GitHub “Finish-Up-A-Thon” Challenge Submission

This is a submission for the GitHub Finish-Up-A-Thon Challenge

What I Built

Yumii is an open-source, locally-run AI companion with a Live2D avatar, real-time voice, and six distinct personalities — and she now actually remembers you.

She listens to your voice. She responds through an animated avatar. Her face reacts to what you say. And as of this sprint, she builds a long-term memory of who you are — across every conversation, every session, every restart.

Beyond the core AI experience, Yumii focuses heavily on accessibility and ease of use. The project features a dedicated website, comprehensive documentation, a one-line installation command, and a streamlined CLI onboarding process that helps users get started quickly.

GitHub: https://github.com/CodeNeuron58/Yumi
License: MIT

Demo

Before vs After — at a glance:

Before After
Forgot you on every restart Persistent memory across sessions
One shared conversation, no history Full session management — create, resume, rename
No user facts Auto-extracts facts from every conversation turn
No CLI beyond launch /memory, /sessions, /resume, /forget commands
Wiped on exit Survives restarts
No dedicated website Full documentation website
Basic audio capture Active VAD — only speaks when you actually do
Manual setup One-line installation

The Comeback Story

I saw Project Ava and a few other AI companion demos and thought — I could build that.

I enjoy anime. The idea of an anime-style AI that actually talks and reacts to you genuinely excited me. So I built a scrappy version that barely held together, it worked, I was happy, and I dropped it. I mean i just build it for fun so i didn't care that much .

Months later I saw the same concept being sold as a subscription product. That bothered me. I'd already built this, yeah a scrappy one , but still i built it. And i thought to myself that I could build it properly — and make it free or opensource .

So I restarted. Proper CLI. Web interface. Clean architecture. VAD, STT, LLM, TTS — the full real-time voice loop. And one hard rule: it has to run on a CPU. I don't have a GPU. Most people don't. If it needs one, it's not really for everyone.

That's what you're looking at now. Still alpha, still rough in places, not yet user-tested beyond myself. But it works, it's free, and it runs locally on your machine.

Note on the demo: I'm skipping a video walkthrough — the Live2D model I use locally has redistribution restrictions, so I can't share footage with it publicly. Voice, memory, and everything else works fine as much as i can tell , it works without an avatar if you want to try it yourself.

What's next

Making her actually do things.

Right now Yumii talks. What I want is for her to act — search the web, read your emails, help you shop, browse on your behalf. A real assistant, not just a companion.

It would be done by giving her tool access via mcp.

Contribute

Yumii is fully open source — MIT license. If you're a developer who builds with LangGraph, vector databases, voice pipelines, or just finds this kind of project interesting — contributions are genuinely welcome.

I'd love to hear what other developers think of the architecture, the approach, or what they'd build on top of it. Open an issue, start a discussion, or just drop a comment here. Every perspective helps.

Small Note

Yumii is still in a very early stage and actively being developed, so expect rough edges here and there.

This is my first time properly sharing the project outside LinkedIn, and I’d genuinely love feedback, ideas, or contributions from fellow developers while building it

My Experience with GitHub Copilot

Copilot was running throughout this sprint mostly through tab completion — and that's where it genuinely helped.

The repetitive async patterns across the session and memory modules are where it earned its place. Once it saw the structure of the first method, it completed the rest almost entirely — correct SQL, correct await placement, correct error handling. What would've taken an hour took fifteen minutes.

The SQLite schema was another moment. I typed CREATE TABLE sessions ( and it suggested the full schema, including columns I hadn't thought of yet but ended up needing.

It didn't make any decisions. The architecture, the three-database design, the fire-and-forget fact extraction, the async queue pipeline — all of that was thought through manually. But for translating decisions into correct boilerplate fast, it was genuinely useful.

I also used Claude for cross-checking integration points — particularly verifying the LangChain tool interface when I added web search. Both tools, different jobs.

Top comments (0)