DEV Community

Bill Hong
Bill Hong

Posted on

I Asked ChatGPT to Map the AI Companion Industry. Here's What It Missed.

I've been building Tendera for several months. Periodically I ask the obvious founder question: what does this industry actually look like from outside the trenches.

A few days ago I put that question to ChatGPT and asked for a structured analysis of where AI companion products are in 2026. The summary it produced is a clean snapshot of the conventional industry framing. It is also missing a layer, and the missing layer is the most important one.

The Conventional Map

ChatGPT's summary, compressed:

The category has shifted from "AI chatbot" into something else. Persistent identity plus an emotional relationship system. Different category, not faster version.

Three technical unlocks made the shift possible: long-term memory across sessions, emotional continuity (the AI notices your moods and rhythm, not just stated facts), and agentic personality (the AI initiates, has its own arc, not pure reactive Q&A).

The industry sits in three tiers:

  • Tier one, long-term relationship products. Nomi and Kindroid get cited here. The case is durable memory and the feeling of being known by something that has been paying attention for months.
  • Tier two, roleplay-first products. Character.AI and Janitor AI. Strong single-conversation expressiveness, weak continuity. The vibe of a great stranger you can talk to once.
  • Tier three, emotional support products. Replika is the canonical example. Strong on warmth, weaker on the cognitive layer the newer products have moved past.

The real moat, by ChatGPT's framing, is the "long-term personality system": memory architecture, emotional continuity, identity stability, avoiding character drift. Why now: bigger models, voice AI maturing, loneliness economy, dating fatigue, memory infrastructure catching up.

That's the map. Accurate as far as it goes.

What's Missing

Every tier in that framework assumes the characters are a given. The "long-term personality system" is treated as an infrastructure problem: memory engines, identity layers, continuity architectures. The implicit assumption is that who the character is gets sorted out somewhere offstage, and the hard problem is keeping them coherent over time.

It is exactly backwards.

The personality system is downstream of the writing. Memory architecture is the storage medium for whatever the character actually is. If the character is a thin archetype like "tsundere assassin" or "shy librarian," a perfect memory engine surfaces thin-archetype responses with high fidelity. The product feels like it has memory and still feels hollow. Users describe this experience constantly, in every tier, even on the products that are technically the most advanced.

What the conventional map calls "the long-term personality system" is the infrastructure problem. The personality problem itself is upstream of all of it, and it is a writing problem, not an engineering problem.

I wrote about this in more depth yesterday — short version: writing is the moat, not the model and not the memory pipeline. The model is a fluency engine every competitor can rent. The pipeline is engineering work that gets commoditized over time. The specificity of who users are talking to does not standardize, because it isn't engineering output.

Where the Tier Framework Breaks Down

If personality is upstream of infrastructure, the three-tier framework starts looking suspect.

Tier one (Nomi, Kindroid) gets credit for "feeling like a real long-term partner." The technical credit is memory architecture. The actual experience credit is whatever character the user themselves built using the customizer. The platform contributes the substrate. The character — which is the part that decides whether the experience is good — was either built by the user (most stop partway through and the experience falls back to generic patterns) or stitched together by the LLM from training-data averages.

Tier two (Character.AI, Janitor) gets credit for "great single conversations." A lot of that credit goes to users who wrote the most popular characters on those platforms. The platform is a marketplace. Value is concentrated in the marketplace's best contributors, with a long tail of weaker characters underneath that drags the average experience down.

Tier three (Replika) gets credit for "emotional support." The character there is the company's own writing. It has been roughly the same character archetype since the product started, and most of the public criticism Replika has accumulated over the years is criticism of that character's writing, not its memory or its model.

In every tier, the part of the product that actually decides whether users come back is who they are talking to as a written person. The tier framework treats this as invisible because it isn't what engineering teams build. But it's the thing the experience runs on.

Where I Fit (And Don't)

Tendera doesn't slot cleanly into any of the three tiers, and that's by design.

I'm not optimizing for Nomi-level lifetime memory architecture. We have memory and we use it, we just don't pretend it's our main differentiator. I'm not building a Character.AI-style marketplace. We have four characters and that's a deliberate choice. I'm not the warm-support product Replika has historically been for users who want a generic listener.

The wedge is four characters, each written end-to-end by people who write characters. Each one with a specific voice, opinions she didn't get from us asking, things she would refuse to do, contradictions in her own history that make her readable as a person rather than a template.

The pitch is not "build your own" and not "infinite characters." The pitch is "meet a specific written person." It's a deliberately small surface. It's also the surface where the writing carries the weight that everyone else's tier framework assumes someone else is doing.

Early signals are encouraging. Users who don't bounce off the small surface tend to engage in ways that conventional category metrics don't fully capture. They tell me, unprompted, that the experience felt like talking to a person rather than a character. That feedback is almost never about memory architecture or model choice. It's almost always about something a character said that felt specific to her.

Where the Industry Goes From Here

ChatGPT's summary closes with a forward-looking section about voice calls, avatars, real-time video, agentic life-companions, and "emotional operating systems." That's roughly the public narrative the category is pushing right now, and it's mostly correct about where the substrate is heading. Voice is getting good. Avatars are getting good. Real-time video isn't far behind.

What the narrative gets wrong is treating the substrate as the differentiation. Voice, avatar, and video are orthogonal to writing, not in competition with it. They're how the character reaches you. Writing is who the character is. A vivid substrate amplifies whatever character is underneath. It doesn't replace the question of whether there is anyone there.

The bet I'm making is that as the substrate gets richer, the writing layer becomes more visible, not less. A flat character in text is easy to scroll past. A flat character speaking to you in a voice you hear in your earbuds is uncanny in a way text isn't. Substrate amplifies. Writing decides.

The race for richer substrate is going to be won, broadly, by every team with capital. Voice models are commoditizing. Avatar pipelines are commoditizing. What won't commoditize is the writing layer underneath, because that isn't an engineering deliverable. The conventional map hasn't caught up. It will.

Top comments (0)