How We Turn WhatsApp Chats Into Illustrated Comics (And Why It's Harder Than It Sounds)
When people hear "upload your WhatsApp chat, get a comic back", they usually imagine something like a simple screenshot with speech bubbles slapped on top. The reality of what Chat Comics actually does is considerably weirder — and, we think, considerably more interesting.
This is a behind-the-scenes look at how Chat Comics works: the AI pipeline, the design decisions, the rabbit holes we fell down, and the bits that still surprise us.
The Problem: A Chat Export Is Just a Text File
A WhatsApp chat export looks like this:
01/03/2025, 14:22 - Jamie: did anyone watch that documentary last night
01/03/2025, 14:23 - Sarah: which one
01/03/2025, 14:23 - Jamie: the one about the guy who thought he was a dog
01/03/2025, 14:24 - Marcus: 💀💀💀
01/03/2025, 14:25 - Sarah: JAMIE WHY ARE YOU WATCHING THAT
01/03/2025, 14:31 - Jamie: it was on recommended ok
That's it. Timestamps, names, raw text, the occasional emoji. No tone. No stage directions. No sense of who these people are or what their dynamic is.
Turning that into a visually compelling, narratively coherent illustrated comic requires solving a surprisingly deep stack of problems.
Step 1: Parsing and Cleaning
WhatsApp export formats vary by platform (iOS vs Android), locale, and version. Date formats differ. System messages ("Jamie added Sarah", "Messages are end-to-end encrypted") need stripping. Deleted messages, media attachments, and multi-line messages all need handling gracefully.
Before anything creative happens, the raw export goes through a normalisation pass that produces a clean, structured conversation object: participants, timestamps, message text, and metadata.
Privacy isn't an afterthought — it's baked into the pipeline from step one.
Step 2: Narrative Extraction with Claude
Here's where things get interesting. We don't just want to illustrate the chat verbatim — we want to turn it into a story.
We pass the cleaned conversation to Claude with a detailed prompt that asks it to:
- Identify the central tension or theme of the conversation
- Establish character archetypes for each participant (the chaos gremlin, the voice of reason, the silent observer, etc.)
- Extract key dramatic beats — the moment things escalated, the punchline, the resolution
- Write comic panel descriptions with narration, dialogue, and scene context
- Assign emotional states to characters at each beat
The mood the user selects (funny, dramatic, roast, horror, romance, detective — there are 12 total) heavily influences this extraction. The same chat about who forgot to buy milk reads very differently as a wholesome slice-of-life versus a psychological horror story.
This is the most creative and unpredictable part of the pipeline. Claude regularly finds angles in chats that are genuinely funnier or more poignant than we'd have anticipated. It's also occasionally completely wrong about who the villain is, which is its own kind of entertainment.
Step 3: Character Design
Every participant gets a consistent visual identity — an avatar that appears across all panels with appropriate facial expressions.
Character design is generated from the personality signals Claude extracted in step 2, combined with any name-based cues. The goal is caricature, not portraiture: exaggerated features, bold design, immediately readable personality.
Expression variants (happy, shocked, suspicious, smug, horrified, etc.) are generated for each character upfront, so we can pick the right emotional beat for each panel without inconsistency. Keeping a character visually coherent across multiple generated images is one of the harder unsolved problems in this space — our current approach trades some flexibility for consistency.
Step 4: Scene and Panel Illustration
Each narrative beat gets a background scene. A kitchen argument. A 3am spiral. A group call where nobody has turned their camera on. A passive-aggressive reaction to being left on read.
Scenes are generated to match the mood and context of the panel, with characters composited in. The visual style stays consistent throughout a comic — we set a style token at the start of generation and it propagates through every image in that run.
Panel layout, speech bubbles, narration boxes, and typography are assembled in the final composition step. This is more layout engineering than AI — making sure text fits, panels breathe, and the reading order feels natural.
The 12 Moods Problem
Supporting 12 wildly different genre moods with a single pipeline is genuinely strange to build. The same conversation needs to produce:
- Funny: exaggerated reactions, comedic timing, rimshot energy
- Dark: same events, but now there's dread. The ellipsis before a reply is terrifying.
- Detective: dry narration, establishing shots, mystery energy
- Love: soft lighting, meaningful glances, everything is a meet-cute
- Epic: the group chat gets the blockbuster treatment
Each mood has its own prompt modifiers, visual style parameters, and narrative framing instructions. Getting them to feel genuinely distinct — rather than just "same comic with different filter" — took a lot of iteration.
Dark mode remains our personal favourite. There is something deeply unsettling about a mundane chat about weekend plans rendered with an unsettling edge.
Privacy Architecture
This is worth being explicit about, because it shaped a lot of decisions.
Chat exports contain real conversations between real people — most of whom never opted into having their messages processed by an AI. That's a serious responsibility.
Our approach:
- No persistent storage of chat content. Uploads are processed in memory and deleted immediately after the comic is generated.
- No training on user data. Chat content is never used to fine-tune or improve models.
- Minimal metadata retention. We keep enough to deliver the comic and handle support requests, nothing more.
- Sign in with Google to get started — your first comic is free.
These constraints occasionally made engineering harder. They were always worth it.
What Still Surprises Us
After building this: AI is remarkably good at reading group dynamics. It picks up on who defers to whom, who escalates, who defuses. It notices callback humour across a long thread. It understands that the person who sends a single full stop is ominous.
It's also confidently wrong in ways that are genuinely funny. It occasionally assigns villain status to someone who was clearly joking, or reads a chaotic voice-note thread as a deeply philosophical meditation on existence.
Both outcomes make for good comics.
Try It
If you want to see what your group chat looks like as a comic, Chat Comics is free for your first comic. No account needed. Your data's gone the moment your comic is generated.
We'd love to hear what you think — and yes, horror mode is worth trying at least once.
Top comments (0)