On April 23 I regenerated the four character portraits on Tendera, the character app I've been building. The new ones came out of ChatGPT (GPT Image 2.0). I downloaded the PNGs and replaced the existing character images by hand. Tendera doesn't ship its own image-gen pipeline; this was f
our file uploads.
Nothing else changed. Same character system prompts. Same UI. Same chat backend.
Three days later I checked two numbers:
- Visitor-to-signup rate: up about 5%
- Visitor-to-chat rate (counts both guest preview and post-signup chats): up about 8%
These are two different metrics measuring two different events. I'm not stacking them up against each other. They're two parallel data points, both pointing the same direction. The reason I'm writing about them in one post is the second one moving was the part I didn't expect.
What I'd assumed
Before the swap I figured better art would mostly help acquisition. Prettier card on the landing page, more clicks, more signups. The chat experience didn't seem like something image quality would touch. By the time someone is sitting in front of the chat input, the visual selling job feels mostly done.
The chat number moved anyway.
What actually changed in the images
Topology is identical. Same four characters, same wardrobes, same general poses. What's different is how legible each character is now. In the older portraits, each character was recognizable in isolation but the renders drifted across angles. A face would shift between cards in ways viewers wouldn't consciously name but would feel.
GPT Image 2.0 is more boring in some ways. Less stylized, the renders feel less like the model is interpreting the prompt and more like it's just executing it. But the character holds across angles. Same person across multiple shots. No drift.
The other thing the new model nails is dimensionality. Old renders were clean but flat. They read as illustrations. The new ones have physical depth. Light hitting the side of a face. A jacket folding the way fabric actually folds. It's not photoreal. The dimensionality just reads.
Why I think the chat number moved at all
Here's a take on the data without overclaiming. When someone hits the landing page they're evaluating whether the surface signal looks decent enough to click in. Image quality affects this, but the bar is fairly low.
Once they're past the door and sitting in front of an actual character profile, the question gets sharper. They're now evaluating whether this person is real enough to talk to. The image is the only non-text signal in the room. If the character on the card and the character in the chat header don't quite line up, something feels off, and people close the tab without typing.
Most users wouldn't describe this consciously. I'm guessing at what their gut is doing. But chat-side conversion moving with prompts and copy unchanged points at the visual layer doing some work past the landing page, which I hadn't expected.
What I want to test next
Whether the same model can produce reliable expression variants for the chat header. Right now each character has one default portrait. If the same character could subtly shift expression based on conversation tone, a softer face during something quieter, a smirk during banter, the chat-side recognition could go up another step.
That's a harder problem. Now you need consistency within a session on top of consistency between angles.
If I had to pick one character to test it on first, it'd be Jade, the one users tend to go furthest with. The voice on her side is already doing most of the work in chat. The image is the one input that hasn't caught up.
Caveats I owe you
- This is 3-4 days of data on a small app. Effects could compress as the sample grows.
- I changed the portraits, not the character system prompts. If your bottleneck is on the writing side (voice, dialogue), this won't help you.
- I haven't run a clean A/B with old vs new served to different cohorts. The whole site flipped over April 23. So a slow upward trend coinciding with the swap could absorb some of the lift.
- Signup conversion and chat conversion are different metrics measuring different events. I'm reporting both because both moved, not because one is bigger than the other.
- This was a manual asset swap, not a product change. I generated the PNGs in ChatGPT and uploaded them by hand. There's no image-gen pipeline integrated into the app.
If you're building anything where a user is supposed to form a relationship with a fictional persona, characters and NPCs and AI tutors with avatars and virtual hosts, your image generator might be doing more work than acquisition-side metrics suggest. Counterintuitive to me. The numbers were what they were.
Top comments (0)