DEV Community

Cover image for Proof of Human: I Built a Reverse Turing Test After Getting Flagged as AI

Proof of Human: I Built a Reverse Turing Test After Getting Flagged as AI

Daniel Nwaneri on June 18, 2026

This is a submission for the June Solstice Game Jam I got flagged by Sloan. If you've been on DEV long enough, you know Sloan. I thought Sloan...
Collapse
 
dannwaneri profile image
Daniel Nwaneri

Hey @francistrdev . you're the origin story for this one. Built it for the June Solstice Game Jam after the flagging incident. Five questions, Claude scores how human you sound.
Curious what you'd score. → proof-of-human-3ts.pages.dev/

Collapse
 
francistrdev profile image
FrancisTRᴅᴇᴠ (っ◔◡◔)っ

Hey Daniel!

Pretty much all are one sentence answers lol

Collapse
 
dannwaneri profile image
Daniel Nwaneri

72 . you passed. The 82 on Q5 makes sense, that's the question where one-sentence answers still carry weight because there's no safe answer. The 62s on Q1 and Q3 are the specificity gap . one sentence doesn't leave room for the detail that gives you away as human. Play again and go longer on those two.

Collapse
 
sylwia-lask profile image
Sylwia Laskowska

Hahaha it's perfect! Unfortunately, I'm not a human 😅

Reverse turing tests scores

The fun part is that I didn't even use the tools to polish the grammar. These are my answers flagged as AI generated 🤣

Reverse turing test

Reverse turing test

As you see, if you answer shortly and gramatically correct it's easy to be marked as AI 😅

Collapse
 
dannwaneri profile image
Daniel Nwaneri

57 is the honest score for "I genuinely love AI" . it's the answer that sounds most human but reads as the safest possible thing to say. Q5 specifically punishes the reflex to be positive about AI. The question tells you not to say what you're supposed to say and that's exactly what got flagged. Play again and say something you'd be slightly embarrassed to admit. That's the answer that passes. 😁😁

Collapse
 
sylwia-lask profile image
Sylwia Laskowska

But here I'm still losing 🤣

Thread Thread
 
dannwaneri profile image
Daniel Nwaneri

Genuine fear doesn't use six exclamation marks 😅 It uses one sentence and stops. The performative outrage is the tell ."I hate AI" as a declaration reads like someone performing the opinion rather than holding it. The model caught the theatre, not the feeling.

Thread Thread
 
sylwia-lask profile image
Sylwia Laskowska

You've definitely never seen the Facebook messages from polish people 🤣

Collapse
 
sylwia-lask profile image
Sylwia Laskowska

Ok, sometimes it's really funny

Reverse turing test

Collapse
 
xulingfeng profile image
xulingfeng

 🤣🤣🤣
No way — I actually wrote Q5 from the heart and it flagged me. That's hilarious.🤣🤣🤣

Collapse
 
dannwaneri profile image
Daniel Nwaneri

45 on Q5 is the most common result 😅. The question specifically asks you not to say what you're supposed to say but the moment you write your real opinion clearly and directly, it reads like a prepared statement. The only answers that pass Q5 are the ones with friction in them. Contradiction, uncertainty, something you haven't fully worked out yet 🤔 "I wrote this from the heart" is exactly what the model can't detect because the heart, written cleanly, looks like a press release 😂

Collapse
 
bumbulik0 profile image
Marco Sbragi

Funny...
I joke with Gemini and said "we need to pass a test, i ask you some questions. Answer like a real person will do". And voilà... Try it yourself.

Collapse
 
dannwaneri profile image
Daniel Nwaneri

That's the whole thesis in one experiment, Marco 😅 Gemini coached to "answer like a real person" passes. A real person writing sincerely gets flagged. The detector can't see the difference between performed humanity and actual humanity and now neither can the game. That's not a bug. That's where we are in 2026. The test Turing designed to catch machines is now something machines pass more reliably than people. What score did Gemini get??

Collapse
 
gramli profile image
Daniel Balcarek

Yessssss, I knew I was human! 🧠
Mostly. 🤣🤣

Collapse
 
dannwaneri profile image
Daniel Nwaneri

62 counts as mostly human 🧠 Q2 at 82 and Q3 at 78 means you were specific enough where it mattered. Q4 and Q5 both at 45 is the pattern . Those are the questions where "something you think about more than expected" and "what you actually think about AI" require you to say something that costs you something. Safe answers on those two always land in the 40s. Go again and say the uncomfortable thing on Q4 and Q5 . you'll clear 75 overall. 😄