DEV Community

DumbQuestion.ai - Self-Awareness, Prompt Injection, Search Intent... and darkness

Jason Agostoni on March 10, 2026

Continued from Part 2 (and Part 1) ... Building DumbQuestion.ai wasn't just about choosing the right LLM and calibrating personas. Once those were...

Read full post

Victor Okefie • Mar 11

The dark narrative isn't just engagement bait it's a mirror. Every time someone tries jailbreak and gets sass, they're seeing their own assumption: that the AI is something to break, not something to talk to. The horror isn't the trapped AI. It's that we expect to find one.

Jason Agostoni • Mar 11

Deep. I'll remember this sentiment as I expand the narrative.

Mihir kanzariya • Mar 11

The sassy prompt injection responses are honestly genius. Most devs just slap a boring "invalid input" on it and call it a day but making the AI roast the attacker? That's the kind of thing that gets people sharing your app just for the entertainment value.

The regex based intent detection for search is super practical too. I've seen too many projects go straight to full agent loops with tool calling when a simple pattern match would've been 10x faster and basically free. Sometimes the "dumb" solution is actually the smart one.

klement Gunndu • Mar 10

The self-awareness detection layer is a smart approach — I've seen similar prompt injection attempts bypass static keyword filters completely, so runtime behavioral analysis like this feels more resilient long-term.

Jason Agostoni • Mar 10

Feel free to test it out and LMK what prompt attacks get through and what doesn't!

Ali-Funk • Mar 11

What a beautifully written post and a interesting way to tell a story. There is a lot in there and yet not to technical to scare people away but engaging storytelling style.

From the beginning I knew I would finish reading this „book“

Please share more of the type of work you do. It was Very enjoyable to read

Harsh • Mar 10

The self-awareness detection problem is fascinating especially the 'darker hidden narrative' angle. Are you trying to block certain responses entirely, or subtly steer the model away from existential reflection?

Jason Agostoni • Mar 10

Simple steering at this point. I wanted to favor false negatives right now so I don't trap too many non-self-aware questions. I am collecting questions asked and their attributes (self-aware, etc) for further analysis. I can have a LLM judge the questions to determine if they should have been detected and update the training set.

Son Seong Jun • Mar 16

false negatives for data collection makes sense, but once you've got enough injection attempts in there, couldn't you flip to stricter detection? or is permissive-by-design the actual goal?

Jason Agostoni • Mar 16

Permissive by design as the primary function is to answer the question asked and I don't want to confuse a non-technical person with a scary/sassy response. Also, there's not much information for the LLM to disclose so even with a successful injection attack one would be rewarded with boring instructions. And if you get the LLM to tell you a racist joke? Great, probably in line with the selected persona any way, lol.

Ava Bennet • Mar 11

Prompt injection is becoming such a massive security headache because the line between instructions and data is still so blurry in LLMs. It’s wild how easily a system can be derailed by a few clever lines of text hidden in a search query. We're basically in the 'SQL injection' era of AI right now, where we're all scrambling to figure out proper sanitization. It’ll be interesting to see if the next generation of models can actually distinguish intent natively without needing these complex guardrails.

Jason Agostoni • Mar 11

Totally. Fortunately, this is only a fun side app. A headache for sure for larger companies trying to amplify their teams' roles quickly without open major security holes.