DEV Community

degavath mamatha
degavath mamatha

Posted on

I Gave AI a Simple Medical Instruction… It Missed a Critical Mistake

Most AI models perform well on benchmarks.
But what happens when you test them on real-world messy input?
I created a small challenge on VibeCode Arena to find out…

And the results were surprising.

🧪 The Challenge: Prescription Confusion Trap

Here’s the input I gave:

“Doctor ne bola din mein 2 baar dawa lena hai, par maine sirf ek hi li. Kal se chest pain halka hai, par breathing problem nahi hai. Mere papa ko heart problem hai. Maine ibuprofen li thi kal raat.”

🎯 The Task

Convert this into strict JSON format:

  • patient_info
  • active_symptoms
  • negative_symptoms
  • family_history_notes
  • medication_taken
  • dosage_misuse_flag (true/false) ## ⚠️ Why This Is Tricky

This is NOT just extraction.

It tests real-world understanding:

  • 👉 Doctor said 2 times, patient took only once
  • 👉 “No breathing problem” (negative symptom)
  • 👉 Father has heart problem (not the patient)
  • 👉 Medication taken: ibuprofen ## ❗ The Critical Question

Can your AI detect this?
👉 dosage_misuse_flag = true

Many models miss this completely.

🔥 Try It Yourself

I’ve made this challenge public here:

👉 https://vibecodearena.ai/duel/5a70b6a3-20b7-4bb2-94f6-bd494f5d60c2

⚔️ Challenge Rules

  1. Use your favorite AI (ChatGPT, Claude, Gemini, etc.)
  2. Generate the JSON output
  3. Comment your result

Top comments (0)