The Pantheon of Tokens: Why Developers Rank AI Models Like Greek Gods and How It's Quietly Sabotaging Their Architecture

#discuss #programming #webdev

The Mythology of AI Models: Why They're Treated Like Greek Gods, and the Damage It Can Cause

Last week I caught myself saying "Claude is better at reasoning" like I was talking about a person. That sentence should have scared me more than it did.

We've created this weird mythology around AI models. And it's messing with our engineering choices in ways we don't often openly discuss.

They Have Names Now

Somewhere in the last couple of years, we stopped comparing and started choosing sides. Claude became the "logical one". GPT became the "all-rounder, powerhouse". And Sirius became the "contractor, if-you-grease-his-palm-he-can-bring-his-brother".

We evaluate them like Greek gods. Poseidon is strong, but dangerous. Athena is clever, but too specialized. We attribute human traits to probability distributions. 🧠

I do this too. My 15-person company has Slack debates over which model "gets" our prompt data more. We say "gets" with a straight face.

Why We Humanize the Math

It's not because we're dumb. It's because it's easier.

If you regularly interact with something that outputs coherent English, the thing your brain evolved to reason about automatically is social relationships. So you give it intentions, preferences, personality. "Claude is acting funny today" is an actual thing I said in a standup meeting (not standup comedy).

Humanizing isn't the issue. The issue is we're letting vibes write our code.

→ We select models based on vibing with a sample of their marketing copy instead of running them on your actual workload.
→ We make architectural decisions based on what a "smart" model can or can't do, rather than what you need to swap in and out.
→ We commit to models like they're spouses, rather than vendors.

The Coddling Myth

Here's where myth-making gets hella expensive.

I watched our team spread out this entire prompt pipeline designed around a quirk only one model had. When that model had some downtime on their API, we suffered. No fallback. No abstraction. Just praying to the AI-realms.

The mythology made us forget the most important question: "What actually does this task need?" and instead encouraged "What would Claude want?" Engineering gold from engineer lead.

Smart teams I chatted with treat models like databases. You pick one for the workload. You build an interface that lets you swap. You don't get "In Loving Memory Of Larry, The 2023 Model" tattooed across your back.

Tier Lists Are A Trap

Dev Twitter sho 'nuff loves tier lists. S-tier, A-tier, "needs to be supervised while generating list" tier.

But capabilities shift every few months. January's strong independent model making its own financial decisions is June's deadbeat model late on alimony. Don't base your architecture on an Instagram snapshot.

→ The model you love today might get an update that leaves you hanging tomorrow.
→ There is no best model. Only best model *for this specific task right now*.
→ Plan for a god funeral.

What I Actually Did

So now my team grumbled at a couple of thinner, dumber principles.

Never hard-wire intelligence assumptions into a program. Keep it thin, replace it at will. Benchmark the hell out of it on your real desires. When we did, our rankings pleasantly surprised us. The "inferior" model could handle three out of our five prompts.

Mythology’s easy to dispel when you got the receipts. 🔥

The Takeaway

Mythologizing models is fine around the watercooler. When it hits the architecture meeting, the design sprint, the vendor lock-in, remember: you're planning a shrine for something that auto-updates.

Treat smarter-than-average tools like potentially dying gas station gods. Make them prove with quarterly miracles they still got it.