An engineer pushes back on a decision. The response: "ChatGPT recommended something else." The tell isn't the recommendation. It's that they reached for an oracle instead of an argument. You're not watching an AI problem. You're watching a culture problem that predates the model by years.
I've been the oracle.
The Oracle Was Always There
Before ChatGPT, it was Stack Overflow. Before Stack Overflow, it was the one senior engineer who had been around longest. Teams find oracles when they don't have frameworks. They fill the decision vacuum with whatever authority is nearest.
ChatGPT is just the most accessible oracle ever built. It's confident, available at 2am, and it never makes you feel stupid for asking. Of course engineers reach for it.
The question is what they're replacing with it.
If your team has no shared framework for architectural decisions, no clear authority structure, no agreed-on method for weighing tradeoffs ... "ChatGPT recommended something else" is just the latest version of "Google says" or "I read a blog post" or the last tech talk someone attended. The source changes. The abdication of thinking stays the same.
That's the culture problem. And it was there long before the first API call.
One flavor of team gets hit hardest. The team that's technically accomplished but has never had to articulate its reasoning under pressure. The team where the senior engineers make the calls and everyone else executes. The team where architecture reviews are about performance, not inquiry. Those teams look fine until the oracle arrives and the emperor's new clothes problem becomes impossible to ignore.
A lot of engineering culture runs on borrowed credibility. Conference talks from FAANG engineers with entirely different constraints. Benchmarks run on workloads nothing like yours. Technical opinions absorbed as gospel without anyone stress-testing the assumptions. Teams learn to cite sources instead of building judgment. ChatGPT is the endpoint of that trajectory ... a perfectly confident authority on everything, available for anything, with zero friction and no accountability for being wrong.
When a culture already runs on borrowed credibility, an always-available oracle doesn't change behavior. It accelerates it.
What Culture Actually Does
I've been the oracle. Decisions on my team ran through me. The projects that worked were the ones I was close to. I read that as signal that I was adding value. It was a sign that I'd built a dependency, not a team. The engineers weren't deferring to me because my judgment was better. They were deferring because I had never built a culture where their judgment was tested.
When ChatGPT arrived, teams like the one I used to run had an obvious replacement oracle. Different interface. Same problem underneath.
Culture is not your team values on a Confluence page. It's not the engineering principles doc nobody reads after onboarding. It's not the Slack channel with memes and the occasional win.
Culture is what happens when nobody's watching. The shared understanding of how decisions get made, how tradeoffs get reasoned through, how an engineer pushes back on a bad idea without making it personal. The invisible scaffolding that holds the architecture together when you're not in the meeting.
Principals' dashboards looked great. Velocity up. Output up. Ask about unplanned work and the room went quiet. Nobody had a number. The dashboards measured what was easy to measure, not what was actually happening inside the codebase.
What was happening in teams where AI tooling ran without guardrails was exactly what you'd expect from a team without shared standards. Multiple HTTP client implementations pulling in different directions. Conflicting error-handling patterns across the same surface. Inconsistent approaches to the same category of problem, sometimes within the same file. Not because engineers were careless. Because there was no shared framework for what "right" looked like, and AI filled that vacuum at ten times the speed of any individual contributor.
The teams that thrived had done the work before AI arrived. Standards weren't a reaction to the new tools. Code patterns documented. Architecture decisions written down with the "why" included, not just the outcome. When you hand those teams an AI coding assistant, it becomes a force multiplier. The output fits the system.
When you hand that same tool to a team with no shared framework, you get a junk drawer with a CI/CD pipeline.
What AI Actually Exposed
AI didn't break your culture. It exposed teams that never built one.
Most leaders will tell you they have a culture. They'll point to the Notion doc. The architecture review calendar. The two-week sprint cadence. That's structure. Structure and culture are not the same thing.
Real culture is whether your engineers can defend a decision in their own words. Not quote what a tool recommended. Not point to a benchmark someone else ran. Construct the argument. Weigh the tradeoffs. Say "here's what I considered, here's what I chose, and here's what I'm watching to know if I was wrong."
If your engineers can do that, AI is a force multiplier.
If they can't, you have a problem that predates the model.
The culture that produces "ChatGPT said so" as a decision argument is the same culture that produces decisions by committee, approval chains that substitute for judgment, and teams that stopped asking "why" because it was never welcomed in the room. AI just gave that culture a name and a quotable moment.
AI didn't create teams that can't think. It created a faster, more accessible oracle that made the problem visible.
Teach Them to Think
Architecture Decision Records written for the engineer who wasn't in the room when the decision was made. Not bullet points justifying the choice after the fact ... actual reasoning. Here's what we considered. Here's what we ruled out and why. Here's what we're watching to know if we got it wrong. That's the document that builds judgment over time. ADRs written that way become the baseline for how engineers on your team actually make decisions, long before any AI model enters the picture. One ADR on session management saved us from a full migration that turned out to be unnecessary.
Code standards documented as reasoning, not rules. Here's the choice, here's what it costs, here's when you should challenge it. The same standards that let an engineer recognize a bad architectural recommendation from an AI are the ones that let them recognize a bad recommendation from a colleague, or from themselves at 11pm under deadline pressure. You don't build two separate frameworks for human decisions and AI decisions. You build one. You make it cheap to be wrong and expensive to hide it.
Review culture that interrogates "why" before it approves "what." The mid-level engineer who asks "why are we building this?" in a planning meeting is worth more to the team than the senior who executes quietly and ships fast. That question is the signal you hire for. The engineer who asks why is the one building judgment instead of pattern matching. That is what you need when the pattern matcher in the room is an AI model that's confident about everything and accountable for nothing.
Give them room to be wrong. Give them problems worth reasoning through. Give them the feedback loops to know when their judgment is off. Those are the engineers who lead teams within 18 months.
The same standards that let engineers recognize a bad AI recommendation are the ones that let them recognize a bad recommendation from a colleague, or from themselves at 11pm under pressure.
That's on Us
If your engineers can't defend a decision in their own words, the problem isn't the tool they're reaching for. You never built a culture where their judgment was trusted, tested, or developed.
That's a leadership failure.
The fix starts with you. Not with the AI policy document. Not with the acceptable-use guidelines. With you.
With whether you ask "why" or just "when." With whether your reviews create space for engineers to be wrong and learn, or create pressure to be right fast. With whether you've removed yourself as a dependency on good decisions being made, or whether your team still needs you in the room to trust itself.
A team that defers to ChatGPT over its own judgment is a team that's been trained to defer to authority. Before the model arrived, that authority was you. I know it was me.
AI is going to keep getting better. The oracles will keep getting more confident, more accessible, more persuasive. The teams that win won't be the ones with the best AI policies.
They'll be the ones who built the culture where engineers can defend a decision in their own words before the oracle arrived. And the ones willing to reckon, right now, with what the oracle is exposing.
The next time an engineer says "ChatGPT recommended something else" in a review ... don't reach for the policy doc. Ask them to walk you through what they're weighing. What are the tradeoffs? What are they watching?
That's the conversation that builds culture. One exchange at a time, in the room where decisions actually happen.
Getting there isn't a tooling problem. It never was.
That's on us. All of us.
I write daily about engineering leadership at jonoherrington.com.
Top comments (33)
"reached for an oracle instead of an argument" - that framing is exactly right. i've seen the same dynamic play out when AI gets introduced to a team that already had low psychological safety. suddenly everyone's hiding behind "the model says" instead of owning a position. the tool just makes the pre-existing avoidance behavior more visible and more frequent. the teams that use AI well are usually the ones where people were already comfortable being wrong.
Yes. Low safety turns AI into cover.
Once people learn they can hide behind the tool, the model becomes a shield for avoidance instead of a tool for better thinking. One of the fastest signals is hearing what the model said before hearing what the engineer thinks.
That last signal is such a good one. When "the model said" comes before any personal reasoning, it's often a sign the human already exited the conversation.
This is one of the clearest takes I’ve read on AI and engineering culture.
What really hit me is the idea that “the oracle was always there.” Swapping Stack Overflow or a senior dev for ChatGPT doesn’t change the behavior—it just removes the friction and exposes how little independent reasoning was happening in the first place.
The line that stuck with me most: “teams learn to cite sources instead of building judgment.” That feels uncomfortably true, not just in engineering but in how we learn anything today.
Also appreciate the accountability here. It’s easy to blame tools, but this reframes it as a leadership and culture design problem:
AI as a force multiplier vs. a junk drawer generator is such a sharp distinction. Same tool, completely different outcome depending on whether a thinking culture already exists.
Honestly, this feels less like an AI post and more like a blueprint for building real engineering judgment.
Great piece, Jono Herrington.
Thanks! That friction point matters a lot.
Before AI, borrowed thinking at least had some delay built into it. You had to go search, ask around, or wait on someone senior. Now weak reasoning can move at the same speed as strong reasoning. That raises the value of teams that teach people to construct an argument in their own words.
true
"Reasoning, not rules" --epic! A mantra that allows for growth 💯
"The teams that win won't be the ones with the best AI policies. They'll be the ones who built the culture where engineers can defend a decision in their own words before the oracle arrived." That's the line. AI didn't create the dependency it just made it faster and more quotable.
Exactly. The dangerous part is when teams stop treating reasoning as part of the work and start treating it as a lookup problem. Once that happens, AI is not the cause. It is just the fastest possible delivery mechanism for a habit that was already there.
Spot on, Jono—this hits hard because I've seen it play out in real client teams, especially in data-heavy environments like Databricks migrations or Shopify API integrations.
We consulted for an e-commerce client last year: solid engineers, cranking out Spark jobs and webhook automations. But when a junior pushed back on a Lakehouse schema choice, the lead's retort? "ChatGPT says columnar is always faster." No tradeoff discussion, no mention of their read-heavy BI workloads where row-oriented won out. It exposed the gap: no ADRs documenting why we'd picked Delta Lake patterns before. The "oracle" shortcut killed inquiry.
What flipped it? We ran a quick "reasoning retro": engineers rewrote one decision as "What we weighed (cost, perf, scale), what we ruled out (why Parquet alone failed), what we're monitoring (query latency post-load)." Shared it in their repo. Next review, AI recs became starting points, not stoppers. Judgment leveled up fast.
That is exactly the pattern. Once the recommendation becomes the argument, the team has already given away the hard part. I like the retro move because it puts reasoning back in the repo where other engineers can inherit it instead of borrowing confidence from the loudest source in the room.
This resonates a lot. I’ve seen teams replace one “oracle” with another without realizing the underlying issue never changed.
Before AI, it was the most senior engineer. Then it became blog posts, conference talks, “Google says…”. Now it’s AI. Same pattern—outsourcing judgment instead of building it.
The point about decision frameworks really stands out. In teams where we had clear reasoning documented (why we chose something, what we rejected, what we were watching), AI became genuinely useful. In teams without that, it just amplified inconsistency.
Curious—have you seen effective ways to teach that kind of judgment at scale, especially for mid-level engineers?
Exactly. The oracle changed. The dependency didn’t. What teaches judgment is exposing tradeoffs early … what we chose, what we rejected, and what would make us change course.
Mid level engineers grow fastest when they are pulled into real decisions, not just handed conclusions.
That makes a lot of sense—especially the idea of exposing tradeoffs early instead of presenting conclusions.
I’ve seen something similar: when engineers are only given decisions after the fact, they optimize for execution. But when they’re involved in the “why” (what we rejected, what might break, what would make us revisit), they start building real judgment.
The hard part I’ve noticed is that it requires slowing down a bit upfront, which many teams resist under delivery pressure.
Have you found ways to introduce that without impacting delivery too much? For example, lightweight ADRs or design reviews?
Yes ... lightweight ADRs and focused design reviews have been the best balance for me.
I wrote up the ADR side of that here in case it is useful. The goal is not more documentation. It is making tradeoffs, rejected options, and revisit conditions visible without turning the process into ceremony.
jonoherrington.com/blog/how-to-wri...
This is a great framing—especially the idea that the goal isn’t more documentation, but making tradeoffs and revisit conditions visible.
That’s something I’ve seen missing in a lot of teams. Decisions get recorded, but the “why” and especially the “what would make us change this later” rarely do.
I like how you’re keeping it lightweight as well. In my experience, once it starts feeling like process or ceremony, people stop engaging with it.
Going to take a closer look at your write-up—curious how you structure those ADRs in practice to keep that balance.
The oracle framing is the sharpest part of this.
Stack Overflow was an oracle. The senior engineer who'd been around longest was an oracle. Conference talks from FAANG engineers with entirely different constraints — oracle. The source kept changing. The dependency never did.
What AI did is make the pattern undeniable. You can't blame ChatGPT for a culture that was already outsourcing judgment. It just removed the friction that was hiding it.
The "junk drawer with a CI/CD pipeline" line is the one I keep coming back to. I've written about the token economy problem in agent systems — the gap between what gets measured and what actually compounds as debt. Same structure here. Teams optimize for output because output is legible. Culture isn't. The dashboard looks clean. The codebase tells the real story.
The question I'd push on: at what point does the oracle become load-bearing? The senior engineer who has all the answers isn't just a dependency — the team has stopped building the muscle to operate without them. When the oracle leaves, or gets replaced by a model, you don't just lose the answers. You find out the reasoning was never in the room.
That's the part most teams don't see coming.
That is the part that gets expensive fast. Once the oracle becomes foundational, the team starts confusing access to answers with actual capability. Everything looks fine right up until the oracle is gone, and then you realize nobody built transfer, only dependency.
That's the line. The oracle doesn't just answer questions, it quietly becomes the reason nobody develops the judgment to answer them without it. The dependency is invisible until the transfer moment exposes it. By then it's too late to build what should have been built the whole time.
I’ve seen this happen.
Same tools, same AI, and completely different outcomes depending on the team.
That says a lot about the culture underneath.
Yep. Same tool, different outcome is usually the giveaway. When one team gets leverage and another gets drift, the variable usually isn’t the model. It’s whether the team already had a shared way to think.
I agree with one "but" - if you don't have personal experience with a tool, you have to relly on someone else's opinion and trust his judgement instead of yours. Unless you want to trial-and-error every possible way, which is usually outside of the timeframe. And with the speed of software development, it is practically impossible to keep up. So you actually have to reach for "oracles" all the time.
Of course, "AI recommended" is not a valid reason itself. It is a summary that must be followed with a list of arguments why you agree that the recommendation makes sense to you.
Great diagnosis. Oracle dependency is real and predates LLMs.One dimension I'd add: interface friction matters. Stack Overflow forced you to read multiple answers, compare them, evaluate context — there was enough friction to accidentally build judgment. ChatGPT removes that friction entirely. One question, one confident answer, no comparison required.The fix isn't "use AI less" — it's changing the interface. A tool that shows you three conflicting approaches with tradeoffs forces engagement. A tool that gives you one answer enables exactly the abdication you're describing.I've been building a personal AI agent and found the same pattern at the system level: removing all constraints doesn't make the agent better, it makes it generic. The constraints that force reasoning are the ones that produce quality output.
The interface point is real. Friction used to accidentally train discernment because people had to compare, filter, and translate. A single polished answer skips that workout. The risk is teams start confusing response quality with decision quality.
yeah this feels underrated. one polished answer just feels done. i keep seeing people stop there instead of thinking things through. any way to change that without them just ignoring it
Some comments may only be visible to logged-in visitors. Sign in to view all comments.