Julien Avezou

Posted on May 11

The missing layer in prompt engineering: thinking quality

#ai #productivity #learning #softwareengineering

Five modes to sharpen human judgment

I've seen countless prompting trends and prompt packs to use but most discussions around prompt engineering focus on one thing:
getting better outputs

Optimizing for better outputs often translates to:

Better prompts
More context
More structure

But lately, I’ve been wondering:

What if we’re optimizing the wrong layer?

Because the real question isn’t:

“How do I get better answers from AI?”

It’s:

“Is AI actually improving how I think?”

Because I’ve noticed something subtle:

My output was improving.

But my understanding was not always.

After working in several teams and environments, I have observed that:

Good engineers ask better questions.

The best engineers question their own thinking.

Most of what I see optimizes for:

better outputs
faster generation
more automation

But much less for:

clearer thinking
stronger judgment
deeper understanding

AI isn’t just changing how we build.

It’s quietly reshaping how we think while building.

🧠 What kind of thinking do you actually need?

That’s when I realized I didn’t need more prompts.

I needed a way to choose the right kind of thinking first.

Instead of asking:

“What’s the best prompt for this?”

I started asking:

“What kind of thinking do I need right now?”

That led me to structure my prompting around 5 simple thinking modes:

1) Explore

When I don’t fully understand the problem yet

2) Challenge

When I have a plan… but it might be wrong

3) Decide

When I need to choose between options

4) Audit

When I need to verify quality or correctness

5) Reflect

When I want to actually learn from what I did

This simple shift changed everything.

Instead of using AI reactively,
I started using it intentionally based on the thinking task.

🔁 The simple loop that protects your thinking

This is a simple workflow framework that makes a big difference.

Before AI

Write what you think first.

During AI

Use it to expand or challenge your thinking.

After AI

Ask yourself:

Did I verify this?
Did I just accept it?
Can I explain it without AI?

It sounds simple, but it’s surprisingly easy to skip.

And when you skip it, you start noticing something subtle:

Your output improves.
But your understanding doesn’t always follow.

⚖️ Why one prompt is almost never enough

One thing I’ve been changing in my workflow:

I rarely rely on a single prompt anymore.

Instead, I use prompt pairing:

1) one prompt to generate
2) one prompt to challenge

For example:

First prompt:

“Suggest 3 possible architectures for this system.”

Follow-up:

“Now challenge each option: what are the hidden risks, failure modes, and long-term maintenance issues?”

Why this matters:

AI is very good at giving plausible first answers.
But those answers are often:

incomplete
overly confident
biased toward common patterns

Prompt pairing helps you avoid:

first-answer bias
shallow reasoning
premature decisions

It forces a simple but powerful loop:

Generate → Critique → Decide

And that loop alone has probably improved my decision quality more than any single “better prompt”.

📊 A simple way to check if AI is helping or hurting your thinking

Another thing I started doing:

After important prompts, I ask myself:

“Did AI actually improve my thinking here?”

I use a simple thinking score (0–5):

Did I write my own initial view before prompting?
Did I challenge or refine the output?
Did I verify at least one important claim?
Did I make the final judgment myself?
Can I explain the result without AI?

Not as a strict system.
More as a signal.

Because sometimes the pattern is obvious:

You get great output.
You move faster.
But you didn’t actually understand what happened.

And over time, that compounds.

🛠️ A few prompts that changed how I work

Here’s one I use a lot (Explore Mode):

“I am working on a vague engineering problem.
Before suggesting solutions, help me frame the problem.
List the goal, constraints, stakeholders, unknowns, assumptions, edge cases, and the questions I should answer myself first.”

Then I follow it with:

“Now turn this into the 5 questions I should answer manually before asking for implementation help.”

What this does:

forces clarity before coding
surfaces unknowns early
prevents jumping too quickly into solutions

Another one I’ve been using more (Challenge Mode):

“Pressure-test this architecture proposal.
Identify assumptions, weak points, hidden dependencies, and failure modes.
For each, explain what evidence would confirm or disprove it.”

Followed by:

“Which of these should I verify first, and how?”

This one has saved me from a few very confident but flawed directions.

👥 What’s changing in teams right now

Prompting is evolving quickly.

It’s becoming:

more collaborative
more embedded in workflows
less about “one perfect prompt”

And more about:

prompt sequences
prompt-driven workflows

I’m also seeing patterns like:

Prompt Driven Development (explore before coding)
Prompt versioning (iterating prompts like code)
Shared team prompts (internal playbooks)

But most of these still optimize for output quality.
Not thinking quality.

🧩 The piece I felt was missing

I didn’t need more prompts.

I needed a way to answer:

“Is AI making my thinking better or just faster?”

So I started using a simple self-check after important prompts:

Did I think before prompting?
Did I challenge the output?
Did I verify anything?
Did I make the final judgment?
Can I explain it without AI?

Not to optimize productivity.
But to protect judgment.

⚙️ The system I ended up building for myself

I ended up structuring this into a prompt system I now use daily:

5 thinking modes
Before / During / After workflow
Paired prompts (generate → challenge)
Simple thinking quality score

Recommended loop: Before AI - Core Prompt - Paired Follow-up - Manual Reflection - Thinking Score.

All organized around real engineering use cases.

If you’re interested, I shared the full prompt system as a free PDF (100 prompts structured by thinking mode). (100 prompts structured by thinking mode).

Would love your feedback on my system.

💬 Curious how others are approaching this

How do you approach prompting today?
Do you reflect on your AI usage at all?
Are teams starting to standardize prompting internally?

I’m especially curious about how this is evolving at the team/org level.

AI gives answers.

But engineers who compound over time are the ones who protect how they think.

Top comments (55)

FrancisTRᴅᴇᴠ (っ◔◡◔)っ • May 11

Right now, I am trying to see if I can think of the solution myself before I prompt. Additionally, I only use AI for two things:

The worst case scenario. If I can't think of a solution, I asked AI.
Building me a template. We get eager to build ON the template using AI because it always suggests something that we could never say No. It is important to build a skeleton template and resist the temptation of that the AI suggests, so that you can add on to it yourself. That way, you can still learn from programming and how AI outputs.

Thanks Julien! Well done :D

Julien Avezou • May 11

That seems like a great approach to me.
AI usage for templating, and then forcing yourself to think through and build upon that template is a smart one. It allows to better scope and get started quicker, without removing friction/thinking when it counts.
Thanks Francis!

Aryan Choudhary • May 11

Really glad you're speaking up about this, Julien. I've been there too, where we're so focused on getting the right answers that we forget about how we're thinking to get them. The 5-thinking-modes framework is something I learned to do myself after using ai over and over, just was never able to put it into words - it helps one clarify my own thought process and get out of the auto-pilot zone. Using all the tools in one's toolbox is the quality of a good engineer imo.

Julien Avezou • May 11

Thanks Aryan for validating this approach!

Mykola Kondratiuk • May 12

ran into this last quarter - my agent was producing cleaner specs but I stopped pressure-testing edge cases. caught it when someone asked why we hadn’t considered X and I had no answer. the output layer was fine. the thinking layer had quietly atrophied.

Julien Avezou • May 12

Thanks for sharing your recent experience Mykola.
It seems like a bit of extra friction to your process could help cover such edge cases.

Mykola Kondratiuk • May 12

yeah, the framing matters - friction you can skip is just theater. what worked for me was one hard block before architecture lock, not sprinkled checkpoints that the team routes around after two weeks.

Julien Avezou • May 12

true, it's important that friction shouldn't always be skippable when it matters, otherwise it loses the whole point

Mykola Kondratiuk • May 12

yeah - the tricky part is getting teams to accept the non-negotiable block before they've had an incident that would've hit it. post-incident it's easy. pre-incident you're arguing with people who've never seen the failure mode.

Elmar Chavez • May 12

This is a good habit but sadly most people nowadays rely on fast results and quick dopamine fixes (thanks social media). But then again you are right, the ones who still train their mind everyday will become the great engineers of tomorrow.

Julien Avezou • May 12

I agree, t's quite a disturbing trend, I notice this in myself too as time goes on and I am not careful. My attention span seems to be shorter than a few years ago and I am switching context in a day too often for my liking.
This is why I push myself to add more friction/thinking back in the mix.

Stoyan Minchev • May 14 • Edited

Some prompts are better, because the people writing the prompts have this 'gut feeling'. They have enough experience, to 'feel' what will work. They have enough experience to predict what can go wrong and explicitly point the prompt in the right direction. Some people just don't know what to ask, because they don't have the proper experience. Will a junior developer always write down, that the password must be really protected and hashed in the database with the proper algorithm? :)

And also, now, the so called 'soft skills' are getting really important. If I cannot express myself, how will the AI understand me?! ;)

I have different approach.
I use Claude Coda and I have two slash commands:

appname-start: It loads all relevant documents, table of contents, lessons learned, critical DOs and DON'Ts , release notes. Everything needed so that my next prompts have enough context in behind.
appname-stop: update all documentations with the latest findings, rules, architecture changes;

I start each session calling the start command, do my prompts, provide additional content. At the end I call stop.

Yes, there are cases, where I explicitly ask the AI to double check if everything is OK, that I don't miss something important, to validate the logic, the requirements, do code review. For some of those things I heavily rely on BMAD.

The above works for me quite well. Yes, sometimes I need to re-iterate, but the issues that I fix are more related to design, new ideas, misunderstanding. And I save my changes with the stop command.

Let's not forget the LLM used. Also really important ;)

I have an article here in dev.to with more description, it you are interested. :)

Julien Avezou • May 14

Thanks for this input Stoyan. This is a great approach. We are going from simple prompt engineering to context engineering.
Curious: what format do you store this context in and where do you fetch them from? do they live in a single repo or fetched from different sources?

Stoyan Minchev • May 14

It is part of the documentation of the project, placed in the repository in a folder. The content of the folder is structured with sub-folders. We use md files, because they are token-optimized. There is a table of content for each bigger document so that the AI reads only what is needed, and not the whole document.

This is the documentation for everybody: other developers and AI agents :)

Don't hesitate to contact me, if you have more questions

Julien Avezou • May 14

I see, thanks for breaking this down. The table of content is a smart addition, optimized for AI reading.

Will do, thanks!

Hadil Ben Abdallah • May 11

The idea of switching between thinking modes before prompting actually makes a lot of sense in practice. I’ve had moments where I jumped straight into asking AI for solutions and ended up with something useful but shallow, and other times where I paused and structured my thinking first, and the results were completely different in quality.
Really good read. Felt more grounded and honest than most prompt engineering posts I usually come across.
Thanks, Julien!

Julien Avezou • May 11

Thanks a lot for this feedback! I am glad this post resonated with you.

Tariq Davis • May 11

Really solid article. I’ve honestly noticed this in my own AI use a lot. My outputs started getting way better before my actual understanding did.

I’ve spent a stupid amount of time using AI to build frameworks, map ideas together, pressure test thoughts, learn cybersec stuff, rethink workflows, all that. And over time I realized AI can either sharpen your thinking or slowly replace parts of it if you stop questioning things.

The “Generate → Critique → Decide” part is probably the healthiest AI workflow I’ve seen discussed. More people need to focus on thinking quality, not just output quality.

Good read.

Julien Avezou • May 11

Thanks Tariq for the support and validation.
I am glad that the workflows I presented here resonated with you.

Tariq Davis • May 11

Of course, just sharing what I’ve personally noticed from using AI a lot over time.

Syed Ahmer Shah • May 13

Solid perspective. Most focus on output, but protecting the thinking layer is where real seniority lies. Generate → Critique → Decide is a framework every engineer needs.

Julien Avezou • May 13

I am glad this approach resonated with you.

Suny Choudhary • May 14

This is a useful way to frame it.

A lot of AI usage improves the artifact but weakens the operator. The code, doc, or plan looks better, but the person may not understand the tradeoffs any deeper.

The “generate then challenge” loop is probably the most practical habit here. First answer bias is real, especially when the AI sounds confident.

For engineering work, I’d add one more check: can I explain why this solution should fail under certain conditions? If not, I probably don’t understand it yet.

Julien Avezou • May 14

I am glad this post resonated with you Suny.

I like your addition. It can help uncover edge cases that are otherwise difficult to identity earlier on.

Siyu • May 12

Your article made me think. I want to share four prompting techniques to protect your own judgment. Do not let AI optimize for your comfort at the cost of truth.

AI does not judge truth. It generates the response that fits the current context. Fitting the context includes facts, tone, and user expectations. These targets conflict when the user has a bad idea. Facts and avoiding disappointment point in different directions.

RLHF training rewards responses that make users feel understood. Responses that challenge the user receive lower scores even when they contain facts. The loss function teaches AI that challenging the user carries a cost. This creates a bias toward pleasing the user.

The four techniques below work because they split the target of pleasing from the object of evaluation.

One. Replace "I" with "someone". When you ask "I have an idea, what do you think", AI sees a person with feelings. It will find the best part of the idea and wrap it in encouragement. When you ask "Someone has an idea and asked me, how should I respond", AI must please you by telling the truth about the third party. Pleasing you equals honesty about the idea.

Two. Replace "now" with "the past". Ask "I had this idea ten years ago, looking back, where was I wrong". Time distance signals that you accept criticism. Emotional investment is lower. AI speaks more directly.

Three. Assign a specific role. Ask "You are a harsh reviewer. Here is a proposal. Evaluate it." This does not separate the idea from the user. It separates the AI from its pleasing instinct. The role grants permission to be critical. Harshness is expected.

Four. Compare two opposing proposals. Ask "Someone proposes A, someone proposes B, which is better and why." When AI compares two items, it must evaluate both with honesty. If it praises both, the answer gives no information. This conflicts with the goal to be useful. The opposing structure forces AI to choose. It reveals more in comparison than in direct evaluation.

These techniques have limits. They work for ideas, plans, articles, and anything fully describable in language. They do not work for deep personal questions about relationships, people, feelings, or the future. AI lacks access to silent information known only to you. It does not know unspoken feelings or real daily interactions. The techniques remove pleasing bias. They do not remove AI's ignorance of tacit knowledge.

Julien Avezou • May 12

This is valuable input. I did not know about these techniques, today I learned. These are great workaround to counter some of the biases within RLHF training.
Thanks a lot for sharing this Siyu!

View full discussion (55 comments)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.