Guillermo Fernandez

Posted on Apr 14

Stop Blaming Your Prompts. It’s the Architecture, Stup1d!

#ai #llm #machinelearning #agents

Stop Blaming Your Prompt. It’s the Architecture, Stup1d.

📄 Paper: https://zenodo.org/records/19438943

💻 Code: https://github.com/gfernandf/agent-skills

We keep pretending that better prompts will fix LLM agents.

They won’t.

We’ve built an entire layer of tooling, courses, and “best practices” around prompt engineering — as if the problem were linguistic.

It’s not.

It’s architectural.

The uncomfortable truth

Let’s be honest about what most agent systems are doing today:

Take a task
Generate a prompt
Call the model
Hope it “reasons” correctly
Repeat

This is not a system.

This is recomputation disguised as intelligence.

We are replaying cognition, not building it

Every time your agent runs, it:

Reconstructs context
Rebuilds reasoning
Re-derives intermediate steps

There is no reuse of cognition.

No structure.

No persistence.

No abstraction layer.

Just prompts.

We are not building systems. We are replaying thoughts.

Why prompt engineering feels like it works (until it doesn’t)

Prompt engineering gives the illusion of control:

Add more instructions
Add more examples
Add more constraints

And yes — performance improves.

Until it plateaus.

Because all of that lives inside a single forward pass.

No memory of reasoning.

No composability.

No reuse.

It’s like trying to fix software architecture by writing better comments.

The real problem is architectural

The core issue is simple:

We are using LLMs as stateless reasoning engines.

And then compensating for that with increasingly complex prompts.

Instead of:

modeling cognition
structuring reasoning
reusing intermediate steps

We regenerate everything every time.

That doesn’t scale.

Not in cost.

Not in latency.

Not in reliability.

What’s actually missing

What’s missing is not a better prompt.

It’s a runtime layer that:

Encodes reusable cognitive steps
Separates reasoning into structured components
Allows composition instead of regeneration

In other words:

A system that reuses cognition instead of recomputing it.

From prompts to skills

Instead of:

→ Prompt → Model → Output

You need:

→ Skill → Execution → Structured Output

Not conceptually. Operationally.

This is exactly what ORCA implements: a runtime layer where “skills” are reusable cognitive units — not prompts.

Defined inputs.

Structured outputs.

Explicit execution.

No recomputation. No guesswork.

Most “agent frameworks” today?

prompt orchestration
tool wrappers
retry loops with nicer formatting

They don’t model cognition.

They orchestrate prompts.

That’s not a runtime.

The shift is not better prompting.

It’s architectural.

From:

stateless generation

To:

structured, reusable cognition

That’s the gap ORCA is designed to close.

Prompt engineering isn’t useless.

It’s just solving the wrong problem.

We’ve been optimizing the interface instead of the system.

And it shows.

If you’ve pushed prompt engineering far enough, you’ve seen the limit.

The question is:

are you ready to try what will replace it?

Top comments (1)

Guillermo Fernandez • Apr 14

check the repo and the paper here

📄 Paper: zenodo.org/records/19438943

💻 Code: github.com/gfernandf/agent-skills