DEV Community

Guillermo Fernandez
Guillermo Fernandez

Posted on

Stop Blaming Your Prompts. It’s the Architecture, Stup1d!

Stop Blaming Your Prompt. It’s the Architecture, Stup1d.

📄 Paper: https://zenodo.org/records/19438943

💻 Code: https://github.com/gfernandf/agent-skills


We keep pretending that better prompts will fix LLM agents.

They won’t.

We’ve built an entire layer of tooling, courses, and “best practices” around prompt engineering — as if the problem were linguistic.

It’s not.

It’s architectural.


The uncomfortable truth

Let’s be honest about what most agent systems are doing today:

  • Take a task
  • Generate a prompt
  • Call the model
  • Hope it “reasons” correctly
  • Repeat

This is not a system.

This is recomputation disguised as intelligence.


We are replaying cognition, not building it

Every time your agent runs, it:

  • Reconstructs context
  • Rebuilds reasoning
  • Re-derives intermediate steps

There is no reuse of cognition.

No structure.

No persistence.

No abstraction layer.

Just prompts.

We are not building systems. We are replaying thoughts.


Why prompt engineering feels like it works (until it doesn’t)

Prompt engineering gives the illusion of control:

  • Add more instructions
  • Add more examples
  • Add more constraints

And yes — performance improves.

Until it plateaus.

Because all of that lives inside a single forward pass.

No memory of reasoning.

No composability.

No reuse.

It’s like trying to fix software architecture by writing better comments.


The real problem is architectural

The core issue is simple:

We are using LLMs as stateless reasoning engines.

And then compensating for that with increasingly complex prompts.

Instead of:

  • modeling cognition
  • structuring reasoning
  • reusing intermediate steps

We regenerate everything every time.

That doesn’t scale.

Not in cost.

Not in latency.

Not in reliability.


What’s actually missing

What’s missing is not a better prompt.

It’s a runtime layer that:

  • Encodes reusable cognitive steps
  • Separates reasoning into structured components
  • Allows composition instead of regeneration

In other words:

A system that reuses cognition instead of recomputing it.


From prompts to skills

Instead of:

→ Prompt → Model → Output

You need:

→ Skill → Execution → Structured Output

Not conceptually. Operationally.

This is exactly what ORCA implements: a runtime layer where “skills” are reusable cognitive units — not prompts.

Defined inputs.

Structured outputs.

Explicit execution.

No recomputation. No guesswork.


Most “agent frameworks” today?

  • prompt orchestration
  • tool wrappers
  • retry loops with nicer formatting

They don’t model cognition.

They orchestrate prompts.

That’s not a runtime.


The shift is not better prompting.

It’s architectural.

From:

  • stateless generation

To:

  • structured, reusable cognition

That’s the gap ORCA is designed to close.


Prompt engineering isn’t useless.

It’s just solving the wrong problem.

We’ve been optimizing the interface instead of the system.

And it shows.


If you’ve pushed prompt engineering far enough, you’ve seen the limit.

The question is:

are you ready to try what will replace it?

Top comments (1)

Collapse
 
gfernandf profile image
Guillermo Fernandez

check the repo and the paper here

📄 Paper: zenodo.org/records/19438943

💻 Code: github.com/gfernandf/agent-skills