<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Guillermo Fernandez</title>
    <description>The latest articles on DEV Community by Guillermo Fernandez (@gfernandf).</description>
    <link>https://dev.to/gfernandf</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3852309%2Fe5a9cc1b-ef75-4815-8799-7d702b087ed6.png</url>
      <title>DEV Community: Guillermo Fernandez</title>
      <link>https://dev.to/gfernandf</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gfernandf"/>
    <language>en</language>
    <item>
      <title>Stop Blaming Your Prompts. It’s the Architecture, Stup1d!</title>
      <dc:creator>Guillermo Fernandez</dc:creator>
      <pubDate>Tue, 14 Apr 2026 18:24:28 +0000</pubDate>
      <link>https://dev.to/gfernandf/stop-blaming-your-prompts-its-the-architecture-stupid-1c5g</link>
      <guid>https://dev.to/gfernandf/stop-blaming-your-prompts-its-the-architecture-stupid-1c5g</guid>
      <description>&lt;h1&gt;
  
  
  Stop Blaming Your Prompt. It’s the Architecture, Stup1d.
&lt;/h1&gt;

&lt;p&gt;📄 Paper: &lt;a href="https://zenodo.org/records/19438943" rel="noopener noreferrer"&gt;https://zenodo.org/records/19438943&lt;/a&gt;&lt;br&gt;&lt;br&gt;
💻 Code: &lt;a href="https://github.com/gfernandf/agent-skills" rel="noopener noreferrer"&gt;https://github.com/gfernandf/agent-skills&lt;/a&gt;  &lt;/p&gt;




&lt;p&gt;We keep pretending that better prompts will fix LLM agents.&lt;/p&gt;

&lt;p&gt;They won’t.&lt;/p&gt;

&lt;p&gt;We’ve built an entire layer of tooling, courses, and “best practices” around prompt engineering — as if the problem were linguistic.&lt;/p&gt;

&lt;p&gt;It’s not.&lt;/p&gt;

&lt;p&gt;It’s architectural.&lt;/p&gt;




&lt;h2&gt;
  
  
  The uncomfortable truth
&lt;/h2&gt;

&lt;p&gt;Let’s be honest about what most agent systems are doing today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Take a task
&lt;/li&gt;
&lt;li&gt;Generate a prompt
&lt;/li&gt;
&lt;li&gt;Call the model
&lt;/li&gt;
&lt;li&gt;Hope it “reasons” correctly
&lt;/li&gt;
&lt;li&gt;Repeat
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a system.&lt;/p&gt;

&lt;p&gt;This is recomputation disguised as intelligence.&lt;/p&gt;




&lt;h2&gt;
  
  
  We are replaying cognition, not building it
&lt;/h2&gt;

&lt;p&gt;Every time your agent runs, it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reconstructs context
&lt;/li&gt;
&lt;li&gt;Rebuilds reasoning
&lt;/li&gt;
&lt;li&gt;Re-derives intermediate steps
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is no reuse of cognition.&lt;/p&gt;

&lt;p&gt;No structure.&lt;br&gt;&lt;br&gt;
No persistence.&lt;br&gt;&lt;br&gt;
No abstraction layer.&lt;/p&gt;

&lt;p&gt;Just prompts.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;We are not building systems. We are replaying thoughts.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Why prompt engineering feels like it works (until it doesn’t)
&lt;/h2&gt;

&lt;p&gt;Prompt engineering gives the illusion of control:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add more instructions
&lt;/li&gt;
&lt;li&gt;Add more examples
&lt;/li&gt;
&lt;li&gt;Add more constraints
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And yes — performance improves.&lt;/p&gt;

&lt;p&gt;Until it plateaus.&lt;/p&gt;

&lt;p&gt;Because all of that lives inside a single forward pass.&lt;/p&gt;

&lt;p&gt;No memory of reasoning.&lt;br&gt;&lt;br&gt;
No composability.&lt;br&gt;&lt;br&gt;
No reuse.&lt;/p&gt;

&lt;p&gt;It’s like trying to fix software architecture by writing better comments.&lt;/p&gt;




&lt;h2&gt;
  
  
  The real problem is architectural
&lt;/h2&gt;

&lt;p&gt;The core issue is simple:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;We are using LLMs as stateless reasoning engines.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And then compensating for that with increasingly complex prompts.&lt;/p&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;modeling cognition
&lt;/li&gt;
&lt;li&gt;structuring reasoning
&lt;/li&gt;
&lt;li&gt;reusing intermediate steps
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We regenerate everything every time.&lt;/p&gt;

&lt;p&gt;That doesn’t scale.&lt;/p&gt;

&lt;p&gt;Not in cost.&lt;br&gt;&lt;br&gt;
Not in latency.&lt;br&gt;&lt;br&gt;
Not in reliability.&lt;/p&gt;




&lt;h2&gt;
  
  
  What’s actually missing
&lt;/h2&gt;

&lt;p&gt;What’s missing is not a better prompt.&lt;/p&gt;

&lt;p&gt;It’s a runtime layer that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Encodes reusable cognitive steps
&lt;/li&gt;
&lt;li&gt;Separates reasoning into structured components
&lt;/li&gt;
&lt;li&gt;Allows composition instead of regeneration
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;A system that reuses cognition instead of recomputing it.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  From prompts to skills
&lt;/h2&gt;

&lt;p&gt;Instead of:&lt;/p&gt;

&lt;p&gt;→ Prompt → Model → Output  &lt;/p&gt;

&lt;p&gt;You need:&lt;/p&gt;

&lt;p&gt;→ Skill → Execution → Structured Output  &lt;/p&gt;

&lt;p&gt;Not conceptually. Operationally.&lt;/p&gt;

&lt;p&gt;This is exactly what ORCA implements: a runtime layer where “skills” are reusable cognitive units — not prompts.&lt;/p&gt;

&lt;p&gt;Defined inputs.&lt;br&gt;&lt;br&gt;
Structured outputs.&lt;br&gt;&lt;br&gt;
Explicit execution.&lt;/p&gt;

&lt;p&gt;No recomputation. No guesswork.&lt;/p&gt;




&lt;p&gt;Most “agent frameworks” today?&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;prompt orchestration
&lt;/li&gt;
&lt;li&gt;tool wrappers
&lt;/li&gt;
&lt;li&gt;retry loops with nicer formatting
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;They don’t model cognition.&lt;/p&gt;

&lt;p&gt;They orchestrate prompts.&lt;/p&gt;

&lt;p&gt;That’s not a runtime.&lt;/p&gt;




&lt;p&gt;The shift is not better prompting.&lt;/p&gt;

&lt;p&gt;It’s architectural.&lt;/p&gt;

&lt;p&gt;From:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;stateless generation
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;structured, reusable cognition
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That’s the gap ORCA is designed to close.&lt;/p&gt;




&lt;p&gt;Prompt engineering isn’t useless.&lt;/p&gt;

&lt;p&gt;It’s just solving the wrong problem.&lt;/p&gt;

&lt;p&gt;We’ve been optimizing the interface instead of the system.&lt;/p&gt;

&lt;p&gt;And it shows.&lt;/p&gt;




&lt;p&gt;If you’ve pushed prompt engineering far enough, you’ve seen the limit.&lt;/p&gt;

&lt;p&gt;The question is:&lt;/p&gt;

&lt;p&gt;are you ready to try what will replace it?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>agents</category>
      <category>llm</category>
    </item>
    <item>
      <title>Why Prompt-Based Agents Don’t Scale (and What We’re Trying Instead)</title>
      <dc:creator>Guillermo Fernandez</dc:creator>
      <pubDate>Wed, 08 Apr 2026 12:08:36 +0000</pubDate>
      <link>https://dev.to/gfernandf/why-prompt-based-agents-dont-scale-and-what-were-trying-instead-3cc4</link>
      <guid>https://dev.to/gfernandf/why-prompt-based-agents-dont-scale-and-what-were-trying-instead-3cc4</guid>
      <description>&lt;h1&gt;
  
  
  Why Prompt-Based Agents Don’t Scale (and What We’re Trying Instead)
&lt;/h1&gt;

&lt;p&gt;Most agent systems today are, at their core, prompt pipelines.&lt;/p&gt;

&lt;p&gt;We chain prompts, add tools, inject memory, and hope that the system behaves consistently. This works surprisingly well for simple cases — but starts to break down as complexity increases.&lt;/p&gt;

&lt;p&gt;After experimenting with different approaches, I’ve been exploring an alternative: introducing a &lt;strong&gt;cognitive runtime layer&lt;/strong&gt; between the agent and the tools.&lt;/p&gt;

&lt;p&gt;I call this approach &lt;strong&gt;ORCA&lt;/strong&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem with Prompt Pipelines
&lt;/h2&gt;

&lt;p&gt;In most current designs, a single layer (the prompt) is responsible for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;deciding what to do
&lt;/li&gt;
&lt;li&gt;selecting tools
&lt;/li&gt;
&lt;li&gt;executing actions
&lt;/li&gt;
&lt;li&gt;interpreting results
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This creates a few issues:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;low observability&lt;/strong&gt; — hard to understand what the agent is doing
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;poor composability&lt;/strong&gt; — workflows don’t reuse well
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;fragility&lt;/strong&gt; — small prompt changes can break behavior
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;implicit execution&lt;/strong&gt; — logic is buried in text
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  A Different Approach: A Cognitive Runtime Layer
&lt;/h2&gt;

&lt;p&gt;Instead of encoding everything in prompts, ORCA separates concerns explicitly:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Capabilities
&lt;/h3&gt;

&lt;p&gt;Atomic cognitive operations such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;retrieve&lt;/li&gt;
&lt;li&gt;transform&lt;/li&gt;
&lt;li&gt;evaluate&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the building blocks of reasoning.&lt;/p&gt;




&lt;h3&gt;
  
  
  2. Skills
&lt;/h3&gt;

&lt;p&gt;Composable workflows built from capabilities.&lt;/p&gt;

&lt;p&gt;Think of them as structured procedures rather than prompt chains.&lt;/p&gt;




&lt;h3&gt;
  
  
  3. Execution Model
&lt;/h3&gt;

&lt;p&gt;Execution is explicit and structured, not hidden inside prompts.&lt;/p&gt;

&lt;p&gt;This allows:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;tracing&lt;/li&gt;
&lt;li&gt;validation&lt;/li&gt;
&lt;li&gt;control over intermediate steps&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  4. Agent Orchestration
&lt;/h3&gt;

&lt;p&gt;The agent is still responsible for decision-making, but it delegates execution to the runtime layer.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why This Might Matter
&lt;/h2&gt;

&lt;p&gt;The hypothesis behind ORCA is that separating:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cognition
&lt;/li&gt;
&lt;li&gt;execution
&lt;/li&gt;
&lt;li&gt;orchestration
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;can improve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;composability&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;observability&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;control over execution&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, moving from:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;prompt-driven behavior  &lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;structured cognitive execution  &lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  Open Source + Paper
&lt;/h2&gt;

&lt;p&gt;I’ve implemented a first version of this idea:&lt;/p&gt;

&lt;p&gt;👉 GitHub:&lt;br&gt;
&lt;a href="https://github.com/gfernandf/agent-skills" rel="noopener noreferrer"&gt;https://github.com/gfernandf/agent-skills&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And documented the architecture and design principles in a paper:&lt;/p&gt;

&lt;p&gt;👉 Paper (DOI):&lt;br&gt;
&lt;a href="https://doi.org/10.5281/zenodo.19438943" rel="noopener noreferrer"&gt;https://doi.org/10.5281/zenodo.19438943&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The paper is also being submitted to arXiv.&lt;/p&gt;




&lt;h2&gt;
  
  
  Open Questions
&lt;/h2&gt;

&lt;p&gt;This is still exploratory, and I’d be very interested in feedback on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;how granular capabilities should be before overhead dominates
&lt;/li&gt;
&lt;li&gt;whether declarative execution models can realistically replace prompt pipelines
&lt;/li&gt;
&lt;li&gt;where this approach would break in real-world systems
&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Closing Thought
&lt;/h2&gt;

&lt;p&gt;We’ve made huge progress treating LLMs as reasoning engines.&lt;/p&gt;

&lt;p&gt;But most current agent systems still rely on &lt;strong&gt;unstructured execution layers&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;If agents are going to scale, we may need to start treating execution as a first-class concern — not something embedded in prompts.&lt;/p&gt;




&lt;p&gt;Happy to discuss ideas, trade-offs, or real-world use cases.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>architecture</category>
      <category>llm</category>
    </item>
    <item>
      <title>Introducing ORCA: executable skills and capabilities for AI agent workflows</title>
      <dc:creator>Guillermo Fernandez</dc:creator>
      <pubDate>Mon, 30 Mar 2026 20:55:01 +0000</pubDate>
      <link>https://dev.to/gfernandf/introducing-orca-executable-skills-and-capabilities-for-ai-agent-workflows-120o</link>
      <guid>https://dev.to/gfernandf/introducing-orca-executable-skills-and-capabilities-for-ai-agent-workflows-120o</guid>
      <description>&lt;h2&gt;
  
  
  The problem
&lt;/h2&gt;

&lt;p&gt;Most agent frameworks are great at orchestrating LLM calls. But when it comes to deterministic operations — parsing documents, validating schemas, deduplicating records — you end up writing custom glue code every time.&lt;/p&gt;

&lt;p&gt;I wanted a framework where agents could actually &lt;strong&gt;execute&lt;/strong&gt;, not just plan.&lt;/p&gt;

&lt;h2&gt;
  
  
  What is ORCA?
&lt;/h2&gt;

&lt;p&gt;ORCA (Open Runtime for Capable Agents) is an open-source Python framework that treats agent actions as &lt;strong&gt;capabilities&lt;/strong&gt; — declarative YAML contracts that can be wired to any backend: pure Python, OpenAI, or external APIs.&lt;/p&gt;

&lt;p&gt;You compose capabilities into &lt;strong&gt;skills&lt;/strong&gt; (multi-step workflows defined as DAGs), and the runtime handles scheduling, policy enforcement, and state tracking.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's included
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;122 capabilities&lt;/strong&gt; with deterministic Python baselines — no API keys needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;36 ready-to-use skills&lt;/strong&gt; composed from those capabilities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;DAG scheduler&lt;/strong&gt; with policy gates and cognitive state tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaffold wizard&lt;/strong&gt; to build new skills in minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-bindings&lt;/strong&gt; for OpenAI and PythonCall out of the box&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adapters&lt;/strong&gt; for LangChain, CrewAI, and Semantic Kernel&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MCP server&lt;/strong&gt; support&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Quick example
&lt;/h2&gt;

&lt;p&gt;A capability is a YAML contract:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
yaml
id: doc.content.chunk
description: Split a document into semantic chunks
input:
  content: { type: string }
  max_tokens: { type: integer, default: 512 }
output:
  chunks: { type: array, items: { type: string } }


A skill composes multiple capabilities as a DAG:

id: document.summarize
steps:
  - id: chunk
    capability: doc.content.chunk
  - id: summarize
    capability: text.body.summarize
    depends_on: [chunk]
  - id: compile
    capability: text.report.compile
    depends_on: [summarize]


The scheduler resolves dependencies, runs steps in parallel when possible, and enforces policies at each gate.

Why not just use LangChain / CrewAI directly?
You can — ORCA has adapters for both. The difference is in the abstraction layer:

      **Traditional frameworks  ORCA**
Actions         Code functions          Declarative YAML capabilities
Composition Imperative chains   DAG-based skills
Execution   LLM-dependent           Det. baselines + LLM fallback
Portability Framework-locked    Bind to any backend

ORCA doesn't replace your orchestration framework — it gives your agents a portable, testable skill layer underneath.

Getting started
pip install agent-skills

from agent_skills import Registry

registry = Registry()
result = registry.execute("doc.content.chunk", {
    "content": "Your document text here...",
    "max_tokens": 256
})
print(result["chunks"])

Links
GitHub: github.com/gfernandf/agent-skills
Documentation: gfernandf.github.io/agent-skills
Registry: github.com/gfernandf/agent-skill-registry
Feedback welcome
The project is early — just made the repos public this week. I'd appreciate feedback on the design, missing capabilities, or anything that feels rough.

Drop a comment here or open an issue on GitHub — happy to discuss.

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>opensource</category>
      <category>python</category>
      <category>ai</category>
      <category>agents</category>
    </item>
  </channel>
</rss>
