<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Santi Santamaría Medel</title>
    <description>The latest articles on DEV Community by Santi Santamaría Medel (@oldskultxo).</description>
    <link>https://dev.to/oldskultxo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3808761%2Fb2181edd-bc65-4118-98ec-cf4a666e7b83.jpg</url>
      <title>DEV Community: Santi Santamaría Medel</title>
      <link>https://dev.to/oldskultxo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/oldskultxo"/>
    <language>en</language>
    <item>
      <title>Coding agents don’t need more context. They need continuity.</title>
      <dc:creator>Santi Santamaría Medel</dc:creator>
      <pubDate>Sat, 09 May 2026 10:03:45 +0000</pubDate>
      <link>https://dev.to/oldskultxo/coding-agents-dont-need-more-context-they-need-continuity-m07</link>
      <guid>https://dev.to/oldskultxo/coding-agents-dont-need-more-context-they-need-continuity-m07</guid>
      <description>&lt;p&gt;I’ve been working with coding agents for quite a while now.&lt;br&gt;
I’ve been a software engineer for more than 15 years, and at first it was hard for me to accept that the rules of the game had changed forever.&lt;/p&gt;

&lt;p&gt;I’ve stopped thinking of coding agents as autocomplete. In many tasks, they can reason through codebases and produce solid implementations. But one thing still feels missing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I haven’t managed to feel that I’m working side by side with an engineer who knows the repository. Someone familiar with the project’s codebase, its strategies, its typical errors, the commands that should be run and the ones that shouldn’t.&lt;br&gt;
A veteran teammate, not a rookie who has to review the whole repo, starting from the README and the Makefile, before writing a single line of code.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;At first I thought it was all about refining prompts.&lt;/p&gt;

&lt;p&gt;Then I focused on operational memory, skills, MCPs, rules, global instructions, AGENTS.md, CLAUDE.md, and everything I kept reading over and over again in articles and posts.&lt;/p&gt;

&lt;p&gt;I also had a “context” phase. I became obsessed with improving the context my agent was working with.&lt;/p&gt;

&lt;p&gt;And yet I still had the same feeling.&lt;/p&gt;

&lt;p&gt;The more I obsessed over prompts, memory, skills, and context, the more I started to feel that what the agent was missing was &lt;strong&gt;continuity&lt;/strong&gt;.&lt;br&gt;
Something more human. Something closer to what a teammate would ask on their first day at work:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Where were we?
What did we do yesterday?
What hypotheses did we discard?
Which file mattered?
Which test was the right one?
What should I not touch?
Where do I start?
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Since I work intensively in large repositories, I saw a major limitation in Codex (the agent I use mainly) starting every session again from the README. It frustrated me to watch it rediscover the repo, try overly broad commands, or attempt to run huge test suites that had nothing to do with the task at hand.&lt;/p&gt;

&lt;p&gt;So I started building a tool focused on operational continuity.&lt;/p&gt;

&lt;p&gt;I called it &lt;strong&gt;AICTX&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In one sentence: &lt;strong&gt;aictx is a repo-local continuity runtime for coding agents&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The idea is that each new session behaves less like an isolated prompt and more like the same repo-native engineer continuing previous work.&lt;/p&gt;

&lt;p&gt;After many iterations, the workflow has consolidated into something like this:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;user prompt
→ agent extracts a narrow task goal
→ aictx resume gives repo-local continuity
→ agent receives an execution contract
→ agent works
→ aictx finalize stores what happened
→ next session starts from continuity, not from zero
→ the user receives feedback about continuity
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;AICTX stores and reuses things like work state, handoffs, decisions, failure memory, strategy memory, execution summaries, RepoMap hints, execution contracts, and contract compliance signals.&lt;br&gt;&lt;br&gt;
All of them are auditable artifacts that are easy to inspect at repo level.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjgry8m4c1infcgq4nts.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjjgry8m4c1infcgq4nts.png" alt="Runtime flow diagram" width="800" height="450"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the other hand, one of the things I like most about the tool is that I can enable portability and keep the most important continuity artifacts versioned, so I can continue the task on my personal laptop, my work laptop, or anywhere else.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The &lt;strong&gt;execution contract&lt;/strong&gt; part feels especially interesting to me. Instead of giving the agent a vague block of context, AICTX tries to give it an operational route:&lt;/p&gt;
&lt;/blockquote&gt;

&lt;ul&gt;
&lt;li&gt;first_action&lt;/li&gt;
&lt;li&gt;edit_scope&lt;/li&gt;
&lt;li&gt;test_command&lt;/li&gt;
&lt;li&gt;finalize_command&lt;/li&gt;
&lt;li&gt;contract_strength&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I wanted to check whether this actually worked, not just rely on my own impressions while watching the agent work with AICTX. So I created a small Python demo repo and ran the same two-session task twice:&lt;/p&gt;

&lt;p&gt;Before talking about the test itself, it’s worth stressing that I mainly work with Codex, so the test has the most validity and accuracy with Codex.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/oldskultxo/aictx-demo-taskflow/tree/with_aictx" rel="noopener noreferrer"&gt;one branch using AICTX&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/oldskultxo/aictx-demo-taskflow/tree/without_aictx" rel="noopener noreferrer"&gt;one branch without AICTX&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The task was intentionally simple: add support for a new &lt;code&gt;BLOCKED&lt;/code&gt; status, and then continue in a second session to validate parser edge cases.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;This is important: the demo is not designed under conditions where AICTX has the maximum possible advantage. The repository is small, the task is simple, and the continuation prompt without AICTX includes enough manual context.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Even so, in the second session a clear difference appeared.&lt;br&gt;&lt;br&gt;
&lt;em&gt;(Note: all demo metrics are available &lt;a href="https://github.com/oldskultxo/aictx-demo-taskflow/tree/main/.demo_metrics" rel="noopener noreferrer"&gt;here&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Session 2
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;with_aictx&lt;/th&gt;
&lt;th&gt;without_aictx&lt;/th&gt;
&lt;th&gt;Difference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Files explored&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;-50.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Files edited&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;-66.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Commands run&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;-46.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tests run&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;-75.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Exploration steps before first edit&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;-60.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to complete&lt;/td&gt;
&lt;td&gt;72s&lt;/td&gt;
&lt;td&gt;119s&lt;/td&gt;
&lt;td&gt;-39.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total tokens&lt;/td&gt;
&lt;td&gt;208,470&lt;/td&gt;
&lt;td&gt;296,157&lt;/td&gt;
&lt;td&gt;-29.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API reference cost&lt;/td&gt;
&lt;td&gt;$0.5983&lt;/td&gt;
&lt;td&gt;$0.8789&lt;/td&gt;
&lt;td&gt;-31.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The most interesting difference for me was not the tokens. It was where the agent started.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;With AICTX:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;first_relevant_file = tests/test_parser.py&lt;br&gt;
    first_edit_file     = tests/test_parser.py&lt;/code&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Without AICTX:&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;first_relevant_file = README.md&lt;br&gt;
    first_edit_file     = src/taskflow/parser.py&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With AICTX, the second session behaved more like an operational continuation.&lt;/strong&gt; &lt;br&gt;
&lt;strong&gt;Without AICTX, it behaved more like a new agent reconstructing the state of the project.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Across both sessions, the savings were more moderate:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;with_aictx&lt;/th&gt;
&lt;th&gt;without_aictx&lt;/th&gt;
&lt;th&gt;Difference&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Files explored&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;-31.6%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Commands run&lt;/td&gt;
&lt;td&gt;19&lt;/td&gt;
&lt;td&gt;26&lt;/td&gt;
&lt;td&gt;-26.9%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tests run&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;-50.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Time to complete&lt;/td&gt;
&lt;td&gt;166s&lt;/td&gt;
&lt;td&gt;222s&lt;/td&gt;
&lt;td&gt;-25.2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Total tokens&lt;/td&gt;
&lt;td&gt;455,965&lt;/td&gt;
&lt;td&gt;492,800&lt;/td&gt;
&lt;td&gt;-7.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API reference cost&lt;/td&gt;
&lt;td&gt;$1.3129&lt;/td&gt;
&lt;td&gt;$1.4591&lt;/td&gt;
&lt;td&gt;-10.0%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;Honest result: AICTX did not magically win at everything.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the first session, it had overhead. There wasn’t much accumulated continuity to reuse yet, so it doesn’t make sense to sell it as a universal token saver.&lt;/p&gt;

&lt;p&gt;There is also another important nuance: the execution without AICTX found and fixed an additional edge case related to UTF-8 BOM input. So I also wouldn’t say that AICTX produced “better code.”&lt;/p&gt;

&lt;p&gt;The honest conclusion would be this:&lt;/p&gt;

&lt;p&gt;AICTX produced a correct, more focused continuation with less repo rediscovery.&lt;br&gt;&lt;br&gt;
The execution without AICTX produced a broader solution, but it needed more exploration, more commands, more tests, and more time.&lt;/p&gt;

&lt;p&gt;For me, this fits the initial hypothesis quite well:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;AICTX is not a magical token saver.&lt;/li&gt;
&lt;li&gt;It has overhead in the first session.&lt;/li&gt;
&lt;li&gt;Its value appears when work continues across sessions.&lt;/li&gt;
&lt;li&gt;The real problem is not just “giving the model more context.”&lt;/li&gt;
&lt;li&gt;The problem is making each agent session feel less like starting from zero.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And I suspect this demo actually reduces the real size of the problem. In a large repo, where the previous session left decisions, failed attempts, scope boundaries, correct test commands, and known risks, continuity should matter more.&lt;/p&gt;

&lt;p&gt;I still don’t fully get the feeling of continuity I’m looking for, but I’m starting to get closer. To push that feeling a bit further, AICTX makes the agent give operational-continuity feedback to the user through a startup banner at the beginning of each session and a summary output at the end of each execution.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fusvydyo4wh26qapej2ri.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fusvydyo4wh26qapej2ri.png" alt="Feedback example of a demo session" width="800" height="329"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The tool is still alive, and I’m still scaling it while trying to solve my own pains. I’d love to receive feedback: positive things, possible improvements, issues people notice, or even PRs if anyone feels like contributing.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;If anyone wants to try it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://github.com/oldskultxo/aictx" rel="noopener noreferrer"&gt;Github repo&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://pypi.org/project/aictx/?utm_source=chatgpt.com" rel="noopener noreferrer"&gt;Pypi&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;    pipx install aictx
    aictx install
    cd repo_path
    aictx init
    # then just work with your coding agent as usual
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With AICTX, I’m not trying to replace good prompts, skills, or already established memory/context-management tools. I’m simply trying to make operational continuity easier in large code repositories that I iterate on once and again.&lt;/p&gt;

&lt;p&gt;I’d be really happy if it ends up being useful to someone along the way.&lt;/p&gt;

&lt;p&gt;If you try it, I’d love to know whether it improves your workflow, or whether it gets in the way.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>python</category>
      <category>devex</category>
    </item>
    <item>
      <title>I tried writing an interactive novel. I accidentally ended up building a platform.</title>
      <dc:creator>Santi Santamaría Medel</dc:creator>
      <pubDate>Fri, 06 Mar 2026 13:45:26 +0000</pubDate>
      <link>https://dev.to/oldskultxo/i-tried-writing-an-interactive-noveli-accidentally-ended-up-building-a-platform-34of</link>
      <guid>https://dev.to/oldskultxo/i-tried-writing-an-interactive-noveli-accidentally-ended-up-building-a-platform-34of</guid>
      <description>&lt;h4&gt;
  
  
  A few months ago I tried to write an interactive fiction novel. I accidentally ended up building a platform instead.
&lt;/h4&gt;

&lt;p&gt;I started writing but as the story grew, I quickly realised two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;First, I’m not a great writer — and even less so when it comes to an interactive novel with all its complexity.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Second, the further I got, the harder it became to manage the structure: branches, conditions, narrative state… everything started getting messy pretty quickly.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The tools I tried didn’t really fit what I had in mind, so at some point I opened Visual Studio and tried to solve the problem myself.&lt;/p&gt;

&lt;p&gt;The idea was simple: I wanted to find a way to separate the prose from the logic that drives the story.&lt;/p&gt;

&lt;p&gt;That’s when the real experiment started.&lt;/p&gt;

&lt;p&gt;Since frontend isn’t really my main area, instead of trying to do everything myself I decided to try something different: building the project with AI agents (Codex) as development partners.&lt;/p&gt;

&lt;p&gt;What started as a small experiment quickly got out of hand. I got carried away and ended up building a small platform.&lt;/p&gt;

&lt;h2&gt;
  
  
  Working with Codex, as an not so small dev team
&lt;/h2&gt;

&lt;p&gt;Working with Codex — and the workflow I gradually developed around it — turned out to be surprisingly effective. Instead of just asking for snippets, I started treating the AI more like a small development team: iterating on architecture, building components, debugging problems together and refining ideas step by step.&lt;/p&gt;

&lt;p&gt;This AI-assisted workflow made it possible to move surprisingly fast across several areas at once: coding, UI design and architectural decisions.&lt;br&gt;
It also became a really interesting learning experience about how to work with AI agents: improving context management, performance and model behaviour.&lt;/p&gt;

&lt;h2&gt;
  
  
  The IEPUB project
&lt;/h2&gt;

&lt;p&gt;The result of that whole process is a small ecosystem called iepub:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a structured format for interactive books&lt;/li&gt;
&lt;li&gt;a reader runtime that interprets that format&lt;/li&gt;
&lt;li&gt;and a visual editor designed for writing interactive fiction&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It had gone completely out of control…&lt;/p&gt;

&lt;p&gt;The editor tries to feel like a normal writing tool — something closer to Google Docs — but designed for interactive storytelling. It allows things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;defining narrative condition&lt;/li&gt;
&lt;li&gt;attaching variables to sections of the story&lt;/li&gt;
&lt;li&gt;configuring dice rolls or probabilistic events&lt;/li&gt;
&lt;li&gt;creating narrative variants, based on both, declarative conditions and user behaviour while reading conditions (really cool!)&lt;/li&gt;
&lt;li&gt;visualising the structure of the story as a graph&lt;/li&gt;
&lt;li&gt;importing and transforming content from the most extended formats &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If anyone is curious about the experiment — both the project itself and the AI-assisted development workflow — you can take a look at the article I published on Medium:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://medium.com/@santi.santamaria.medel/interactive-fiction-platform-codex-ai-093358665827" rel="noopener noreferrer"&gt;https://medium.com/@santi.santamaria.medel/interactive-fiction-platform-codex-ai-093358665827&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;And if you just want to explore the project itself, you can do that here:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://iepub.io" rel="noopener noreferrer"&gt;https://iepub.io&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I’d also love to hear how others are using AI in their development workflows, and learn! &lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The project is alive and keeps evolving, so every feedback will be a good feedback! &lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>showdev</category>
      <category>sideprojects</category>
      <category>softwaredevelopment</category>
      <category>writing</category>
    </item>
  </channel>
</rss>
