<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dirk Mattig</title>
    <description>The latest articles on DEV Community by Dirk Mattig (@newadventuresinit).</description>
    <link>https://dev.to/newadventuresinit</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3937640%2F60dcd699-87e8-4614-957f-60a6ea2af6ac.png</url>
      <title>DEV Community: Dirk Mattig</title>
      <link>https://dev.to/newadventuresinit</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/newadventuresinit"/>
    <language>en</language>
    <item>
      <title>TaskTrack — A Specify Spec for Agent Task Management</title>
      <dc:creator>Dirk Mattig</dc:creator>
      <pubDate>Fri, 05 Jun 2026 06:45:44 +0000</pubDate>
      <link>https://dev.to/newadventuresinit/tasktrack-a-specify-spec-for-agent-task-management-2bhh</link>
      <guid>https://dev.to/newadventuresinit/tasktrack-a-specify-spec-for-agent-task-management-2bhh</guid>
      <description>&lt;p&gt;It is time to put my proposition made in my &lt;a href="https://dev.to/newadventuresinit/speccing-is-the-new-coding-493g"&gt;previous blog post&lt;/a&gt; to the test. Is it possible to spec an application for execution by an agent without encoding it in source? Let's find out.&lt;/p&gt;

&lt;p&gt;One type of application every knowledge worker is familiar with is task management. Every task has a lifecycle status, dependencies on other tasks, and a history of progress.&lt;/p&gt;

&lt;p&gt;Let's give agents their own.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://github.com/newadventuresinit/tasktrack" rel="noopener noreferrer"&gt;&lt;strong&gt;TaskTrack&lt;/strong&gt;&lt;/a&gt; is a simple but non-trivial task management system variant implemented as a &lt;a href="https://newadventuresinit.github.io/specpack/specify/" rel="noopener noreferrer"&gt;Specify&lt;/a&gt; spec. It goes beyond checkbox-based to-do lists that agents sometimes use internally and mimics the key system features listed above.&lt;/p&gt;

&lt;p&gt;TaskTrack defines two procedures: a "Plan Authoring Run" to create an interconnected set of tasks from requirements and a "Plan Execution Run" to advance a previously authored plan toward completion. One execution run might not always be enough to achieve completion, because TaskTrack allows requesting human feedback and incorporating it during the next execution run. Furthermore, every execution run is divided into "Task Processing Run" sub-procedures to allow for advanced agent context management.&lt;/p&gt;

&lt;p&gt;TaskTrack implements all of this in less than 300 lines of text. If the implementation used source code, then, depending on the programming language, this would be enough space to implement only the required file I/O operations (TaskTrack uses files for simplicity, not a database). Natural language can easily become quite bloated, but a stringent, scientific writing style and extensive use of what the Specify standard offers can effectively counter that.&lt;/p&gt;

&lt;p&gt;The official test is, how could it be any other way, the implementation of yet another uninspired Breakout clone. The requirements, the completed TaskTrack plan, and the deliverable are contained in the repository.&lt;/p&gt;

&lt;p&gt;If you want to run the test yourself, the included README file contains the necessary information, including the launch prompts for both the authoring agent and the execution agent. Please note how both launch prompts are structured. They use TaskTrack terminology and point to the relevant files. They do not contain task-related behavioral instructions. The execution agent launch prompt contains agent-specific instructions for mapping agent features to the generic TaskTrack specification. The principles behind the good old manual coding design patterns remain valid even in the agentic era!&lt;/p&gt;

&lt;p&gt;And now, finally, for the test result. In a nutshell: It works!&lt;/p&gt;

&lt;p&gt;The authoring agent created all TaskTrack files as indicated, which is, maybe, less surprising or impressive. More importantly, the execution agent showed deterministic behavior over all 16 tasks and two execution runs. I often hear that deterministic behavior must remain encoded in source due to the inherently random, and hence non-deterministic, nature of LLMs. I cannot confirm this based on the test result. The execution agent followed the step-by-step procedure definition by the book each and every time. Even the defined textual output was created as reliably and repeatably as if it were produced by a &lt;code&gt;print&lt;/code&gt; statement.&lt;/p&gt;

&lt;p&gt;It goes without saying that this single test result does not deliver a general proof of the viability of speccing. It shows it can work; it is possible. Maybe non-deterministic agent behavior is more often than not the result of unspecific instructions rather than randomness in the underlying LLM.&lt;/p&gt;

&lt;p&gt;Having said all this, the test run was far from being perfect. It produced several so-called valuable learning experiences.&lt;/p&gt;

&lt;p&gt;The first and most obvious finding is that all but one of the timestamps are incorrect. The authoring agent wrote and executed a Python script to retrieve the current UTC time. All task processing subagents simply invented timestamps. When I later asked the system about this difference in behavior, it gave an interesting answer: Creating a new timestamp is a "single, salient, one-off step... worth a real python/date call." Updating timestamp fields is "a repeated, mechanical step... every task, every run, in fresh subagent contexts," and that "models systematically deprioritize repeated boilerplate."&lt;/p&gt;

&lt;p&gt;This is not a TaskTrack issue but rather the result of an ill-equipped agent. And it is at this point, where, no matter how hard I try, I cannot stop myself from making the tongue-in-cheek remark that the machines that are feared to first fire and then nuke us apparently have no built-in access to the current time... I will keep this in mind, just in case.&lt;/p&gt;

&lt;p&gt;The second finding is that, as the agent itself remarked when reviewing the test results, task resolutions are not necessarily as brief as mandated by the TaskTrack specification. But then, what is brief? Precisely. This is the kind of hastily written, hand-wavy instruction that is open to interpretation and leads to varying results. Just because we are using natural language now does not mean we are allowed to let our rigor slip.&lt;/p&gt;

&lt;p&gt;Luckily, it is not a major pain point, since it only affects the resolution, not the core processing logic. Still, it is worth fixing in a future publication.&lt;/p&gt;

&lt;p&gt;The third finding is that, strictly speaking, the test run was flawed because these wonderful machines now have memory. Both the authoring and execution agents revealed in their thinking output that they were aware that this was a test. I do not think this flaw invalidates the qualitative test result as such. Still, future test setups will require more care and consideration.&lt;/p&gt;

&lt;p&gt;In the meantime, the TaskTrack specification is live, the license is permissive, and the floor is open. Have a look around, and let me know what you think.&lt;/p&gt;

</description>
      <category>agents</category>
      <category>ai</category>
      <category>productivity</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Speccing Is the New Coding</title>
      <dc:creator>Dirk Mattig</dc:creator>
      <pubDate>Mon, 25 May 2026 10:48:20 +0000</pubDate>
      <link>https://dev.to/newadventuresinit/speccing-is-the-new-coding-493g</link>
      <guid>https://dev.to/newadventuresinit/speccing-is-the-new-coding-493g</guid>
      <description>&lt;p&gt;What do we still need source code for?&lt;/p&gt;

&lt;p&gt;It is an odd question to ask after spending a lifetime writing it, but it is the one that keeps pulling at my sleeve. Let me work backwards to explain why.&lt;/p&gt;

&lt;p&gt;The first computers were one-trick ponies. Their behavior was baked into their wiring — change the task, change the machine. Useful, expensive, inflexible. Then the elegant idea emerged that part of the data a machine processed could also control how it processed the rest, and the stored-program computer was born. Hardware became a stage; software became the play.&lt;/p&gt;

&lt;p&gt;And with software came developers — a new profession whose first and hardest job was, and still is, to &lt;em&gt;understand&lt;/em&gt; a process well enough that they could, in principle, perform it themselves. The encoding into source code was always the second step. We did it because humans, however well they understand a process, cannot match a machine for speed or reliability — and have an inconvenient need for sleep.&lt;/p&gt;

&lt;p&gt;For decades that was the deal. A business owner understood a process; a developer understood the business owner; the source code was the byproduct of that understanding, painstakingly translated through several meetings, languages, frameworks, and rather more meetings on the way to silicon.&lt;/p&gt;

&lt;p&gt;That deal has changed. The entity we now describe processes to &lt;em&gt;is already the machine.&lt;/em&gt; The author and the audience have merged. So the question writes itself: if the agent already understands what we want, why do we still ask it to produce thousands of lines of source code that we, in turn, will mostly never read?&lt;/p&gt;

&lt;p&gt;The short-term answers are perfectly good. Executing compiled code is cheaper and faster than burning tokens. The entire existing body of software — every library, every API, every running system — is encoded in source. That body of work is not going anywhere quickly — not in a year, not in a decade, probably not in two.&lt;/p&gt;

&lt;p&gt;But mid-term, I think the answer changes. Our industry has a stubborn habit of making things cheaper and faster, fast. The obstacles ahead are real, but they are the kind of constraints we have spent decades learning to engineer around. Once the economics flip, the cleanest representation of an application is no longer a tree of source files written for one runtime — it is a single document, in prose, describing the logic and behavior the application is supposed to exhibit. Read directly. Understood directly. Acted on directly.&lt;/p&gt;

&lt;p&gt;That is what I mean by &lt;em&gt;speccing is the new coding.&lt;/em&gt; And it is the reason I have just published &lt;a href="https://newadventuresinit.github.io/specpack/" rel="noopener noreferrer"&gt;&lt;strong&gt;SpecPack&lt;/strong&gt;&lt;/a&gt; — three small reference standards meant as an experimental foundation for that future. None of them are rocket science. I think of them as a bit of housekeeping for a fresh start.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://newadventuresinit.github.io/specpack/minimark/" rel="noopener noreferrer"&gt;&lt;strong&gt;MiniMark&lt;/strong&gt;&lt;/a&gt; takes Markdown and removes its optionality. Humans thrive on optionality — it is how we express our individuality and our taste. Machines do not need it, and tend to find it actively confusing. MiniMark keeps the syntax humans already know and strips the redundant ways of saying the same thing.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://newadventuresinit.github.io/specpack/river/" rel="noopener noreferrer"&gt;&lt;strong&gt;riVer&lt;/strong&gt;&lt;/a&gt; is a versioning scheme for textual content. It assumes a world in which sophisticated version-control systems like git are no longer required, or simply not present in the agentic environment. Once an application is a single document, a long-standing anti-pattern turns out to make sense again: putting the version number &lt;em&gt;inside&lt;/em&gt; the document and hence into the agent's context. What that version should indicate, on the other hand, is a question we get to ask from scratch. Agents do not consult a semantic version to decide whether a change is breaking — they read the spec and find out. What they need is an integer to mark the iteration, a status to mark the lifecycle stage, and a timestamp to place the change in time. riVer gives them exactly that.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://newadventuresinit.github.io/specpack/specify/" rel="noopener noreferrer"&gt;&lt;strong&gt;Specify&lt;/strong&gt;&lt;/a&gt; is the most ambitious of the three: an attempt at &lt;em&gt;American English coding standards&lt;/em&gt; — conventions for expressing programmatic logic and behavior in plain English, precisely enough for an agent to act on.&lt;/p&gt;

&lt;p&gt;So — does this mean source code goes away?&lt;/p&gt;

&lt;p&gt;No. It changes jobs: from the substance of applications to the seam between them. More on that next time.&lt;/p&gt;

&lt;p&gt;In the meantime, the standards are live, the license is permissive, and the floor is open. Have a look around, and let me know what you think.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>softwaredevelopment</category>
    </item>
    <item>
      <title>My New Adventures in IT</title>
      <dc:creator>Dirk Mattig</dc:creator>
      <pubDate>Mon, 25 May 2026 10:20:19 +0000</pubDate>
      <link>https://dev.to/newadventuresinit/my-new-adventures-in-it-4lhb</link>
      <guid>https://dev.to/newadventuresinit/my-new-adventures-in-it-4lhb</guid>
      <description>&lt;p&gt;When a blinking cursor on a screen awaited my input for the very first time, I could not possibly have anticipated that this technology would soon open up a whole new dimension for the world to live in. Having been born about six months after the Beatles split up, I happened to be in the right place at the right time when a bunch of nerds succeeded in bringing computers to the home.&lt;/p&gt;

&lt;p&gt;Now, after more than 40 years of coding as a hobby and over 25 years of software engineering as a profession, once again a blinking cursor (caret, really) awaits my input. Only this time the machine answers in far more elaborate ways than simply stating "ERROR". The more I use AI, the more evident it becomes to me that this is not just another paradigm shift like web, mobile, or cloud were.&lt;/p&gt;

&lt;p&gt;This is a new beginning. A new dimension is opening up.&lt;br&gt;
The new rules are that there will be all new rules.&lt;br&gt;
And that we do not yet know any of these new rules.&lt;/p&gt;

&lt;p&gt;This is the starting point for my ventures into the future of software, work, and business.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>software</category>
    </item>
  </channel>
</rss>
