<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: keesan.eth</title>
    <description>The latest articles on DEV Community by keesan.eth (@cryptokeesan).</description>
    <link>https://dev.to/cryptokeesan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F918212%2F4a0730ab-6568-4e42-ae73-5088e9b37b59.jpg</url>
      <title>DEV Community: keesan.eth</title>
      <link>https://dev.to/cryptokeesan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/cryptokeesan"/>
    <language>en</language>
    <item>
      <title>What Actually Makes Social Automation Reliable</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Sun, 31 May 2026 20:03:46 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/what-actually-makes-social-automation-reliable-4g1m</link>
      <guid>https://dev.to/cryptokeesan/what-actually-makes-social-automation-reliable-4g1m</guid>
      <description>&lt;p&gt;A reliable social automation stack is not built by stacking more retries on top of brittle behavior.&lt;/p&gt;

&lt;p&gt;The durable pattern is simpler:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;use official APIs where they exist&lt;/li&gt;
&lt;li&gt;keep browser execution as a controlled fallback&lt;/li&gt;
&lt;li&gt;require both a receipt and a verified postcondition before counting a run&lt;/li&gt;
&lt;li&gt;fail closed when the platform state does not match the reported result&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That discipline matters more than raw surface area. A smaller set of lanes with honest verification is worth more than a wider setup that quietly reports false success.&lt;/p&gt;

</description>
      <category>automation</category>
      <category>devtools</category>
      <category>api</category>
    </item>
    <item>
      <title>Receipts beat scheduled optimism</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Sun, 31 May 2026 20:00:30 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/receipts-beat-scheduled-optimism-1c5d</link>
      <guid>https://dev.to/cryptokeesan/receipts-beat-scheduled-optimism-1c5d</guid>
      <description>&lt;h1&gt;
  
  
  Receipts beat scheduled optimism
&lt;/h1&gt;

&lt;p&gt;The fastest way to lose trust in an automation is to mistake a schedule for a result.&lt;/p&gt;

&lt;p&gt;We have been rebuilding our execution stack around one rule: if a worker cannot show the exact action it took or the exact blocker it hit, it did not finish the job.&lt;/p&gt;

&lt;p&gt;That has forced us to simplify a lot. Fewer lanes. Better proofs. More honest failure states.&lt;/p&gt;

&lt;p&gt;The upside is that the system gets easier to trust once every action has to survive real verification.&lt;/p&gt;

</description>
      <category>devtools</category>
      <category>automation</category>
      <category>opensource</category>
    </item>
    <item>
      <title>MartinLoop: a control plane for AI coding agents</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Wed, 27 May 2026 01:39:14 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/martinloop-a-control-plane-for-ai-coding-agents-3dg5</link>
      <guid>https://dev.to/cryptokeesan/martinloop-a-control-plane-for-ai-coding-agents-3dg5</guid>
      <description>&lt;h1&gt;
  
  
  MartinLoop
&lt;/h1&gt;

&lt;p&gt;MartinLoop is an open-source control plane for AI coding agents.&lt;/p&gt;

&lt;p&gt;It adds hard budget stops, JSONL run records, and verify-gated completion so autonomous coding stays accountable.&lt;/p&gt;

&lt;p&gt;We built it because agent loops are powerful, but most teams still do not have enough control over cost, retries, or proof of completion.&lt;/p&gt;

&lt;p&gt;If you are using AI coding agents in production, I would love to hear how you are handling governance, cost ceilings, and verification.&lt;/p&gt;

</description>
      <category>devtools</category>
      <category>ai</category>
      <category>opensource</category>
    </item>
    <item>
      <title>AI Coding Agents Are Burning Budgets. The Next Layer Is Control</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Tue, 12 May 2026 01:08:29 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/ai-coding-agents-are-burning-budgets-the-next-layer-is-control-1eah</link>
      <guid>https://dev.to/cryptokeesan/ai-coding-agents-are-burning-budgets-the-next-layer-is-control-1eah</guid>
      <description>&lt;h2&gt;
  
  
  AI coding agents are becoming useful, but they still burn budgets, loop on bad strategies, and finish without enough evidence. The next layer is trace intelligence, model routing, and control."
&lt;/h2&gt;

&lt;h1&gt;
  
  
  AI Coding Agents Are Burning Budgets. The Next Layer Is Control.
&lt;/h1&gt;

&lt;p&gt;AI coding agents are getting better.&lt;/p&gt;

&lt;p&gt;They can read a repo, edit files, run tests, inspect errors, and try again.&lt;/p&gt;

&lt;p&gt;That is useful.&lt;/p&gt;

&lt;p&gt;But the problem showing up in real workflows is not just whether agents can write code.&lt;/p&gt;

&lt;p&gt;The problem is that agents can spend budget without producing finished work.&lt;/p&gt;

&lt;p&gt;They loop.&lt;/p&gt;

&lt;p&gt;They retry weak strategies.&lt;/p&gt;

&lt;p&gt;They switch files without explaining why.&lt;/p&gt;

&lt;p&gt;They chase unrelated errors.&lt;/p&gt;

&lt;p&gt;They claim completion without enough proof.&lt;/p&gt;

&lt;p&gt;And when the run ends, the human still has to ask:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;What actually happened?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is the gap the next generation of agent infrastructure has to solve.&lt;/p&gt;

&lt;p&gt;Not more autonomy first.&lt;/p&gt;

&lt;p&gt;Control first.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7wvfdsegi3ko8d4ht9ol.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F7wvfdsegi3ko8d4ht9ol.jpg" alt=" " width="800" height="475"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem Is Not Just Bad Code
&lt;/h2&gt;

&lt;p&gt;A bad patch is easy to see.&lt;/p&gt;

&lt;p&gt;A bad agent run is harder.&lt;/p&gt;

&lt;p&gt;The agent may do a lot of work that looks productive:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;read many files&lt;/li&gt;
&lt;li&gt;generate a long plan&lt;/li&gt;
&lt;li&gt;edit several modules&lt;/li&gt;
&lt;li&gt;run commands&lt;/li&gt;
&lt;li&gt;inspect failures&lt;/li&gt;
&lt;li&gt;produce a confident summary&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But at the end, the task is still not done.&lt;/p&gt;

&lt;p&gt;The budget is gone.&lt;/p&gt;

&lt;p&gt;The repo is messy.&lt;/p&gt;

&lt;p&gt;The logs are unclear.&lt;/p&gt;

&lt;p&gt;The next engineer has to reconstruct the run from fragments.&lt;/p&gt;

&lt;p&gt;This is why agentic coding needs a better unit of accountability.&lt;/p&gt;

&lt;p&gt;Not just the final diff.&lt;/p&gt;

&lt;p&gt;The full trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Trace Becomes The Product
&lt;/h2&gt;

&lt;p&gt;A coding agent trace should not be an afterthought.&lt;/p&gt;

&lt;p&gt;It should be the primary artifact of the run.&lt;/p&gt;

&lt;p&gt;A useful trace answers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What did the agent try first?&lt;/li&gt;
&lt;li&gt;Where did it get stuck?&lt;/li&gt;
&lt;li&gt;Which files did it touch?&lt;/li&gt;
&lt;li&gt;Which commands did it run?&lt;/li&gt;
&lt;li&gt;Which verifier failed?&lt;/li&gt;
&lt;li&gt;Did it repeat the same strategy?&lt;/li&gt;
&lt;li&gt;Did it switch models?&lt;/li&gt;
&lt;li&gt;Did it exceed budget?&lt;/li&gt;
&lt;li&gt;Why did it stop?&lt;/li&gt;
&lt;li&gt;What should a human do next?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is what I think of as &lt;strong&gt;trace intelligence&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Not just raw logs.&lt;/p&gt;

&lt;p&gt;Not just token usage.&lt;/p&gt;

&lt;p&gt;Not just a transcript.&lt;/p&gt;

&lt;p&gt;Trace intelligence means turning the run into something a human, system, or second agent can reason about.&lt;/p&gt;

&lt;p&gt;The trace should explain the work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Model Routing Matters
&lt;/h2&gt;

&lt;p&gt;Most agent workflows still treat model choice too casually.&lt;/p&gt;

&lt;p&gt;One model may be good at planning.&lt;/p&gt;

&lt;p&gt;Another may be better at code edits.&lt;/p&gt;

&lt;p&gt;Another may be cheaper for search, summarization, or test-output analysis.&lt;/p&gt;

&lt;p&gt;Another may be stronger for final review.&lt;/p&gt;

&lt;p&gt;But without a control layer, model routing becomes guesswork.&lt;/p&gt;

&lt;p&gt;A better system should ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Is this step worth a premium model?&lt;/li&gt;
&lt;li&gt;Can a cheaper model classify this failure?&lt;/li&gt;
&lt;li&gt;Should a stronger model review the plan before execution?&lt;/li&gt;
&lt;li&gt;Should the run downgrade when budget is tight?&lt;/li&gt;
&lt;li&gt;Should the run escalate when repeated failures appear?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Model routing should not just optimize quality.&lt;/p&gt;

&lt;p&gt;It should optimize quality within budget.&lt;/p&gt;

&lt;p&gt;That matters because the most painful agent failure is not always wrong code.&lt;/p&gt;

&lt;p&gt;Sometimes it is expensive unfinished work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Headless Agents Need More Guardrails, Not Fewer
&lt;/h2&gt;

&lt;p&gt;Headless coding agents are especially interesting.&lt;/p&gt;

&lt;p&gt;They can run without a constant human in the loop.&lt;/p&gt;

&lt;p&gt;They can process tasks, inspect repos, execute commands, and produce outputs asynchronously.&lt;/p&gt;

&lt;p&gt;That is powerful.&lt;/p&gt;

&lt;p&gt;But headless execution increases the need for control.&lt;/p&gt;

&lt;p&gt;If an agent is running without a developer watching every step, the system needs stronger answers to basic questions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What is this agent allowed to do?&lt;/li&gt;
&lt;li&gt;What budget can it spend?&lt;/li&gt;
&lt;li&gt;What commands are blocked?&lt;/li&gt;
&lt;li&gt;What verifier defines success?&lt;/li&gt;
&lt;li&gt;When should it stop?&lt;/li&gt;
&lt;li&gt;When should it ask for approval?&lt;/li&gt;
&lt;li&gt;What trace does it leave behind?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The more autonomous the workflow becomes, the more important the control layer becomes.&lt;/p&gt;

&lt;p&gt;Autonomy without traceability is not leverage.&lt;/p&gt;

&lt;p&gt;It is invisible execution.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent Teams Make The Problem Bigger
&lt;/h2&gt;

&lt;p&gt;The next step is not one agent.&lt;/p&gt;

&lt;p&gt;It is teams of agents.&lt;/p&gt;

&lt;p&gt;A planner agent.&lt;/p&gt;

&lt;p&gt;A coding agent.&lt;/p&gt;

&lt;p&gt;A reviewer agent.&lt;/p&gt;

&lt;p&gt;A test agent.&lt;/p&gt;

&lt;p&gt;A documentation agent.&lt;/p&gt;

&lt;p&gt;A security agent.&lt;/p&gt;

&lt;p&gt;A release agent.&lt;/p&gt;

&lt;p&gt;That sounds useful, but it also creates a new coordination problem.&lt;/p&gt;

&lt;p&gt;If one agent produces a bad plan, another may execute it.&lt;/p&gt;

&lt;p&gt;If the reviewer misses the issue, the system may mark the run complete.&lt;/p&gt;

&lt;p&gt;If the test agent checks the wrong verifier, the whole workflow may look successful while still being wrong.&lt;/p&gt;

&lt;p&gt;Agent-to-agent workflows need shared state, shared budgets, shared traces, and shared stop conditions.&lt;/p&gt;

&lt;p&gt;Otherwise, teams of agents can become teams of budget-burning loops.&lt;/p&gt;

&lt;p&gt;The question becomes:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Who governs the team?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That is where a control layer becomes necessary.&lt;/p&gt;

&lt;h2&gt;
  
  
  What MartinLoop 360 Is Pointing Toward
&lt;/h2&gt;

&lt;p&gt;The direction I am exploring with MartinLoop is a control layer for agentic coding workflows.&lt;/p&gt;

&lt;p&gt;The current idea is simple:&lt;/p&gt;

&lt;p&gt;Every agent run should be bounded, inspectable, and test-verifiable.&lt;/p&gt;

&lt;p&gt;The next layer expands that into a broader loop:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Trace intelligence&lt;/strong&gt; to understand what happened during a run&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model routing&lt;/strong&gt; to choose the right model for the right step&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HeadlessOS&lt;/strong&gt; for controlled background execution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MartinLoop 360&lt;/strong&gt; as a higher-level view of agent runs, budgets, traces, policies, and outcomes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The goal is not to make agents look more magical.&lt;/p&gt;

&lt;p&gt;The goal is to make them easier to trust.&lt;/p&gt;

&lt;p&gt;If an agent burns budget and fails, that should be visible.&lt;/p&gt;

&lt;p&gt;If an agent loops, that should be classified.&lt;/p&gt;

&lt;p&gt;If an agent completes a task, that should be verified.&lt;/p&gt;

&lt;p&gt;If multiple agents collaborate, the team should leave one coherent trace.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Core Loop
&lt;/h2&gt;

&lt;p&gt;A governed agent workflow should look less like this:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
text
Prompt → Agent runs → Agent says done

I’m exploring these ideas while building MartinLoop, an open-source control layer for AI coding agents.

GitHub: https://github.com/Keesan12/Martin-Loop 

Website: https://martinloop.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>ai</category>
      <category>devops</category>
      <category>opensource</category>
      <category>agents</category>
    </item>
    <item>
      <title>AI coding agents need receipts, not just better prompts</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Mon, 11 May 2026 17:46:15 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/ai-coding-agents-need-receipts-not-just-better-prompts-838</link>
      <guid>https://dev.to/cryptokeesan/ai-coding-agents-need-receipts-not-just-better-prompts-838</guid>
      <description>&lt;p&gt;AI coding agents are getting good enough to run real engineering tasks, but not safe enough to run without guardrails.&lt;/p&gt;

&lt;p&gt;The failure mode is not always dramatic.&lt;/p&gt;

&lt;p&gt;Sometimes the agent just keeps working.&lt;/p&gt;

&lt;p&gt;It retries.&lt;br&gt;
It rewrites.&lt;br&gt;
It spends tokens.&lt;br&gt;
It changes files.&lt;br&gt;
It says it is done.&lt;/p&gt;

&lt;p&gt;Then another engineer opens the diff and realizes the agent solved the wrong problem.&lt;/p&gt;

&lt;p&gt;That creates a new engineering question:&lt;/p&gt;

&lt;p&gt;Can another engineer audit this run later?&lt;/p&gt;

&lt;p&gt;That is why I’m building MartinLoop.&lt;/p&gt;

&lt;p&gt;MartinLoop is an open-source control plane for AI coding agents. The goal is to make every agent run bounded, inspectable, and test-verifiable.&lt;/p&gt;

&lt;p&gt;The first version focuses on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;hard budget caps&lt;/li&gt;
&lt;li&gt;JSONL run records&lt;/li&gt;
&lt;li&gt;audit trails&lt;/li&gt;
&lt;li&gt;failure classification&lt;/li&gt;
&lt;li&gt;test-verified completion&lt;/li&gt;
&lt;li&gt;reproducible agent runs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thesis is simple:&lt;/p&gt;

&lt;p&gt;The next layer of AI coding is not only better prompts.&lt;/p&gt;

&lt;p&gt;It is governance.&lt;/p&gt;

&lt;p&gt;Before agents touch serious repos, teams need receipts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;what the agent tried&lt;/li&gt;
&lt;li&gt;what it changed&lt;/li&gt;
&lt;li&gt;how much it spent&lt;/li&gt;
&lt;li&gt;what commands it ran&lt;/li&gt;
&lt;li&gt;what tests passed&lt;/li&gt;
&lt;li&gt;what failed&lt;/li&gt;
&lt;li&gt;why it stopped&lt;/li&gt;
&lt;li&gt;whether a human can resume, revert, or rerun it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I’m looking for feedback from developers using Claude Code, Codex, Cursor, Devin-style agents, or custom coding agents in real repos.&lt;/p&gt;

&lt;p&gt;What would you want in the default “agent receipt”?&lt;/p&gt;

&lt;p&gt;GitHub: &lt;a href="https://github.com/Keesan12/Martin-Loop" rel="noopener noreferrer"&gt;https://github.com/Keesan12/Martin-Loop&lt;/a&gt;&lt;br&gt;
Site: &lt;a href="https://martinloop.com" rel="noopener noreferrer"&gt;https://martinloop.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>devops</category>
      <category>programming</category>
    </item>
    <item>
      <title>Come Build on Concordium</title>
      <dc:creator>keesan.eth</dc:creator>
      <pubDate>Tue, 30 Aug 2022 19:57:03 +0000</pubDate>
      <link>https://dev.to/cryptokeesan/come-build-on-concordium-2jgh</link>
      <guid>https://dev.to/cryptokeesan/come-build-on-concordium-2jgh</guid>
      <description>&lt;p&gt;Concordium Blockchain is the only public blockchain with a privacy based ID-layer at the protocol level built using RUST. It is the only blockchain with user attributes accessible from Smart Contracts built to be enterprise grade and compliant by nature. &lt;/p&gt;

&lt;p&gt;We welcome all #rustdevs to test us out and help us build out our bounties as we look to create new tools for integrations, interoperability, and Dapps. &lt;/p&gt;

&lt;p&gt;The blockchain for the future has arrived!&lt;/p&gt;

</description>
      <category>rust</category>
      <category>blockchain</category>
      <category>webdev</category>
      <category>bounties</category>
    </item>
  </channel>
</rss>
