<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Osama Ghazal</title>
    <description>The latest articles on DEV Community by Osama Ghazal (@osama_ghazal_96).</description>
    <link>https://dev.to/osama_ghazal_96</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3995126%2F54e464aa-ac88-4803-937a-02ca89a110c9.png</url>
      <title>DEV Community: Osama Ghazal</title>
      <link>https://dev.to/osama_ghazal_96</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/osama_ghazal_96"/>
    <language>en</language>
    <item>
      <title>The Core of a Coding Agent Is 128 Lines of Python. So I Built One From Scratch.</title>
      <dc:creator>Osama Ghazal</dc:creator>
      <pubDate>Sun, 21 Jun 2026 12:04:38 +0000</pubDate>
      <link>https://dev.to/osama_ghazal_96/the-core-of-a-coding-agent-is-128-lines-of-python-so-i-built-one-from-scratch-1og9</link>
      <guid>https://dev.to/osama_ghazal_96/the-core-of-a-coding-agent-is-128-lines-of-python-so-i-built-one-from-scratch-1og9</guid>
      <description>&lt;p&gt;&lt;strong&gt;128 lines of Python.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's the entire core of a coding agent — the loop that powers tools like Claude Code and Cursor. I didn't believe it either, so I built one from scratch. Then I pointed it at a failing test, and it read the file, ran the test, saw the traceback, fixed the code, and re-ran it — choosing every step itself. No one hard-coded that.&lt;/p&gt;

&lt;p&gt;It's open source (MIT), with a phased roadmap you can follow:&lt;/p&gt;

&lt;p&gt;👉 &lt;a href="https://github.com/osama96gh/coding-agent-from-scratch" rel="noopener noreferrer"&gt;github.com/osama96gh/coding-agent-from-scratch&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Why build one instead of reading one
&lt;/h2&gt;

&lt;p&gt;I use coding agents every day. As an AI engineer, I think they're the breakout use case for LLMs right now. But using something and understanding it are different things.&lt;/p&gt;

&lt;p&gt;Reading a production agent's source to learn the core is a trap — the essential logic is buried under prompt caching, retries, telemetry, and elaborate scaffolding. You can't see the engine for the bodywork.&lt;/p&gt;

&lt;p&gt;So I built just the engine. No optimizations. Just the essence.&lt;/p&gt;

&lt;h2&gt;
  
  
  The "huh, that small?" numbers
&lt;/h2&gt;

&lt;p&gt;These surprised me enough that I re-counted:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Piece&lt;/th&gt;
&lt;th&gt;Size&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Entire REPL + agent loop + permission gate (&lt;code&gt;main.py&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;128 lines&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;The system prompt that steers all behavior (&lt;code&gt;prompts.py&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;19 lines&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tools — read, list, grep, edit, write, run_bash&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;6 files&lt;/strong&gt;, smallest is 35&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Whole project&lt;/strong&gt;, incl. 2 swappable providers + streaming&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1,300 lines&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The thing that &lt;em&gt;feels&lt;/em&gt; like magic — an agent autonomously reading files, running your tests, fixing the failure, re-running — comes out of about a hundred lines of orchestration. The intelligence lives in the model. Your job is plumbing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The whole trick: the agent loop
&lt;/h2&gt;

&lt;p&gt;Strip away the streaming, the permission gate, and the UI, and the heartbeat of the whole thing is this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;user_input&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;  &lt;span class="c1"&gt;# keep going until the model stops asking for tools
&lt;/span&gt;    &lt;span class="n"&gt;turn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tools&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;TOOL_SCHEMAS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;SYSTEM_PROMPT&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;turn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;to_message&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;        &lt;span class="c1"&gt;# plain text → the model is done
&lt;/span&gt;        &lt;span class="k"&gt;break&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;turn&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tool_calls&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="c1"&gt;# otherwise, run each tool it asked for…
&lt;/span&gt;        &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;run_tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;conversation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tool&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="c1"&gt;# …then loop, so the model sees the results and decides what's next
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;That's it. That's the agent.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You type a request in plain English.&lt;/li&gt;
&lt;li&gt;Send the conversation + the list of tools to the LLM.&lt;/li&gt;
&lt;li&gt;The model replies with either &lt;strong&gt;text&lt;/strong&gt; (talk to you) or a &lt;strong&gt;tool call&lt;/strong&gt; ("read &lt;code&gt;main.py&lt;/code&gt;", "run &lt;code&gt;pytest&lt;/code&gt;").&lt;/li&gt;
&lt;li&gt;If it's a tool call: run it, append the result to the conversation, &lt;strong&gt;loop back to step 2&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;If it's text: show it, wait for your next turn.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The model decides &lt;em&gt;which&lt;/em&gt; tool and &lt;em&gt;in what order&lt;/em&gt;; the loop just keeps turning until the model stops asking.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;An agent is just an LLM, a loop, and some tools.&lt;/strong&gt; Everything else in this repo is refinement on top of those three.&lt;/p&gt;

&lt;p&gt;This is also where "it can debug itself" comes from — for free. When the shell tool feeds &lt;strong&gt;exit codes and stderr&lt;/strong&gt; back into the conversation, the model sees the failure on the next turn and proposes a fix. Nobody wrote &lt;code&gt;if tests fail, edit the code&lt;/code&gt;. It falls out of the loop.&lt;/p&gt;
&lt;h2&gt;
  
  
  The six tools
&lt;/h2&gt;

&lt;p&gt;One file each: &lt;code&gt;read_file&lt;/code&gt;, &lt;code&gt;list_files&lt;/code&gt;, &lt;code&gt;grep&lt;/code&gt;, &lt;code&gt;edit_file&lt;/code&gt;, &lt;code&gt;write_file&lt;/code&gt;, &lt;code&gt;run_bash&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Each is just a function plus a JSON schema describing its arguments — and that schema is all the model needs to know the tool exists and how to call it. "Tool calling" sounds advanced; it's really "here's a function signature, fill in the arguments."&lt;/p&gt;

&lt;p&gt;&lt;code&gt;run_bash&lt;/code&gt; alone is almost a superpower — with a shell you can stand in for most of the others — which is exactly why an agent needs a &lt;strong&gt;permission gate&lt;/strong&gt;.&lt;/p&gt;
&lt;h2&gt;
  
  
  The parts that make it feel real
&lt;/h2&gt;

&lt;p&gt;These refinements sit on top of the core, and they're where most of the line count goes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;System prompt (19 lines).&lt;/strong&gt; Tiny, but it's the steering wheel: &lt;em&gt;you're a coding agent, prefer tools over guessing, work step by step.&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Permission gate.&lt;/strong&gt; Before anything risky (writes, shell commands), it asks — and the decision reads the &lt;em&gt;arguments&lt;/em&gt;, not just the tool name, so &lt;code&gt;git status&lt;/code&gt; runs unprompted while &lt;code&gt;git push&lt;/code&gt; still stops to ask. The difference between an assistant and &lt;code&gt;rm -rf&lt;/code&gt; roulette.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context management.&lt;/strong&gt; LLMs are stateless — every turn resends the whole conversation, which gets expensive fast. The fixes: lean on the provider's cached prefix, and summarize old turns yourself.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pluggable providers.&lt;/strong&gt; A thin interface makes OpenAI and Gemini interchangeable — one env var to switch — and keeps anything provider-specific out of the loop.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Streaming + usage reporting.&lt;/strong&gt; See tokens as they generate; know what each turn cost.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2&gt;
  
  
  The part that surprised me: it just &lt;em&gt;feels&lt;/em&gt; real
&lt;/h2&gt;

&lt;p&gt;That failing-test run from the top? I never scripted it. The model chose to read, run, diagnose, fix, and re-run entirely on its own — the same shape of behavior I pay for in Claude Code every day, out of ~128 lines I could read in a single sitting.&lt;/p&gt;

&lt;p&gt;The gap between "toy" and "real" is smaller than the hype suggests. The production polish — caching, retries, sandboxing, a thousand handled edge cases — is genuine, hard engineering. But the &lt;em&gt;core&lt;/em&gt; that makes an agent an agent is within any engineer's reach in an afternoon.&lt;/p&gt;
&lt;h2&gt;
  
  
  Build it yourself, one phase at a time
&lt;/h2&gt;

&lt;p&gt;The repo is a &lt;strong&gt;phased roadmap&lt;/strong&gt; — each phase runs on its own and teaches one concept, so you always have a working agent:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The bare chat loop (no tools)&lt;/li&gt;
&lt;li&gt;Tool infrastructure + &lt;code&gt;read_file&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Read-only exploration (&lt;code&gt;list_files&lt;/code&gt;, &lt;code&gt;grep&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Write tools (&lt;code&gt;edit_file&lt;/code&gt;, &lt;code&gt;write_file&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;The shell tool (&lt;code&gt;run_bash&lt;/code&gt;) — where it gets powerful (and dangerous)&lt;/li&gt;
&lt;li&gt;System prompt + UX polish&lt;/li&gt;
&lt;li&gt;Safety &amp;amp; permissions&lt;/li&gt;
&lt;li&gt;Context management for long sessions&lt;/li&gt;
&lt;/ol&gt;


&lt;div class="ltag-github-readme-tag"&gt;
  &lt;div class="readme-overview"&gt;
    &lt;h2&gt;
      &lt;img src="https://assets.dev.to/assets/github-logo-5a155e1f9a670af7944dd5e12375bc76ed542ea80224905ecaf878b9157cdefc.svg" alt="GitHub logo"&gt;
      &lt;a href="https://github.com/osama96gh" rel="noopener noreferrer"&gt;
        osama96gh
      &lt;/a&gt; / &lt;a href="https://github.com/osama96gh/coding-agent-from-scratch" rel="noopener noreferrer"&gt;
        coding-agent-from-scratch
      &lt;/a&gt;
    &lt;/h2&gt;
    &lt;h3&gt;
      Educational Python coding agent built from scratch to explain agent loops, tool calling, code editing, bash execution, permissions, context management, and OpenAI/Gemini providers.
    &lt;/h3&gt;
  &lt;/div&gt;
  &lt;div class="ltag-github-body"&gt;
    
&lt;div id="readme" class="md"&gt;&lt;div class="markdown-heading"&gt;
&lt;h1 class="heading-element"&gt;Building a Coding Agent from Scratch&lt;/h1&gt;
&lt;/div&gt;

&lt;p&gt;A learning project: build a simple but real &lt;strong&gt;coding agent&lt;/strong&gt; (think a tiny Claude Code / Cursor / Codex), step by step, from nothing — to understand how complex AI agents are actually structured under the hood.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The one-sentence mental model:&lt;/strong&gt; &lt;em&gt;An agent is just an LLM, a loop, and some tools.&lt;/em&gt; Everything else is refinement. (&lt;a href="https://ampcode.com/notes/how-to-build-an-agent" rel="nofollow noopener noreferrer"&gt;source&lt;/a&gt;)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;Project description&lt;/h2&gt;
&lt;/div&gt;

&lt;p&gt;This repository is an educational, from-scratch Python implementation of a terminal coding agent. It shows the core mechanics behind modern AI coding tools: a model-driven agent loop, tool calling, file exploration, targeted code edits, shell command execution, permission checks, streaming responses, usage reporting, context compaction, and pluggable OpenAI/Gemini providers.&lt;/p&gt;

&lt;p&gt;It is meant to be read, modified, and learned from. It is not a production coding agent, but a small reference implementation for understanding how production coding agents are structured under the hood.&lt;/p&gt;

&lt;div class="markdown-heading"&gt;
&lt;h2 class="heading-element"&gt;What&lt;/h2&gt;…&lt;/div&gt;&lt;/div&gt;
  &lt;/div&gt;
  &lt;div class="gh-btn-container"&gt;&lt;a class="gh-btn" href="https://github.com/osama96gh/coding-agent-from-scratch" rel="noopener noreferrer"&gt;View on GitHub&lt;/a&gt;&lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;Build it, break it, extend it (a new tool, a web UI, a third provider) — and tell me how it goes. The fastest way to stop an AI tool from feeling like magic is to build a small one yourself.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>llm</category>
      <category>tutorial</category>
    </item>
  </channel>
</rss>
