<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Devanshu Biswas</title>
    <description>The latest articles on DEV Community by Devanshu Biswas (@dev48v).</description>
    <link>https://dev.to/dev48v</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3929385%2F75a3696c-143d-4252-ba59-6ed4083ca827.jpg</url>
      <title>DEV Community: Devanshu Biswas</title>
      <link>https://dev.to/dev48v</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dev48v"/>
    <language>en</language>
    <item>
      <title>I Built a Custom Tool Server for Claude in 250 Lines of TypeScript</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Mon, 01 Jun 2026 13:58:11 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-custom-tool-server-for-claude-in-250-lines-of-typescript-3nph</link>
      <guid>https://dev.to/dev48v/i-built-a-custom-tool-server-for-claude-in-250-lines-of-typescript-3nph</guid>
      <description>&lt;p&gt;Open Claude Desktop. Ask it: &lt;em&gt;"What's on the Wikipedia article about WebAssembly?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Without any plugin, it'll guess from training data. With the server I'm about to show you, it'll call a tool named &lt;code&gt;wiki_extract&lt;/code&gt;, fetch the real article live, and read it to you. &lt;strong&gt;Same client. Same model. Different tools.&lt;/strong&gt; That switchover is the whole point of the Model Context Protocol.&lt;/p&gt;

&lt;p&gt;MCP, released by Anthropic in late 2024, is the open standard that finally answers &lt;em&gt;"how do I give an LLM real-world tools?"&lt;/em&gt; Before MCP, every AI app rolled its own plugin system. After MCP, you write a server once and &lt;strong&gt;every&lt;/strong&gt; compatible client speaks to it the same way — Claude Desktop, Cursor, Continue.dev, Zed, custom agents you build with the Anthropic SDK.&lt;/p&gt;

&lt;p&gt;Today I'm going to walk you through building one. The whole server is 250 lines. The whole thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mental model
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐   JSON-RPC 2.0   ┌──────────────────┐   HTTPS   ┌──────────────┐
│ Claude Desktop  │ ◄─── stdio ────► │ your MCP server  │ ◄───────► │  Wikipedia   │
│   (client)      │                  │    (Node.js)     │           │   (REST API) │
└─────────────────┘                  └──────────────────┘           └──────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things happen in this triangle:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The client spawns your server&lt;/strong&gt; as a child process when the app launches.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;They talk JSON-RPC 2.0 over stdio&lt;/strong&gt; — the server reads requests from stdin and writes responses to stdout. The wire format is the same one the Language Server Protocol has used since 2016 (yes, the protocol behind every "Go to definition" feature in VS Code).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The model decides when to call tools.&lt;/strong&gt; Your server provides the tools, the model picks which to call and with what arguments, the client routes the calls, your server runs them and returns results.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's the whole protocol.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: scaffold
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;mkdir &lt;/span&gt;mcp-from-zero &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;cd &lt;/span&gt;mcp-from-zero
npm init &lt;span class="nt"&gt;-y&lt;/span&gt;
npm &lt;span class="nb"&gt;install&lt;/span&gt; @modelcontextprotocol/sdk zod
npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-D&lt;/span&gt; typescript @types/node tsx
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;@modelcontextprotocol/sdk&lt;/code&gt; is the official Anthropic SDK. It does &lt;strong&gt;all&lt;/strong&gt; the JSON-RPC framing for you — you write tool handlers, the SDK speaks the wire format.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: minimal server with stdio transport
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;McpServer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modelcontextprotocol/sdk/server/mcp.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;StdioServerTransport&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@modelcontextprotocol/sdk/server/stdio.js&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;McpServer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mcp-from-zero&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;0.1.0&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;
&lt;span class="p"&gt;})&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;transport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;StdioServerTransport&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transport&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ready on stdio&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;// NOTE: stderr, not stdout!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's a fully working MCP server. It does nothing useful yet — no tools — but a client can &lt;code&gt;initialize&lt;/code&gt; against it and read capabilities. Nine lines.&lt;/p&gt;

&lt;p&gt;⚠️ &lt;strong&gt;One trap to avoid:&lt;/strong&gt; never log to stdout. The stdio transport &lt;strong&gt;reads JSON-RPC frames off stdout&lt;/strong&gt;. Any &lt;code&gt;console.log&lt;/code&gt; corrupts the wire format and the client drops the connection silently. Use &lt;code&gt;console.error&lt;/code&gt; (which goes to stderr) for &lt;strong&gt;everything&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: add a tool
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;zod&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wiki_search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Search Wikipedia for article titles matching a query.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;string&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;describe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Search query, e.g. "WebAssembly".&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;z&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;number&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;int&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hits&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;searchWikipedia&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;hits&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; — &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Five interesting things here:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;server.tool(name, description, schema, handler)&lt;/code&gt;&lt;/strong&gt; — that's the entire API. The SDK wires up &lt;code&gt;tools/list&lt;/code&gt; and &lt;code&gt;tools/call&lt;/code&gt; JSON-RPC handlers automatically.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zod schema → JSON Schema.&lt;/strong&gt; The SDK converts your Zod definitions into JSON Schema so MCP clients get parameter docs + validation for free. The model sees the same schema the client sees — it knows what arguments to send.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The &lt;code&gt;.describe()&lt;/code&gt; calls matter a LOT.&lt;/strong&gt; The model reads them as natural language. Better descriptions = better tool selection. "Search query, e.g. 'WebAssembly'" beats "string."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The handler returns content blocks.&lt;/strong&gt; Each block has a &lt;code&gt;type&lt;/code&gt; (text, image, resource_link) and a payload. The model reads them as part of its context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Async out of the box.&lt;/strong&gt; Tools can hit databases, call APIs, run subprocess — anything. The SDK awaits and serialises the result.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Step 4: the Wikipedia client
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;searchWikipedia&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;https://en.wikipedia.org/w/api.php&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;searchParams&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;action&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;opensearch&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;searchParams&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;format&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;searchParams&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;search&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;searchParams&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;limit&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User-Agent&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mcp-from-zero/0.1 (contact: you@example.com)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`HTTP &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[,&lt;/span&gt; &lt;span class="nx"&gt;titles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;snippets&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]]&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;titles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;title&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;snippets&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;urls&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Wikipedia's REST + Action APIs are free, no key, no account. They ask one thing: send a polite &lt;code&gt;User-Agent&lt;/code&gt; with a contact URL so they can throttle abusers without blocking polite traffic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: resources
&lt;/h2&gt;

&lt;p&gt;Tools are things the &lt;strong&gt;model&lt;/strong&gt; calls. Resources are things the &lt;strong&gt;client&lt;/strong&gt; pulls on its own.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;server&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wiki-trending&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wiki://trending&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Trending Wikipedia articles&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="na"&gt;mimeType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;text/plain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getTrendingArticles&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Resources are great for context the client might want to inject &lt;strong&gt;without spending model tokens&lt;/strong&gt; on a tool call first — daily reports, glossaries, config files, recent activity. The model only sees them if the client decides to put them in the conversation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 6: hook it into Claude Desktop
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mcp-from-zero"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"node"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"/path/to/mcp-from-zero/dist/index.js"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Drop that into Claude Desktop's &lt;code&gt;claude_desktop_config.json&lt;/code&gt; (macOS: &lt;code&gt;~/Library/Application Support/Claude/&lt;/code&gt;, Windows: &lt;code&gt;%APPDATA%/Claude/&lt;/code&gt;). Restart the app. Bottom-right tool icon now shows your three tools.&lt;/p&gt;

&lt;p&gt;Try: &lt;em&gt;"Use the wiki_extract tool to summarise the article on entropy."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Claude picks the tool, sends the request to your server, your server hits Wikipedia, returns the text, Claude reads it, writes a summary. The whole loop happens inside the chat.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why MCP matters
&lt;/h2&gt;

&lt;p&gt;If you've been watching the AI-agents wave the last twelve months thinking &lt;em&gt;"every demo has a different tool API,"&lt;/em&gt; MCP is the answer.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One server fits every client.&lt;/strong&gt; Build mcp-from-zero once, it works in Claude Desktop, Cursor, Continue.dev, Zed, and any custom agent that imports the Anthropic SDK.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One spec across vendors.&lt;/strong&gt; Microsoft and OpenAI have both publicly committed to supporting MCP. The major IDE makers shipped it within months.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capability negotiation up front.&lt;/strong&gt; Clients tell servers what they support (sampling, completion, logging) before any model call.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the USB-C moment for AI integrations. One cable, every device. The reason MCP went from "neat protocol" to "everywhere" in six months is that the &lt;strong&gt;distribution model&lt;/strong&gt; finally fits: build once, ship to every chat / IDE / agent simultaneously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this changes for you
&lt;/h2&gt;

&lt;p&gt;Five years ago you'd write a Slack bot. Three years ago you'd write a ChatGPT plugin (deprecated 2024). Last year you'd write a Custom GPT (locked to OpenAI). Today you write an MCP server, push it to npm, and any AI client running on any user's machine can install it.&lt;/p&gt;

&lt;p&gt;The code for this demo is on &lt;a href="https://github.com/dev48v/mcp-from-zero" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, with eight step-by-step commits you can follow. Clone it, hook it into Claude Desktop in two minutes, and you've now extended your AI client with a tool you wrote yourself. That's the whole point.&lt;/p&gt;

&lt;p&gt;Welcome to MCP.&lt;/p&gt;




&lt;p&gt;🔗 Code: &lt;a href="https://github.com/dev48v/mcp-from-zero" rel="noopener noreferrer"&gt;github.com/dev48v/mcp-from-zero&lt;/a&gt;&lt;br&gt;
📚 Series: &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;TechFromZero&lt;/a&gt; — a new technology every day, all free, all open source.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>anthropic</category>
      <category>typescript</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Compiled Rust to WebAssembly and Made My JavaScript 6 Faster</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Mon, 25 May 2026 05:21:24 +0000</pubDate>
      <link>https://dev.to/dev48v/i-compiled-rust-to-webassembly-and-made-my-javascript-6x-faster-3ekf</link>
      <guid>https://dev.to/dev48v/i-compiled-rust-to-webassembly-and-made-my-javascript-6x-faster-3ekf</guid>
      <description>&lt;p&gt;Click the &lt;strong&gt;Gaussian blur&lt;/strong&gt; button on &lt;a href="https://wasm-from-zero.vercel.app" rel="noopener noreferrer"&gt;wasm-from-zero.vercel.app&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;WASM: 38 ms. JS: 182 ms. &lt;strong&gt;Speedup: 4.8×.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That gap is the whole point of WebAssembly. Same algorithm. Same image. Same browser. But one version ran a hot inner loop compiled from Rust through wasm-bindgen to a 12 KB binary, and the other version ran the same loop as JavaScript. The browser optimised both. Wasm won by almost an order of magnitude.&lt;/p&gt;

&lt;p&gt;This is the trick that powers Figma, Photoshop on the Web, Google Earth, AutoCAD Web, and a lot of the "wait, that runs in a browser tab?" demos you've seen the last three years. It used to take a six-month port of a C++ engine to ship. Today it takes a Cargo.toml, one &lt;code&gt;wasm-pack build&lt;/code&gt;, and an &lt;code&gt;import&lt;/code&gt; statement.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mental model
&lt;/h2&gt;

&lt;p&gt;WebAssembly is a &lt;strong&gt;binary format that runs in every modern browser&lt;/strong&gt;. It's not a programming language — it's an assembly-like target you compile to, the same way you compile to x86_64 or ARM. The languages that target wasm today: Rust, C, C++, Go, Zig, AssemblyScript, Swift, Kotlin, Dart, .NET. Anything with an LLVM backend, basically.&lt;/p&gt;

&lt;p&gt;The browser doesn't run your wasm under an interpreter. It JIT-compiles it to actual native machine code, then runs it at near-native speed. The wasm engine in V8 (and SpiderMonkey, JavaScriptCore) is the same engine that runs your JavaScript — it just gets to skip the type-inference + bailout dance JS needs, because wasm is already statically typed.&lt;/p&gt;

&lt;p&gt;For pure compute hot loops, that means: &lt;strong&gt;no boxed numbers, no hidden classes, no deopts, no GC pauses&lt;/strong&gt;. Just tight code that does the work the source said to do.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: write the kernel in Rust
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;use&lt;/span&gt; &lt;span class="nn"&gt;wasm_bindgen&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nn"&gt;prelude&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;#[wasm_bindgen]&lt;/span&gt;
&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;grayscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;mut&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;pixels&lt;/span&gt;&lt;span class="nf"&gt;.chunks_exact_mut&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;f32&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;luma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.299&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.587&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.114&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;u8&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;luma&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;#[wasm_bindgen]&lt;/code&gt; attribute is the magic. It generates the glue code that lets JavaScript pass a &lt;code&gt;Uint8Array&lt;/code&gt; directly into Rust's &lt;code&gt;&amp;amp;mut [u8]&lt;/code&gt; — and Rust to write back to the same bytes. No serialisation, no JSON round-trip, no copy across the language boundary.&lt;/p&gt;

&lt;p&gt;Why does that matter? Because the canvas pixel buffer &lt;strong&gt;is&lt;/strong&gt; a &lt;code&gt;Uint8ClampedArray&lt;/code&gt;. The wasm function gets to point at the canvas's actual memory and rewrite it in place. Zero overhead between the JS world and the Rust world.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: compile to wasm
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;docker run &lt;span class="nt"&gt;--rm&lt;/span&gt; &lt;span class="nt"&gt;-v&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$PWD&lt;/span&gt;&lt;span class="s2"&gt;/rust:/work"&lt;/span&gt; &lt;span class="nt"&gt;-w&lt;/span&gt; /work rust:1-slim &lt;span class="se"&gt;\&lt;/span&gt;
    bash &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'cargo install wasm-pack --locked &amp;amp;&amp;amp; wasm-pack build --target web'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The output is a &lt;code&gt;pkg/&lt;/code&gt; folder with three files that matter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;wasm_from_zero_bg.wasm&lt;/code&gt; — the actual binary, ~10 KB for the four filters in this demo&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wasm_from_zero.js&lt;/code&gt; — wasm-bindgen-generated ES module wrapper&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wasm_from_zero.d.ts&lt;/code&gt; — TypeScript definitions, autogenerated from the Rust signatures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You import it like any other module. No bundler plugin needed (Vite handles the &lt;code&gt;.wasm&lt;/code&gt; URL resolution natively):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;init&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;grayscale&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;invert&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sepia&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;blur&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./wasm/wasm_from_zero&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;           &lt;span class="c1"&gt;// one-time: fetch + instantiate the .wasm&lt;/span&gt;
&lt;span class="nf"&gt;grayscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;// direct function call from here on&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That &lt;code&gt;await init()&lt;/code&gt; is the only async work. After it resolves, every filter call is a synchronous function invocation with zero per-call overhead.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: feed it the canvas
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;2d&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getImageData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;h&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;// Canvas hands us Uint8ClampedArray; wasm-bindgen's &amp;amp;mut [u8] wants&lt;/span&gt;
&lt;span class="c1"&gt;// Uint8Array. Same memory, different TS type — wrap a view, zero copy.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pixels&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Uint8Array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nf"&gt;grayscale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;              &lt;span class="c1"&gt;// wasm mutates in place&lt;/span&gt;

&lt;span class="nx"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;putImageData&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;// paint result&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole bridge. Read the pixels off the canvas, hand them to wasm, write the result back. The wasm linear memory and the canvas buffer &lt;strong&gt;are the same bytes&lt;/strong&gt; — the function call is just "go process this 4 MB pointer."&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: race it against JavaScript
&lt;/h2&gt;

&lt;p&gt;The killer demo is to run the exact same algorithm in plain JS and time both. I ported every kernel from Rust to JS byte-for-byte:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;grayscaleJs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Uint8ClampedArray&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;luma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.299&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.587&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mf"&gt;0.114&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;luma&lt;/span&gt;
    &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;luma&lt;/span&gt;
    &lt;span class="nx"&gt;pixels&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;luma&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same &lt;code&gt;0.299&lt;/code&gt;, &lt;code&gt;0.587&lt;/code&gt;, &lt;code&gt;0.114&lt;/code&gt;. Same loop shape. Same memory access pattern. The JS engine has every chance to optimise this — and it does, decently. But it can't escape the boxing of &lt;code&gt;pixels[i] * 0.299&lt;/code&gt; (the JS spec says that's a float64 multiplication, even though the result will be stored as a &lt;code&gt;Uint8Clamped&lt;/code&gt;). Wasm just multiplies &lt;code&gt;u8 * f32&lt;/code&gt; directly.&lt;/p&gt;

&lt;p&gt;On my laptop:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Filter&lt;/th&gt;
&lt;th&gt;WASM&lt;/th&gt;
&lt;th&gt;JS&lt;/th&gt;
&lt;th&gt;Speedup&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;invert&lt;/td&gt;
&lt;td&gt;6 ms&lt;/td&gt;
&lt;td&gt;9 ms&lt;/td&gt;
&lt;td&gt;1.5×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;grayscale&lt;/td&gt;
&lt;td&gt;7 ms&lt;/td&gt;
&lt;td&gt;14 ms&lt;/td&gt;
&lt;td&gt;2×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sepia&lt;/td&gt;
&lt;td&gt;9 ms&lt;/td&gt;
&lt;td&gt;22 ms&lt;/td&gt;
&lt;td&gt;2-3×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;blur (3×3)&lt;/td&gt;
&lt;td&gt;38 ms&lt;/td&gt;
&lt;td&gt;182 ms&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;4.8×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The pattern: the bigger the per-pixel arithmetic, the wider wasm's lead. &lt;code&gt;invert&lt;/code&gt; is &lt;code&gt;255 - v&lt;/code&gt; × 3 channels, so cheap that JS almost catches up. The 3×3 Gaussian blur reads 9 neighbour pixels × 3 channels per output pixel — 27 million multiply-adds on a 1 MP image. That's exactly the shape where wasm pulls ahead.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this changes
&lt;/h2&gt;

&lt;p&gt;Five years ago, you wrote your image filters in JS and either accepted the slowness or stood up a Python service. Today, you stick the hot loop in a Rust crate, run one build, and ship a 10 KB binary alongside your bundle.&lt;/p&gt;

&lt;p&gt;The interesting thing isn't the speedup — it's the &lt;strong&gt;distribution model&lt;/strong&gt;. You're not asking the user to install anything. You're not running a server. The binary just rides along with your JS bundle. The browser caches it. It runs sandboxed by the same security model as JS. It works on every modern browser including mobile.&lt;/p&gt;

&lt;p&gt;If you've been holding off on "real" graphics, audio, video, ML, crypto, parsing work in the browser because "it'd be too slow in JS" — that excuse is gone.&lt;/p&gt;

&lt;p&gt;The code for this demo is on &lt;a href="https://github.com/dev48v/wasm-from-zero" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, with eight step-by-step commits you can follow. The live version is at &lt;a href="https://wasm-from-zero.vercel.app" rel="noopener noreferrer"&gt;wasm-from-zero.vercel.app&lt;/a&gt;. Open it. Click Gaussian blur. Watch the speedup card light up.&lt;/p&gt;

&lt;p&gt;That's WebAssembly.&lt;/p&gt;




&lt;p&gt;🔗 Code: &lt;a href="https://github.com/dev48v/wasm-from-zero" rel="noopener noreferrer"&gt;github.com/dev48v/wasm-from-zero&lt;/a&gt;&lt;br&gt;
🌐 Live demo: &lt;a href="https://wasm-from-zero.vercel.app" rel="noopener noreferrer"&gt;wasm-from-zero.vercel.app&lt;/a&gt;&lt;br&gt;
📚 Series: &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;TechFromZero&lt;/a&gt; — a new technology every day, all free, all open source.&lt;/p&gt;

</description>
      <category>rust</category>
      <category>webassembly</category>
      <category>javascript</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Built a Text-to-Image Search Engine That Runs Entirely in the Browser</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Sat, 23 May 2026 21:51:08 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-text-to-image-search-engine-that-runs-entirely-in-the-browser-55n</link>
      <guid>https://dev.to/dev48v/i-built-a-text-to-image-search-engine-that-runs-entirely-in-the-browser-55n</guid>
      <description>&lt;p&gt;Type "a corgi on grass." Out of 24 photos in the gallery, the corgi rises to the top. Score: 0.31.&lt;/p&gt;

&lt;p&gt;Type "something to eat." A bowl of strawberries, a plate of pasta, and a wood-fired pizza take the medal positions. Score range: 0.25 – 0.27.&lt;/p&gt;

&lt;p&gt;No server. No API key. No image got uploaded anywhere. The whole pipeline — a 150 MB neural network and 24 image embeddings — lives in a tab in your browser.&lt;/p&gt;

&lt;p&gt;This is &lt;strong&gt;CLIP&lt;/strong&gt;, the model that quietly powers a huge slice of modern computer vision. And in 2026, you can ship it on Vercel for free.&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea behind CLIP
&lt;/h2&gt;

&lt;p&gt;OpenAI released CLIP in 2021 with one beautifully simple idea: &lt;strong&gt;train one model to put text and images into the same vector space.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's it. That's the whole trick.&lt;/p&gt;

&lt;p&gt;CLIP has two encoders. The text encoder turns "a corgi puppy" into a 512-dimensional vector. The vision encoder turns a photo of a corgi into a 512-dimensional vector. They were trained on 400 million (caption, image) pairs scraped from the web so that paired text and images &lt;strong&gt;end up near each other in that vector space&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Once that's true, you can do search the way a database does it. The distance between the vector for "a corgi puppy" and the vector for an actual corgi photo is small. The distance between "a corgi puppy" and a photo of an astronaut is large. Sort by distance. Done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"a corgi puppy"  ─▶  text encoder  ─▶  [0.04, -0.12, 0.07, ...]  ─┐
                                                                   ├─▶  cosine sim
[image bytes]    ─▶  vision encoder ─▶  [0.05, -0.10, 0.08, ...]  ─┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The math at the end is two for-loops and a multiplication:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;cosineSim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Float32Array&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Float32Array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;dot&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;dot&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;dot&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. That's the search engine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;If you understand "embed everything into the same vector space and compare with a dot product," you understand the heart of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pinterest's visual search.&lt;/strong&gt; "Find me more like this."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stable Diffusion's text conditioning.&lt;/strong&gt; "Generate this prompt" is "find a region of vector space the model has learned to produce."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dataset deduplication.&lt;/strong&gt; "Which of these 50 million images are near-duplicates?" Cluster by vector distance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero-shot classification.&lt;/strong&gt; "Is this a cat, a dog, or a goat?" Encode the three labels, encode the image, take the closest.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content moderation at scale.&lt;/strong&gt; "Is this image semantically similar to the policy violations we've labelled?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every one of these has been a multi-million-dollar engineering problem for some company in the last five years. The core trick is what we're about to build in 200 lines of TypeScript.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 1: load CLIP into the browser
&lt;/h2&gt;

&lt;p&gt;The thing that used to require a Python server with a GPU now runs in a tab. The library doing this magic is &lt;a href="https://huggingface.co/docs/transformers.js" rel="noopener noreferrer"&gt;Transformers.js&lt;/a&gt; — Hugging Face's port of their Python &lt;code&gt;transformers&lt;/code&gt; library to ONNX Runtime in JavaScript.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;AutoProcessor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;CLIPTextModelWithProjection&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;CLIPVisionModelWithProjection&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@xenova/transformers&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;MODEL_ID&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Xenova/clip-vit-base-patch32&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;textModel&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;visionModel&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
  &lt;span class="nx"&gt;AutoTokenizer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;AutoProcessor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;CLIPTextModelWithProjection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="nx"&gt;CLIPVisionModelWithProjection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;from_pretrained&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;MODEL_ID&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;First visit: ~150 MB of ONNX weights stream from the Hugging Face CDN into your browser's Cache API. Every visit after that: a few hundred milliseconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 2: encode a phrase
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;encodeText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;tokenizer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;padding&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;truncation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;text_embeds&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;textModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;l2normalise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text_embeds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;// 512-d Float32Array&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;text_embeds&lt;/code&gt; field is the projected vector — the one that lives in the shared space. The un-projected hidden state is the wrong vector to compare against.&lt;/p&gt;

&lt;p&gt;We L2-normalise (divide by length) so cosine similarity reduces to a dot product. This is the kind of small detail nobody explains in tutorials but matters: without normalisation, your ranking becomes "which image is bright" not "which image matches your query."&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 3: encode an image
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;encodeImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;image&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;RawImage&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;inputs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;processor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;image&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;image_embeds&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;visionModel&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;inputs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;l2normalise&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;image_embeds&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The processor handles resize → centre-crop → normalisation with CLIP's specific mean/std values. The vision model is a &lt;strong&gt;Vision Transformer&lt;/strong&gt; — it cuts the 224×224 image into 7×7 patches of 32×32 px, treats each patch as a token, runs them through a transformer (yes, the same architecture as GPT), and projects the [CLS] token down to 512-d.&lt;/p&gt;

&lt;p&gt;Same 512-d. Same space as the text. That's the whole magic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 4: rank
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queryVec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;encodeText&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;a corgi on grass&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;imageVecs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;encodeImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;img&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ranked&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;imageVecs&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;images&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="na"&gt;score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;cosineSim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryVec&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;v&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}))&lt;/span&gt;
  &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the search engine. Eight lines.&lt;/p&gt;

&lt;h2&gt;
  
  
  Step 5: cache so it stays fast
&lt;/h2&gt;

&lt;p&gt;The model weights cache in the Cache API automatically. But re-encoding 24 images on every visit is ~5 seconds of WASM work for no reason — the vectors don't change. Stash them in IndexedDB keyed by image id + model id:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;STORE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;embeddings&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;putCached&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;imageId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Float32Array&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;openDb&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;STORE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;readwrite&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;objectStore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;STORE&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;vec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;buffer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;modelId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;::&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;imageId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;48 KB for the whole gallery. Warm reloads now feel instant.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this changes
&lt;/h2&gt;

&lt;p&gt;Five years ago, "text-to-image search" was a paper. Two years ago, it was a Python server with a GPU and an SDK. Today, it's a Vercel deploy.&lt;/p&gt;

&lt;p&gt;The line between "real AI engineering" and "a weekend project" keeps moving. Not because the models got smaller — CLIP is still 150 MB. The browser got bigger. WebAssembly. ONNX Runtime Web. IndexedDB. Cache API. The runtime stack ate everything the old Python service used to do.&lt;/p&gt;

&lt;p&gt;If you're a beginner reading this thinking "AI is too hard, I'd never build that": you just read the whole thing. The code is on &lt;a href="https://github.com/dev48v/clip-from-zero" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;, every commit walks you through one concept, and there's a live demo at &lt;a href="https://clip-from-zero.vercel.app" rel="noopener noreferrer"&gt;clip-from-zero.vercel.app&lt;/a&gt;. Clone it. Open &lt;code&gt;clip.ts&lt;/code&gt;. Read the four functions. That's CLIP.&lt;/p&gt;

&lt;p&gt;The next time you hear someone talking about "embeddings" or "vector search" or "RAG" or "multimodal" — you know what they mean. Numbers in a 512-d space. Cosine similarity. A dot product.&lt;/p&gt;

&lt;p&gt;That's it.&lt;/p&gt;




&lt;p&gt;🔗 Code: &lt;a href="https://github.com/dev48v/clip-from-zero" rel="noopener noreferrer"&gt;github.com/dev48v/clip-from-zero&lt;/a&gt;&lt;br&gt;
🌐 Live demo: &lt;a href="https://clip-from-zero.vercel.app" rel="noopener noreferrer"&gt;clip-from-zero.vercel.app&lt;/a&gt;&lt;br&gt;
📚 Series: &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;TechFromZero&lt;/a&gt; — a new technology every day, all free, all open source.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>webdev</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Built a 3-Agent AI Research Crew in 250 Lines of Python (LangGraph + Free Gemini)</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Thu, 21 May 2026 10:04:31 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-3-agent-ai-research-crew-in-250-lines-of-python-langgraph-free-gemini-2l38</link>
      <guid>https://dev.to/dev48v/i-built-a-3-agent-ai-research-crew-in-250-lines-of-python-langgraph-free-gemini-2l38</guid>
      <description>&lt;p&gt;You've seen the demos. "Look, our AI hired a research team and wrote you a 50-page report while you brushed your teeth!" Cool. Now show me the code.&lt;/p&gt;

&lt;p&gt;It turns out the code is small. Embarrassingly small. The whole pattern — the thing every multi-agent framework on the market sells you a SaaS license for — is &lt;strong&gt;three functions piped through a typed dict&lt;/strong&gt;. Once you see it, you can't unsee it.&lt;/p&gt;

&lt;p&gt;So today we're going to build it. From scratch. In about 250 lines of Python. The crew has three specialists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;Researcher&lt;/strong&gt; who actually hits the web for real facts.&lt;/li&gt;
&lt;li&gt;A &lt;strong&gt;Writer&lt;/strong&gt; who turns those facts into a draft briefing.&lt;/li&gt;
&lt;li&gt;An &lt;strong&gt;Editor&lt;/strong&gt; who polishes the draft into something you'd send to a colleague.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You give it a topic. It hands you back a markdown report. And — this is the part the YouTubers skip — the entire stack is &lt;strong&gt;free&lt;/strong&gt;. No OpenAI, no Anthropic, no Tavily, no credit card. Just one free Google API key.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://langgraph-from-zero.vercel.app" rel="noopener noreferrer"&gt;Live demo&lt;/a&gt; — click a starter, watch the three stages turn green. &lt;a href="https://github.com/dev48v/langgraph-from-zero" rel="noopener noreferrer"&gt;Source on GitHub&lt;/a&gt;, step-by-step commits.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson under the hype
&lt;/h2&gt;

&lt;p&gt;If you remember nothing else, remember this: &lt;strong&gt;a multi-agent system is a state machine where each transition is an LLM call.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That's it. That's the whole field.&lt;/p&gt;

&lt;p&gt;You define a shared blob of state — let's call it &lt;code&gt;CrewState&lt;/code&gt;. The graph has nodes. Each node is just a function that reads the state, does some work (often by calling an LLM), and returns a partial dict. The framework merges that dict back into the state and decides which node runs next.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;       ┌────────────┐    ┌──────────┐    ┌────────┐
TOPIC ─▶│ Researcher │───▶│  Writer  │───▶│ Editor │───▶ REPORT
       │  (DDG API) │    │ (Gemini) │    │(Gemini)│
       └────────────┘    └──────────┘    └────────┘
                  shared CrewState flows through all three
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When LangGraph's authors built v0.1, they were essentially asking: &lt;em&gt;what if we took LangChain's "Chain" abstraction but let it have cycles, branches, and persistent state?&lt;/em&gt; That's the whole pitch. The graph &lt;strong&gt;is&lt;/strong&gt; the agent.&lt;/p&gt;

&lt;h2&gt;
  
  
  The state object: where everyone meets
&lt;/h2&gt;

&lt;p&gt;LangGraph is opinionated about one thing: every node has to agree on the shape of the data flowing through. So we type it. Once.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;typing&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;TypedDict&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fact&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;snippet&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CrewState&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;TypedDict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;facts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;Fact&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;draft&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;final_report&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
    &lt;span class="n"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;total=False&lt;/code&gt; is the magic. It means &lt;strong&gt;every field is optional&lt;/strong&gt;. The graph starts with only &lt;code&gt;topic&lt;/code&gt; set. As each node finishes, it adds its key (&lt;code&gt;facts&lt;/code&gt;, then &lt;code&gt;draft&lt;/code&gt;, then &lt;code&gt;final_report&lt;/code&gt;). LangGraph merges the partial dict back into the state before the next node runs.&lt;/p&gt;

&lt;p&gt;Notice what's missing: any reference to LLMs, prompts, or agents. The state is data, not behaviour. Keeping it that way is what makes the whole pattern testable — you can swap any agent for a stub that just returns &lt;code&gt;{"draft": "lorem ipsum"}&lt;/code&gt; and the graph still runs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agent 1: The Researcher (or, why your LLM needs to leave the house)
&lt;/h2&gt;

&lt;p&gt;The single biggest mistake beginners make with LLM apps is asking the model to remember things. Don't. The model's job is to &lt;strong&gt;transform&lt;/strong&gt; inputs into outputs. The inputs come from your code, calling real APIs.&lt;/p&gt;

&lt;p&gt;That's why the Researcher does the search &lt;strong&gt;first&lt;/strong&gt;, then hands the results to the LLM. Not the other way around.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;duckduckgo_search&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DDGS&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;researcher_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;CrewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DDGS&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ddgs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ddgs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;region&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wt-wt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;facts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
        &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;href&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;
    &lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;facts&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole researcher. Twelve lines. No LLM call. DuckDuckGo's API is free, no key, no signup — perfect for a tutorial. (For production, swap to Tavily or Serper; the pattern doesn't change.)&lt;/p&gt;

&lt;p&gt;Why grounded search instead of asking Gemini "what do you know about X?"&lt;/p&gt;

&lt;p&gt;Because &lt;strong&gt;Gemini will lie about URLs&lt;/strong&gt;. Every single LLM ever shipped will invent URLs that look real but 404. Don't ask the model for facts. Ask it to summarise facts you got from a tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  Agents 2 + 3: Why two? Why not one big prompt?
&lt;/h2&gt;

&lt;p&gt;Here's the part that feels wasteful at first. We have research notes. We want a polished report. Surely one prompt — &lt;em&gt;"Write a polished briefing about $TOPIC using these facts"&lt;/em&gt; — can do it?&lt;/p&gt;

&lt;p&gt;Try it. It produces mediocre output. Always.&lt;/p&gt;

&lt;p&gt;The reason is &lt;strong&gt;divided attention&lt;/strong&gt;. Asking a model to simultaneously generate prose AND fix its own structural problems means it does both badly. So we split:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;strong&gt;Writer&lt;/strong&gt; has one job: produce a 400-500 word draft. Hedges allowed. Markdown sloppy. Just get words on the page.&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;Editor&lt;/strong&gt; has one job: read the draft, fix the structure, prepend a title, cut the flab, ship the final.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is how human newsrooms work. Reporters file rough drafts; copy editors polish them. Both roles exist because both add value. Multi-agent design is just the same idea applied to LLM calls.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;writer_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;facts_block&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;- &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;snippet&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;source&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;)&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;f&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;facts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a technical writer. 400-500 words, plain markdown...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Topic: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Facts:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;facts_block&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Write the draft.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;editor_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
        &lt;span class="nc"&gt;SystemMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a senior editor. Polish, don&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;t rewrite. Add an H1 title...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="nc"&gt;HumanMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;DRAFT TO POLISH:&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;draft&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;System message = how to behave. Human message = what to do. Mixing them produces worse output. This is the single most underrated trick in prompt engineering.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring it: the actual LangGraph part
&lt;/h2&gt;

&lt;p&gt;After all that setup, the LangGraph code is comically short:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;langgraph.graph&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;StateGraph&lt;/span&gt;

&lt;span class="n"&gt;g&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;StateGraph&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CrewState&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;researcher_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;writer_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_node&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;editor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;editor_node&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;START&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;researcher&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;writer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;editor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add_edge&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;editor&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;END&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;crew&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;invoke&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;How CRISPR works&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;final_report&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole thing. Eight lines define the agent crew. &lt;code&gt;compile()&lt;/code&gt; turns the declarative graph into an executor that runs your nodes in order, merging state between them. Once compiled, the same &lt;code&gt;crew&lt;/code&gt; object handles every request — it's stateless across invocations, because the state lives in the dict you pass in.&lt;/p&gt;

&lt;h2&gt;
  
  
  Streaming: because waiting 8 seconds for a spinner feels broken
&lt;/h2&gt;

&lt;p&gt;The sync version works, but the UX is sad. The user clicks a button. A spinner appears. Eight to ten seconds pass. The whole report appears at once.&lt;/p&gt;

&lt;p&gt;LangGraph has a &lt;code&gt;stream()&lt;/code&gt; method that yields after each node finishes. Same total latency, dramatically better feel:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;crew&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stream&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;topic&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="n"&gt;stream_mode&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;updates&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;node&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;node_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;data&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;partial&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;stream_mode="updates"&lt;/code&gt; emits only the diff returned by the node that ran (cheaper than &lt;code&gt;"values"&lt;/code&gt;, which emits the whole state). Pipe that through FastAPI's Server-Sent Events and the React frontend can render researcher facts as soon as they arrive — while the writer is still drafting. The Render free tier sleeps after 15 minutes, so first hit takes ~30 seconds to wake up; subsequent hits feel instant.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is THE concept to understand before everything else
&lt;/h2&gt;

&lt;p&gt;Every "wow look at this" AI demo from the next six months is going to be a variation on this exact pattern. Replace the Researcher with a code-aware agent and you get Cursor's compose mode. Replace the Editor with a fact-checker and you get Perplexity's pipeline. Replace the linear edges with conditional edges and a feedback loop and you get autonomous agent loops like AutoGPT.&lt;/p&gt;

&lt;p&gt;The pattern doesn't change. &lt;strong&gt;State + nodes + edges.&lt;/strong&gt; Three functions, three arrows, one dict. Everything else is marketing.&lt;/p&gt;

&lt;p&gt;Once you see it, you can read any "agentic" framework's docs in five minutes flat. CrewAI, AutoGen, LangGraph, Pydantic AI — they all collapse to the same shape. The differences are syntactic sugar around the same core idea.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to try next
&lt;/h2&gt;

&lt;p&gt;Clone the &lt;a href="https://github.com/dev48v/langgraph-from-zero" rel="noopener noreferrer"&gt;repo&lt;/a&gt; and run it locally. It takes about three commands. Then break it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Add a fourth agent — a fact-checker that flags suspicious claims in the editor's output and sends it back for revision. (Hint: &lt;code&gt;add_conditional_edges&lt;/code&gt;.)&lt;/li&gt;
&lt;li&gt;Swap Gemini for local Llama via Ollama. Same &lt;code&gt;.invoke()&lt;/code&gt; interface, zero network cost.&lt;/li&gt;
&lt;li&gt;Replace DuckDuckGo with a vector search over your own documents. Now you've built RAG.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The same 250 lines, with three different tweaks, give you three completely different products. &lt;strong&gt;That&lt;/strong&gt; is why understanding multi-agent orchestration matters. The hard part isn't the agents — it's seeing the pattern clearly enough to recognise it everywhere.&lt;/p&gt;

&lt;p&gt;This is Day 37 of &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;50 days of from-zero builds&lt;/a&gt;. A new technology every day. The full live demo is at &lt;a href="https://langgraph-from-zero.vercel.app" rel="noopener noreferrer"&gt;langgraph-from-zero.vercel.app&lt;/a&gt;; the source is on &lt;a href="https://github.com/dev48v/langgraph-from-zero" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;. Beginners welcome — every commit teaches one concept.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>python</category>
      <category>langgraph</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Built a 3D Solar System in 300 Lines of React (No Game Engine)</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Tue, 19 May 2026 06:50:30 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-3d-solar-system-in-300-lines-of-react-no-game-engine-52b2</link>
      <guid>https://dev.to/dev48v/i-built-a-3d-solar-system-in-300-lines-of-react-no-game-engine-52b2</guid>
      <description>&lt;p&gt;Pull up a browser. Drag your mouse. Watch eight planets orbit the Sun, axes tilted, Saturn's rings catching the light.&lt;/p&gt;

&lt;p&gt;That's not a game engine. That's not Unity. That's 300 lines of React.&lt;/p&gt;

&lt;p&gt;If your mental model of "3D programming" is "scary C++ matrices and a 600-page OpenGL textbook," you're a decade out of date. WebGL has shipped in every browser since 2014. Three.js wraps the boring math. React Three Fiber lets you write the scene as &lt;strong&gt;components&lt;/strong&gt;, the same way you write HTML. The whole pipeline is &lt;code&gt;&amp;lt;mesh&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;sphereGeometry&amp;gt;&lt;/code&gt;, &lt;code&gt;&amp;lt;meshStandardMaterial&amp;gt;&lt;/code&gt; — three tags, you've made a planet.&lt;/p&gt;

&lt;p&gt;Today I'll show you the whole thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core insight: a scene is a tree
&lt;/h2&gt;

&lt;p&gt;Every 3D scene — every Pixar movie, every video game, every product configurator — is the same shape: &lt;strong&gt;a tree of objects&lt;/strong&gt;, where each node has a position, a rotation, a scale, and zero or more children. That's it. The Sun is the root. Earth is a child positioned 9 units to the right. The Moon is a child of Earth, positioned 1 unit further right. Rotate Earth and the Moon comes along for the ride, because it's a child.&lt;/p&gt;

&lt;p&gt;In Three.js you build this tree imperatively:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nb"&gt;sun&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Mesh&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;geom&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;mat&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;earth&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;THREE&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Mesh&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;geom2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;mat2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;earth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nb"&gt;sun&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;earth&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;scene&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sun&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In React Three Fiber, the tree IS your component tree:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;           &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* sun */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;sphereGeometry&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;meshBasicMaterial&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"yellow"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt; &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;    &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* earth, child of sun */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;sphereGeometry&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;meshStandardMaterial&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"blue"&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the whole conceptual leap. Once you see "the React tree is the Three.js scene graph," the rest is naming things.&lt;/p&gt;

&lt;h2&gt;
  
  
  The trick that makes orbits cheap
&lt;/h2&gt;

&lt;p&gt;Naïve orbit code looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;useFrame&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;angle&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;speed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;earth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;x&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;angle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;earth&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;position&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;z&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sin&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;angle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;9&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That works, but you're doing two trig calls per planet per frame in JavaScript. 60 fps × 8 planets = 960 sin/cos per second in slow JS.&lt;/p&gt;

&lt;p&gt;There's a better way. Put the planet inside a &lt;strong&gt;pivot group&lt;/strong&gt; at the origin. Place the planet at &lt;code&gt;(distance, 0, 0)&lt;/code&gt;. Rotate the &lt;strong&gt;group&lt;/strong&gt;, not the planet.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;group&lt;/span&gt; &lt;span class="na"&gt;ref&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;orbitRef&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;                    &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* this group spins → orbit */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt; &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;      &lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="cm"&gt;/* planet stays put in local space */&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;sphereGeometry&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;radius&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
    &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;meshStandardMaterial&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;color&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;group&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="nf"&gt;useFrame&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;orbitRef&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;current&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rotation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;y&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="nx"&gt;speed&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;dt&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// ONE addition&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you're doing one addition per planet per frame in JavaScript and zero trig. Three.js's internal matrix update handles the rotation in compiled C++ inside the GPU pipeline. The math still happens — it just happens in the right place.&lt;/p&gt;

&lt;p&gt;Same trick for axial rotation: a &lt;strong&gt;child group&lt;/strong&gt; inside the planet rotates on its own Y axis. Tilt the wrapper group on the X axis and Uranus is suddenly tipped 98° like real Uranus. The whole solar system is six nested groups doing addition.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lighting: three lines, instantly 3D
&lt;/h2&gt;

&lt;p&gt;If you skip lighting, every planet looks flat — like a coloured paper disc. Add one &lt;code&gt;&amp;lt;pointLight&amp;gt;&lt;/code&gt; at the Sun's position and use &lt;code&gt;meshStandardMaterial&lt;/code&gt; for the planets:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;pointLight&lt;/span&gt; &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;intensity&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mf"&gt;2.5&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;distance&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;

&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;sphereGeometry&lt;/span&gt; &lt;span class="na"&gt;args&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;meshStandardMaterial&lt;/span&gt; &lt;span class="na"&gt;color&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"blue"&lt;/span&gt; &lt;span class="na"&gt;roughness&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mf"&gt;0.7&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;mesh&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;meshStandardMaterial&lt;/code&gt; is physically-based — it reads the light, bounces it off the surface based on &lt;code&gt;roughness&lt;/code&gt; and &lt;code&gt;metalness&lt;/code&gt;, and shades the half facing the light bright while the half facing away goes dark. Three lines. Instant 3D.&lt;/p&gt;

&lt;p&gt;Pro tip: don't use &lt;code&gt;meshStandardMaterial&lt;/code&gt; for the Sun itself. The Sun emits light, it doesn't receive it. Use &lt;code&gt;meshBasicMaterial&lt;/code&gt;, which ignores all lights and shows the colour you set, flat. Otherwise you'll have a yellow sphere with a dark side, which looks wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  OrbitControls: 80% of the polish for free
&lt;/h2&gt;

&lt;p&gt;Drei (the R3F helper library) ships an &lt;code&gt;&amp;lt;OrbitControls /&amp;gt;&lt;/code&gt; component. Drop it in your &lt;code&gt;&amp;lt;Canvas&amp;gt;&lt;/code&gt; and you get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Drag to rotate the camera around the scene&lt;/li&gt;
&lt;li&gt;Scroll to zoom&lt;/li&gt;
&lt;li&gt;Pinch on mobile&lt;/li&gt;
&lt;li&gt;Two-finger rotate
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;OrbitControls&lt;/span&gt; &lt;span class="na"&gt;enablePan&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;minDistance&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="na"&gt;maxDistance&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="mi"&gt;80&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt; &lt;span class="p"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three lines, all of "drag to look around" is done. This is the kind of thing that takes a junior developer two weeks in raw WebGL and 30 seconds in R3F. Use the helpers.&lt;/p&gt;

&lt;h2&gt;
  
  
  HTML overlays beat in-canvas UI
&lt;/h2&gt;

&lt;p&gt;The temptation when you're new to 3D is to put every UI element inside the 3D scene — billboards, sprites, text geometry. Don't. &lt;strong&gt;Mount your &lt;code&gt;&amp;lt;Canvas&amp;gt;&lt;/code&gt; full-bleed and stack regular HTML on top with &lt;code&gt;position: absolute&lt;/code&gt;.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"shell"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;header&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"hero"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;header&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;Canvas&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nc"&gt;Canvas&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;aside&lt;/span&gt; &lt;span class="na"&gt;className&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="s"&gt;"info-panel"&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;aside&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;div&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The info panel that slides in when you click a planet is just a styled &lt;code&gt;&amp;lt;aside&amp;gt;&lt;/code&gt;. The speed slider is &lt;code&gt;&amp;lt;input type="range"&amp;gt;&lt;/code&gt;. Your CSS skills transfer 1:1. The 3D part stays focused on 3D.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned actually building this
&lt;/h2&gt;

&lt;p&gt;Real takeaways from an afternoon of this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Three.js is huge but the surface you need is small.&lt;/strong&gt; The full Three.js bundle is ~600KB. You will use maybe 12 of its 400+ classes. &lt;code&gt;Scene&lt;/code&gt;, &lt;code&gt;Mesh&lt;/code&gt;, &lt;code&gt;SphereGeometry&lt;/code&gt;, &lt;code&gt;MeshStandardMaterial&lt;/code&gt;, &lt;code&gt;PointLight&lt;/code&gt;, &lt;code&gt;PerspectiveCamera&lt;/code&gt;, &lt;code&gt;OrbitControls&lt;/code&gt;. That's most of it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Real scale is the enemy.&lt;/strong&gt; The Sun is 109× the radius of Earth. Neptune orbits ~30× further than Earth. If you use real ratios, the Sun fills the screen and Neptune is a single pixel. Cheat the visuals. Show real numbers in the info panel.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. &lt;code&gt;useFrame&lt;/code&gt; runs 60Hz, so don't allocate.&lt;/strong&gt; Every frame, that callback fires. If you &lt;code&gt;new Vector3()&lt;/code&gt; inside it, you're creating garbage 60 times per second. Either mutate refs you already have, or hoist allocations outside.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. &lt;code&gt;delta&lt;/code&gt; is your friend.&lt;/strong&gt; R3F's &lt;code&gt;useFrame((_, delta) =&amp;gt; ...)&lt;/code&gt; gives you seconds since last frame. Multiply your speed by &lt;code&gt;delta&lt;/code&gt; and your animation runs the same on a 60Hz laptop and a 144Hz gaming monitor. Without &lt;code&gt;delta&lt;/code&gt;, your planets fly off the screen on a high-refresh display.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. &lt;code&gt;dpr={[1, 2]}&lt;/code&gt; is the mobile performance switch.&lt;/strong&gt; Devices with retina displays would normally render at 3× resolution and tank the FPS. Capping at 2× looks identical to the eye and triples your frame rate on phones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;3D in the browser used to be a specialty — game studios, big agencies, NASA visualizations. It's not specialty anymore. Product configurators, real estate walkthroughs, data visualizations, NFT galleries, classroom physics demos — every web product is starting to have a 3D moment.&lt;/p&gt;

&lt;p&gt;R3F is the lever that makes 3D approachable for people who already write React. You don't have to learn imperative scene-graph plumbing. You already know how trees of components work — you're doing 3D, you just have a different leaf type.&lt;/p&gt;

&lt;p&gt;So go play. Open the live demo, click each planet, scroll out and look at the layout from the side. Then clone the repo and change a number. Make the Sun blue. Add a moon to Earth — wrap an Earth-sized sphere in an outer group, position the moon &lt;code&gt;(1.2, 0, 0)&lt;/code&gt;, and watch it follow. That's the entire mental model. You'll be making your own scenes within an hour.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it / fork it
&lt;/h2&gt;

&lt;p&gt;🌐 Live: &lt;a href="https://threejs-from-zero.vercel.app" rel="noopener noreferrer"&gt;https://threejs-from-zero.vercel.app&lt;/a&gt;&lt;br&gt;
🐙 Code: &lt;a href="https://github.com/dev48v/threejs-from-zero" rel="noopener noreferrer"&gt;https://github.com/dev48v/threejs-from-zero&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is Day 36 of TechFromZero — a 50-day series where I build one tech from scratch every day with step-by-step commits you can read like a textbook. Yesterday was a voice AI tutor (Web Speech → Gemini → TTS). Tomorrow we're building a multi-agent AI orchestration that has agents argue with each other.&lt;/p&gt;

&lt;p&gt;🌐 See all days: &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;https://dev48v.infy.uk/techfromzero.php&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Talk to you tomorrow.&lt;/p&gt;

</description>
      <category>javascript</category>
      <category>react</category>
      <category>threejs</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I Built a Voice AI Tutor in 200 Lines of Code (and Zero Backend)</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Mon, 18 May 2026 20:11:46 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-voice-ai-tutor-in-200-lines-of-code-and-zero-backend-7fe</link>
      <guid>https://dev.to/dev48v/i-built-a-voice-ai-tutor-in-200-lines-of-code-and-zero-backend-7fe</guid>
      <description>&lt;p&gt;Open Siri. Ask it a question. Listen to the reply.&lt;/p&gt;

&lt;p&gt;That whole experience — the magic that powers Alexa, ChatGPT voice mode, every car assistant, every drive-through screen — is &lt;strong&gt;three steps glued together&lt;/strong&gt;.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Turn microphone audio into text.&lt;/li&gt;
&lt;li&gt;Send the text to a brain.&lt;/li&gt;
&lt;li&gt;Turn the brain's reply back into audio.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. The whole industry of voice AI is variations on those three boxes. Different brains, different microphones, different voices, but the shape is identical.&lt;/p&gt;

&lt;p&gt;Today I'm going to build the whole thing in your browser. No server. No install. No API key except a single free one. Open the tab, click the mic, talk to an AI. Total code: about 200 lines.&lt;/p&gt;

&lt;p&gt;The pattern is the actual lesson. Once you see it, you can replace any box with a fancier one — Whisper for transcription, ElevenLabs for voices, your own fine-tuned model in the middle — and the architecture doesn't change.&lt;/p&gt;

&lt;h2&gt;
  
  
  The three Lego bricks
&lt;/h2&gt;

&lt;p&gt;Let me name them with the boring acronyms so you can search for them later:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;STT — Speech-to-Text.&lt;/strong&gt; Microphone audio → string of words. The expensive option is OpenAI Whisper (best accuracy, costs about a third of a cent per minute). The free option, which I'm using here, is the &lt;strong&gt;Web Speech API&lt;/strong&gt;, which has shipped in Chrome since 2013. You give it a microphone permission and it gives you back text. Zero key, zero upload — Chrome talks to Google's recognizer behind the scenes for you. It's slightly less accurate than Whisper, especially on accents, but for a learning demo the difference doesn't matter.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LLM — the brain.&lt;/strong&gt; This is the part everyone gets excited about. You hand a string to a Large Language Model and it hands a string back. ChatGPT, Claude, Gemini — they all expose the same shape: send a list of messages, get a message back. I'm using &lt;strong&gt;Gemini 2.5 Flash&lt;/strong&gt; because Google gives it away free at 15 requests per minute. Beginners shouldn't have to wave a credit card to learn how this works.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TTS — Text-to-Speech.&lt;/strong&gt; String → audio you can play. The fancy option is ElevenLabs, whose voices are so good they sound uncanny. The free, zero-key option is &lt;code&gt;window.speechSynthesis&lt;/code&gt;, which has shipped in every major browser since 2014. It sounds robotic, but it's instant and it costs nothing.&lt;/p&gt;

&lt;p&gt;Notice the pattern: every brick has an expensive flavor and a free flavor. The interfaces are identical. You can swap one for the other without changing the architecture. &lt;strong&gt;That's why this is worth learning.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring the loop
&lt;/h2&gt;

&lt;p&gt;Here's the entire pipeline in pseudocode:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="n"&gt;wants&lt;/span&gt; &lt;span class="n"&gt;to&lt;/span&gt; &lt;span class="n"&gt;talk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;listening&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;STT&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;listen&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;        &lt;span class="c1"&gt;# mic open until silence
&lt;/span&gt;    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;thinking&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;LLM&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ask&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;      &lt;span class="c1"&gt;# 1-2 seconds typically
&lt;/span&gt;    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;speaking&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;TTS&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;say&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;             &lt;span class="c1"&gt;# plays through speakers
&lt;/span&gt;    &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;idle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The state machine matters more than you'd think. If the user clicks the mic while the assistant is still talking, you need to cancel the playback. If they click while the LLM is still thinking, you need to keep them out. UIs get confusing fast when you have four states and one button. I'll show you the React version in a minute.&lt;/p&gt;

&lt;h2&gt;
  
  
  The STT brick
&lt;/h2&gt;

&lt;p&gt;The browser ships a class called &lt;code&gt;SpeechRecognition&lt;/code&gt; (with a &lt;code&gt;webkit&lt;/code&gt; prefix on Safari). The API is event-based, not promise-based, which is a little annoying — but the pattern is straightforward:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rec&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SpeechRecognition&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="nx"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lang&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;en-US&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;continuous&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;       &lt;span class="c1"&gt;// keep mic open across pauses&lt;/span&gt;
&lt;span class="nx"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;interimResults&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;   &lt;span class="c1"&gt;// stream partials while user talks&lt;/span&gt;

&lt;span class="nx"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onresult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resultIndex&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isFinal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nf"&gt;onFinal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="nf"&gt;onPartial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;transcript&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;rec&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two things to notice. First, &lt;strong&gt;&lt;code&gt;interimResults&lt;/code&gt;&lt;/strong&gt; is a gift. It streams text while the user is still talking, so you can show "you're saying..." in real time. It feels alive instead of laggy. Second, &lt;strong&gt;&lt;code&gt;resultIndex&lt;/code&gt;&lt;/strong&gt; lets you only walk new results since the last fire — the browser keeps the whole session's results in the &lt;code&gt;results&lt;/code&gt; array, but you usually only care about what's new.&lt;/p&gt;

&lt;h2&gt;
  
  
  The LLM brick
&lt;/h2&gt;

&lt;p&gt;Google's SDK makes this almost embarrassingly short:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;GoogleGenerativeAI&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@google/generative-ai&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;GoogleGenerativeAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;ai&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getGenerativeModel&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;gemini-2.5-flash&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;systemInstruction&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Reply in 1-3 short sentences. No markdown.&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;generationConfig&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;maxOutputTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startChat&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;history&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userText&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two design choices worth calling out.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;System prompt.&lt;/strong&gt; I tell the model to keep answers under 60 words. Why? Because the TTS will read every word. If Gemini writes a Wikipedia paragraph, your user is going to sit through 90 seconds of robot voice waiting for the next chance to talk. Voice AIs need to be terser than text AIs. This is a real lesson — half of building voice products is wrestling the model down to a sentence or two.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;maxOutputTokens.&lt;/strong&gt; A hard ceiling. Even if the model decides to ignore the system prompt and ramble, this cuts it off. Belt and suspenders.&lt;/p&gt;

&lt;h2&gt;
  
  
  The TTS brick
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;u&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;SpeechSynthesisUtterance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;lang&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;en-US&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;voice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;bestVoiceFor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;en-US&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;speechSynthesis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cancel&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;   &lt;span class="c1"&gt;// kill anything currently playing&lt;/span&gt;
&lt;span class="nx"&gt;speechSynthesis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;speak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;u&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The one gotcha: &lt;code&gt;speechSynthesis.getVoices()&lt;/code&gt; returns an empty array the first time you call it. Voices load asynchronously and Chrome fires a &lt;code&gt;voiceschanged&lt;/code&gt; event when they're ready. So I wrap voice-loading in a one-shot promise that callers can await. Otherwise your first reply plays in the browser's default voice instead of the nice Google one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wiring it in React
&lt;/h2&gt;

&lt;p&gt;The whole React component is a state machine over &lt;code&gt;phase: "idle" | "listening" | "thinking" | "speaking"&lt;/code&gt; and a list of messages.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;phase&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setPhase&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Phase&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;idle&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;Message&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;([]);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;startListening&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;setPhase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;listening&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;stt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;onFinal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;stt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userMsg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;curr&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;curr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userMsg&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
      &lt;span class="nf"&gt;setPhase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;thinking&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;askGemini&lt;/span&gt;&lt;span class="p"&gt;([...&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userMsg&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;setMessages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;curr&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;curr&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;model&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;reply&lt;/span&gt; &lt;span class="p"&gt;}]);&lt;/span&gt;
      &lt;span class="nf"&gt;setPhase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;speaking&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="nf"&gt;speak&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reply&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;onEnd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setPhase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;idle&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The mic button changes label based on phase. Click it during &lt;code&gt;idle&lt;/code&gt; to start listening, click it during &lt;code&gt;listening&lt;/code&gt;/&lt;code&gt;speaking&lt;/code&gt; to stop. The transcript renders as a list of bubbles. That's the whole UI.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned actually building this
&lt;/h2&gt;

&lt;p&gt;A few real takeaways from spending an afternoon on this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Browser TTS quality is better than you remember.&lt;/strong&gt; The Google voices on Chrome are genuinely fine. They were embarrassing in 2015. They're not embarrassing now. For a learning demo, ElevenLabs is overkill.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. The pipeline is the lesson, not the tools.&lt;/strong&gt; When a recruiter says "build a voice agent," they don't mean "use these three specific libraries." They mean "wire mic, brain, and speaker together with a state machine that doesn't get confused." Once you can do that, you can swap parts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Voice changes how you prompt.&lt;/strong&gt; A system prompt that's great for ChatGPT (gives bulleted lists, uses headings) is terrible for voice. The TTS reads "asterisk asterisk" out loud. Tell the model "no markdown, no lists, one paragraph" or live with the consequences.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. State machines beat booleans.&lt;/strong&gt; I started with &lt;code&gt;isListening&lt;/code&gt; + &lt;code&gt;isThinking&lt;/code&gt; + &lt;code&gt;isSpeaking&lt;/code&gt; booleans. Within five minutes I had bugs where two were true at once. A single &lt;code&gt;phase&lt;/code&gt; enum makes the impossible states actually impossible. Reach for this earlier than you think.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Free tiers are enough to learn on.&lt;/strong&gt; Gemini's free tier covers ~14,000 requests per day. You will not run out while learning. Don't let "what API should I pay for" stop you from starting.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this matters
&lt;/h2&gt;

&lt;p&gt;Every "AI agent" startup right now is some variation of these three boxes plus a loop. Voice tutors, customer service bots, drive-throughs, in-car assistants, accessibility tools. Once you can wire the three bricks, you can build any of them. The hard part is taste — which brain, which voice, which prompt, which moment to interrupt. That's the next ten years of product work, and it's all built on top of the architecture you can spin up in a single afternoon.&lt;/p&gt;

&lt;p&gt;So go spin it up. Open the repo. Read the commits one at a time. The first commit is an empty React shell. The seventh commit is the entire app. Each commit is one concept. You'll get more out of reading the seven small steps than reading one huge final file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it / fork it
&lt;/h2&gt;

&lt;p&gt;🌐 Live: &lt;a href="https://voice-from-zero.vercel.app" rel="noopener noreferrer"&gt;https://voice-from-zero.vercel.app&lt;/a&gt;&lt;br&gt;
🐙 Code: &lt;a href="https://github.com/dev48v/voice-from-zero" rel="noopener noreferrer"&gt;https://github.com/dev48v/voice-from-zero&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This is Day 35 of TechFromZero — a 50-day series where I build one tech from scratch every day with step-by-step commits you can read like a textbook. Yesterday was Stable Diffusion. Tomorrow is 3D in the browser with Three.js.&lt;/p&gt;

&lt;p&gt;If you're learning AI and want a low-stakes way to actually ship something — clone the repo, change the model, change the voice, change the system prompt, and you'll have an entirely different demo by lunch. Make it a French tutor. Make it a Dungeon Master. Make it a meditation guide. The Legos snap together however you want.&lt;/p&gt;

&lt;p&gt;🌐 See all days: &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;https://dev48v.infy.uk/techfromzero.php&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Talk to you tomorrow.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>javascript</category>
      <category>react</category>
      <category>beginners</category>
    </item>
    <item>
      <title>I built a Stable Diffusion playground in 200 lines and zero API keys. Here's how.</title>
      <dc:creator>Devanshu Biswas</dc:creator>
      <pubDate>Fri, 15 May 2026 21:01:36 +0000</pubDate>
      <link>https://dev.to/dev48v/i-built-a-stable-diffusion-playground-in-200-lines-and-zero-api-keys-heres-how-40c0</link>
      <guid>https://dev.to/dev48v/i-built-a-stable-diffusion-playground-in-200-lines-and-zero-api-keys-heres-how-40c0</guid>
      <description>&lt;p&gt;The first time I generated an AI image, I expected the worst.&lt;/p&gt;

&lt;p&gt;A signup page. Email verification. A "free trial" with a credit card on file. A Python SDK that wouldn't install. CUDA. A Hugging Face account.&lt;/p&gt;

&lt;p&gt;None of that happened.&lt;/p&gt;

&lt;p&gt;I typed a URL into my browser, pressed Enter, and waited five seconds. An image of an astronaut riding a horse on Mars appeared. I had not given anyone an email address. I had not installed anything.&lt;/p&gt;

&lt;p&gt;That's the API I built Day 34 of &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;TechFromZero&lt;/a&gt; around: a free, zero-auth gateway called &lt;a href="https://pollinations.ai" rel="noopener noreferrer"&gt;Pollinations.ai&lt;/a&gt; that hosts Stable Diffusion, FLUX, and a handful of other models behind a beautifully boring URL pattern.&lt;/p&gt;

&lt;p&gt;If you've been putting off learning generative AI because the setup looked like Day 1 of a Computer Science PhD, this is your skip-the-cutscene button.&lt;/p&gt;

&lt;h2&gt;
  
  
  The whole API, on one line
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://image.pollinations.ai/prompt/an+astronaut+cat?model=flux
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Hit that URL. You get an image. That is the entire integration.&lt;/p&gt;

&lt;p&gt;There is no JSON to parse. There is no SDK to install. There is no token to manage. The image flows from Pollinations' CDN straight into your browser's &lt;code&gt;&amp;lt;img&amp;gt;&lt;/code&gt; tag, and your browser doesn't know or care that an AI generated it. As far as it's concerned, it loaded a picture.&lt;/p&gt;

&lt;p&gt;Compared to the way the same task usually feels — &lt;em&gt;npm install some-sdk; set ANTHROPIC_API_KEY; add billing; wait for the cold start&lt;/em&gt; — this is approximately 100% less ceremony.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I built
&lt;/h2&gt;

&lt;p&gt;A single-page React app. You type a prompt, click Generate, and an image appears. The app remembers your last 50 generations in &lt;code&gt;localStorage&lt;/code&gt; (metadata only — the images live on Pollinations' CDN), lets you favorite the ones you like, lock the seed for reproducible variations, and download any image as a PNG.&lt;/p&gt;

&lt;p&gt;I deliberately kept the whole thing one React component (~200 lines). Splitting it across five files and a Redux store would have added zero clarity for the reader and a lot of noise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tech:&lt;/strong&gt; Vite + React 19 + TypeScript + &lt;code&gt;localStorage&lt;/code&gt;. &lt;strong&gt;Backend:&lt;/strong&gt; none. &lt;strong&gt;API keys:&lt;/strong&gt; zero. &lt;strong&gt;Vercel deploy:&lt;/strong&gt; static SPA, ~10 seconds.&lt;/p&gt;

&lt;p&gt;📸 Try it: &lt;a href="https://stable-diffusion-from-zero.vercel.app" rel="noopener noreferrer"&gt;stable-diffusion-from-zero.vercel.app&lt;/a&gt;&lt;br&gt;
🧑‍💻 Code: &lt;a href="https://github.com/dev48v/stable-diffusion-from-zero" rel="noopener noreferrer"&gt;github.com/dev48v/stable-diffusion-from-zero&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The four parameters that matter
&lt;/h2&gt;

&lt;p&gt;Pollinations supports a fistful of query params. Four of them are the ones a beginner cares about:&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;model&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Pick the generator. The defaults you'll actually want:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;flux&lt;/code&gt; — fast, photorealistic, great default.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;flux-anime&lt;/code&gt; — illustration / stylised art.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sdxl&lt;/code&gt; — Stable Diffusion XL, the classic. Strong on composition.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sd3&lt;/code&gt; — Stable Diffusion 3. Specifically good at rendering legible text inside images. Yes, the AI can write the word "BAKERY" on a shop sign now.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;dalle3&lt;/code&gt; — OpenAI's DALL·E 3 routed through Pollinations. Surreal, concept-art-friendly.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;seed&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Same prompt + same seed = same image, byte for byte. This is the most important parameter beginners ignore.&lt;/p&gt;

&lt;p&gt;Why? Because once you find a composition you like, you want to &lt;em&gt;iterate&lt;/em&gt; — slightly different lighting, slightly different angle, but the same person. Lock the seed, change the prompt by one word at a time, and you can see what a single word does to the model's mental picture.&lt;/p&gt;

&lt;p&gt;In the playground, every generation rolls a new random seed automatically. If you find one you like, copy the number, paste it back into the seed input, and lock it for the next try.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;width&lt;/code&gt; + &lt;code&gt;height&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Pollinations accepts almost any pixel size, but the model was trained on 1024×1024-ish inputs, so it's best at square (1024×1024) and gentle aspect ratios (768×1152 portrait, 1152×768 landscape). Push past 2048 in either direction and you'll start to see weird artifacts — extra fingers, fractal hair, that AI-image uncanny-valley vibe.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;code&gt;enhance&lt;/code&gt;
&lt;/h3&gt;

&lt;p&gt;Easily my favorite. With &lt;code&gt;enhance=true&lt;/code&gt;, Pollinations runs your prompt through a small LLM &lt;em&gt;first&lt;/em&gt;, expanding it into a more descriptive version before handing it to the image model.&lt;/p&gt;

&lt;p&gt;So when you type &lt;strong&gt;"astronaut cat"&lt;/strong&gt;, the LLM rewrites that to something like:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"A highly detailed astronaut cat floating in zero gravity, photorealistic, cinematic lighting, 4k resolution, intricate space suit details, distant Earth in the background"&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;…and &lt;em&gt;then&lt;/em&gt; the image model renders the expanded version. Your three-word prompt looks like it was written by someone who's been making Midjourney art for two years.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I learned (and that beginners should know)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Generated images aren't free in the legal sense.&lt;/strong&gt; Pollinations' terms permit personal use; many model licences (FLUX in particular) restrict commercial use. If you're shipping a product, read the licence of the specific model you're using before you sell anything you generated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The seed parameter is reproducibility, not randomness.&lt;/strong&gt; I used to think &lt;code&gt;seed=42&lt;/code&gt; was "the 42nd image". It's not. It's the random-number-generator's starting state. Different prompts with &lt;code&gt;seed=42&lt;/code&gt; produce wildly different images. The seed only "matches" when the prompt is also identical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Aspect ratio affects content.&lt;/strong&gt; Ask for "a portrait of a librarian" at 1024×1024 and you'll get the librarian centered with bookshelves around them. Ask for the same thing at 1152×768 and you'll get a wider shot — the librarian &lt;em&gt;plus&lt;/em&gt; a reading nook. The model uses the canvas shape as a hint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;enhance&lt;/code&gt; makes lazy prompts good.&lt;/strong&gt; Three words become twenty. The output looks 5× better. For 95% of users this is the right default, which is why my playground turns it on automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to extend this
&lt;/h2&gt;

&lt;p&gt;A few good next steps if you want to keep going:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Inpainting&lt;/strong&gt; — Pollinations supports a &lt;code&gt;mask=&lt;/code&gt; parameter for editing parts of an existing image. Add a brush tool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image-to-image&lt;/strong&gt; — pass an existing image URL as a starting point. Useful for stylistic transfer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch generation&lt;/strong&gt; — fire off a 4×4 grid of variations on the same prompt with different seeds. Pollinations is happy to serve in parallel.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Self-host&lt;/strong&gt; — Pollinations is open source. You can run the inference stack yourself on a GPU and avoid even the implicit dependency on someone else's free infrastructure.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What's next in the series
&lt;/h2&gt;

&lt;p&gt;This is Day 34 of TechFromZero — one new technology every day, built from scratch with detailed commits. Day 35 picks up the AI thread with &lt;strong&gt;voice AI&lt;/strong&gt; (Whisper + Gemini + ElevenLabs).&lt;/p&gt;

&lt;p&gt;If you're learning AI from a beginner background and want a curriculum that's actually fun, &lt;a href="https://dev48v.infy.uk/techfromzero.php" rel="noopener noreferrer"&gt;follow along at dev48v.infy.uk/techfromzero&lt;/a&gt;. Each day stands alone — start anywhere.&lt;/p&gt;

&lt;p&gt;Generated AI images used to require a CS degree, a GPU, and a credit card. Now they require a browser tab. The barrier to creating with AI is gone. The only barrier left is curiosity.&lt;/p&gt;

&lt;p&gt;Try a weird prompt. See what happens.&lt;/p&gt;

</description>
      <category>stablediffusion</category>
      <category>ai</category>
      <category>beginners</category>
      <category>javascript</category>
    </item>
  </channel>
</rss>
