<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Code with Gem</title>
    <description>The latest articles on DEV Community by Code with Gem (@gemrey13).</description>
    <link>https://dev.to/gemrey13</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2431053%2F78072a15-34a3-44a0-8954-bbcbb2d837f0.jpeg</url>
      <title>DEV Community: Code with Gem</title>
      <link>https://dev.to/gemrey13</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/gemrey13"/>
    <language>en</language>
    <item>
      <title>Building War-Machine: My First Local AI Bridge with Ollama &amp; Node.js 🚀🦾</title>
      <dc:creator>Code with Gem</dc:creator>
      <pubDate>Thu, 26 Mar 2026 10:36:41 +0000</pubDate>
      <link>https://dev.to/gemrey13/building-war-machine-my-first-local-ai-bridge-with-ollama-nodejs-3615</link>
      <guid>https://dev.to/gemrey13/building-war-machine-my-first-local-ai-bridge-with-ollama-nodejs-3615</guid>
      <description>&lt;p&gt;I finally did it. I built my first local AI integration, and I named him &lt;strong&gt;War-Machine&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;As a personal project, I wanted to see if I could make a local LLM feel as fast as a cloud API on a mid-range laptop (i5-1235U). Here is the breakdown of how I made it happen.&lt;/p&gt;

&lt;p&gt;🛠️ &lt;strong&gt;The Tech Stack&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;Engine&lt;/strong&gt;: Ollama (Llama 3.2 3B)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Backend&lt;/strong&gt;: Node.js (ES Modules) + Express 5&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hardware&lt;/strong&gt;: Intel i5-1235U | 16GB RAM&lt;/p&gt;

&lt;p&gt;⚡ &lt;strong&gt;Key Optimizations&lt;/strong&gt;&lt;br&gt;
Most beginners struggle with local AI being "slow." Here are the two things that changed the game for &lt;strong&gt;War-Machine&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Direct IPv4 Binding&lt;/strong&gt;: Don't use &lt;code&gt;localhost&lt;/code&gt; on Windows. Use &lt;code&gt;127.0.0.1&lt;/code&gt;. It bypasses the 2-second DNS resolution lag.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Chunked Streaming&lt;/strong&gt;: By streaming the response, the user starts reading in &amp;lt; 2 seconds, even if the full message takes 8 seconds to finish.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;🛡️ &lt;strong&gt;The Persona&lt;/strong&gt;&lt;br&gt;
&lt;strong&gt;War-Machine&lt;/strong&gt; is configured via a custom &lt;code&gt;Modelfile&lt;/code&gt; to be a witty, tactical assistant. It makes debugging much more entertaining when your AI talks back like a drill sergeant.&lt;/p&gt;

&lt;p&gt;I've open-sourced the project for anyone else looking to jump into local AI without a dedicated GPU.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo:&lt;/strong&gt; &lt;a href="https://github.com/gemrey13/War-Machine-AI" rel="noopener noreferrer"&gt;Link to repo&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ollama</category>
      <category>ai</category>
      <category>javascript</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
