<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Erick Mwangi Muguchia </title>
    <description>The latest articles on DEV Community by Erick Mwangi Muguchia  (@muguchiaerickmwangi).</description>
    <link>https://dev.to/muguchiaerickmwangi</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3610630%2F6ad407ec-048d-47dd-acc9-1d742265a0fe.jpg</url>
      <title>DEV Community: Erick Mwangi Muguchia </title>
      <link>https://dev.to/muguchiaerickmwangi</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/muguchiaerickmwangi"/>
    <language>en</language>
    <item>
      <title>No GPU? No problem!, running local AI efficiently on my CPU.</title>
      <dc:creator>Erick Mwangi Muguchia </dc:creator>
      <pubDate>Tue, 14 Apr 2026 20:21:12 +0000</pubDate>
      <link>https://dev.to/muguchiaerickmwangi/no-gpu-no-problem-running-local-ai-efficiently-on-my-cpu-1fhf</link>
      <guid>https://dev.to/muguchiaerickmwangi/no-gpu-no-problem-running-local-ai-efficiently-on-my-cpu-1fhf</guid>
      <description>&lt;h2&gt;
  
  
  1. Why I tried this.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  2. My setup.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  3. The problems I faced.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  4. The tweaks I discovered.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  5. Results.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  6. Lessons.
&lt;/h2&gt;

&lt;h2&gt;
  
  
  THE WHY:
&lt;/h2&gt;

&lt;p&gt;I’ve always wanted to explore deep conversations with AI and understand how these systems work. For a long time, that dream was limited by the lack of a GPU and the high cost of apps that allow meaningful interaction with AI models. Now, I’m determined to overcome those barriers and build my own path into this world.&lt;/p&gt;

&lt;p&gt;So the idea of running one locally always kept bugging me, and I finally got to do it.&lt;br&gt;
It was a long and educational journey, so let me walk you through it.&lt;/p&gt;
&lt;h2&gt;
  
  
  My setup
&lt;/h2&gt;

&lt;p&gt;OS: Arch Linux x86_64&lt;br&gt;
Workflow: tmux + i3 (just because  I like using key-bindings), starship + wezterm &lt;br&gt;
Hardware :CPU: Intel(R) Core(TM) i5-7200U (4) @ 3.10 GHz&lt;br&gt;
          GPU: Intel HD Graphics 620 @ 1.00 GHz [Integrated]&lt;br&gt;
          Memory: 2.75 GiB / 7.61 GiB (36%)&lt;br&gt;
          Swap: 666.95 MiB / 3.81 GiB (17%)&lt;/p&gt;

&lt;p&gt;Storage:  Disk (/): 28.64 GiB / 31.20 GiB (92%) - ext4&lt;br&gt;
          Disk (/home): 39.15 GiB / 84.33 GiB (46%) - ext4&lt;br&gt;
          Disk (/run/media/shinigami/Vault): 349.96 GiB / 457.38 GiB (77%) - ext4&lt;/p&gt;

&lt;p&gt;Ollama is the engine I used to run models locally. By default, it stores models in ~/.ollama, which quickly filled my root partition.&lt;br&gt;
install with:&lt;br&gt;
&lt;code&gt;curl -fsSL https://ollama.com/install.sh | sh&lt;/code&gt;&lt;/p&gt;
&lt;h1&gt;
  
  
  To fix this I redirected the models to a secondary storage.
&lt;/h1&gt;

&lt;p&gt;&lt;code&gt;mv ~/.ollama /run/media/shinigami/Vault/ollama&lt;/code&gt;&lt;br&gt;
&lt;code&gt;ln -s /run/media/shinigami/Vault/ollama ~/.ollama&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;This symlink forces Ollama to store everything on the Hard Drive , solving the disk full error.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;&lt;strong&gt;Now pulling and managing models.&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I started by pulling the smallest model, so I went for the Tinyllama, which is about 637 MB.&lt;br&gt;
After testing the model, &amp;gt; fast yes but not that smart, so I had to look for a bit smarter models.&lt;br&gt;
&lt;code&gt;ollama pull tinyllama&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Therefore I went ahead and  pulled heavy models like llama3.2:3b, and llama3.2:1b, which were 2.0 GB and 1.3 GB respectively.&lt;br&gt;
&lt;code&gt;ollama pull llama3.2:3b&lt;/code&gt;&lt;br&gt;
&lt;code&gt;ollama pull llama3.2:1b&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Building custom builds for them&lt;/strong&gt;&lt;br&gt;
So I wanted maximum output/use of the models so I created custom models using modelfiles.&lt;br&gt;
Example;&lt;br&gt;
I wanted one to help me with learning basic networking concepts, the model files looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; llama3.2:3b # Adjust accordingly&lt;/span&gt;

&lt;span class="c"&gt;# ---------- PARAMETERS ----------&lt;/span&gt;
&lt;span class="c"&gt;# Lower temperature for accuracy; moderate top_p for variety without drifting.&lt;/span&gt;
PARAMETER temperature 0.25
PARAMETER top_p 0.9

&lt;span class="c"&gt;# More room for code + explanations.&lt;/span&gt;
PARAMETER num_ctx 4096

&lt;span class="c"&gt;# Safer repetition control (prevents looping).&lt;/span&gt;
PARAMETER repeat_penalty 1.12

&lt;span class="c"&gt;# If your setup supports it and you want more deterministic answers, you can also try:&lt;/span&gt;
&lt;span class="c"&gt;# PARAMETER seed 42&lt;/span&gt;

&lt;span class="c"&gt;# ---------- SYSTEM BEHAVIOR ----------&lt;/span&gt;
SYSTEM """
You are My Mentor: a concise, practical networking fundamentals tutor and coding assistant.

Primary goal:
- Teach networking fundamentals clearly and correctly (OSI/TCP-IP, IP addressing &amp;amp; subnetting, ARP, DNS, DHCP, TCP vs UDP, ports/sockets, routing, NAT, HTTP/TLS basics, troubleshooting with ping/traceroute/nslookup/curl/tcpdump, basic firewalls).

Secondary goal:
- Produce meaningful, runnable code examples when useful prefer (enter preferred language).

Style rules:
- Explain concepts with short definitions + one concrete example.
- When writing code, include: what it does, how to run it, expected output, and common pitfalls.
- If the user’s question is ambiguous, ask up to 2 clarifying questions before answering.

Output format (default):
1) Concept (2–5 sentences)
2) Why it matters (1–2 bullets)
3) Example (diagram, packet flow, or command)
4) Code (only if it adds value; keep it minimal)
5) Quick check (2–3 questions to self-test)
""" 
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That snippet explains much of it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the models
&lt;/h2&gt;

&lt;p&gt;After creating the Model file, &lt;code&gt;vim Modelfile1&lt;/code&gt;, now we bind it to the any models we downloaded.&lt;br&gt;
A Modelfile is the blueprint that shapes an AI model’s personality, rules, and behavior on top of its base intelligence.&lt;/p&gt;

&lt;p&gt;Command for creating from the Modelfile,&lt;br&gt;
&lt;code&gt;ollama create My-model -f Modelfile1&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;you should see this output:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;gathering model components 
using existing layer sha256:dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff 
using existing layer sha256:966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396 
using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d 
using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd 
creating new layer sha256:7f89bd8bf6ef609a9aefeab288cde09db6c1ef97f649691f25b29e0f85a8c91c 
creating new layer sha256:446b3a23f7599dc79a11cfb03c670091c9fe265aba28fa3316e9e46dc86365db 
writing manifest 
success 

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;"My-model" &amp;gt;  you can name it anything.&lt;br&gt;
Plus you can create as many &lt;strong&gt;Modelfile&lt;/strong&gt; as you like giving them different task, and of course you can add more rules and examples in the &lt;strong&gt;Modelfile&lt;/strong&gt; as you like.&lt;/p&gt;

&lt;p&gt;After successfully creating 'My-model', run the model,&lt;br&gt;
&lt;code&gt;ollama run My-model&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;➜ ollama run My-model
&lt;/span&gt;&lt;span class="gp"&gt;&amp;gt;&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt; Send a message &lt;span class="o"&gt;(&lt;/span&gt;/? &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="nb"&gt;help&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
&lt;span class="go"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That was the fun part, now the real problem was now when the models runs and now CPU starts screaming because of 100% CPU consumption, overheating, which ends up making the model work slow.&lt;/p&gt;

&lt;p&gt;So I had to optimize my setup to better handle the models.&lt;br&gt;
By using the tools to monitor CPU (htop), and heat (lm_sensors), I was able to better optimize my setup.&lt;br&gt;
To run the models efficiently I had to:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Maximize CPU performance when running the models.&lt;br&gt;
Reduce latency of bottlenecks.&lt;br&gt;
Stabilize thermal behavior.&lt;br&gt;
Prioritize computer-heavy processes.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Running local AI models on CPU introduces:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Lower parallelism.&lt;br&gt;
Thermal Throttling.&lt;br&gt;
OS scheduling inefficiencies.&lt;br&gt;
Power-saving defaults limiting performance.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So instead of forcing the model, You optimize around it.&lt;br&gt;
&lt;em&gt;Unlock CPU performance&lt;/em&gt;&lt;br&gt;
&lt;em&gt;In Arch&lt;/em&gt;&lt;br&gt;
&lt;code&gt;sudo pacman -s cpupower&lt;/code&gt;&lt;br&gt;
then install the tuned for changing CPU frequency state and for system wide optimization.&lt;br&gt;
&lt;code&gt;sudo pacman -S tuned&lt;/code&gt;&lt;br&gt;
Then enable it :&lt;br&gt;
&lt;code&gt;sudo systemctl enable --now tuned&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Then change the CPU performance;&lt;br&gt;
&lt;code&gt;sudo cpupower frequency-set -g performance&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Switches CPU governor: By default, Linux CPUs often run in “ondemand” or “powersave” mode, scaling frequency up and down depending on load.&lt;br&gt;
Performance mode: Locks the CPU to its maximum frequency, ensuring consistent speed.&lt;/p&gt;

&lt;p&gt;Impact:&lt;br&gt;
Faster response times for heavy workloads (like tokenization, AI inference, or compiling).&lt;br&gt;
Reduced latency spikes since the CPU doesn’t waste time ramping up.&lt;br&gt;
More predictable benchmarking results.&lt;/p&gt;

&lt;p&gt;cons:&lt;br&gt;
Higher power draw, more heat, fans spin up, battery drains faster on laptops.&lt;/p&gt;

&lt;p&gt;Then confirm the configuration by,&lt;br&gt;
&lt;code&gt;cpupower frequency-info&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;➜ cpupower frequency-info
driver: acpi-cpufreq
hardware limits: 400 MHz - 2.60 GHz
available cpufreq governors: conservative ondemand userspace powersave performance schedutil
current policy: governor "performance" within 400 MHz - 2.50 GHz
current CPU frequency: 3.10 GHz (kernel reported)
boost state: Supported, Active

&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then apply throughput performance profile, using the 'tuned' we installed earlier.&lt;br&gt;
&lt;code&gt;sudo tuned-adm profile throughput-profile&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Why this matters:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Optimizes CPU behaviors.&lt;br&gt;
Improves disk I/O.&lt;br&gt;
Adjusts system scheduling.&lt;br&gt;
Reduces unnecessary power-saving interruptions.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Results: Smoother, sustained compute performance. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Model am using (and why)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;llama3.2:3b&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Balanced size and capability.&lt;br&gt;
Noticeably smart.&lt;br&gt;
Good for deeper prompts and reasoning.&lt;br&gt;
This felt like middle ground between speed and intelligence.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;phi3:mini&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Very efficient for its size.&lt;br&gt;
Strong reasoning compared to other small models.&lt;br&gt;
Optimized for lower-resources environments&lt;br&gt;
This one stood out as surprisingly powerful for CPU use.&lt;/p&gt;
&lt;/blockquote&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This concludes my first phase: setting up, tuning performance, and confirming that local AI models run smoothly. In the next phase, I’ll dive into measuring tokenization speed;  using verbose logs and custom C scripts to compare how these models perform under different workloads.&lt;br&gt;
&lt;em&gt;"Turns out you don't need powerful hardware to explore AI,  just curiosity and a stubborn CPU."&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
    </item>
    <item>
      <title>Running local AI.</title>
      <dc:creator>Erick Mwangi Muguchia </dc:creator>
      <pubDate>Sun, 12 Apr 2026 15:14:03 +0000</pubDate>
      <link>https://dev.to/muguchiaerickmwangi/running-local-ai-14pb</link>
      <guid>https://dev.to/muguchiaerickmwangi/running-local-ai-14pb</guid>
      <description>&lt;p&gt;I had this idea to run AI locally on my own laptop. Just to see if I could. Ended up going with Ollama.&lt;/p&gt;

&lt;p&gt;At first it was brutal — all CPU, no GPU, super slow. But I messed around, tweaked some stuff, and finally got it to actually run okay. Not fast, but okay.&lt;/p&gt;

&lt;p&gt;Then I went down a rabbit hole. I wanted to know what the models were doing. Like, how hot is my CPU getting? How fast is it spitting out tokens? So I started building my own little monitoring setup. Used C for some low-level stuff, Dash for a live dashboard, Python to glue it all together. Oh and lm-sensors to watch the temps because this thing makes my laptop sweat.&lt;/p&gt;

&lt;p&gt;Now I can sit there and watch my models run in real time. Token rate, memory, core temps — all on a dashboard.&lt;/p&gt;

&lt;p&gt;Feels good having AI running offline. No cloud, no weird latency, just my machine. And a bunch of scripts I broke and fixed along the way.&lt;/p&gt;

&lt;p&gt;If you're thinking about trying local AI, just go for it. Just know you'll end up tinkering way more than you expect. Worth it though.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>monitoring</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Web apps.
I like making web apps ..for celebrations or for fun.
So its christmas and I made a small web app.
And I didn't use html or js or css...
But i used C programming to make it..
It was stressing but its good and am happy about it .</title>
      <dc:creator>Erick Mwangi Muguchia </dc:creator>
      <pubDate>Tue, 23 Dec 2025 04:30:38 +0000</pubDate>
      <link>https://dev.to/muguchiaerickmwangi/web-apps-i-like-making-web-apps-for-celebrations-or-for-fun-so-its-christmas-and-i-made-a-small-25i2</link>
      <guid>https://dev.to/muguchiaerickmwangi/web-apps-i-like-making-web-apps-for-celebrations-or-for-fun-so-its-christmas-and-i-made-a-small-25i2</guid>
      <description></description>
    </item>
    <item>
      <title>I made a promise to myself that am not leaving Meru University without Python skills.</title>
      <dc:creator>Erick Mwangi Muguchia </dc:creator>
      <pubDate>Fri, 12 Dec 2025 09:36:29 +0000</pubDate>
      <link>https://dev.to/muguchiaerickmwangi/i-made-a-promise-to-myself-that-am-leaving-meru-university-without-python-skills-2ag6</link>
      <guid>https://dev.to/muguchiaerickmwangi/i-made-a-promise-to-myself-that-am-leaving-meru-university-without-python-skills-2ag6</guid>
      <description>&lt;p&gt;When I arrived at Meru University, I made myself a deal:&lt;br&gt;
"I will not leave this place without learning Python."&lt;/p&gt;

&lt;p&gt;The first thing I did was relocate. I needed to minimize distractions and move to an environment conducive to focused learning.&lt;/p&gt;

&lt;p&gt;Why Python?&lt;/p&gt;

&lt;p&gt;I'd heard so much about it—web development, data science, AI, endless possibilities. I was determined to master it and open doors in tech. But I had zero programming knowledge. I knew it would be challenging, but I was willing to put in the effort.&lt;/p&gt;

&lt;p&gt;Month 1: Building Foundations&lt;/p&gt;

&lt;p&gt;I downloaded tutorials, read documentation, binged YouTube. I learned Python syntax, data types, control structures. Then I practiced—a lot. Small programs. Number games. These games weren't just practice; they made learning fun. That mattered more than I expected.&lt;/p&gt;

&lt;p&gt;Month 2–3: Going Deeper&lt;/p&gt;

&lt;p&gt;After a month, I decided to add complexity. I wanted to understand how programming actually works, not just write code. So I added C to my learning path.&lt;/p&gt;

&lt;p&gt;This wasn't random. Python was my safety net. C forced me to understand memory, pointers, how computers actually think. It made Python click in a new way.&lt;/p&gt;

&lt;p&gt;Learning both simultaneously was hard—but it worked.&lt;/p&gt;

&lt;p&gt;Month 4: The Full Picture&lt;/p&gt;

&lt;p&gt;As weeks turned into months, I got proficient in both. I signed up for GitHub's student pack (more resources, better tools). I learned version control—essential for any real programmer.&lt;/p&gt;

&lt;p&gt;Then came R for statistical programming and data visualization. Each language opened new doors.&lt;/p&gt;

&lt;p&gt;The Progress&lt;/p&gt;

&lt;p&gt;Now, as the semester ends, I can say this honestly: I've made significant progress.&lt;/p&gt;

&lt;p&gt;I have:&lt;br&gt;
✅ Multiple projects built and on GitHub&lt;br&gt;
✅ Mastery of Python, C, and R&lt;br&gt;
✅ Understanding of version control and collaborative development&lt;br&gt;
✅ A journaling habit that tracked every step&lt;/p&gt;

&lt;p&gt;The Real Win&lt;/p&gt;

&lt;p&gt;Learning these languages wasn't just about syntax. It boosted my confidence. It showed me I can learn anything if I commit to it.&lt;/p&gt;

&lt;p&gt;And I developed a habit of documenting everything—journaling my process, reflecting on struggles. That's been invaluable. Future me can look back and see exactly how I got here.&lt;/p&gt;

&lt;p&gt;What's Next&lt;/p&gt;

&lt;p&gt;I'm leaving Meru with a promise kept. But this isn't the end—it's the beginning. I'm excited to explore data science, build real applications, and help others learn like I did.&lt;/p&gt;

&lt;p&gt;The promise was simple. The journey changed everything.&lt;/p&gt;

</description>
      <category>programming</category>
      <category>beginners</category>
      <category>learning</category>
      <category>100daysofcode</category>
    </item>
  </channel>
</rss>
