<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Stephen Sebastian</title>
    <description>The latest articles on DEV Community by Stephen Sebastian (@stephen_sebastian_c85ea2b).</description>
    <link>https://dev.to/stephen_sebastian_c85ea2b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3936117%2Faaa3e3a0-dba3-4fe6-8b19-b08e05277a74.png</url>
      <title>DEV Community: Stephen Sebastian</title>
      <link>https://dev.to/stephen_sebastian_c85ea2b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/stephen_sebastian_c85ea2b"/>
    <language>en</language>
    <item>
      <title>I Stopped Chunking My Logs. Then Gemma 4's 128K Context Found What I'd Missed for Weeks</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Sun, 17 May 2026 14:35:14 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/i-stopped-chunking-my-logs-then-gemma-4s-128k-context-found-what-id-missed-for-weeks-1m58</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/i-stopped-chunking-my-logs-then-gemma-4s-128k-context-found-what-id-missed-for-weeks-1m58</guid>
      <description>&lt;p&gt;&lt;em&gt;Last month, I had a server log that wouldn’t give up its secret.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Every Tuesday at 3:14 AM, a background job crashed. The same error appeared: &lt;code&gt;context deadline exceeded&lt;/code&gt;. No stack. No breadcrumb. Just that smug, silent timeout.&lt;br&gt;
I pulled twelve months of logs. That added up to 3.2 million lines. Roughly 115,000 tokens of pure, messy reality.&lt;br&gt;
My old local LLM capped out at 8K tokens. So, I did what everyone does: I split the file into chunks—January, February, March, and so on.&lt;br&gt;
Each chunk alone looked innocent. February ran fine. March ran fine. Then April showed the timeout, but with no explanation for why. The cause lived in January—a tiny config change that only became critical when combined with a different change in March.&lt;br&gt;
Chunking ruined the connection between them. I was holding two pieces of a broken plate, trying to see how they ever fit together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The core lesson&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Chunking ruins understanding. You can give a model a fragment, but it has no memory of what came before. It's like trying to solve a murder mystery after someone ripped out every third page of the case file.&lt;br&gt;
Then I learned about Gemma 4's 128K token context window.&lt;br&gt;
Numbers like that seem abstract until you put them into perspective. 128K tokens is roughly the entire &lt;em&gt;Lord of the Rings&lt;/em&gt; trilogy—all three books—in one prompt. For my log file, that 115,000-token monster would fit with room to spare.&lt;br&gt;
No chunking. No slicing. Just the whole story, start to finish, fed to the model in one go.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Local setup—exactly what I ran&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I use Ollama because it's one command and done.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;
ollama pull gemma4:26b

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  📊 Debugging Efficiency Analysis
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Parameter&lt;/th&gt;
&lt;th&gt;Traditional 8K Model (Chunked Approach)&lt;/th&gt;
&lt;th&gt;Gemma 4 26B MoE (128K Context Window)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Data Ingestion&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;12 Separate monthly segments&lt;/td&gt;
&lt;td&gt;Single 115,000-token complete file&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Analysis Time&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;3+ Hours of manual cross-referencing&lt;/td&gt;
&lt;td&gt;~50 Seconds (Automated execution)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Correlation Range&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Limited to isolated 30-day boundaries&lt;/td&gt;
&lt;td&gt;365-Day full system timeline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Root Cause Detection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Missed&lt;/strong&gt; (Isolated chunks hid the drift)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Found&lt;/strong&gt; (Linked Jan pool drop with Mar backoff)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That’s the Mixture-of-Experts variant. Active parameters per token are around 4B to 8B, but the model's total size is 26B. It runs smoothly on my M1 MacBook Pro with 16GB RAM. (If you have more memory, &lt;code&gt;gemma4:31b&lt;/code&gt; is the dense version—smarter but also hungrier.)&lt;/p&gt;

&lt;p&gt;Once downloaded, feeding the log file was simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cat &lt;/span&gt;twelve_months_of_hell.log | ollama run gemma4:26b &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="s2"&gt;"Trace the root cause of the recurring timeout. Connect events across the entire timeline."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model paused for about 50 seconds. Then it provided an answer.&lt;/p&gt;

&lt;p&gt;Not with a guess. With a &lt;em&gt;timeline&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;"At line 1,450 (January 12), the connection pool size dropped from 50 to 20. At line 58,200 (March 28), the retry backoff changed from exponential to linear. Neither change caused a problem on its own. But by June, traffic doubled. The smaller pool couldn't keep up, and linear retries worsened the backlog. The timeout first appeared on June 3 at line 112,400."&lt;br&gt;
I had spent three days manually connecting those two changes. Gemma 4 did it in one prompt because it saw both ends of the story at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The real insight isn’t about length&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We talk about context windows like they're just bigger buckets. More storage. But that's not what 128K actually offers.&lt;br&gt;
It provides &lt;strong&gt;temporal coherence&lt;/strong&gt;.&lt;br&gt;
When a model reads a document from beginning to end without artificial breaks, it maintains the sequence. Not just "word A near word B," but "event X happened, then Y, then Z, and because of that, W failed."&lt;br&gt;
That's cause and effect across distance. Chunking destroys it. The model sees fragments, each isolated from the others.&lt;br&gt;
Gemma 4 sees the whole arc. A variable's first appearance, its slow change, its final break—all visible at once.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tying this back to something real (my contest project)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This exact issue with logs pushed me to build &lt;strong&gt;LedgerGuard&lt;/strong&gt;—my submission for the Gemma 4 Challenge.&lt;/p&gt;

&lt;p&gt;I realised that financial ledgers face the same problem. A small business owner's CSV with a full year of transactions is just a log file with dollar signs. Drifting expense categories, slowly changing deduction rules, a questionable transaction in January that only makes sense when you see the context from November.&lt;/p&gt;

&lt;p&gt;So, I wrapped Gemma 4's 128K window into a local-first desktop app. You can drag in your CSV or tax PDF. The model scans everything—no cloud, no data leaving your machine. It flags anomalies, suggests deductions, and even runs "what-if" simulations, like "What if I reclassified this $5,000 as marketing instead of office supplies?"&lt;br&gt;
The same temporal coherence that found my config drift now finds tax errors across 12 months of ledgers. All offline. All private.&lt;/p&gt;

&lt;h3&gt;
  
  
  🔄 LedgerGuard Offline Data Pipeline
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ Raw CSV / Bank PDF Ledger ] 
              │
              ▼ (Native File Dropzone Parsing)
[ Client-Side Browser Memory ] 
              │
              ▼ (Encrypted Local Storage)
[ Local IndexedDB Cache Database ] 
              │
              ▼ (Stateless Local Port Bridge)
[ Node.js Server Environment (server.ts) ] 
              │
              ▼ (Secure Handshake via @google/genai SDK)
[ Local Gemma 4 Engine (Ollama Runtime Sandbox) ] 
              │
              ▼ (Forensic Token Analysis)
[ 🎯 Structured JSON Compliance Audit Matrix Output ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;When to use 128K (and when to skip it)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Here’s my honest take after a month of experimenting.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use 128K when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're analyzing logs, financial records, legal contracts, or long conversations&lt;/li&gt;
&lt;li&gt;The order of events is more crucial than any single event&lt;/li&gt;
&lt;li&gt;You're tired of playing "connect the dots" across chunk boundaries&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  🧠 Local Deployment Matrix
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model Tier&lt;/th&gt;
&lt;th&gt;Active Parameters&lt;/th&gt;
&lt;th&gt;System Memory Req.&lt;/th&gt;
&lt;th&gt;Ideal Operational Workload&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 2B (Lite)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2B Parameters&lt;/td&gt;
&lt;td&gt;4GB+ RAM&lt;/td&gt;
&lt;td&gt;Ultra-mobile devices, isolated edge scripts, quick standalone functions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 26B (MoE)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;4B–8B Active per token&lt;/td&gt;
&lt;td&gt;16GB+ RAM&lt;/td&gt;
&lt;td&gt;Multi-step interactive applications, deep forensic document log analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Gemma 4 31B (Dense)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;31B Parameters&lt;/td&gt;
&lt;td&gt;32GB+ RAM&lt;/td&gt;
&lt;td&gt;High-end hardware deployments, precise deterministic financial audit compilation&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Skip it when:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You're summarizing a single email or writing a function&lt;/li&gt;
&lt;li&gt;You need responses in under a second (128K takes 30–90 seconds on consumer hardware)&lt;/li&gt;
&lt;li&gt;The document is naturally short—don’t drive a truck to buy milk&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Final thought&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That timeout bug is fixed now. The fix was simple—revert the retry backoff change. But finding it without a 128K context window would have taken another week of manual correlation.&lt;/p&gt;

&lt;p&gt;Gemma 4 didn't just give me a bigger bucket. It gave me back the timeline.Sometimes, that’s all you need to see the story that was hiding in plain sight.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdswsbe2qbx8kytx2ksko.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fdswsbe2qbx8kytx2ksko.png" alt=" " width="800" height="520"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;This article is part of my submission for the Gemma 4 challenge. The accompanying project, LedgerGuard, is on GitHub &lt;a href="https://github.com/hypecuts619-source/Ledger-Guard" rel="noopener noreferrer"&gt;https://github.com/hypecuts619-source/Ledger-Guard&lt;/a&gt;. All code, all prompts, and all log files used in this post are open source—no cloud required. Thanks Gemma 4.&lt;/p&gt;

</description>
      <category>devchallenge</category>
      <category>gemmachallenge</category>
      <category>gemma</category>
      <category>ai</category>
    </item>
    <item>
      <title>Building a Zero-Framework, Local-First PWA to Combat Web Bloat</title>
      <dc:creator>Stephen Sebastian</dc:creator>
      <pubDate>Sun, 17 May 2026 11:00:34 +0000</pubDate>
      <link>https://dev.to/stephen_sebastian_c85ea2b/building-a-zero-framework-local-first-pwa-to-combat-web-bloat-1cc</link>
      <guid>https://dev.to/stephen_sebastian_c85ea2b/building-a-zero-framework-local-first-pwa-to-combat-web-bloat-1cc</guid>
      <description>&lt;p&gt;Hey dev.to community! 👋&lt;/p&gt;

&lt;p&gt;I want to share an indie engineering project I’ve been building over the last few weeks: &lt;strong&gt;&lt;a href="https://quickconvertunits.com" rel="noopener noreferrer"&gt;QuickConvertUnits&lt;/a&gt;&lt;/strong&gt;. &lt;/p&gt;

&lt;p&gt;It’s a fast, ad-free, local-first unit conversion utility designed to solve a personal frustration I had with the modern web ecosystem. &lt;/p&gt;

&lt;p&gt;Whenever I needed to quickly convert cooking metrics on my phone or map out structural spatial areas (like square feet to acres) on-site, the top search results always led me to ancient calculator directories that completely choked my mobile browser. They forced megabytes of intrusive tracking pixels, pop-up scripts, and full-screen layout shifts onto the viewport.&lt;/p&gt;

&lt;p&gt;I wanted to see how clean, lightweight, and fast a modern web utility could be if we stripped away all the bloat and focused strictly on modern client-side capabilities.&lt;/p&gt;

&lt;p&gt;Here is a deep dive into the engineering decisions and code patterns behind the platform.&lt;/p&gt;




&lt;h3&gt;
  
  
  🛠️ The Core Technical Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Zero Framework Architecture:&lt;/strong&gt; Built entirely using native HTML5, semantic CSS, and pure vanilla JavaScript. No React re-renders, no heavy framework hydration lag.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Progressive Web App (PWA):&lt;/strong&gt; Engineered using custom service workers. Once a user loads a directory path once, the core math engines are cached locally. It runs 100% offline.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strictly Privacy-First:&lt;/strong&gt; No backend servers, no cloud storage dependencies, and no remote data parsing. The application processes data instantly inside the local client layer.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  💡 Interesting Code Patterns Implemented
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1. Zero-Latency Reactive Calculation
&lt;/h4&gt;

&lt;p&gt;Instead of forcing users to click an old-school "Submit" or "Calculate" button, the mathematical calculation logic executes reactively on the active browser &lt;code&gt;input&lt;/code&gt; event listener. &lt;/p&gt;

&lt;p&gt;Here is a simplified blueprint of the native listener strategy used to handle dynamic structural conversions without unnecessary DOM thrashing:&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
javascript
const inputElement = document.getElementById('source-value');
const outputElement = document.getElementById('target-output');

// Simple reactive calculation matrix execution
function handleInstantConversion() {
    const rawValue = parseFloat(inputElement.value);

    if (isNaN(rawValue)) {
        outputElement.value = '';
        return;
    }

    const conversionFactor = 0.03527396; // e.g., grams to ounces
    const processedResult = rawValue * conversionFactor;

    // Smooth rendering via precision normalization
    outputElement.value = Number(processedResult.toFixed(4));
}

inputElement.addEventListener('input', handleInstantConversion);

Multi-Language i18n Without Heavy Bundles
To make the application globally accessible, I wanted to support comprehensive multi-language support (English, Italian, French, German, Chinese, Arabic, etc.).

Instead of pulling down a massive external internationalization library via npm, I kept the application bundle highly optimized by utilizing a lean native JSON lookup object matching the language pathing variables inside the window state. This allowed me to serve localized interfaces instantly while preserving a near-zero initial JS payload size.

🚀 What I Learned
Building this reminded me of how powerful native web APIs have become. We often reach for frameworks out of habit, even when building simple, high-performance daily utilities. Relying entirely on native web standards allowed me to keep the mobile performance score near perfect while offering complete offline capabilities that rival native mobile apps.

💬 I'd Love Your Feedback!
The project is live here: quickconvertunits.com

As developers, how are you dealing with floating-point math precision errors when handling deep decimal conversions in pure client-side JavaScript? I'd love to know what strategies you use to ensure absolute math precision without bringing in heavy mathematical utility libraries.

Let me know your thoughts on the performance layout, or if there are any custom conversion modules you think I should add next!
![ ](https://dev-to-uploads.s3.amazonaws.com/uploads/articles/g408hv9biz6qo30h1zai.png)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

</description>
      <category>webdev</category>
      <category>javascript</category>
      <category>beginners</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
