<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: speed engineer</title>
    <description>The latest articles on DEV Community by speed engineer (@speed_engineer).</description>
    <link>https://dev.to/speed_engineer</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3844864%2F78a68c07-7a26-44f8-a98d-84d4d29fa7ef.png</url>
      <title>DEV Community: speed engineer</title>
      <link>https://dev.to/speed_engineer</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/speed_engineer"/>
    <language>en</language>
    <item>
      <title>Sunday Notes: Find Where Before You Theorize Why</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Sun, 14 Jun 2026 03:47:56 +0000</pubDate>
      <link>https://dev.to/speed_engineer/sunday-notes-find-where-before-you-theorize-why-57o2</link>
      <guid>https://dev.to/speed_engineer/sunday-notes-find-where-before-you-theorize-why-57o2</guid>
      <description>&lt;p&gt;This week I wrote about two things that have nothing to do with each other.&lt;/p&gt;

&lt;p&gt;On Saturday it was a debugging story: a service was losing 30% of its UDP packets, I spent the better part of a day convinced a switch was dying, and the network turned out to be completely innocent — my own host was accepting the datagrams and then dropping them because a socket buffer kept filling during bursts. One kernel counter (&lt;code&gt;netstat -su&lt;/code&gt;, "receive buffer errors") would have told me that in about ten seconds.&lt;/p&gt;

&lt;p&gt;Earlier in the week it was product stuff: why teams keep losing their best work, why a freelancer's Friday disappears into "what was I even doing on Tuesday," why a marketing team rewrites the same prompt every quarter.&lt;/p&gt;

&lt;p&gt;It wasn't until I sat down to write this recap that I noticed they're the same lesson.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shared mistake
&lt;/h2&gt;

&lt;p&gt;In the packet story, the expensive move was theorizing about &lt;em&gt;why&lt;/em&gt; before establishing &lt;em&gt;where&lt;/em&gt;. "The network is dropping packets" is a theory about cause. It sent me to the wrong team, the wrong dashboards, the wrong week. The cheap move — the one I skipped — was localizing first: are the packets dying on the wire, or after they reach my box? Two completely different buildings, identical symptom.&lt;/p&gt;

&lt;p&gt;Product decisions have the exact same failure mode. "Users churn because we're missing feature X" is a theory about why. It's also, usually, the most expensive possible thing to act on first, because building X takes a quarter and the theory might be wrong. The cheap move is to localize: &lt;em&gt;where&lt;/em&gt; in the week, the workflow, or the funnel does the value actually leak out?&lt;/p&gt;

&lt;p&gt;When you force yourself to find &lt;em&gt;where&lt;/em&gt; first, the answer is often boring and small — and boring and small is good news, because boring and small is cheap to fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why I keep coming back to this
&lt;/h2&gt;

&lt;p&gt;Both of the things I work on are, underneath the marketing, just instruments for making "where" visible.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; exists because "I undercharged this month" is a why-theory; the useful version is &lt;em&gt;where&lt;/em&gt; the billable hours actually went, captured while they happened instead of reconstructed on Friday.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://promptship.co" rel="noopener noreferrer"&gt;PromptShip&lt;/a&gt; is the same shape pointed at a different problem: "our team is bad at AI" is a why-theory; "the prompt that worked is sitting in one person's chat history and nobody else can find it" is a &lt;em&gt;where&lt;/em&gt;. One is an identity crisis, the other is a Tuesday-afternoon fix.&lt;/p&gt;

&lt;p&gt;Neither is glamorous. Both are counters you check before you theorize.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Sunday takeaway
&lt;/h2&gt;

&lt;p&gt;Find where before you theorize why. It works on packets, it works on churn, and it works on most arguments that have gone in circles for more than ten minutes. Locate the problem in space before you start explaining it — the explanation is usually cheaper, and more often correct, once you know where you're standing.&lt;/p&gt;

&lt;p&gt;The full UDP debugging story, counters and buffer math included, is on Medium if that's your kind of weekend reading: &lt;a href="https://medium.com/@speed_enginner/networking-for-developers-i-lost-30-of-udp-packets-the-debugging-story-f755f5680b35" rel="noopener noreferrer"&gt;I Lost 30% of My UDP Packets — The Debugging Story&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;What's the last thing you assumed the cause of — before you actually measured where it was happening?&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>saas</category>
      <category>startup</category>
      <category>debugging</category>
    </item>
    <item>
      <title>I Lost 30% of My UDP Packets — and the Network Was Innocent</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Sat, 13 Jun 2026 03:54:53 +0000</pubDate>
      <link>https://dev.to/speed_engineer/i-lost-30-of-my-udp-packets-and-the-network-was-innocent-5die</link>
      <guid>https://dev.to/speed_engineer/i-lost-30-of-my-udp-packets-and-the-network-was-innocent-5die</guid>
      <description>&lt;p&gt;A receiver pulling a UDP feed was missing roughly 30% of its messages. No errors, no exceptions, no stack traces — just gaps in the sequence numbers. The first suspect is always the network: a flaky switch, a saturated link, a tired NIC.&lt;/p&gt;

&lt;p&gt;The network was innocent. The packets were being dropped &lt;em&gt;on the receiving host&lt;/em&gt;, after they'd already arrived. Here's how to tell the difference, and why it matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why UDP makes this sneaky
&lt;/h2&gt;

&lt;p&gt;UDP has no retransmission and no backpressure. When a datagram is lost, nobody is notified — not the sender, not the receiver. The packet simply isn't there.&lt;/p&gt;

&lt;p&gt;That means two completely different failures look identical from the application's point of view:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The network dropped the packet &lt;em&gt;before&lt;/em&gt; it reached your machine.&lt;/li&gt;
&lt;li&gt;Your own host accepted the packet and then threw it away &lt;em&gt;after&lt;/em&gt; it arrived.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The application sees the same thing in both cases: a missing sequence number. But the fix is in a different building depending on which one it is.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where the packets actually go
&lt;/h2&gt;

&lt;p&gt;The receive path is: NIC → kernel socket receive buffer → your &lt;code&gt;recv()&lt;/code&gt; call. The kernel parks incoming datagrams in a per-socket buffer until your code reads them. If your code doesn't drain that buffer fast enough, it fills, and the kernel drops the overflow. Crucially, &lt;strong&gt;the kernel counts those drops.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;On Linux:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Per-protocol summary — look for "receive buffer errors"&lt;/span&gt;
netstat &lt;span class="nt"&gt;-su&lt;/span&gt;

&lt;span class="c"&gt;# Or straight from the kernel counters&lt;/span&gt;
&lt;span class="nb"&gt;cat&lt;/span&gt; /proc/net/snmp | &lt;span class="nb"&gt;grep&lt;/span&gt; &lt;span class="nt"&gt;-A1&lt;/span&gt; Udp
&lt;span class="c"&gt;#   InDatagrams  ... InErrors  RcvbufErrors ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If &lt;code&gt;RcvbufErrors&lt;/code&gt; is climbing, the network did its job and &lt;em&gt;your host&lt;/em&gt; discarded the datagrams. That single counter collapses a week of "is it the switch?" into about ten seconds of certainty.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual cause
&lt;/h2&gt;

&lt;p&gt;In this case the socket receive buffer was sitting at the default (~208 KB). The sender burst faster than a single receive thread could call &lt;code&gt;recv()&lt;/code&gt;. Average throughput looked fine on every dashboard — but the bursts filled the buffer in milliseconds, and everything past the brim was dropped. The metric that mattered wasn't mean throughput; it was peak burst versus drain rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix, in order of leverage
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Drain faster.&lt;/strong&gt; The receive loop was parsing &lt;em&gt;and&lt;/em&gt; doing a database write inline. Anything that isn't "copy bytes out of the socket" belongs off the hot path: &lt;code&gt;recv()&lt;/code&gt; → hand the buffer to a queue → immediately loop back to &lt;code&gt;recv()&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Raise the buffer.&lt;/strong&gt; Bump &lt;code&gt;SO_RCVBUF&lt;/code&gt;, and raise &lt;code&gt;net.core.rmem_max&lt;/code&gt; so the kernel actually honors the request. A bigger buffer doesn't fix a slow consumer — it absorbs bursts so a fast-enough consumer never falls behind. You usually need both this and #1.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch your syscalls.&lt;/strong&gt; &lt;code&gt;recvmmsg()&lt;/code&gt; pulls many datagrams per system call, which cuts per-packet overhead when volume is high.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Spread the load.&lt;/strong&gt; If one core genuinely can't keep up, &lt;code&gt;SO_REUSEPORT&lt;/code&gt; lets multiple threads share the same port with separate buffers.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;"Packet loss" is a &lt;em&gt;location&lt;/em&gt;, not a cause. Find out &lt;strong&gt;where&lt;/strong&gt; before you theorize about &lt;strong&gt;why&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;With UDP, silent drops are the default — the protocol won't tell you, so the kernel counters have to.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;RcvbufErrors&lt;/code&gt; is the first thing to check. It almost always points at a receive buffer that's too small or a consumer that's too slow.&lt;/li&gt;
&lt;li&gt;A bigger buffer absorbs bursts; a faster drain prevents them. You usually want both.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full debugging story — the live-feed before/after, the buffer math, and the exact counters I watched while tuning it — is on Medium:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://medium.com/@speed_enginner/networking-for-developers-i-lost-30-of-udp-packets-the-debugging-story-f755f5680b35" rel="noopener noreferrer"&gt;Networking for Developers: I Lost 30% of UDP Packets — The Debugging Story&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;I write more like this on Medium as **The Speed Engineer&lt;/em&gt;* — performance engineering, debugging stories, and the lower-level systems work that doesn't fit in a tweet.*&lt;/p&gt;

</description>
      <category>networking</category>
      <category>debugging</category>
      <category>programming</category>
      <category>performance</category>
    </item>
    <item>
      <title>We Built Two Products Around the Same Boring Insight: Valuable Work Leaks</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Fri, 12 Jun 2026 03:53:43 +0000</pubDate>
      <link>https://dev.to/speed_engineer/we-built-two-products-around-the-same-boring-insight-valuable-work-leaks-5e8f</link>
      <guid>https://dev.to/speed_engineer/we-built-two-products-around-the-same-boring-insight-valuable-work-leaks-5e8f</guid>
      <description>&lt;h2&gt;
  
  
  The pattern we kept ignoring
&lt;/h2&gt;

&lt;p&gt;When you build software for a living, you start noticing the same problem wearing different costumes. The costume changes; the body underneath is always the same: &lt;strong&gt;valuable work gets created, and then quietly leaks away before anyone captures it.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;We have now built two products around that one observation. Here is the story — and the lesson I wish we had internalized three years earlier.&lt;/p&gt;

&lt;h2&gt;
  
  
  Leak #1: billable time
&lt;/h2&gt;

&lt;p&gt;Our first product started as an internal tool. We were a small consulting shop, and at the end of every month we reconstructed our hours from memory, calendar invites, and guilt.&lt;/p&gt;

&lt;p&gt;The math was brutal. If each person under-reports just 20 minutes a day — a quick call here, a "real fast" review there — that is roughly 7 hours a month per person walking out the door unbilled. For a five-person team billing $100/hr, you are lighting about $3,500 on fire. Every month.&lt;/p&gt;

&lt;p&gt;The fix was not a better spreadsheet. It was removing the memory step entirely: tracking time as it happens instead of reconstructing it later. That tool became &lt;strong&gt;FillTheTimesheet&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Leak #2: your best prompts
&lt;/h2&gt;

&lt;p&gt;Fast-forward to the AI era. We watched our own team — marketing, sales, support, not engineers — get genuinely good at writing prompts for ChatGPT and Claude. Someone would craft a prompt that turned a 40-minute task into a 4-minute one...&lt;/p&gt;

&lt;p&gt;...and then paste it into a Slack thread, where it died.&lt;/p&gt;

&lt;p&gt;Two weeks later, three other people would reinvent a worse version of the same prompt. The institutional knowledge existed for exactly one session, then leaked away.&lt;/p&gt;

&lt;p&gt;Same body, new costume. So we built &lt;strong&gt;PromptShip&lt;/strong&gt; — a shared prompt library so a team's best prompts get captured once and reused by everyone, instead of getting lost in chat history.&lt;/p&gt;

&lt;h2&gt;
  
  
  The lesson
&lt;/h2&gt;

&lt;p&gt;Here is the part I would tattoo on past-me: &lt;strong&gt;the most valuable problems are the ones so mundane that everyone has quietly accepted them.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nobody files a feature request for "I keep forgetting my hours" or "our good prompts disappear." They absorb the loss as a cost of doing business. That acceptance is exactly where the opportunity hides.&lt;/p&gt;

&lt;p&gt;If you are looking for something to build — or just something to fix in your own workflow — do not hunt for exotic problems. Look for the leaks everyone has stopped noticing: work that gets created but never captured, knowledge that lives in one person's head, value that is generated and then immediately discarded.&lt;/p&gt;

&lt;p&gt;Plug one of those and you will rarely have to explain why it matters.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Valuable work leaks constantly: unbilled time, lost knowledge, discarded outputs.&lt;/li&gt;
&lt;li&gt;The best problems are mundane enough that people have stopped complaining about them.&lt;/li&gt;
&lt;li&gt;Capture-at-the-moment beats reconstruct-it-later, every time.&lt;/li&gt;
&lt;li&gt;You do not need a novel problem. You need an unaddressed one.&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>startup</category>
      <category>saas</category>
      <category>productivity</category>
      <category>career</category>
    </item>
    <item>
      <title>Networking for Developers: TCP vs UDP (When Each Protocol Kills Your App)</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Thu, 11 Jun 2026 13:26:43 +0000</pubDate>
      <link>https://dev.to/speed_engineer/networking-for-developers-tcp-vs-udp-when-each-protocol-kills-your-app-1kek</link>
      <guid>https://dev.to/speed_engineer/networking-for-developers-tcp-vs-udp-when-each-protocol-kills-your-app-1kek</guid>
      <description>&lt;p&gt;Your video call stutters. Your game lags. You picked the wrong protocol — and now you’re debugging packets at mid night. &lt;/p&gt;




&lt;h3&gt;
  
  
  Networking for Developers: TCP vs UDP (When Each Protocol Kills Your App)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Your video call stutters. Your game lags. You picked the wrong protocol — and now you’re debugging packets at mid night.&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3fx5byfgc3d3ondmg3bj.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3fx5byfgc3d3ondmg3bj.png" width="800" height="734"&gt;&lt;/a&gt; Choosing between TCP and UDP isn’t academic — it’s the difference between your app working and your users complaining. Pick wrong and you’ll trace symptoms for days before finding the real cause.&lt;/p&gt;

&lt;p&gt;Our P99 latency hit 5 seconds randomly. Three days of packet tracing led me to a single dropped packet.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Not corrupted. Not delayed. Just gone.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I’d been running a real-time sensor network. Temperature readings every 100ms. Life-or-death? No. But the client paid for sub-second response times. We were violating SLA hourly.&lt;/p&gt;

&lt;p&gt;The system used TCP. Reliable delivery, ordered packets — textbook choice for anything important, right?&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Wrong.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The Debugging Session That Changed Everything
&lt;/h3&gt;

&lt;p&gt;So I checked the obvious first. Network saturation? No. Server CPU? Fine. Memory leaks? Clean. I spent two days looking at application code before someone suggested I actually look at the &lt;em&gt;network&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;I fired up tcpdump on the sensor gateway:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tcpdump -i eth0 -n 'tcp port 8080' -w capture.pcap
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Watched it for an hour. Then opened Wireshark.&lt;/p&gt;

&lt;p&gt;That’s when I saw it. One packet dropped. TCP’s retransmission timer kicked in. 200ms wait. Retry. Another drop. Exponential backoff. 400ms. 800ms. 1600ms. By the time the packet finally made it through, we’d blown past 5 seconds.&lt;/p&gt;

&lt;p&gt;Five seconds of latency because of one 512-byte packet.&lt;/p&gt;

&lt;h3&gt;
  
  
  I Assumed TCP Was Reliable — Then Packet Loss Taught Me Different
&lt;/h3&gt;

&lt;p&gt;TCP is reliable in that it &lt;em&gt;eventually&lt;/em&gt; delivers your data. But reliable doesn’t mean fast. It doesn’t even mean predictable.&lt;/p&gt;

&lt;p&gt;Networks fail.&lt;/p&gt;

&lt;p&gt;When a packet drops on TCP, the entire connection stalls. TCP guarantees ordering. So if packet #47 disappears, packet #48 through #500 just… wait. They’re already at the receiver. Sitting in a buffer. Unusable. This is head-of-line blocking.&lt;/p&gt;

&lt;p&gt;UDP doesn’t care. Packet #47 vanishes? Packet #48 gets delivered anyway. No waiting. No retries. No guarantees.&lt;/p&gt;

&lt;p&gt;I had temperature sensors. If reading #47 was lost, reading #48 was still useful. More useful than waiting 5 seconds for stale data.&lt;/p&gt;

&lt;p&gt;This matters for revenue. Our client was aggregating sensor data for HVAC optimization. Five-second-old temperature readings meant HVAC systems reacting to old conditions. Wasted energy. Real dollars.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Foundation: What These Protocols Actually Do
&lt;/h3&gt;

&lt;p&gt;TCP establishes connections. Three-way handshake: SYN, SYN-ACK, ACK. Overhead before you send a single byte of application data. Every packet gets acknowledged. Missing ACK? Retransmit. Receiver buffers out-of-order packets and delivers them in sequence to your application.&lt;/p&gt;

&lt;p&gt;Flow control prevents fast senders from overwhelming slow receivers. Congestion control backs off when the network is saturated. TCP is a state machine with 11 different states. It’s complex because it’s trying to make an unreliable network look reliable.&lt;/p&gt;

&lt;p&gt;UDP is simpler. You call &lt;code&gt;sendto()&lt;/code&gt;. Packet goes on the wire. That's it. No connection. No state. No acknowledgments. No retries. No guarantees about ordering or delivery.&lt;/p&gt;

&lt;p&gt;Here’s what UDP looks like:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import socket  

sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)  
sock.sendto(b"temperature:23.5", ("10.0.1.50", 8080))
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Four lines. Fire and forget. If the network drops it, you’ll never know.&lt;/p&gt;

&lt;p&gt;TCP needs more ceremony:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import socket  

sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)  
sock.connect(("10.0.1.50", 8080))  
sock.sendall(b"temperature:23.5")  
sock.close()
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Connection setup and teardown. More syscalls. More network round-trips. Higher latency even when nothing goes wrong.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8r8x841pk5v4hcmrrar.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fi8r8x841pk5v4hcmrrar.png" width="800" height="734"&gt;&lt;/a&gt;TCP’s handshake and acknowledgment overhead seems reasonable — until you’re sending hundreds of small messages per second and every round-trip adds milliseconds. UDP skips all of it.&lt;/p&gt;

&lt;h3&gt;
  
  
  When Different Protocols Make Sense
&lt;/h3&gt;

&lt;p&gt;I rebuilt the sensor system with UDP. Latency dropped to 50ms P99. Problem solved.&lt;/p&gt;

&lt;p&gt;Except not every problem wants UDP.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use TCP when&lt;/strong&gt; : data loss is unacceptable. HTTP requests — you can’t skip part of an HTML page. File transfers — corrupted files are useless. Database queries — missing rows break application logic. Email delivery — partial messages don’t work. API calls where you need confirmation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use UDP when&lt;/strong&gt; : timeliness beats completeness. Video streaming — one dropped frame is invisible, buffering to retry is noticeable. Gaming — 200ms-old player positions are worthless, waiting for retransmits is worse. DNS queries — if the response doesn’t arrive, just ask again. Metrics collection — one missing data point doesn’t invalidate the trend. VoIP — humans tolerate brief audio dropouts better than lag.&lt;/p&gt;

&lt;p&gt;Speaking of DNS, that’s often your first failure point. DNS uses UDP for speed. Queries timeout after 2 seconds and retry. If DNS is slow, everything feels slow — even TCP connections stall at hostname resolution.&lt;/p&gt;

&lt;p&gt;Actually, most people don’t realize DNS is UDP by default. It falls back to TCP only for responses over 512 bytes (before EDNS). This design choice prioritizes speed for the 99% case.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Moment I Actually Saw Network Behavior
&lt;/h3&gt;

&lt;p&gt;Back to my debugging session.&lt;/p&gt;

&lt;p&gt;I set up continuous packet capture. Left it running overnight. Next morning I filtered for retransmissions:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tcp.analysis.retransmission
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Thousands of entries. Not random though. They clustered around 2 AM. Same time every night.&lt;/p&gt;

&lt;p&gt;So I checked the infrastructure logs. Backup jobs. Every night at 2 AM, backup traffic saturated the 1Gbps link. TCP saw congestion, slowed down, retransmitted dropped packets. My sensor traffic got caught in it.&lt;/p&gt;

&lt;p&gt;With UDP, the backup traffic still saturated the link. Some sensor packets still dropped. But the surviving packets arrived immediately. No cascading retransmission delays.&lt;/p&gt;

&lt;p&gt;I’m still not sure why the network team scheduled backups during production hours. Politics, probably.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Gotcha: Timeouts and Buffer Sizes
&lt;/h3&gt;

&lt;p&gt;Here’s my real mistake: I set my read timeout too short. Wasted a week tuning it.&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sock.settimeout(0.5)  # Don't do this blindly
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Half a second seemed reasonable. But under congestion, legitimate packets took 800ms to arrive. My timeout fired. Connection closed. Data lost.&lt;/p&gt;

&lt;p&gt;I bumped it to 2 seconds. Helped sometimes. Made the head-of-line blocking worse other times. Longer timeouts meant the application waited longer when packets actually were lost.&lt;/p&gt;

&lt;p&gt;With UDP, timeouts work differently. You’re not waiting for the protocol to retry. You’re just waiting for data. If nothing arrives, your application decides what to do. Send a new request? Use stale data? Your choice.&lt;/p&gt;

&lt;p&gt;Buffer sizes matter too. TCP receive buffers hide latency problems until they overflow. Then you get tail latency spikes. UDP has smaller buffers because there’s no reordering queue. Lost packets don’t consume buffer space.&lt;/p&gt;

&lt;h3&gt;
  
  
  Diagnosis Cascades Through Layers
&lt;/h3&gt;

&lt;p&gt;Here’s how the debugging actually went. Timeouts are symptoms. I saw timeouts in application logs. Traced them to slow responses. Speaking of symptoms, high CPU often masks network issues — if your app is busy retrying, CPU looks busy, but you’re not doing useful work.&lt;/p&gt;

&lt;p&gt;Slow responses came from retransmissions. Retransmissions came from packet loss. Packet loss came from link saturation. Link saturation came from backup jobs.&lt;/p&gt;

&lt;p&gt;Each layer revealed the next. This is how network debugging works. You start at the application layer (HTTP 500s, timeouts) and work down through TCP (retransmissions, connection resets) to IP (routing, fragmentation) to the physical layer (link saturation, bit errors).&lt;/p&gt;

&lt;p&gt;Why connection pooling matters: every TCP connection has setup cost. If you’re making hundreds of requests per second, connection setup becomes the bottleneck. Pools amortize that cost. But they also hide problems — a bad connection stays in the pool, serving corrupt data until health checks remove it.&lt;/p&gt;

&lt;p&gt;UDP doesn’t have connection pools. No connections to pool. Each packet is independent. Lower complexity, but you lose connection-level metrics and circuit breaking.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Middle Ground: QUIC
&lt;/h3&gt;

&lt;p&gt;I mentioned the sensor network earlier. We eventually migrated to QUIC.&lt;/p&gt;

&lt;p&gt;QUIC runs on UDP but adds reliability features. Selective acknowledgments — only retransmit lost packets, not everything after them. Connection migration — your phone switches from WiFi to cellular, connection survives. Reduced handshake latency — combines TCP’s three-way handshake and TLS setup into one round-trip.&lt;/p&gt;

&lt;p&gt;HTTP/3 uses QUIC. Google built it to fix TCP’s head-of-line blocking for web traffic. A slow-loading image doesn’t block JavaScript anymore.&lt;/p&gt;

&lt;p&gt;QUIC isn’t perfect. It’s complex. Debugging is harder — encrypted from the start, so tcpdump shows less. NAT traversal can be tricky. CPU overhead is higher than plain UDP.&lt;/p&gt;

&lt;p&gt;But for applications that need reliability &lt;em&gt;and&lt;/em&gt; low latency, it’s worth considering.&lt;/p&gt;

&lt;p&gt;[Image Prompt: A three-tier timeline showing network evolution: TCP (1981), UDP (1980), and QUIC (2012) with arrows indicating “reliability” vs “speed” tradeoffs. Show how QUIC attempts to combine both.]&lt;/p&gt;

&lt;p&gt;Caption: QUIC isn’t replacing TCP everywhere, but it’s solving real problems for specific use cases. When you need both reliability and speed, the protocol layer matters more than you think.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Means For Your Next Project
&lt;/h3&gt;

&lt;p&gt;Don’t cargo-cult protocol choices. “Everyone uses TCP” isn’t engineering reasoning.&lt;/p&gt;

&lt;p&gt;Ask: what happens when packets drop? If you need every byte in order, TCP is right. If recent data beats complete data, consider UDP.&lt;/p&gt;

&lt;p&gt;Test under realistic conditions. Packet loss isn’t theoretical. Saturate your network in staging. Drop packets with &lt;code&gt;tc&lt;/code&gt; on Linux:&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tc qdisc add dev eth0 root netem loss 1%
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;One percent packet loss. Watch how your application behaves. TCP might be fine. Or you might see latency spike to seconds.&lt;/p&gt;

&lt;p&gt;Measure what matters. Throughput? Latency? P99? P999? Different protocols optimize for different metrics. UDP gives better P99 latency. TCP gives better worst-case reliability.&lt;/p&gt;

&lt;p&gt;My sensor network runs on QUIC now. P99 latency is 45ms. Packet loss doesn’t cascade anymore. We still lose packets — that’s networks — but the system degrades gracefully.&lt;/p&gt;

&lt;p&gt;Tomorrow: I debugged a UDP packet loss nightmare. Turns out application-level acknowledgments are harder than they look. More on that soon.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sources consulted&lt;/strong&gt; : RFC 793 (TCP), RFC 768 (UDP), RFC 9000 (QUIC), Linux kernel TCP implementation docs, Cloudflare’s blog on QUIC deployment&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Enjoyed the read? Let’s stay connected!&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🚀 Follow &lt;strong&gt;The Speed Engineer&lt;/strong&gt; for more Rust, Go and high-performance engineering stories.&lt;/li&gt;
&lt;li&gt;💡 Like this article? Follow for daily speed-engineering benchmarks and tactics.&lt;/li&gt;
&lt;li&gt;⚡ Stay ahead in Rust and Go — follow for a fresh article every morning &amp;amp; night.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your support means the world and helps me create more content you’ll love. ❤️&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Your team doesn't have a prompt problem. It has a blank-box problem.</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Thu, 11 Jun 2026 03:45:56 +0000</pubDate>
      <link>https://dev.to/speed_engineer/your-team-doesnt-have-a-prompt-problem-it-has-a-blank-box-problem-1he9</link>
      <guid>https://dev.to/speed_engineer/your-team-doesnt-have-a-prompt-problem-it-has-a-blank-box-problem-1he9</guid>
      <description>&lt;p&gt;Most advice about getting more out of AI is aimed at one person getting better at writing prompts. Take a course, learn the "formula," practice every day. That's fine if you're an enthusiast. It's a bad deal for a team of busy people who just want to finish a task.&lt;/p&gt;

&lt;p&gt;Here's what actually happens on most teams.&lt;/p&gt;

&lt;h2&gt;
  
  
  The blank box
&lt;/h2&gt;

&lt;p&gt;Someone opens ChatGPT to do a real task — turn a long customer email into three clear bullet points, draft a job posting, rewrite a paragraph for a newsletter. They get an empty box and a blinking cursor. They don't know what "good" looks like for this task, so they either type something vague ("make this better"), get something vague back, or they freeze and close the tab.&lt;/p&gt;

&lt;p&gt;The conclusion they walk away with is "AI isn't that useful for my work." The real problem is that they started from zero, with no idea what a strong prompt for that task even looks like.&lt;/p&gt;

&lt;p&gt;Meanwhile, someone three desks over wrote a great prompt for that exact task last month. It works every time. It just lives in their personal chat history, where no one else will ever see it.&lt;/p&gt;

&lt;h2&gt;
  
  
  For a team, prompting is a distribution problem
&lt;/h2&gt;

&lt;p&gt;For one person, writing better prompts is a skill you build over time. For a team, the thing holding you back usually isn't skill — it's distribution. The good prompts already exist. They're trapped in one person's account.&lt;/p&gt;

&lt;p&gt;So the goal isn't to turn everyone into a "prompt engineer." It's to make sure the person facing the blank box right now can start from a proven prompt instead of from nothing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The habit that fixes it
&lt;/h2&gt;

&lt;p&gt;You don't need a tool to start. You need four small habits:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Notice when a prompt works.&lt;/strong&gt; The moment you get output you'd actually use, that prompt just became worth keeping. Most people throw it away by closing the tab.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Label it by the task, not the topic.&lt;/strong&gt; "Summarize a support thread into themes" beats "AI stuff." People look for prompts by the job they're trying to do, so name it that way.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start from the closest one and adapt 20%.&lt;/strong&gt; The skill that actually scales on a team isn't writing prompts from scratch — it's recognizing "this is 80% of what I need" and changing the rest. Starting from a known-good prompt also teaches people what good looks like faster than any course.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Keep them where people already are.&lt;/strong&gt; A prompt nobody can find is the same as no prompt. A pinned doc or a shared channel beats ten private chat histories.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Where the doc breaks down
&lt;/h2&gt;

&lt;p&gt;For a week, a shared doc or a Slack canvas is enough. Then it scatters. Prompts get pasted into DMs, the doc goes stale, nobody updates it, and everyone quietly drifts back to the blank box.&lt;/p&gt;

&lt;p&gt;That's the gap we built &lt;a href="https://promptship.co" rel="noopener noreferrer"&gt;PromptShip&lt;/a&gt; to fill — a shared prompt library for non-technical teams (it works with ChatGPT, Claude, and Gemini) so the person facing the blank box starts from a teammate's proven prompt in a couple of clicks instead of from scratch. But the habit matters more than the tool: even a pinned list of your team's ten best prompts beats starting from zero every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;For most non-technical teams, the bottleneck isn't prompt-writing skill — it's the blank box, and starting every task from nothing.&lt;/li&gt;
&lt;li&gt;The good prompts already exist; they're stuck in one person's chat history.&lt;/li&gt;
&lt;li&gt;Notice what works, label it by the task, start from the closest one and adapt, and keep them somewhere everyone can reach.&lt;/li&gt;
&lt;li&gt;Past a handful of prompts and people, a shared library does this automatically — but even a pinned doc beats starting from scratch.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;What's the one prompt on your team that everyone should have, but only one person actually does?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>chatgpt</category>
      <category>teamwork</category>
    </item>
    <item>
      <title>Your Oldest Client Is Probably Your Lowest-Paying One</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Wed, 10 Jun 2026 03:52:02 +0000</pubDate>
      <link>https://dev.to/speed_engineer/your-oldest-client-is-probably-your-lowest-paying-one-3bm1</link>
      <guid>https://dev.to/speed_engineer/your-oldest-client-is-probably-your-lowest-paying-one-3bm1</guid>
      <description>&lt;p&gt;Most freelancers can name their highest-paying client instantly. Ask them which client pays the &lt;em&gt;least per hour&lt;/em&gt; and they go quiet — because the answer is usually the one they'd never suspect: the loyal, long-running retainer that's felt "safe" for years.&lt;/p&gt;

&lt;h2&gt;
  
  
  The client you never re-priced
&lt;/h2&gt;

&lt;p&gt;Here's how it happens. You sign a retainer at $2,000/month for what you both estimate is about 20 hours of work. That's $100/hour. Good deal, everyone's happy.&lt;/p&gt;

&lt;p&gt;Then time passes. The relationship gets comfortable. And comfort is exactly where scope creep lives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Can you also just look at this real quick?"&lt;/li&gt;
&lt;li&gt;"While you're in there, could you update the other page too?"&lt;/li&gt;
&lt;li&gt;"We added a new tool — can you handle that going forward?"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each request is small. Each one is easy to say yes to, because you like them and the relationship is good. None of them comes with a fee change.&lt;/p&gt;

&lt;p&gt;Eighteen months later, that 20-hour retainer is quietly a 38-hour retainer. Same $2,000. Your effective rate just fell from $100/hour to about $53 — and nobody decided that on purpose. You gave your biggest client an enormous raise, and you never saw the memo, because you sent it to yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a flat fee hides the leak
&lt;/h2&gt;

&lt;p&gt;Hourly work has a built-in alarm: more hours, bigger invoice, and you notice. A flat fee removes that alarm completely. The invoice is identical whether the month took 20 hours or 40. The number on the invoice stops telling you anything about the number that actually matters — your real hourly rate.&lt;/p&gt;

&lt;p&gt;So the erosion is invisible by design. There's no line item for "scope that crept in since last year." The work feels normal because it grew one reasonable favor at a time. And the client isn't being shady — they genuinely don't know how long things take you. Only you can know that, and only if you're measuring.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: track hours &lt;em&gt;against the flat fee&lt;/em&gt;
&lt;/h2&gt;

&lt;p&gt;You don't need to switch the client to hourly. You just need to know your effective rate, which means logging time even on work you're not billing hourly.&lt;/p&gt;

&lt;p&gt;The drill:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;For 30 days, track every hour you spend on each retainer&lt;/strong&gt;, tagged to that client. Don't change anything else.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At month's end, divide the flat fee by the hours.&lt;/strong&gt; That's your real rate for that client.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compare it to your target rate.&lt;/strong&gt; If the retainer has drifted 30–40% below your other work, that's not a loyal client — that's a subsidy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-scope or re-price.&lt;/strong&gt; "Here's what the retainer originally covered, here's what it covers now — let's right-size it" is a normal, professional conversation. The data makes it un-awkward.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Most people who run this are stunned by which client comes out lowest. It's almost never the demanding one. It's the easy one you've had forever.&lt;/p&gt;

&lt;h2&gt;
  
  
  How FillTheTimesheet fits in
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; partly for this: it lets you log time against a client even when the engagement is flat-fee, then shows effective hourly rate per client so the erosion can't hide. But you can do the whole audit with a spreadsheet and a timer. The tool just keeps the number in front of you every month instead of once, by accident, when you finally wonder why the "safe" client never feels worth it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;A flat fee removes the feedback loop that hourly billing gives you — scope can grow without the invoice ever changing.&lt;/li&gt;
&lt;li&gt;Your oldest, friendliest retainer is the most likely to have eroded, because comfort is where scope creep lives.&lt;/li&gt;
&lt;li&gt;Track hours against flat-fee work for 30 days and compute effective rate per client.&lt;/li&gt;
&lt;li&gt;If a retainer has drifted well below your target rate, re-scope or re-price — the data turns an awkward ask into an obvious one.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run it for one month. The client who comes out at the bottom of that list is the conversation you've been avoiding.&lt;/p&gt;

</description>
      <category>freelance</category>
      <category>productivity</category>
      <category>business</category>
      <category>timetracking</category>
    </item>
    <item>
      <title>Prompt drift: why the AI prompt that worked last month quietly stopped working</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Tue, 09 Jun 2026 10:24:36 +0000</pubDate>
      <link>https://dev.to/speed_engineer/prompt-drift-why-the-ai-prompt-that-worked-last-month-quietly-stopped-working-289l</link>
      <guid>https://dev.to/speed_engineer/prompt-drift-why-the-ai-prompt-that-worked-last-month-quietly-stopped-working-289l</guid>
      <description>&lt;p&gt;If you share AI prompts with your team, you've probably hit this without having a name for it:&lt;/p&gt;

&lt;p&gt;A prompt that produced great output a month ago now produces mediocre output. Same prompt, same model, worse results. Nobody changed the model. So what happened?&lt;/p&gt;

&lt;p&gt;Usually, the prompt changed — a little at a time, by people trying to help.&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompt drift
&lt;/h2&gt;

&lt;p&gt;Here's the typical sequence. Someone writes a genuinely good prompt — say, one that turns messy meeting notes into a clean summary. It works. They paste it into a shared doc so the team can reuse it.&lt;/p&gt;

&lt;p&gt;Then it starts drifting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A teammate adds "keep it under 100 words" because their summaries ran long.&lt;/li&gt;
&lt;li&gt;Someone else adds "use bullet points" for their own use case.&lt;/li&gt;
&lt;li&gt;A third person rewords a line to fix a one-off problem.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each edit made sense for the person making it. But they all edited the &lt;em&gt;same shared copy, in place&lt;/em&gt;, with no record of what changed or why. Three weeks later the prompt is a patchwork of everyone's special cases, and the original — the one that actually worked — is gone. You can't roll back to it, because nobody saved it.&lt;/p&gt;

&lt;p&gt;That's prompt drift: the slow degradation of a shared prompt that no single person broke.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why a shared doc makes it worse
&lt;/h2&gt;

&lt;p&gt;A Google Doc or Notion page feels like the obvious home for team prompts. It's better than nothing, but it has the exact property that causes drift: &lt;strong&gt;one editable copy, no versions, no rollback.&lt;/strong&gt; The moment two people have different needs for the same prompt, one overwrites the other, and there's no known-good version to return to.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: treat prompts like recipes, not messages
&lt;/h2&gt;

&lt;p&gt;A chat message is disposable. A recipe is something you keep, refine, and can always cook again. Treat your reusable prompts as recipes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Name it.&lt;/strong&gt; "Meeting-notes → summary" beats "that summary prompt Dana shared." A shared name means everyone is talking about the same thing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Version it.&lt;/strong&gt; Every time you change a prompt, save it as a new version instead of overwriting. Keep a one-line note — &lt;em&gt;what changed and why&lt;/em&gt; ("Jun 9 — added word limit for newsletter use").&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Save the output that worked.&lt;/strong&gt; Store one example of the result the prompt produced when everyone agreed it was good. That's your reference point: when quality drops, you compare against it instead of arguing from memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fork instead of overwrite.&lt;/strong&gt; When your use case differs, copy the prompt to a new version — don't edit the shared one. The newsletter team and the support team can each keep a variant without stepping on each other.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roll back fearlessly.&lt;/strong&gt; When a prompt gets worse, don't debug it — restore the last version that worked, then re-apply changes one at a time until you find the one that hurt.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;None of this requires special tooling. You can do it in a doc with manual version headers, and for a handful of prompts that's fine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it breaks down
&lt;/h2&gt;

&lt;p&gt;The doc approach falls apart somewhere around 15–20 prompts and three or more people. Manual version headers get skipped, nobody saves the "good" output, and you're back to drift. That's the point where a purpose-built shared prompt library earns its keep: it keeps every prompt named, versioned, and roll-back-able by default, so the discipline happens automatically instead of relying on everyone remembering.&lt;/p&gt;

&lt;p&gt;That's the gap we built &lt;a href="https://promptship.co" rel="noopener noreferrer"&gt;PromptShip&lt;/a&gt; to fill — a shared prompt library for teams (works with ChatGPT, Claude, and Gemini) where every prompt has version history, so you can always get back to the version that worked. But the habit matters more than the tool: even if you never adopt a library, versioning your prompts will save you the next time one mysteriously stops working.&lt;/p&gt;

&lt;h2&gt;
  
  
  Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Shared prompts degrade over time through well-meaning in-place edits — &lt;em&gt;prompt drift&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;A single editable copy (the typical shared doc) is what enables it.&lt;/li&gt;
&lt;li&gt;Name your prompts, version them, save the output that worked, fork instead of overwrite, and keep the ability to roll back.&lt;/li&gt;
&lt;li&gt;Past ~15 prompts and a few people, a versioned prompt library does this automatically.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;How does your team keep track of the prompts that actually work?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>chatgpt</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>The 6-Minute Tasks That Quietly Cost Freelancers a Full Day of Pay Each Week</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Mon, 08 Jun 2026 03:45:20 +0000</pubDate>
      <link>https://dev.to/speed_engineer/the-6-minute-tasks-that-quietly-cost-freelancers-a-full-day-of-pay-each-week-5hlj</link>
      <guid>https://dev.to/speed_engineer/the-6-minute-tasks-that-quietly-cost-freelancers-a-full-day-of-pay-each-week-5hlj</guid>
      <description>&lt;p&gt;Ask a freelancer how many hours they billed last week and you'll get a confident number. Ask how many hours they actually &lt;em&gt;worked&lt;/em&gt; and the number gets fuzzy. The gap between those two numbers is almost always made of small tasks — and it's bigger than you think.&lt;/p&gt;

&lt;h2&gt;
  
  
  The tasks you never log
&lt;/h2&gt;

&lt;p&gt;You log the two-hour design session. You log the afternoon you spent on the client's API integration. What you don't log:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;the 6-minute reply to "quick question about the invoice"&lt;/li&gt;
&lt;li&gt;the 4-minute Slack clarification before you could start real work&lt;/li&gt;
&lt;li&gt;the 9 minutes hunting for the brand assets they swore they'd sent&lt;/li&gt;
&lt;li&gt;the 12-minute call that "wasn't really billable"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each one feels too small to bother with. Individually, it is. The problem is that there are fifteen of them a day.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one-week audit
&lt;/h2&gt;

&lt;p&gt;Here's the experiment. For one week, log &lt;em&gt;everything&lt;/em&gt; — including every task under ten minutes. Don't judge it, don't decide whether it's billable, just capture it. Put a tally next to anything that feels "too small to bill."&lt;/p&gt;

&lt;p&gt;At the end of the week, total that column.&lt;/p&gt;

&lt;p&gt;Most people I've talked into doing this land somewhere between four and six hours. That's a half to a full working day, every week, of real client work that left no trace on an invoice. Over a year it's the single biggest leak in a freelance P&amp;amp;L, and almost nobody measures it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why it leaks
&lt;/h2&gt;

&lt;p&gt;It isn't laziness. It's three structural things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The unit feels wrong.&lt;/strong&gt; A six-minute task measured against a one-hour mental "minimum billable unit" feels un-loggable, so it vanishes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The switching cost hides it.&lt;/strong&gt; The task interrupted something else, so you bucket it under that something else — or under nothing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You feel awkward billing for it.&lt;/strong&gt; "I'm not going to charge them for a four-minute email" is a sentence that, repeated 300 times a year, is a meaningful pay cut.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What to do with the number
&lt;/h2&gt;

&lt;p&gt;Once you can see the leak, you have options. Batch micro-tasks into a single daily "client comms" line and bill it honestly. Set a real minimum billable unit (most agencies use 15 minutes) and apply it consistently instead of silently eating anything smaller. Or build the small stuff into your rate from the start, so you're not deciding case-by-case at 5pm whether a task "counts."&lt;/p&gt;

&lt;p&gt;The point isn't to nickel-and-dime your clients. It's to stop nickel-and-diming &lt;em&gt;yourself&lt;/em&gt; by accident.&lt;/p&gt;

&lt;h2&gt;
  
  
  How FillTheTimesheet fits in
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; partly because of this exact leak — it lets you drop in micro-entries as fast as you can type them, then totals the "too small to bill" pile for you at the end of the week, so the number is staring you in the face instead of hiding. But the audit works with a notebook. The tool just makes the weekly total automatic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The gap between hours worked and hours billed is mostly sub-10-minute tasks.&lt;/li&gt;
&lt;li&gt;Run a one-week audit: log everything, tally the "too small to bill" items, total it.&lt;/li&gt;
&lt;li&gt;Expect 4–6 hours. That's up to a full day of unbilled work a week.&lt;/li&gt;
&lt;li&gt;Fix it structurally: batch, set a minimum unit, or price it in.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run the audit for one week. The number alone will change how you bill.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>freelance</category>
      <category>timetracking</category>
      <category>saas</category>
    </item>
    <item>
      <title>Why Your AI Prompts Work for You But Fail for Your Teammates (And the 4-Block Format That Fixes It)</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Tue, 02 Jun 2026 04:30:42 +0000</pubDate>
      <link>https://dev.to/speed_engineer/why-your-ai-prompts-work-for-you-but-fail-for-your-teammates-and-the-4-block-format-that-fixes-it-1kjh</link>
      <guid>https://dev.to/speed_engineer/why-your-ai-prompts-work-for-you-but-fail-for-your-teammates-and-the-4-block-format-that-fixes-it-1kjh</guid>
      <description>&lt;h1&gt;
  
  
  Why Your AI Prompts Work for You But Fail for Your Teammates (And the 4-Block Format That Fixes It)
&lt;/h1&gt;

&lt;p&gt;You write a prompt for ChatGPT that drafts a perfect outreach email. You paste it into Slack. A teammate tries it. The output is mediocre.&lt;/p&gt;

&lt;p&gt;You're not crazy. The prompt didn't get worse. It was never portable to begin with.&lt;/p&gt;

&lt;p&gt;This is one of the most under-discussed reasons team AI workflows fail: prompts that depend on &lt;strong&gt;context only the author has in their head&lt;/strong&gt;. Once you see the pattern, you can fix any prompt in about 10 minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why "Just Copy My Prompt" Doesn't Work
&lt;/h2&gt;

&lt;p&gt;A working prompt is actually three things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The text you typed&lt;/li&gt;
&lt;li&gt;The context in the chat history above it&lt;/li&gt;
&lt;li&gt;The mental model you have of what "good" looks like&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you share the prompt, only piece #1 travels. Pieces #2 and #3 stay with you. Your teammate gets the recipe without the pantry — and the output reflects that.&lt;/p&gt;

&lt;p&gt;The fix isn't a better model. It's a prompt format that &lt;strong&gt;forces all three pieces into the prompt itself&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 4-Block Format
&lt;/h2&gt;

&lt;p&gt;Every reusable team prompt should have four labeled blocks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ROLE]
You are a B2B SaaS sales rep writing to a CFO at a mid-market company.

[CONTEXT]
- Our product: PromptShip — a shared prompt library for non-technical teams
- Their pain: AI prompts scattered across Slack, no team standard
- Their company: 50-500 employees, recently funded
- Tone: direct, no jargon, short paragraphs

[TASK]
Write a 4-sentence cold email that opens with a specific observation about their team, references a real pain, and asks for a 15-minute call.

[OUTPUT FORMAT]
- Subject line (max 8 words)
- Body (max 4 sentences)
- One follow-up subject line variant
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Four blocks. Always in this order. No exceptions.&lt;/p&gt;

&lt;p&gt;The magic isn't the structure itself — it's that each block forces you to write down something you'd normally leave implicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Refactoring a Real Prompt
&lt;/h2&gt;

&lt;p&gt;Here's a prompt one of our users shared. It worked great for her, but every teammate who tried it got bland output:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"Write a follow-up email for the demo I had yesterday with the marketing director."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;What's wrong: "the demo I had" assumes the AI knows what the demo covered. "Marketing director" without industry context is too generic. No tone guidance, no length constraint.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After (in 4-Block format):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ROLE]
You are a customer success rep at a B2B SaaS company.

[CONTEXT]
- Yesterday's demo: 30 min walkthrough of a prompt library tool
- Prospect: Marketing Director at a 200-person fintech company
- Their stated pain: team uses AI but no shared prompts, work gets duplicated
- They asked for: a way to share their copywriting prompts with the social team
- Our pricing: $15/mo for 10 seats

[TASK]
Write a follow-up email that thanks them, reinforces the one specific thing they said they cared about, and proposes a 15-min call with their social team lead.

[OUTPUT FORMAT]
- Subject line (max 8 words, no "Following up")
- Body (3 short paragraphs)
- Clear next-step CTA
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Same task. The "after" version works for anyone on the team because all the implicit context is now explicit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Compounds Across a Team
&lt;/h2&gt;

&lt;p&gt;Once your team writes prompts in this format, three things happen:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Onboarding gets faster.&lt;/strong&gt; A new hire can run a senior teammate's prompt and get a senior-quality output, because the prompt carries its own context.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prompts become editable assets.&lt;/strong&gt; When the model changes or your product evolves, anyone can update the [CONTEXT] block without reverse-engineering the original author's intent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You can actually share prompts.&lt;/strong&gt; Not just paste them and hope. &lt;em&gt;Share&lt;/em&gt; them — with the confidence that they'll work for the next person.&lt;/p&gt;

&lt;h2&gt;
  
  
  How We Use This at PromptShip
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://promptship.co" rel="noopener noreferrer"&gt;PromptShip&lt;/a&gt; is a shared prompt library for teams using ChatGPT, Claude, or Gemini. The 4-Block format is baked into how we recommend customers structure prompts — every saved prompt has fields for role, context, task, and output, so teammates can copy a prompt with one click and trust that it'll work.&lt;/p&gt;

&lt;p&gt;The free plan (200 prompts, 1 user) is enough to test this format across a few of your team's top prompts before rolling it out.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Prompts that work for you but fail teammates aren't bad prompts — they're prompts missing implicit context&lt;/li&gt;
&lt;li&gt;The 4-Block format (Role / Context / Task / Output Format) makes prompts portable&lt;/li&gt;
&lt;li&gt;Refactoring takes ~10 min per prompt and pays back in every future use&lt;/li&gt;
&lt;li&gt;Shared prompt libraries only work if the prompts inside them are written to be shared&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;What's the most-copied prompt on your team right now? Try refactoring it into 4 blocks and see if the output improves. I'd love to hear what changed.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>teamwork</category>
      <category>saas</category>
    </item>
    <item>
      <title>The 60-Second Time-Tracking Habit That Killed My Friday "Reconstruction Ritual"</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Mon, 01 Jun 2026 03:41:31 +0000</pubDate>
      <link>https://dev.to/speed_engineer/the-60-second-time-tracking-habit-that-killed-my-friday-reconstruction-ritual-4c17</link>
      <guid>https://dev.to/speed_engineer/the-60-second-time-tracking-habit-that-killed-my-friday-reconstruction-ritual-4c17</guid>
      <description>&lt;h2&gt;
  
  
  The 60-Second Time-Tracking Habit That Killed My Friday "Reconstruction Ritual"
&lt;/h2&gt;

&lt;p&gt;For about three years, my Fridays ended the same way: 45 minutes of squinting at calendar entries, Slack threads, and Git history trying to remember what I worked on Tuesday afternoon.&lt;/p&gt;

&lt;p&gt;That's not time tracking. That's archaeology.&lt;/p&gt;

&lt;p&gt;If you bill by the hour — or you just want an honest weekly review — the issue isn't laziness. It's that you're trying to remember the past instead of capturing the present.&lt;/p&gt;

&lt;p&gt;Here's the habit that fixed it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Context-Switch Stamp
&lt;/h2&gt;

&lt;p&gt;Every time you switch tasks — even briefly — write one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;14:32 → client-A onboarding flow → wrapping for Slack DM
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Three fragments: timestamp, what you were doing, why you stopped. No categories, no project codes, no tool. A sticky note works. So does the back of your hand.&lt;/p&gt;

&lt;p&gt;The trick is that you're &lt;em&gt;already&lt;/em&gt; context-switching. You're already pausing. The 30 seconds to write a stamp is invisible. The 45 minutes on Friday is not.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this beats "tracking sessions"
&lt;/h2&gt;

&lt;p&gt;Most time-tracking advice says: start a timer when you begin, stop it when you finish. Reality:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You forget to start it&lt;/li&gt;
&lt;li&gt;You forget to stop it&lt;/li&gt;
&lt;li&gt;You're in three projects at once&lt;/li&gt;
&lt;li&gt;A "session" is a fiction; you got interrupted four times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The context-switch stamp inverts the problem. You don't need to remember to start anything. The next interruption &lt;em&gt;is&lt;/em&gt; the trigger. Every stamp tells you when the previous block ended. The boundaries draw themselves.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you do with the stamps
&lt;/h2&gt;

&lt;p&gt;At the end of the day (not the week — the day), spend two minutes turning stamps into intervals:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;09:14 – 10:47 → Client A — onboarding flow (1h 33m)
10:47 – 11:02 → email triage
11:02 – 12:38 → Client B — invoicing bug (1h 36m)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now you have a real timesheet. Not a reconstruction. A recording.&lt;/p&gt;

&lt;p&gt;When Friday comes around, your week is already there.&lt;/p&gt;

&lt;h2&gt;
  
  
  How FillTheTimesheet fits in
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; because I got tired of the daily two-minute reconciliation, too. You paste your raw stamps in and it groups them into client/project blocks, flags overlaps, and exports the result to whatever billing format your agency uses.&lt;/p&gt;

&lt;p&gt;But honestly — the habit matters more than the tool. If you take one thing from this post, take the stamp. You can do it in a text file. You can do it in Notes. You can do it in an SMS to yourself. The reconciliation step is fast no matter how you batch it; what's slow is reconstructing from memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Stop tracking sessions. Track &lt;em&gt;transitions&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;Three fragments per stamp: timestamp, what, why-stopping&lt;/li&gt;
&lt;li&gt;Reconcile daily, not weekly — two minutes vs. forty-five&lt;/li&gt;
&lt;li&gt;The tool is optional. The habit is not.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Try it for one day. If your Friday next week is shorter, you'll never go back.&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>freelance</category>
      <category>timetracking</category>
      <category>saas</category>
    </item>
    <item>
      <title>Sunday Notes: The Workflows Your Team Runs But Never Writes Down</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Sun, 31 May 2026 06:56:59 +0000</pubDate>
      <link>https://dev.to/speed_engineer/sunday-notes-the-workflows-your-team-runs-but-never-writes-down-4gjm</link>
      <guid>https://dev.to/speed_engineer/sunday-notes-the-workflows-your-team-runs-but-never-writes-down-4gjm</guid>
      <description>&lt;h2&gt;
  
  
  Sunday Notes: The Workflows Your Team Runs But Never Writes Down
&lt;/h2&gt;

&lt;p&gt;Every team I've talked to this week has a few workflows that live nowhere.&lt;/p&gt;

&lt;p&gt;The Friday-afternoon check-in someone reconstructs from chat history. The ChatGPT prompt that "really works" that one person on the team quietly rewrites every other Tuesday. The invoice-reconciliation pass that eats 40 minutes because half the week never got logged.&lt;/p&gt;

&lt;p&gt;This week I had three separate conversations land on the same thing: &lt;strong&gt;the most expensive workflows on a team are the ones that aren't captured anywhere.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Two patterns, same shape
&lt;/h2&gt;

&lt;p&gt;The conversations were in different worlds — freelancers and agencies on one side, marketing/sales/support teams on the other — but the underlying pattern was identical.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 1: time that vanishes.&lt;/strong&gt;&lt;br&gt;
Freelancers and small agencies kept describing the same loop. Work happens. Tracking gets postponed. Friday becomes a reconstruction exercise. The invoice goes out for less than the week actually was.&lt;/p&gt;

&lt;p&gt;The instinct is "I need more discipline." The fix is usually different: remove the moment where tracking becomes a decision. That's the entire bet behind &lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; — automated tracking so the timesheet mostly fills itself and Friday afternoon stops being detective work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 2: prompts that vanish.&lt;/strong&gt;&lt;br&gt;
Marketing, sales, HR, and support teams kept describing a different version of the same loop. Someone writes a prompt that works beautifully. They use it for three weeks. It disappears into chat history. Three months later, someone else on the team writes a very similar prompt from scratch, often worse than original.&lt;/p&gt;

&lt;p&gt;That's the loop &lt;a href="https://promptship.co" rel="noopener noreferrer"&gt;PromptShip&lt;/a&gt; was built for — a shared library of prompts your team can actually find again. One-click into ChatGPT, Claude, or Gemini. No "wait, what was the prompt Sarah used for that thing?"&lt;/p&gt;

&lt;h2&gt;
  
  
  What I actually wanted to say this week
&lt;/h2&gt;

&lt;p&gt;Different markets. Different buyers. Same underlying problem.&lt;/p&gt;

&lt;p&gt;Valuable work happens. The artifact never gets captured. The cost shows up later as the same work being done again somewhere else by someone else.&lt;/p&gt;

&lt;p&gt;The teams I see compounding fastest aren't necessarily working harder or hiring better. They've just lowered the friction of &lt;em&gt;capture&lt;/em&gt; — the moment between "this worked" and "this is now reusable."&lt;/p&gt;

&lt;p&gt;That's almost never a glamorous feature. It's also almost always the thing that turns one-month users into one-year users.&lt;/p&gt;

&lt;h2&gt;
  
  
  A question for you
&lt;/h2&gt;

&lt;p&gt;What's one piece of "invisible work" on your team right now? Something useful that's happening, but living entirely in someone's head or someone's chat history?&lt;/p&gt;

&lt;p&gt;Genuinely curious which loops other people are stuck in.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Sunday notes from The Speed Engineer. More long-form writing on &lt;a href="https://medium.com/@speed_enginner" rel="noopener noreferrer"&gt;Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>productivity</category>
      <category>teamwork</category>
      <category>ai</category>
      <category>saas</category>
    </item>
    <item>
      <title>Checksum Everything: A Practical Habit That Catches Data Corruption Before It Hurts</title>
      <dc:creator>speed engineer</dc:creator>
      <pubDate>Sat, 30 May 2026 06:07:39 +0000</pubDate>
      <link>https://dev.to/speed_engineer/checksum-everything-a-practical-habit-that-catches-data-corruption-before-it-hurts-937</link>
      <guid>https://dev.to/speed_engineer/checksum-everything-a-practical-habit-that-catches-data-corruption-before-it-hurts-937</guid>
      <description>&lt;h2&gt;
  
  
  Why I Started Checksumming Everything
&lt;/h2&gt;

&lt;p&gt;A few months back, I shipped a system that quietly truncated 0.4% of records during a Kafka rebalance. No errors, no alarms — just a few thousand objects with subtly wrong payloads downstream. By the time anyone noticed, the damage was already in the database.&lt;/p&gt;

&lt;p&gt;That bug taught me a habit I now apply everywhere: &lt;strong&gt;checksum at every boundary&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem: Silent Data Corruption
&lt;/h2&gt;

&lt;p&gt;Network errors are obvious. They throw. They reconnect. They give you something to log.&lt;/p&gt;

&lt;p&gt;Silent corruption is the dangerous cousin. A flipped bit in a memory-mapped file. A truncated payload from a misconfigured proxy. A serializer that drops the last field of a struct because producer and consumer disagree on the schema.&lt;/p&gt;

&lt;p&gt;None of these raise an exception. Your code happily processes the wrong bytes and you only find out weeks later when a customer asks why their invoice is short.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix Is Embarrassingly Simple
&lt;/h2&gt;

&lt;p&gt;Add a checksum field to every message you pass across a process or network boundary.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;envelope&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sort_keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checksum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;blake2b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;sort_keys&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;encode&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;hashlib&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;blake2b&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;digest_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;16&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;hexdigest&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;expected&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checksum&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;checksum mismatch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. BLAKE2b is fast (several GB/s on modern x86), the digest is 16 bytes, and you've turned a class of invisible bugs into loud, immediate failures.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to Apply It
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Between services&lt;/strong&gt;: every queue message, every HTTP body, every gRPC payload.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Across disk boundaries&lt;/strong&gt;: every file you write and re-read, every cache entry.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Across language boundaries&lt;/strong&gt;: every JSON you serialize and parse on the other side.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Across time&lt;/strong&gt;: every record you persist for replay later.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The rule of thumb I follow: if data crosses a boundary I don't control, it gets a checksum.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Checksums Are NOT For
&lt;/h2&gt;

&lt;p&gt;Checksums don't authenticate (use HMAC for that). They don't protect against malicious tampering. They don't fix your schema problems.&lt;/p&gt;

&lt;p&gt;What they do is catch the dumb, mechanical bugs — the ones that would otherwise take you a week to track down because they don't surface as errors.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Operational Payoff
&lt;/h2&gt;

&lt;p&gt;Once you start checksumming, two things change:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You catch corruption at the boundary that introduced it, not three systems downstream.&lt;/li&gt;
&lt;li&gt;Your error messages stop being mysteries. "Checksum mismatch on inbound queue message" beats "user reports invoice off by $14" every time.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A Note on Billing-Critical Systems
&lt;/h2&gt;

&lt;p&gt;This habit matters most in systems where data quietly becomes money. Time-tracked hours, invoices, usage events — anything that hits a customer's bill. We use this pattern in &lt;a href="https://fillthetimesheet.com" rel="noopener noreferrer"&gt;FillTheTimesheet&lt;/a&gt; for exactly this reason: a corrupted timesheet entry isn't a UI bug, it's a wrong invoice. Checksums between the tracker and the billing pipeline have caught issues we never would have noticed otherwise.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Takeaways
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Silent corruption is the dangerous class of bug — it doesn't throw.&lt;/li&gt;
&lt;li&gt;A 16-byte checksum at every boundary catches it cheaply.&lt;/li&gt;
&lt;li&gt;BLAKE2b or xxHash are both fast enough that you'll never notice the cost.&lt;/li&gt;
&lt;li&gt;Checksum at the boundary that introduces corruption, not downstream.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This is a companion to the deeper write-up on Medium: &lt;a href="https://medium.com/@speed_enginner/checksum-everything-corruption-caught-before-catastrophe-5cace12122fa" rel="noopener noreferrer"&gt;"Checksum Everything: Corruption Caught Before Catastrophe"&lt;/a&gt; by The Speed Engineer.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>programming</category>
      <category>tutorial</category>
      <category>softwareengineering</category>
      <category>reliability</category>
    </item>
  </channel>
</rss>
