<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Budi Widhiyanto</title>
    <description>The latest articles on DEV Community by Budi Widhiyanto (@budiwidhiyanto).</description>
    <link>https://dev.to/budiwidhiyanto</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F178123%2Fe07b4355-179c-44ca-8bc9-ee89ec2c17de.png</url>
      <title>DEV Community: Budi Widhiyanto</title>
      <link>https://dev.to/budiwidhiyanto</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/budiwidhiyanto"/>
    <language>en</language>
    <item>
      <title>National Vaccine Appointment &amp; Administration System</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Sat, 28 Feb 2026 09:30:45 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/national-vaccine-appointment-administration-system-303o</link>
      <guid>https://dev.to/budiwidhiyanto/national-vaccine-appointment-administration-system-303o</guid>
      <description>&lt;h2&gt;
  
  
  🌱 How It Started
&lt;/h2&gt;

&lt;p&gt;Few Years ago, I had a system design interview. The interviewer gave me this scenario:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"Design a national vaccine appointment booking system. Millions of citizens need to register and book slots. Clinics must administer the doses. The government needs audit logs and fraud prevention."&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;My first thought was simple just let people book a slot, check the stock, and confirm. I drew a basic flow on the whiteboard and felt pretty good about it. Then the interviewer started asking harder questions.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"What if two people try to book the last slot at the same time?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"What if the clinic runs out of doses after the booking is already confirmed?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;"How do you undo things if eligibility check fails in the middle?"&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;I didn't have good answers. I only designed for the happy path.&lt;/p&gt;

&lt;p&gt;That interview stuck in my mind. Months later, I was doing research on &lt;a href="https://dev.to/budiwidhiyanto/designing-an-internet-credit-purchase-system-1175"&gt;inventory reservation patterns for an internet credit purchase system&lt;/a&gt;, and I realized the same ideas could have helped me in that interview. So I went back to the problem and redesigned it. This is what I came up with.&lt;/p&gt;




&lt;h2&gt;
  
  
  ⚡ My Initial (Naïve) Solution
&lt;/h2&gt;

&lt;p&gt;Here's what I proposed during the interview:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowllg4g1kpyz9799mo0e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fowllg4g1kpyz9799mo0e.png" alt="Initial (Naïve) Solution" width="791" height="861"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Simple, right? But the problems come fast:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Race conditions&lt;/strong&gt;: Two people click "Book" at the same time for the last slot. Both get confirmed. Now one citizen has no seat.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stock mismatch&lt;/strong&gt;: Slot is confirmed, but the clinic ran out of vaccine doses between booking day and appointment day.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Late eligibility failure&lt;/strong&gt;: System confirms appointment first, then finds out the citizen doesn't meet age or insurance requirement. Now you need to undo everything, but stock is already allocated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No rollback&lt;/strong&gt;: If something fails in the middle, there's no way to release the slot or dose back to the pool.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are the same problems I found later when designing the &lt;a href="https://dev.to/budiwidhiyanto/designing-an-internet-credit-purchase-system-1175"&gt;internet credit purchase system&lt;/a&gt; the happy path is not enough when you deal with limited resources and many users at the same time.&lt;/p&gt;




&lt;h2&gt;
  
  
  🔍 Rethinking the Flow
&lt;/h2&gt;

&lt;p&gt;The main idea, which I learned from inventory reservation strategies in e-commerce, is: &lt;strong&gt;don't confirm anything until everything is verified&lt;/strong&gt;. Use a multi-stage process temporary hold first, then verify, then confirm. If anything fails, rollback.&lt;/p&gt;

&lt;p&gt;It's like buying concert tickets. When you select a seat, it's held for you while you pay. If you don't finish in time, the seat goes back. Same concept here.&lt;/p&gt;

&lt;p&gt;Here's the full flow of the improved design:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6kfgwg4akc1spqzq2pg.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft6kfgwg4akc1spqzq2pg.png" alt="improved flow design" width="666" height="1259"&gt;&lt;/a&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  🧩 The Improved Design
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Reserve First (Temporary Hold)
&lt;/h3&gt;

&lt;p&gt;When a citizen selects a clinic, time slot, and vaccine type, the system does not confirm right away. Instead:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It creates a &lt;strong&gt;temporary reservation in Redis&lt;/strong&gt; with a TTL (time-to-live), for example 5 minutes.&lt;/li&gt;
&lt;li&gt;Appointment status is set to &lt;code&gt;PENDING&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Slot capacity and vaccine dose count are decreased &lt;em&gt;temporarily&lt;/em&gt; other users will see less availability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why Redis?&lt;/strong&gt; Because we need something fast and temporary. A relational database could work too, but you would need a separate scheduled job to clean up expired reservations. Redis handles this automatically with TTL when 5 minutes pass, the key just disappears. For a system that handles millions of bookings during a national vaccine campaign, this performance difference is important.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to handle race condition on Redis?&lt;/strong&gt; We use Redis &lt;code&gt;DECR&lt;/code&gt; command on the slot counter. This is atomic meaning if two requests come at the same time, Redis processes them one by one. If the counter reaches zero, the next request is rejected. For extra safety, you can use a Lua script to make the check-and-decrement happen in one step.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Eligibility Verification
&lt;/h3&gt;

&lt;p&gt;While the slot is held, the system runs eligibility checks:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Age requirement (e.g., some vaccines only for 60+).&lt;/li&gt;
&lt;li&gt;Insurance verification through external API.&lt;/li&gt;
&lt;li&gt;Medical history (allergies, previous doses).&lt;/li&gt;
&lt;li&gt;Geographic check (is this citizen in the right region?).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If any check fails, the reservation is released Redis key is deleted, slot goes back to the pool. The citizen gets a clear message explaining &lt;em&gt;why&lt;/em&gt; they are not eligible, not just "something went wrong."&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Confirm Appointment
&lt;/h3&gt;

&lt;p&gt;If all checks pass:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Slot capacity and vaccine stock are decreased &lt;strong&gt;permanently&lt;/strong&gt; in the main database.&lt;/li&gt;
&lt;li&gt;Appointment status changes from &lt;code&gt;PENDING&lt;/code&gt; to &lt;code&gt;CONFIRMED&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Redis reservation is cleared (not needed anymore).&lt;/li&gt;
&lt;li&gt;Confirmation is sent to the citizen (SMS, email, or push notification).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the point of no return. Before this step, everything can be undone.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Administration (Vaccination Day)
&lt;/h3&gt;

&lt;p&gt;When the citizen arrives at the clinic:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Clinic staff scans the citizen's &lt;strong&gt;QR code&lt;/strong&gt;. The QR code contains the appointment ID and a verification hash. The hash is generated on the server using appointment ID + citizen ID + a secret key, so it cannot be faked.&lt;/li&gt;
&lt;li&gt;System verifies the QR code against the appointment record.&lt;/li&gt;
&lt;li&gt;Staff records the &lt;strong&gt;vaccine batch number&lt;/strong&gt; and time of administration.&lt;/li&gt;
&lt;li&gt;Appointment status changes to &lt;code&gt;ADMINISTERED&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;An event is sent to other systems analytics, government reporting, audit logs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  5. Failure &amp;amp; Rollback Scenarios
&lt;/h3&gt;

&lt;p&gt;This is the part I completely missed in my interview. Here's how each failure is handled:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr3cjkoeg7zlwheyxb56x.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fr3cjkoeg7zlwheyxb56x.png" alt="Failure and Rollback" width="800" height="427"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No-show&lt;/strong&gt;: A scheduled job checks for &lt;code&gt;CONFIRMED&lt;/code&gt; appointments that passed their time window. Status becomes &lt;code&gt;NO_SHOW&lt;/code&gt;, stock is released back.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Citizen cancels&lt;/strong&gt;: They can cancel through the portal. Stock is released right away.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clinic cancels a slot&lt;/strong&gt; (e.g., not enough staff): All affected appointments are flagged. Citizens get notified and can rebook with priority.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External API is down&lt;/strong&gt; (e.g., insurance service): The system uses a &lt;strong&gt;circuit breaker&lt;/strong&gt; pattern. After several failures in a row, the system stops calling that API temporarily. Meanwhile, the booking is either queued for retry (with increasing wait time between retries) or allowed provisionally with a flag for manual review later. The important thing is: one broken dependency should not block the whole flow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis goes down&lt;/strong&gt;: The system falls back to database-level reservations with a cleanup job. It's slower, but the booking still works.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  🏗️ System Components
&lt;/h2&gt;

&lt;p&gt;Here's the high-level architecture:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4m9qco7wxsv7x65f6gap.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4m9qco7wxsv7x65f6gap.png" alt="high level architecture" width="800" height="268"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: Booking portal for citizens + Dashboard for clinic staff.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway&lt;/strong&gt;: Authentication, rate limiting (very important during mass booking), and routing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Core Services&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auth Service&lt;/strong&gt; Login, national ID verification.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Patient Service&lt;/strong&gt; Medical records, vaccination history.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clinic Service&lt;/strong&gt; Slot management, staff schedules, capacity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inventory Service&lt;/strong&gt; Vaccine stock per clinic, batch tracking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Appointment Service&lt;/strong&gt; The main service. Manages reservations, confirmations, and status changes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eligibility Service&lt;/strong&gt; Rules engine + external API calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification Service&lt;/strong&gt; SMS, email, push. Retries if delivery fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit Service&lt;/strong&gt; Append-only logs for every status change. Required for government compliance.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Data Layer&lt;/strong&gt;: PostgreSQL for permanent data, Redis for temporary reservations and caching.&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Async Messaging&lt;/strong&gt;: Kafka for events &lt;code&gt;AppointmentReserved&lt;/code&gt;, &lt;code&gt;AppointmentConfirmed&lt;/code&gt;, &lt;code&gt;AppointmentAdministered&lt;/code&gt;, &lt;code&gt;AppointmentCancelled&lt;/code&gt;. This keeps services separated and makes the system auditable by default.&lt;/li&gt;

&lt;/ul&gt;




&lt;h2&gt;
  
  
  🎯 What I Would Do Differently Now
&lt;/h2&gt;

&lt;p&gt;Looking back at that interview, the biggest thing I missed was not about technology it was about &lt;strong&gt;mindset&lt;/strong&gt;. I jumped to the happy path because it felt complete. But the interviewer was not testing if I can design a booking form. They were testing if I can think about what happens when things go wrong.&lt;/p&gt;

&lt;p&gt;Here's what I learned from this experience:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Start with failure scenarios&lt;/strong&gt;, not the happy path. Ask yourself "what can go wrong at each step?" before finalizing any design.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Temporary reservation is a pattern, not a hack&lt;/strong&gt;. Whether it's concert tickets, flash sales, or vaccine slots if you have limited stock and many users, you need hold-then-confirm flow.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't be vague about rollbacks&lt;/strong&gt;. "We'll handle errors" is not a design. Be specific what happens to the data, the stock, and the user when something fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;External services will go down&lt;/strong&gt;. Always have a plan for when the insurance API or notification service is not available. Circuit breakers and retry queues are not optional they are necessary.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you're preparing for system design interviews, I recommend studying inventory reservation patterns. My earlier post on &lt;a href="https://dev.to/budiwidhiyanto/designing-an-internet-credit-purchase-system-1175"&gt;designing an internet credit purchase system&lt;/a&gt; covers these patterns with more detail and code examples. The core idea reserve first, verify, then commit appears in many systems once you start looking.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Thanks for reading. If you faced similar interview questions or have ideas to improve this design, I would like to hear about it in the comments.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>architecture</category>
      <category>interview</category>
      <category>career</category>
    </item>
    <item>
      <title>Data Fetching Patterns Every Developer Should Know (And When to Actually Use Them)</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Sat, 28 Feb 2026 08:56:54 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/data-fetching-patterns-every-developer-should-know-and-when-to-actually-use-them-16j3</link>
      <guid>https://dev.to/budiwidhiyanto/data-fetching-patterns-every-developer-should-know-and-when-to-actually-use-them-16j3</guid>
      <description>&lt;p&gt;About a year ago, I was working on a payment app. Solid architecture, clean API design, decent frontend on paper, everything looked good. But a few months after launch, the ratings started tanking. Users were complaining about slow loads, failed transactions, and the whole thing falling apart on spotty connections.&lt;/p&gt;

&lt;p&gt;I spent three months debugging those performance issues, and the fix wasn't some clever algorithm or a server upgrade. It was rethinking how we fetched data. That's it. Same features, same infrastructure, same design just smarter data fetching patterns. The app went from 3.2 stars to 4.7, and transaction volume jumped 30% within two months.&lt;/p&gt;

&lt;p&gt;That experience a year ago changed how I think about data flow end-to-end. Most apps don't have a "feature" problem they have a "how we get data to the screen" problem. And the difference between a mediocre app and a great one often comes down to picking the right data fetching pattern for the right situation.&lt;/p&gt;

&lt;p&gt;Here's everything I learned and wish I'd known sooner.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Basics: Request-Response
&lt;/h2&gt;

&lt;p&gt;This is where everyone starts, and for good reason. You ask the server for something, you wait, you get it back. It's the foundation of HTTP, and it handles the majority of use cases just fine.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fetchUser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/api/users/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Think of it like ordering at a counter you place your order, you wait, you get your food. Simple and predictable.&lt;/p&gt;

&lt;p&gt;This works great for standard CRUD operations: loading a user profile, submitting a form, fetching account details on page load. Where it falls apart is when you start chaining multiple requests together. If your page needs data from five endpoints and each one takes 300ms, your user is staring at a spinner for 1.5 seconds. That adds up fast.&lt;/p&gt;

&lt;p&gt;The key is recognizing when request-response &lt;em&gt;stops&lt;/em&gt; being enough which brings us to everything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  Polling: The "Are We There Yet?" Approach
&lt;/h2&gt;

&lt;p&gt;Polling is exactly what it sounds like. You ask the server for updates on a regular interval. Every 5 seconds, every 30 seconds, whatever makes sense for your use case.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pollForUpdates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;intervalId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/updates&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="nf"&gt;updateUI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Polling failed:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Don't forget cleanup&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;clearInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;intervalId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I've seen polling get a bad reputation, and honestly, sometimes it deserves it. Naive polling hammers your server with requests even when nothing has changed. On mobile, it eats battery life. And you'll always have that gap between intervals where updates get missed.&lt;/p&gt;

&lt;p&gt;But here's the thing polling is dead simple to implement, works everywhere, and for many use cases (dashboards refreshing every 30 seconds, checking job status on a build pipeline, order tracking) it's perfectly fine. Not everything needs to be real-time. Sometimes "close enough" is the right engineering decision.&lt;/p&gt;

&lt;p&gt;The smarter version is &lt;strong&gt;long polling&lt;/strong&gt;, where the server holds the connection open until it actually has something to send back. It's a nice middle ground before committing to WebSockets.&lt;/p&gt;




&lt;h2&gt;
  
  
  WebSockets: When You Need Actual Real-Time
&lt;/h2&gt;

&lt;p&gt;WebSockets maintain a persistent, two-way connection between the client and server. Unlike polling, neither side has to ask data flows both directions whenever either side has something to say.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;socket&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;WebSocket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;wss://example.com/socket&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onopen&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Connected&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onmessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;updateUI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;socket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onclose&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// You'll want reconnection logic here connections drop&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Disconnected&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is what powers chat apps, multiplayer games, collaborative editors like Google Docs, and trading platforms where milliseconds matter. If your users need to see changes the moment they happen, and especially if they need to send data back frequently, WebSockets are the right call.&lt;/p&gt;

&lt;p&gt;The tradeoff is complexity. You need to handle reconnections (connections &lt;em&gt;will&lt;/em&gt; drop). You need to think about scaling every connected user holds an open connection on your server. You need to deal with authentication differently than with regular HTTP. It's not hard, but it's more surface area than a simple fetch call.&lt;/p&gt;

&lt;p&gt;My rule of thumb: if you're polling more than once every 5 seconds, it's probably time to consider WebSockets.&lt;/p&gt;




&lt;h2&gt;
  
  
  Server-Sent Events: Real-Time's Simpler Cousin
&lt;/h2&gt;

&lt;p&gt;SSE is the pattern I wish more developers knew about. It's a one-way channel the server pushes updates to the client over a long-lived HTTP connection. No polling, no WebSocket complexity.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;eventSource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;EventSource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/stream&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;eventSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onmessage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;updateUI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nx"&gt;eventSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onerror&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;eventSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;close&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;See how much simpler that is compared to WebSockets? And you get automatic reconnection for free the browser handles it.&lt;/p&gt;

&lt;p&gt;SSE is perfect for notifications, live sports scores, progress bars for long-running tasks (think file processing or deployment pipelines), newsfeeds, and anything where the server is doing the talking and the client is just listening.&lt;/p&gt;

&lt;p&gt;The limitation is right there in the name: &lt;em&gt;server-sent&lt;/em&gt;. If your client needs to send data back frequently, SSE isn't enough. But for a surprising number of "real-time" features, one-way is all you need.&lt;/p&gt;




&lt;h2&gt;
  
  
  Caching: Making Your App Feel Instant
&lt;/h2&gt;

&lt;p&gt;Caching is less of a fetching pattern and more of a fetching &lt;em&gt;strategy&lt;/em&gt; that you layer on top of other patterns. The idea is simple: store data you've already fetched so you don't have to fetch it again.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useQuery&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react-query&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useQuery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;staleTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;cacheTime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Libraries like React Query and SWR have made caching dramatically easier. They handle stale-while-revalidate (show cached data immediately, then refresh in the background), cache invalidation, and deduplication of simultaneous requests.&lt;/p&gt;

&lt;p&gt;The impact is hard to overstate. When a user navigates to a page they've already visited and the content appears &lt;em&gt;immediately&lt;/em&gt; while a background refresh happens silently that's the kind of thing that makes an app feel native-quality.&lt;/p&gt;

&lt;p&gt;The classic challenge is cache invalidation (there's a reason Phil Karlton called it one of the two hard things in computer science). You have to decide: how long is cached data acceptable? What events should invalidate the cache? What happens when two tabs have different cached versions? These are solvable problems, but they require deliberate thinking.&lt;/p&gt;




&lt;h2&gt;
  
  
  Lazy Loading: Don't Fetch What You Don't Need Yet
&lt;/h2&gt;

&lt;p&gt;The fastest network request is the one you never make. Lazy loading defers fetching until the user actually needs the data typically triggered by scrolling, clicking a tab, or navigating to a new section.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;loadMore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isNearBottom&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`/api/items?page=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;nextPage&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newItems&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nf"&gt;setItems&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;[...&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;newItems&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="nf"&gt;setNextPage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;prev&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="nb"&gt;window&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;scroll&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;loadMore&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You see this everywhere: infinite scroll on social feeds, images loading as you scroll past them, tabs that only fetch their content when clicked. It makes initial page loads fast because you're only loading what's visible.&lt;/p&gt;

&lt;p&gt;The gotchas are UX-related. Infinite scroll can make it impossible for users to reach the footer. Loading new content can cause layout shifts that make users lose their place. And for accessibility, you need to make sure screen readers can navigate lazy-loaded content properly.&lt;/p&gt;

&lt;p&gt;For very large lists (thousands of items), pair lazy loading with virtualization only render the DOM elements that are visible in the viewport. Libraries like &lt;code&gt;react-window&lt;/code&gt; or &lt;code&gt;tanstack-virtual&lt;/code&gt; make this manageable.&lt;/p&gt;




&lt;h2&gt;
  
  
  Background Sync: Building for the Real World
&lt;/h2&gt;

&lt;p&gt;This one is close to my heart because it solved the biggest pain point in that payment app I worked on last year. Background sync lets users take actions (send a message, submit a form, record a transaction) even when they're offline. The operations get queued and processed automatically when connectivity returns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Service Worker&lt;/span&gt;
&lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sync&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tag&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sync-transactions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;waitUntil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;processQueuedTransactions&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Application code&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;recordTransaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;saveToLocalQueue&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Show the transaction in the UI immediately&lt;/span&gt;
  &lt;span class="nf"&gt;updateUIOptimistically&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;serviceWorker&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;registration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;serviceWorker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;registration&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sync&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;register&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sync-transactions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This pattern is essential for mobile apps used in areas with unreliable connections field service apps, delivery tracking, healthcare in rural areas, anything where you can't assume a stable connection.&lt;/p&gt;

&lt;p&gt;The complexity lives in conflict resolution. What happens if two offline users edit the same record? What if the server rejects a queued operation? You need clear strategies for these cases, and they're not always straightforward. But the user experience improvement is massive. Going from "you can't do anything without internet" to "everything just works, and syncs when it can" is a night-and-day difference.&lt;/p&gt;




&lt;h2&gt;
  
  
  Batch Fetching: One Trip Instead of Ten
&lt;/h2&gt;

&lt;p&gt;If your page makes 8 separate API calls to render, something is probably wrong. Batch fetching combines multiple requests into a single network call.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Instead of this:&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/users/1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;posts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/users/1/posts&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;notifications&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/users/1/notifications&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Do this:&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;dashboard&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/dashboard?userId=1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// Returns user, posts, and notifications in one response&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The savings come from reducing HTTP overhead connection setup, headers, TLS handshakes. On mobile networks with high latency, the difference between one request and ten is very noticeable.&lt;/p&gt;

&lt;p&gt;The downside is coupling. When you batch things together, you can't cache or invalidate them independently. If the notification data changes every 30 seconds but user profile data changes once a month, batching them means either over-fetching profile data or under-fetching notifications. You have to think about which data actually belongs together.&lt;/p&gt;




&lt;h2&gt;
  
  
  GraphQL: Ask for Exactly What You Need
&lt;/h2&gt;

&lt;p&gt;GraphQL flips the traditional REST model. Instead of the server deciding what data each endpoint returns, the client specifies exactly what it needs.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`
  query GetUser($id: ID!) {
    user(id: $id) {
      name
      email
      posts(last: 5) {
        title
        preview
      }
    }
  }
`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/graphql&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;POST&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;variables&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;123&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With REST, a mobile app and a desktop app hitting the same &lt;code&gt;/api/user&lt;/code&gt; endpoint get the same response even if the mobile app only needs the name and avatar while the desktop app needs the full profile. GraphQL eliminates that mismatch. Each client asks for exactly what it needs.&lt;/p&gt;

&lt;p&gt;This matters most when you have multiple clients with different data requirements, deeply nested data relationships, or you're tired of creating one-off REST endpoints for every new screen design.&lt;/p&gt;

&lt;p&gt;The investment is real, though. You need a GraphQL server, a schema, resolvers, and your team needs to learn a new paradigm. Caching is trickier than REST because everything goes through a single endpoint. And poorly written queries can cause serious performance issues on the backend (the N+1 query problem is very real with GraphQL).&lt;/p&gt;

&lt;p&gt;For smaller apps with a single client, REST with good API design is usually simpler and sufficient.&lt;/p&gt;




&lt;h2&gt;
  
  
  Federated Fetching: Unifying Microservices
&lt;/h2&gt;

&lt;p&gt;In microservice architectures, the data a single page needs might live across five different services. Federated fetching usually through a BFF (Backend-For-Frontend) layer or API gateway aggregates that data so the client makes one clean request.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BFF endpoint&lt;/span&gt;
&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/api/dashboard/:userId&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;activity&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`http://user-service/users/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`http://account-service/accounts?userId=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`http://activity-service/recent?userId=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;accounts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="na"&gt;recentActivity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;activity&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()).&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The BFF pattern is a lifesaver in complex systems. Instead of the frontend knowing about every microservice and making separate calls to each, it talks to one unified API that handles the orchestration. The frontend stays clean, and you can tailor responses to what each client actually needs.&lt;/p&gt;

&lt;p&gt;The downside is obvious you're adding another service to build, deploy, and maintain. And if your BFF goes down, everything goes down. It's a pattern that makes sense at a certain scale, but overkill for smaller applications.&lt;/p&gt;




&lt;h2&gt;
  
  
  Combining Patterns: Where It Gets Interesting
&lt;/h2&gt;

&lt;p&gt;No real application uses just one pattern. The interesting decisions happen when you combine them. Here's what that looks like in practice:&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;messaging app&lt;/strong&gt; might use WebSockets for incoming messages, background sync for sending messages in poor connectivity, caching for conversation history, and lazy loading for scrolling through older messages.&lt;/p&gt;

&lt;p&gt;An &lt;strong&gt;e-commerce app&lt;/strong&gt; might use request-response for search, caching for product pages, SSE for inventory availability, and batch fetching for the cart summary.&lt;/p&gt;

&lt;p&gt;A &lt;strong&gt;trading platform&lt;/strong&gt; might use WebSockets for live prices, polling as a fallback, GraphQL for portfolio data, and caching for historical charts.&lt;/p&gt;

&lt;p&gt;The point is to match each data need to the pattern that best serves it. Not every piece of data on a screen has the same freshness requirements, the same access patterns, or the same tolerance for latency.&lt;/p&gt;




&lt;h2&gt;
  
  
  Quick Reference
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Pattern&lt;/th&gt;
&lt;th&gt;Use When&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;th&gt;Offline Support&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Request-Response&lt;/td&gt;
&lt;td&gt;Standard CRUD, simple pages&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Polling&lt;/td&gt;
&lt;td&gt;Periodic updates, status checks&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebSockets&lt;/td&gt;
&lt;td&gt;Two-way real-time (chat, collaboration)&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Server-Sent Events&lt;/td&gt;
&lt;td&gt;One-way real-time (notifications, feeds)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caching&lt;/td&gt;
&lt;td&gt;Repeated data access, speed matters&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;td&gt;Partial&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Lazy Loading&lt;/td&gt;
&lt;td&gt;Large lists, heavy initial loads&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Background Sync&lt;/td&gt;
&lt;td&gt;Offline-first, unreliable connections&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Batch Fetching&lt;/td&gt;
&lt;td&gt;Multiple related data needs&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;GraphQL&lt;/td&gt;
&lt;td&gt;Complex/varied data requirements&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Federated Fetching&lt;/td&gt;
&lt;td&gt;Microservices, unified APIs&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  Final Thoughts
&lt;/h2&gt;

&lt;p&gt;A year ago, when that payment app went from 3.2 stars to 4.7, we didn't add a single new feature. We just changed &lt;em&gt;how&lt;/em&gt; existing features got their data. Caching made it feel instant. Background sync made it work offline. WebSockets made payments confirm in real-time. Batch fetching cut load times by 80%.&lt;/p&gt;

&lt;p&gt;Looking back, that project taught me something I keep coming back to: users don't care about your architecture. They care that things are fast, reliable, and don't waste their time. Data fetching patterns on both the backend and the frontend are how you deliver on that promise.&lt;/p&gt;

&lt;p&gt;Pick the right pattern for each situation. Combine them thoughtfully. And when your app ratings start climbing, you'll know why.&lt;/p&gt;

</description>
      <category>softwareengineering</category>
      <category>architecture</category>
      <category>fintech</category>
      <category>discuss</category>
    </item>
    <item>
      <title>How a "Simple" QR Code Generator Ate All My RAM: A Tale of 50,000 QR Codes</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Tue, 24 Feb 2026 02:43:04 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/how-a-simple-qr-code-generator-ate-all-my-ram-a-tale-of-50000-qr-codes-1nkg</link>
      <guid>https://dev.to/budiwidhiyanto/how-a-simple-qr-code-generator-ate-all-my-ram-a-tale-of-50000-qr-codes-1nkg</guid>
      <description>&lt;p&gt;&lt;em&gt;Sometimes the simplest tasks can become the biggest headaches. Here's how I learned that data size matters more than code complexity.&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Innocent Beginning
&lt;/h2&gt;

&lt;p&gt;It started with a straightforward request: generate 50,000 unique QR codes for a project. "How hard could it be?" I thought. Python has excellent libraries for this. A quick script, a PDF output, done by lunch.&lt;/p&gt;

&lt;p&gt;I was wrong. Very wrong.&lt;/p&gt;

&lt;p&gt;What I didn't anticipate was that my "simple" script would consume every byte of RAM on my machine, freeze my computer, and teach me an important lesson about thinking at scale.&lt;/p&gt;

&lt;p&gt;Let me walk you through what happened, how I fixed it, and what you can learn from my mistakes.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Original Approach: Looks Good on Paper
&lt;/h2&gt;

&lt;p&gt;Here's the approach I initially took. Generate all the QR codes first, cache them in memory, then write them to a PDF. It sounds logical, right? Pre-compute everything, then assemble the final output.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_unique_ids&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Pre-generate ALL QR codes in parallel for "speed"
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pre-generating &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; QR codes in parallel...&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;num_workers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;cpu_count&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Split IDs into batches for parallel processing
&lt;/span&gt;    &lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_workers&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;batches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="c1"&gt;# Generate QR codes in parallel using multiprocessing
&lt;/span&gt;    &lt;span class="n"&gt;qr_cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;Pool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_workers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;tqdm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;imap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;generate_qr_batch&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;batches&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batches&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generating QR codes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Store ALL images in memory
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;batch_result&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;img_bytes&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;batch_result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;io&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;BytesIO&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;qr_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ImageReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# NOW create the PDF using cached images
&lt;/span&gt;    &lt;span class="c1"&gt;# ... PDF generation code ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I was proud of this code. Multiprocessing! Parallel execution! Batch processing! All the buzzwords that make you feel like a "real" programmer.&lt;/p&gt;

&lt;p&gt;Then I ran it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Disaster Unfolds
&lt;/h2&gt;

&lt;p&gt;The script started running. Progress bars moved. CPU usage spiked to 100% across all cores. "Excellent," I thought, "parallel processing doing its thing."&lt;/p&gt;

&lt;p&gt;Then I noticed my system getting sluggish. Browser tabs stopped responding. My IDE froze. I opened the system monitor and watched in horror as my RAM usage climbed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2 GB...&lt;/li&gt;
&lt;li&gt;4 GB...&lt;/li&gt;
&lt;li&gt;8 GB...&lt;/li&gt;
&lt;li&gt;12 GB...&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;My laptop has 16 GB of RAM. The script was devouring it all. Before I could react, the OOM (Out of Memory) killer struck. Process terminated. No PDF. Just a frozen computer and a lesson learned the hard way.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding the Problem
&lt;/h2&gt;

&lt;p&gt;After my system recovered, I sat down to analyze what went wrong. Let me break down the math:&lt;/p&gt;

&lt;p&gt;Each QR code image:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Resolution: 400 × 400 pixels&lt;/li&gt;
&lt;li&gt;Format: PNG in memory&lt;/li&gt;
&lt;li&gt;Approximate size: 15-30 KB per image (compressed)&lt;/li&gt;
&lt;li&gt;But in memory as a PIL Image object: ~500 KB - 1 MB&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Scale it up:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50,000 QR codes × ~500 KB = ~25 GB of RAM&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Even with the compressed PNG byte representation, we're looking at:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;50,000 × 20 KB = ~1 GB just for the image bytes&lt;/li&gt;
&lt;li&gt;Plus the ImageReader objects&lt;/li&gt;
&lt;li&gt;Plus the BytesIO buffers&lt;/li&gt;
&lt;li&gt;Plus Python's memory overhead&lt;/li&gt;
&lt;li&gt;Plus multiprocessing duplicating data across workers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The actual memory consumption was somewhere between 2-4 GB, which was still way more than what should be acceptable for such a "simple" task.&lt;/p&gt;

&lt;p&gt;The fundamental flaw in my approach was this: I was optimizing for speed when I should have been optimizing for resource consumption.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Fix: Think Like a Stream, Not a Lake
&lt;/h2&gt;

&lt;p&gt;The solution was embarrassingly simple once I understood the problem. Instead of loading all 50,000 QR codes into memory at once (a "lake" of data), I needed to process them as a stream—one page at a time.&lt;/p&gt;

&lt;p&gt;Here's the key insight: A PDF with 50,000 QR codes has about 1,667 pages (30 QR codes per page). I only need to hold 30 QR codes in memory at any given time—the ones for the current page.&lt;/p&gt;

&lt;p&gt;Here's the refactored approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;generate_pdf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generate_unique_ids&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;total_pages&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;PER_PAGE&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;PER_PAGE&lt;/span&gt;

    &lt;span class="c1"&gt;# Create PDF canvas
&lt;/span&gt;    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;canvas&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Canvas&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;pagesize&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;A4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Process ONE PAGE at a time
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page_start&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;tqdm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PER_PAGE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generating PDF pages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;page_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;page_start&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;page_start&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;PER_PAGE&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Generate QR codes ONLY for this page
&lt;/span&gt;        &lt;span class="n"&gt;page_qr_cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;page_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;img&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;make_qr_image&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;page_qr_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;img_to_reader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;img&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Draw this page
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_ids&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="c1"&gt;# ... draw QR code to PDF ...
&lt;/span&gt;            &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;drawImage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page_qr_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;qr_x&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qr_y&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...)&lt;/span&gt;

        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;showPage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="c1"&gt;# CRITICAL: Clear the cache after each page!
&lt;/span&gt;        &lt;span class="n"&gt;page_qr_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;save&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key changes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Generate per-page: Only create QR codes for the 30 items on the current page&lt;/li&gt;
&lt;li&gt;Clear after use: Explicitly clear the page cache after each page is written&lt;/li&gt;
&lt;li&gt;No multiprocessing overhead: Removed the parallel processing that was duplicating data&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Trade-off: Speed vs. Safety
&lt;/h2&gt;

&lt;p&gt;Let's be honest about the trade-offs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Original (Parallel)&lt;/th&gt;
&lt;th&gt;Optimized (Per-Page)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Memory Usage&lt;/td&gt;
&lt;td&gt;2-4 GB&lt;/td&gt;
&lt;td&gt;50-100 MB&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Speed&lt;/td&gt;
&lt;td&gt;Faster (theoretically)&lt;/td&gt;
&lt;td&gt;Slower&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Stability&lt;/td&gt;
&lt;td&gt;Crashes on large datasets&lt;/td&gt;
&lt;td&gt;Stable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scalability&lt;/td&gt;
&lt;td&gt;Limited by RAM&lt;/td&gt;
&lt;td&gt;Limited by disk space&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Yes, the optimized version is slower. Without parallel processing, we're generating QR codes sequentially. For 50,000 codes, the execution time went from "crash before completion" to "about 30-45 minutes of stable execution."&lt;/p&gt;

&lt;p&gt;But here's the thing: a slow script that completes is infinitely faster than a fast script that crashes.&lt;/p&gt;

&lt;p&gt;I ran the optimized version overnight. When I woke up, both PDF files (100,000 QR codes total) were sitting there, ready to use. My computer was fine. No crashes. No freezing. Just steady, predictable progress.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons Learned
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Data Size Changes Everything
&lt;/h3&gt;

&lt;p&gt;A script that works perfectly for 100 items might explode at 10,000 items. Always ask yourself: "What happens when this scales 10x? 100x? 1000x?"&lt;/p&gt;

&lt;p&gt;In my case, the script probably worked fine during testing with small batches. It was only at production scale that the memory issue became catastrophic.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Memory is Not Infinite
&lt;/h3&gt;

&lt;p&gt;This sounds obvious, but it's easy to forget when you're writing code. Every object you create lives somewhere in memory. When you're dealing with images, those objects can be surprisingly large.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This innocent-looking line...
&lt;/span&gt;&lt;span class="n"&gt;qr_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;uid&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;ImageReader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# ...executed 50,000 times becomes a memory bomb
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Parallel ≠ Better
&lt;/h3&gt;

&lt;p&gt;Parallel processing is great for CPU-bound tasks where you have enough memory to support multiple workers. But when each worker is creating large objects, parallelism can actually make things worse by multiplying memory usage.&lt;/p&gt;

&lt;p&gt;Sometimes, a simple sequential loop is the right answer.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Clear Your References
&lt;/h3&gt;

&lt;p&gt;Python's garbage collector is good, but it's not magic. If you're holding references to large objects in a dictionary or list, that memory won't be freed until you explicitly remove those references.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This single line saved gigabytes of RAM
&lt;/span&gt;&lt;span class="n"&gt;page_qr_cache&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;clear&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Progress Bars Are Your Friend
&lt;/h3&gt;

&lt;p&gt;When you're running long-executing tasks, always add progress bars. The &lt;code&gt;tqdm&lt;/code&gt; library makes this trivially easy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page_start&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;tqdm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PER_PAGE&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;desc&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Generating PDF pages&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# ... your code ...
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not only does this give you feedback on how long the task will take, but it also helps you identify when something is wrong. If the progress bar stalls, you know there's a problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Bigger Picture: Thinking About Resources
&lt;/h2&gt;

&lt;p&gt;This experience changed how I approach coding problems. Now, before I write any code that deals with data at scale, I ask myself three questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What's the memory footprint per item?&lt;/li&gt;
&lt;li&gt;How many items will I process?&lt;/li&gt;
&lt;li&gt;Can I process items one at a time instead of all at once?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is especially important in scenarios like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Image processing: Images are memory-hungry&lt;/li&gt;
&lt;li&gt;Data pipelines: Processing large CSV/JSON files&lt;/li&gt;
&lt;li&gt;API responses: Paginating through thousands of records&lt;/li&gt;
&lt;li&gt;File operations: Reading/writing large files&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The pattern is always the same: stream when you can, batch when you must, and never load everything into memory unless you absolutely have to.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Tips for Your Own Projects
&lt;/h2&gt;

&lt;p&gt;If you're working on a similar task—generating large numbers of images, processing big datasets, or handling any kind of bulk operation—here are some practical tips:&lt;/p&gt;

&lt;h3&gt;
  
  
  Use Generators Instead of Lists
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Bad: Creates a list of 50,000 items in memory
&lt;/span&gt;&lt;span class="n"&gt;ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;generate_id&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50000&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

&lt;span class="c1"&gt;# Better: Generates one at a time
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;id_generator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="nf"&gt;generate_id&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Process in Chunks
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Instead of processing all at once
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;huge_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Process in manageable chunks
&lt;/span&gt;&lt;span class="n"&gt;chunk_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;huge_list&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;chunk&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;huge_list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;chunk_size&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c1"&gt;# Clean up after each chunk
&lt;/span&gt;    &lt;span class="n"&gt;gc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;collect&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# Force garbage collection if needed
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Monitor Your Memory Usage
&lt;/h3&gt;

&lt;p&gt;Add memory monitoring to long-running scripts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;psutil&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_memory_usage&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;process&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;psutil&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getpid&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;memory_info&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="n"&gt;rss&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;  &lt;span class="c1"&gt;# MB
&lt;/span&gt;
&lt;span class="c1"&gt;# In your loop
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;item&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="nf"&gt;process&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; items, Memory: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;get_memory_usage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; MB&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Set Memory Limits
&lt;/h3&gt;

&lt;p&gt;For critical scripts, you can set memory limits to prevent runaway consumption:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt;

&lt;span class="c1"&gt;# Limit memory to 1GB
&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setrlimit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RLIMIT_AS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;My "simple" QR code generator turned into a valuable lesson about resource management. The original code was clever—parallel processing, batch operations, caching. But clever code that doesn't work is worse than simple code that does.&lt;/p&gt;

&lt;p&gt;The final version generates 100,000 QR codes across two PDF files. It takes about an hour to run. It uses less than 100 MB of RAM. And most importantly, it completes successfully every single time.&lt;/p&gt;

&lt;p&gt;Sometimes the best optimization isn't making your code faster—it's making it actually work.&lt;/p&gt;

&lt;p&gt;The next time you're writing code that processes data at scale, remember: think about memory first, speed second. A slow script that completes is infinitely more valuable than a fast script that crashes.&lt;/p&gt;




&lt;p&gt;TL;DR: I tried to generate 50,000 QR codes by loading them all into memory at once. My computer ran out of RAM and crashed. The fix was simple: generate QR codes one page at a time (30 at a time instead of 50,000). It's slower, but it works. Always consider memory usage when working with data at scale.&lt;/p&gt;

</description>
      <category>python</category>
      <category>performance</category>
      <category>optimization</category>
    </item>
    <item>
      <title>Building a FHIR Patient Deduplication System: A Journey from Chaos to Performance</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Sat, 15 Nov 2025 01:32:38 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/building-a-fhir-patient-deduplication-system-a-journey-from-chaos-to-performance-4h65</link>
      <guid>https://dev.to/budiwidhiyanto/building-a-fhir-patient-deduplication-system-a-journey-from-chaos-to-performance-4h65</guid>
      <description>&lt;p&gt;I'm working on a national project to collect health data from legacy systems in two pilot districts in Indonesia. The goal is to create interoperability between different healthcare systems so we can make better healthcare decisions based on complete patient data. It's an important project, and it's been a challenging one.&lt;/p&gt;

&lt;p&gt;One of the biggest challenges has been patient deduplication. We're collecting data from multiple legacy systems, and each one has its own way of storing patient information. When we convert all this data to FHIR R4 format, we end up with duplicate patient records—the same person appearing multiple times in our system because they exist in multiple source systems.&lt;/p&gt;

&lt;p&gt;This is the story of how I built a patient deduplication system that processes thousands of records in minutes instead of hours, and the lessons I learned from approaches that didn't work. If you're working with FHIR data from multiple sources or dealing with patient deduplication in healthcare systems, I hope my experience helps you.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Beginning: Starting with a Partner's Approach
&lt;/h2&gt;

&lt;p&gt;We're working with a technology partner on this national interoperability project. They had already built systems for patient matching in their own applications, and they shared their approach with us. Their method seemed reasonable: when creating a new patient, first search for existing patients using gender and birthdate filters, then apply fuzzy matching on the patient's name. If you find a good match, use that patient ID. Otherwise, create a new patient. If gender or birthdate are missing, fall back to using NIK (Indonesian National Identity Number) for exact matching.&lt;/p&gt;

&lt;p&gt;I implemented their approach in our FHIR converter. Here's what it looked like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;getPatientIdWithFuzzyLogic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;internal_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parent_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Strategy 1: Use gender and birthdate if available
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Get patients matching gender and birthdate
&lt;/span&gt;        &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gender&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;birthdate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_patients_with_params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Apply fuzzy matching on names
&lt;/span&gt;            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fuzz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;token_sort_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;FUZZY_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="c1"&gt;# Strategy 2: Fall back to NIK if gender/birthdate didn't work
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;nik_patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_patients_with_params&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="c1"&gt;# No match found, create new patient
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The logic seemed solid. Search by demographics first, verify with name matching, fall back to NIK if needed. I deployed it and started converting patient data from our legacy systems.&lt;/p&gt;

&lt;p&gt;That's when the problems started appearing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Failed Solution: Why the Partner's Method Didn't Work
&lt;/h2&gt;

&lt;p&gt;The partner's method worked well in their own internal systems, but it didn't work for our interoperability project. I discovered two fundamental problems that made their approach unsuitable for our needs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 1: Missing Data in Legacy Systems&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The partner's method relied heavily on having gender and birthdate for every patient. But we had data quality issues. Many patient records were missing gender or birthdate fields. When that happened, the search by demographics would fail, and we'd fall back to NIK matching. But if NIK was also missing or inconsistent, we'd create a duplicate patient.&lt;/p&gt;

&lt;p&gt;I started seeing duplicate patients in our FHIR server. The same person would appear multiple times because the legacy data from different sources had different levels of completeness. One source might have gender and birthdate, another might only have NIK, and a third might have partial information. The fuzzy matching couldn't handle this inconsistency reliably.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Problem 2: Pagination Limits&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The bigger problem was the FHIR API's pagination limit. Our FHIR server returns a maximum of 100 records per search request. When I searched for patients by gender and birthdate, I'd get the first 100 results. If there were more than 100 patients matching those criteria (which is common for popular birthdates), I'd need to paginate through all the results to find the right patient.&lt;/p&gt;

&lt;p&gt;But the partner's code didn't handle pagination. It only looked at the first page of results. If the patient I was looking for was on page 2 or page 3, the search would miss them, and the converter would create a duplicate.&lt;/p&gt;

&lt;p&gt;I could have fixed the pagination issue by implementing proper page-through logic, but that would make every patient search much slower—potentially making multiple API calls just to check if a patient exists. For batch conversion of thousands of patients, this would be too slow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Real Problem&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The partner's method was built for their internal systems, where they controlled the data quality and had different constraints. Our situation was different. We were collecting data from multiple independent legacy systems, each with its own data quality issues, and we needed to process it efficiently at scale.&lt;/p&gt;

&lt;p&gt;I needed a different approach—one that worked with the data we actually had, not the data we wished we had.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Finding a Better Way&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I went back to analyze what reliable data we did have. The answer was NIK—the Indonesian National Identity Number. Almost every patient in our system had a NIK, and it was consistent across different legacy systems. It's a 16-digit number, always formatted the same way, and it uniquely identifies a person.&lt;/p&gt;

&lt;p&gt;Why was I treating NIK as a fallback? It should be the primary method. NIK is more reliable than gender or birthdate for identifying patients in Indonesia. Gender and birthdate can be missing or inconsistent, but NIK is designed to be unique.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building the Reference System: NIK-First Strategy
&lt;/h2&gt;

&lt;p&gt;I redesigned the patient matching system to use NIK as the primary identifier, with demographic matching as a fallback only when necessary. Here's the new approach:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;getPatientIdByNIK&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get patient ID by NIK with caching&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nik_cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;nik_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_patients_with_params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;patient_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;nik_cache&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;  &lt;span class="c1"&gt;# Cache for future lookups
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;getPatientIdWithFuzzyLogic&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;parent_name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# Strategy 1: Try NIK exact match first (most reliable)
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;nik_patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_patients_with_params&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# Single match - verify with name fuzzy matching
&lt;/span&gt;            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="n"&gt;patient_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_full_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;new_patient_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;

                &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fuzz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;token_sort_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;new_patient_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;FUZZY_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

            &lt;span class="c1"&gt;# Multiple NIK matches - use fuzzy matching to find best
&lt;/span&gt;            &lt;span class="k"&gt;elif&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;best_match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
                &lt;span class="n"&gt;best_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nik_patients&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;patient_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_full_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fuzz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;token_sort_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_name&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;best_score&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;best_score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
                        &lt;span class="n"&gt;best_match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;best_score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;FUZZY_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;best_match&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="c1"&gt;# Strategy 2: Fall back to demographic matching if NIK fails
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;gender&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;gender&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;birthdate&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;birthdate&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;candidates&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_patients_with_params&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;candidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fuzz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;token_sort_ratio&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;get_full_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;FUZZY_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;

    &lt;span class="c1"&gt;# No match found
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This new system inverts the partner's approach. Instead of searching by demographics first and falling back to NIK, I search by NIK first and fall back to demographics. This solves both problems:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Missing data&lt;/strong&gt;: NIK is more consistently available than gender/birthdate in our legacy systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pagination&lt;/strong&gt;: Searching by NIK returns far fewer results (usually just one), so pagination isn't an issue&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I also added a caching mechanism with &lt;code&gt;getPatientIdByNIK&lt;/code&gt;. When converting thousands of patient records, many of them might be the same person (repeat visits, multiple encounters, etc.). By caching the NIK-to-patient-ID mapping, I avoid making redundant API calls for patients I've already looked up.&lt;/p&gt;

&lt;p&gt;The fuzzy matching on names is still there as a safety check. Even when I find a patient by NIK, I verify that the name matches using fuzzy string comparison. This catches cases where NIK might have been entered incorrectly or where there might be data quality issues.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Testing the New System&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When I tested the new NIK-first system with real data from our legacy systems, it worked much better. The converter found existing patients reliably, even when demographic data was missing. The pagination problem disappeared because NIK searches rarely return more than 100 results. And the caching made bulk conversion much faster.&lt;/p&gt;

&lt;p&gt;I watched the logs during a test run converting 1,000 patient records: "Found patient by NIK... Found patient by NIK... Created new patient (no NIK match)... Found patient by NIK (cached)..." The system was working.&lt;/p&gt;

&lt;p&gt;But I still had a problem. Before implementing this fix, the old system had already created duplicate patients in our FHIR server. Some NIKs had 2, 3, or even 10 duplicate patient records. While the new reference system prevented future duplicates, I needed to clean up the existing ones.&lt;/p&gt;

&lt;p&gt;I needed a deduplication process.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Deduplication Challenge: Two Versions
&lt;/h2&gt;

&lt;p&gt;Building a system to deduplicate existing patients was a different challenge entirely. With the reference system, I was preventing new duplicates—a relatively simple task of checking before creating. With deduplication, I needed to find all existing duplicates, choose which one should be the "master," and then update potentially thousands of medical records to point to that master instead of the duplicates.&lt;/p&gt;

&lt;p&gt;This was going to touch a lot of data. I needed to be careful.&lt;/p&gt;

&lt;h3&gt;
  
  
  Version 1: Sequential Processing - The Safe, Slow Way
&lt;/h3&gt;

&lt;p&gt;For my first implementation, I chose the safest possible approach: sequential processing. I would handle one NIK at a time, processing each step completely before moving to the next. No parallelization, no batch operations, just simple, linear execution.&lt;/p&gt;

&lt;p&gt;The algorithm was straightforward:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find all patient IDs with the same NIK&lt;/li&gt;
&lt;li&gt;Select which one should be the master (I chose the most recently updated)&lt;/li&gt;
&lt;li&gt;Find all resources (observations, encounters, etc.) referencing the duplicate patient IDs&lt;/li&gt;
&lt;li&gt;Update each resource to reference the master patient ID instead&lt;/li&gt;
&lt;li&gt;Mark the duplicate patients as inactive&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;I wrote it as a simple loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;nik_list&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# Find duplicate patients
&lt;/span&gt;    &lt;span class="n"&gt;patient_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;find_patients_by_nik&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;continue&lt;/span&gt;  &lt;span class="c1"&gt;# No duplicates, skip
&lt;/span&gt;
    &lt;span class="c1"&gt;# Select master patient (most recently updated)
&lt;/span&gt;    &lt;span class="n"&gt;master_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;select_master_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;duplicate_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="c1"&gt;# Find all resources referencing duplicates
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;RESOURCE_TYPES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;duplicate_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;fetch_resources&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c1"&gt;# Update each resource
&lt;/span&gt;            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="nf"&gt;update_patient_reference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nf"&gt;put_resource&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Mark duplicates inactive
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;dup_id&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;duplicate_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;mark_patient_inactive&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;dup_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I tested it with a single NIK first. It worked. I checked the data afterward—all the medical records now pointed to the master patient, the duplicates were marked inactive with a &lt;code&gt;replaced-by&lt;/code&gt; link to the master. Perfect.&lt;/p&gt;

&lt;p&gt;Then I tried it with ten NIKs. It worked, but it took 15 minutes. Okay, that's not great, but acceptable for a cleanup operation, right?&lt;/p&gt;

&lt;p&gt;Then I ran it on our actual list: 68 NIKs with known duplicates. I started the script, watched the logs for a few minutes, then went to get coffee. When I came back 30 minutes later, it had processed 3 NIKs. I did the math. 68 NIKs at 10 minutes each... over 11 hours.&lt;/p&gt;

&lt;p&gt;I let it run overnight. The next morning, it had finished successfully. All the duplicates were cleaned up. But 11 hours was not acceptable. We had hundreds more NIKs to process in other datasets. At this rate, a full deduplication would take days, maybe weeks. And during that time, the script would be constantly hammering our FHIR server with API calls.&lt;/p&gt;

&lt;p&gt;The problem was obvious: I was making way too many individual API calls. For each patient ID, I was searching for observations one at a time, then encounters one at a time, then medications, procedures, diagnostic reports—the list went on. And FHIR has a lot of resource types that can reference patients. Even though I filtered it down to the most common ones, I was still checking 27 different resource types. For each duplicate patient. Sequentially.&lt;/p&gt;

&lt;p&gt;If a single NIK had 3 duplicate patients and each patient had 20 observations, that's 60 individual GET requests just for observations, plus 60 individual PUT requests to update them. Multiply that by all the other resource types, and you're talking about hundreds of API calls per NIK. No wonder it was slow.&lt;/p&gt;

&lt;p&gt;I watched the script run for a while, looking at the logs. The server was responding quickly—each API call only took a few hundred milliseconds. But I was only making one call at a time. The network latency, the sequential execution, it all added up. I was wasting so much time just waiting.&lt;/p&gt;

&lt;p&gt;That's when I remembered something. When I built the original FHIR converter, I had faced a similar problem. Converting thousands of patient records one at a time was slow. I had solved it by using batch operations and parallel processing. I could apply the same techniques here.&lt;/p&gt;

&lt;h3&gt;
  
  
  Version 2: Batch Processing &amp;amp; Parallelization - The Fast Way
&lt;/h3&gt;

&lt;p&gt;The key insight was this: most of the steps in deduplication don't depend on each other. When I'm fetching observations for a patient, I don't need to wait for the encounters to be fetched first. When I'm updating resources, I don't need to update them one at a time—I can batch them together.&lt;/p&gt;

&lt;p&gt;I redesigned the system with two major optimizations: parallel resource fetching and batch updates.&lt;/p&gt;

&lt;p&gt;For parallel fetching, I used Python's &lt;code&gt;ThreadPoolExecutor&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;concurrent.futures&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;as_completed&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_all_references&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Find ALL resources that reference duplicate patient IDs (parallel)&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Only search for duplicates, not master
&lt;/span&gt;    &lt;span class="n"&gt;search_patient_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;all_references&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="c1"&gt;# Use ThreadPoolExecutor for parallel fetching
&lt;/span&gt;    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;ThreadPoolExecutor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Submit all resource type fetches in parallel
&lt;/span&gt;        &lt;span class="n"&gt;future_to_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;executor&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_fetch_resources_for_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;search_patient_ids&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;PATIENT_REFERENCING_RESOURCES&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;# Collect results as they complete
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;as_completed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;future_to_type&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;resource_type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future_to_type&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;future&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;result&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;all_references&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error fetching &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;all_references&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Instead of fetching observations, then encounters, then medications sequentially, I now launch all those searches in parallel. Five worker threads (configurable via &lt;code&gt;MAX_WORKERS&lt;/code&gt;) simultaneously fetch different resource types. This alone cut the fetching time by about 80%.&lt;/p&gt;

&lt;p&gt;But the real performance gain came from batch updates. FHIR supports bundle operations—instead of sending one resource update at a time, you can send a bundle of up to hundreds of updates in a single API call. I implemented this using FHIR batch bundles:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_create_batch_bundle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resources_to_update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Create a FHIR batch bundle for updating multiple resources&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;bundle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resourceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bundle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;batch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resources_to_update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;bundle&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;method&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;PUT&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt;
        &lt;span class="p"&gt;})&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;bundle&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_all_references&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;all_references&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;duplicate_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Update all references using batch bundles&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;resources_to_update&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;

    &lt;span class="c1"&gt;# Collect all resources that need updating
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_references&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;resources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_patient_references&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;duplicate_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;resources_to_update&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c1"&gt;# Split into batches of 100
&lt;/span&gt;    &lt;span class="n"&gt;num_batches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resources_to_update&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_batches&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;batch_resources&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resources_to_update&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;:(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

        &lt;span class="c1"&gt;# Create and execute batch bundle
&lt;/span&gt;        &lt;span class="n"&gt;bundle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_create_batch_bundle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch_resources&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;result_bundle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_execute_batch_bundle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bundle&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Check results
&lt;/span&gt;        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result_bundle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
            &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resources_updated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now instead of 60 separate PUT requests to update 60 observations, I send one request with a bundle containing all 60 updates. The FHIR server processes them efficiently on its end, and I get back a bundle with the results.&lt;/p&gt;

&lt;p&gt;I also added a smart optimization: only fetch resources for duplicate patients, not the master. Resources already pointing to the master don't need to be fetched or updated. This simple check cut the amount of data I needed to process roughly in half:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Only search for duplicates, not master (master resources don't need updating)
&lt;/span&gt;&lt;span class="n"&gt;search_patient_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;master_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I ran the new version on the same 68 NIKs that had taken 11 hours before. This time, I watched the progress in real-time. The parallel fetching worked beautifully—I could see all 27 resource types being queried simultaneously. The batch updates were lightning fast—bundles of 100 resources updated in seconds.&lt;/p&gt;

&lt;p&gt;Twenty-three minutes later, it was done. The same operation that took 11 hours now took 23 minutes. That's roughly 30 times faster.&lt;/p&gt;

&lt;p&gt;I ran it again on a larger dataset just to be sure. Same results. The system was consistently fast. The optimization worked.&lt;/p&gt;

&lt;h2&gt;
  
  
  Technical Deep Dives: Solving Real Problems
&lt;/h2&gt;

&lt;p&gt;While the main architecture was solid, I ran into several challenges that required specific solutions. Let me share three that taught me the most.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 1: Pagination and URL Handling
&lt;/h3&gt;

&lt;p&gt;One issue that cost me two hours of debugging was pagination. When fetching resources, FHIR servers often return results in pages. You get the first 100 results, plus a "next" link to get the next page. Simple enough, right?&lt;/p&gt;

&lt;p&gt;Except the "next" link returned by our FHIR server had a trailing slash before the query parameters: &lt;code&gt;/Patient/?_count=100&amp;amp;_page_token=abc&lt;/code&gt;. When I tried to fetch that URL, I got 404 errors. The server expected &lt;code&gt;/Patient?_count=100&lt;/code&gt; (no slash before the question mark).&lt;/p&gt;

&lt;p&gt;I spent way too long staring at logs before I noticed that subtle difference. Once I saw it, the fix was simple:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_get_next_link&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bundle&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Get next page URL from bundle&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;bundle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;link&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]):&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;relation&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;next&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;next_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;link&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;next_url&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/fhir/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;next_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;path_and_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;next_url&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/fhir/&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                &lt;span class="c1"&gt;# Remove trailing slash before query string
&lt;/span&gt;                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;path_and_query&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;path_and_query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;path_and_query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;/?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;?&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;path_and_query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This taught me to always validate assumptions about external APIs. Just because something looks like a standard URL doesn't mean it will work exactly as you expect.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 2: Choosing the Right Master Patient
&lt;/h3&gt;

&lt;p&gt;Initially, I just picked the most recently updated patient as the master. Simple logic: the newest record is probably the most complete. But then I realized this created a problem. If I ran the deduplication twice, I might choose a different master the second time (if one of the duplicates had been updated in between). This would cause unnecessary churn—moving all those resource references back and forth.&lt;/p&gt;

&lt;p&gt;The solution was to implement stability: once a patient has been designated as master, it should stay the master. I did this using FHIR meta tags. When a patient becomes the master, I tag it with a "golden resource" tag that points to itself:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_has_self_referencing_golden_tag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Check if patient has golden resource tag pointing to itself&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;patient_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

    &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;tag&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;http://terminology.kemkes.go.id/sp-replaced-by&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;code&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;patient_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;select_master_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;List&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Select which patient should be the master

    Priority:
    1. Active patient with golden resource tag (existing master)
    2. Most recently updated active patient
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nf"&gt;fetch_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;pid&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;active_patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patients&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;active&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="c1"&gt;# Check for existing master first
&lt;/span&gt;    &lt;span class="n"&gt;existing_masters&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;active_patients&lt;/span&gt;
                       &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_has_self_referencing_golden_tag&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;existing_masters&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existing_masters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lastUpdated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;active_patients&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                  &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;meta&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;lastUpdated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when I run deduplication, it first checks if one of the patients is already marked as master. If so, use that one. If not, pick the most recently updated and mark it as master. This ensures consistency across multiple runs.&lt;/p&gt;

&lt;h3&gt;
  
  
  Challenge 3: Handling Batch Failures Gracefully
&lt;/h3&gt;

&lt;p&gt;When I first implemented batch updates, I used FHIR transaction bundles (type: "transaction"). These are atomic—either all updates succeed, or they all fail. This seemed safe, but it had a major problem: if even one resource in the batch had an issue, the entire batch would fail, and none of the updates would be applied.&lt;/p&gt;

&lt;p&gt;During testing, I had a batch of 100 observations to update. One of them had a validation issue (a missing required field from old data). The entire batch failed, and I had to figure out which one was problematic. This was frustrating and slow.&lt;/p&gt;

&lt;p&gt;The solution was to switch to batch bundles (type: "batch") instead of transaction bundles. With batch bundles, each operation in the bundle succeeds or fails independently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;bundle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;resourceType&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bundle&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;type&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;batch&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;# Independent operations, not atomic
&lt;/span&gt;    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[...]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now if one resource in a batch of 100 fails, the other 99 still get updated successfully. I log the failure, track it in my stats, but don't let it block the entire operation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result_bundle&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])):&lt;/span&gt;
    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# Success (2xx status code)
&lt;/span&gt;        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resources_updated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;batch_successes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Log failure but continue
&lt;/span&gt;        &lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;resource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;batch_resources&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;error_msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to update &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;error_msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;batch_failures&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This makes the system much more robust. Even with messy real-world data, the deduplication completes successfully for the vast majority of resources, and I have a clear log of anything that failed.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making It Production-Ready: The API Layer
&lt;/h2&gt;

&lt;p&gt;The command-line script worked great for batch deduplication, but for ongoing operations, I needed something more accessible. I built a FastAPI wrapper that exposes the deduplication functionality as a REST API.&lt;/p&gt;

&lt;p&gt;The API has two main endpoints:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="nd"&gt;@app.post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;/deduplicate&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;deduplicate_single_nik&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;SingleNIKRequest&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Deduplicate patients for a single NIK&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;deduplicator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;FHIRPatientDeduplicator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;FHIR_BASE_URL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;nik_system&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;NIK_SYSTEM&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;API_KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;BATCH_SIZE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;max_workers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;MAX_WORKERS&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;patient_ids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;deduplicator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;find_patients_by_nik&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;deduplicator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;deduplicate_by_nik&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;patient_ids&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;delete_duplicates&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;delete_duplicates&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;duration&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;DeduplicationResponse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;success&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;resources_found&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;deduplicator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resources_found&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;resources_updated&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;deduplicator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stats&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resources_updated&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;duration_seconds&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;duration&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;timestamp&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;utcnow&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;isoformat&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I added timing information so we can track how long each deduplication takes. This is useful for monitoring and capacity planning. I also added a batch endpoint that processes multiple NIKs in sequence, with per-NIK timing and summary statistics.&lt;/p&gt;

&lt;p&gt;The API is deployed on Google Cloud Run, which handles scaling automatically. If we need to process a large batch of NIKs, we can send them to the batch endpoint and it processes them sequentially (to maintain data integrity) while still being fast thanks to the parallelization and batch updates happening under the hood.&lt;/p&gt;

&lt;p&gt;The API also makes it easy for other teams to integrate deduplication into their workflows. They can call the endpoint whenever they import new data, and any duplicates get cleaned up automatically.&lt;/p&gt;

&lt;h2&gt;
  
  
  Reflection &amp;amp; Lessons Learned
&lt;/h2&gt;

&lt;p&gt;Looking back on this project, I'm proud of what I built, but I'm also very aware of what I did wrong and what I'd do differently next time.&lt;/p&gt;

&lt;h3&gt;
  
  
  What Went Well
&lt;/h3&gt;

&lt;p&gt;The NIK-based reference system is simple and reliable. By choosing the right unique identifier from the start, I avoided all the complexity of demographic matching. The system hasn't created a single duplicate patient since I deployed it.&lt;/p&gt;

&lt;p&gt;The optimization from sequential to batch/parallel processing was a huge win. Going from 11 hours to 23 minutes isn't just about speed—it's about practicality. At 11 hours, running deduplication was something you'd do rarely, maybe once a month, as a special operation. At 23 minutes, it's something you can run weekly, or even daily if needed. That changes how useful the tool is.&lt;/p&gt;

&lt;p&gt;The architectural decisions around resilience—using batch bundles instead of transactions, tracking errors but continuing, logging everything—have proven their value. The system handles real-world messy data gracefully. It doesn't fail catastrophically because one record has a problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  What I'd Do Differently
&lt;/h3&gt;

&lt;p&gt;I should have thought about deduplication from day one. If I had implemented the NIK check in the original converter, I wouldn't have created thousands of duplicates that needed cleaning up. This is a classic example of a small amount of foresight preventing a large amount of pain later.&lt;/p&gt;

&lt;p&gt;I wasted a month trying to adapt the partner's solution. I should have analyzed our specific problem more carefully first. Their demographic matching system was sophisticated and well-built, but it was solving a different problem than ours. Understanding the problem deeply before jumping to solutions would have saved a lot of time.&lt;/p&gt;

&lt;p&gt;I should have built the parallel/batch version first, or at least earlier. I learned more from building it than I would have from just thinking about it, but if I had started with "how do I make this fast?" instead of "how do I make this work?", I would have gotten to the good solution faster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Technical Learnings
&lt;/h3&gt;

&lt;p&gt;Batch operations are powerful. Reducing API calls from hundreds to dozens makes a massive difference. Whenever you're doing lots of similar operations, look for a way to batch them.&lt;/p&gt;

&lt;p&gt;Parallelization works best when operations are independent. Fetching different resource types in parallel is perfect because they don't depend on each other. But I couldn't parallelize the actual deduplication of different NIKs because they might reference the same resources. Understanding these dependencies is crucial.&lt;/p&gt;

&lt;p&gt;The FHIR standard is well-designed but implementations vary. Features like batch bundles, search parameters, and pagination work slightly differently on different servers. Always test against your actual FHIR server, not just against the spec.&lt;/p&gt;

&lt;p&gt;Real-world data is messy. Invalid formats, missing fields, duplicate identifiers—they're all going to happen. Build your system to handle errors gracefully rather than assuming perfect data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Future Improvements
&lt;/h3&gt;

&lt;p&gt;If I were to continue improving this system, here's what I'd add:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More sophisticated master selection.&lt;/strong&gt; Currently I use "most recently updated" as a tiebreaker. But there are other factors that could matter—which patient has the most complete data, which one has the most recent medical records, which one was verified most recently. A scoring system could help.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Automated detection of new duplicates.&lt;/strong&gt; Right now someone has to identify that duplicates exist and call the API. I could build a background job that periodically scans for NIKs with multiple active patients and flags them for review or automatic deduplication.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Intelligent merging of patient demographic data.&lt;/strong&gt; When deduplicating patients, I currently just pick one master patient and mark the others inactive. But sometimes the duplicate records have complementary information—one might have a phone number, another might have an address. I could merge the best available data from all duplicates into the master patient record before marking duplicates inactive. This would ensure no valuable information is lost during deduplication.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;Building this patient deduplication system taught me that good software engineering isn't just about making things work—it's about making them work well, reliably, and efficiently. It's about thinking ahead, but also about being willing to rework things when your first approach doesn't scale.&lt;/p&gt;

&lt;p&gt;I made mistakes. I spent time on solutions that didn't fit my problem. I built a slow version first when I could have built a fast one. But each of those mistakes taught me something valuable. Now I know to identify the right unique identifier before building a system around it. I know to batch operations whenever possible. I know to design for resilience, not just for the happy path.&lt;/p&gt;

&lt;p&gt;Most importantly, I learned that performance optimization isn't just about making things faster—it's about making them useful. A tool that takes 11 hours to run gets used rarely. A tool that takes 23 minutes gets used regularly. Speed enables usefulness.&lt;/p&gt;

&lt;p&gt;If you're building something similar—whether it's deduplication, data migration, or any kind of batch processing—I hope my journey helps you avoid some of the wrong turns I took. Think about deduplication early. Choose the right unique identifier. Build for resilience. Batch and parallelize when you can. And don't be afraid to throw away your first version if it doesn't scale.&lt;/p&gt;

&lt;p&gt;The code is running in production now, quietly cleaning up duplicate patient records every week. It works. It's fast. And most importantly, it helps make sure that when a healthcare provider looks up a patient's medical history, they see the complete picture. That's what matters.&lt;/p&gt;

</description>
      <category>softwaredevelopment</category>
      <category>architecture</category>
      <category>performance</category>
      <category>learning</category>
    </item>
    <item>
      <title>Relearning Microservices with a Weekend Mini eCommerce Build</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Wed, 24 Sep 2025 03:24:21 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/relearning-microservices-with-a-weekend-mini-ecommerce-build-3dpi</link>
      <guid>https://dev.to/budiwidhiyanto/relearning-microservices-with-a-weekend-mini-ecommerce-build-3dpi</guid>
      <description>&lt;p&gt;One rainy weekend I decided to refresh my microservices skills by building a small eCommerce platform from scratch. I wanted a playground that was close enough to real work to show the classic problems—clear boundaries, steady APIs, reliable deployments—without growing into a long project. This article is my field journal from that sprint: what I built, why I made certain choices, and how the code in this repo supports every decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture at a Glance
&lt;/h2&gt;

&lt;p&gt;Saturday morning started with a blank page and four simple boxes. I knew the weekend would stay calm only if every box owned one clear job and followed the same rules. The result is a Node.js monorepo with four deployable workspaces that live together but stay independent:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User Service handles registration, login, and profile lookups so the rest of the stack never has to guess who is calling.&lt;/li&gt;
&lt;li&gt;Product Service manages the catalog and keeps price data clean.&lt;/li&gt;
&lt;li&gt;Order Service turns carts into history by connecting users and products.&lt;/li&gt;
&lt;li&gt;API Gateway sits on the edge and hides the backend layout from clients.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each service gets its own Postgres database and REST API. To avoid copying the same setup again and again, every service depends on &lt;code&gt;@mini/shared&lt;/code&gt; for logging, HTTP helpers, error classes, and configuration tools. From there the workflow stays simple on purpose: &lt;code&gt;npm run compose:up&lt;/code&gt; brings the stack online with this Compose file driving the topology:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# docker-compose.yml&lt;/span&gt;
&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;user-service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run dev --workspace services/user&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3001:3001"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user-db&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;product-service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run dev --workspace services/product&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3002:3002"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;product-db&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;order-service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run dev --workspace services/order&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;3003:3003"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;order-db&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;user-service&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;product-service&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

  &lt;span class="na"&gt;api-gateway&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;command&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;npm run dev --workspace gateway&lt;/span&gt;
    &lt;span class="na"&gt;ports&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;8080:8080"&lt;/span&gt;
    &lt;span class="na"&gt;depends_on&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;user-service&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;product-service&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;order-service&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;

&lt;span class="na"&gt;volumes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;user-db-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;product-db-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;order-db-data&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The manifests in &lt;code&gt;k8s/&lt;/code&gt; reproduce the same shape inside a Kubernetes cluster when I want to push things a little harder.&lt;/p&gt;

&lt;h2&gt;
  
  
  Shared Platform Capabilities
&lt;/h2&gt;

&lt;p&gt;By midday I noticed the same pattern, service after service. Each one wanted identical Express plumbing, the same error classes, and the same &lt;code&gt;.env&lt;/code&gt; routine. Rather than repeat myself, I moved those cross-cutting pieces into &lt;code&gt;@mini/shared&lt;/code&gt; so the rest of the weekend could focus on business rules instead of setup.&lt;/p&gt;

&lt;p&gt;The shared HTTP helper keeps every edge consistent by centralising the Express setup, wiring in JSON parsing, health checks, and error handling so every service exposes the same behaviour:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// shared/src/http.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createApp&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;routes&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;serviceName is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;express&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;disable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;x-powered-by&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;express&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/healthz&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;uptime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uptime&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;routes&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;routes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NotFoundError&lt;/span&gt;&lt;span class="p"&gt;()));&lt;/span&gt;

  &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nx"&gt;AppError&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;AppError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Internal Server Error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;?.(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;request failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;message&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;app&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Error classes stay in one place, so every service can throw meaningful responses and map domain problems to HTTP status codes without duplicating boilerplate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// shared/src/errors.js&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;AppError&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Validation failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;details&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;validation_error&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;details&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt; &lt;span class="kd"&gt;extends&lt;/span&gt; &lt;span class="nc"&gt;AppError&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;constructor&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;super&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;401&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;code&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unauthorized&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Configuration loading is just as centralised, which means each service validates its environment variables before it starts and applies optional parsers or defaults in one predictable location:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// shared/src/env.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getConfig&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;!!&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;required&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;fallback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fallback&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;required&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Missing required environment variable &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="nx"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;key&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nf"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;acc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;{});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Lastly, the shared logger stamps every log line with the service name, which makes cross-service debugging feel like reading a conversation instead of a jumble of anonymous messages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// shared/src/logger.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;prefix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;serviceName&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s2"&gt;`[&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;serviceName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;]`&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;[app]&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;log&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;warn&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prefix&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;meta&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;After that refactor each service file felt lighter. The interesting code stayed in front, and new features no longer meant reworking the foundations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Service Deep Dive
&lt;/h2&gt;

&lt;h3&gt;
  
  
  User Service: Reestablishing Identity Basics
&lt;/h3&gt;

&lt;p&gt;The first feature I added was identity. Past projects taught me that most bugs look like security bugs when the caller is unknown, so &lt;code&gt;registerUser&lt;/code&gt; hashes the password, saves it, and issues a JWT in one short flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/user/src/service.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;registerUser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;password&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;username and password are required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;findByUsername&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;username already taken&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;passwordHash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;hashPassword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;password&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createUser&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;passwordHash&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;issueToken&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Startup logic seeds an admin account from environment variables because I have locked myself out of dashboards before; the database initializer keeps that safety net in place by creating the table and populating the admin row the moment the service boots:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/user/src/db.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;initDb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;customPool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getPool&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;customPool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`
    CREATE TABLE IF NOT EXISTS users (
      id TEXT PRIMARY KEY,
      username TEXT UNIQUE NOT NULL,
      password_hash TEXT NOT NULL,
      role TEXT NOT NULL DEFAULT 'user'
    );
  `&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;customPool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SELECT id FROM users WHERE username = $1 LIMIT 1&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ADMIN_USERNAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ADMIN_PASSWORD&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;passwordHash&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;hashPassword&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ADMIN_PASSWORD&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;customPool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;INSERT INTO users (id, username, password_hash, role) VALUES ($1, $2, $3, $4)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ADMIN_USERNAME&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;passwordHash&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Authentication sits in a small middleware that checks Bearer tokens and attaches the decoded data to the request. The cryptography helpers stay in their own module so the rest of the code can trust &lt;code&gt;req.user&lt;/code&gt; without drama, and so future changes to signing logic happen in one place:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/user/src/auth-middleware.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;authRequired&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;authorization&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;''&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[,&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;header&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt; &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Missing bearer token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;verifyToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;username&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;username&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Invalid token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/user/src/security.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;issueToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;JWT_SECRET&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;expiresIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;1h&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;verifyToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;JWT_SECRET&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Product Service: Guarding the Catalog
&lt;/h3&gt;

&lt;p&gt;With identity stable, I moved to the catalog. Public routes need to be friendly but safe, so they validate pagination settings before running a query to avoid accidental full-table scans or wasteful database calls:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/product/src/service.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;fetchProducts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nf"&gt;isNaN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;limit must be numeric&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nf"&gt;isNaN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;offset must be numeric&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;listProducts&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;limit&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;offset&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;offset&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Admin routes are stricter: the price parser stops invalid or negative numbers before they reach the database, and the admin middleware keeps write actions behind a trusted role so change control stays tight:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/product/src/service.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;parsePrice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isNaN&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;price must be a non-negative number&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createProductRecord&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;price&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;name and description are required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsedPrice&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;parsePrice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;price&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parsedPrice&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;price is required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;createProduct&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nx"&gt;description&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;price&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;parsedPrice&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/product/src/admin-middleware.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;adminOnly&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;_res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Auth required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;admin&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Admin access required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each product receives a UUID when it is created and is stored in Postgres. That small step keeps tracking clear and makes later integrations easier if this prototype grows into something larger because every product ID stays unique across environments and migrations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Order Service: Cross-Service Collaboration
&lt;/h3&gt;

&lt;p&gt;Orders were the most satisfying part because they make the services work together and force the boundaries to prove themselves. The handler checks that both &lt;code&gt;userId&lt;/code&gt; and &lt;code&gt;productId&lt;/code&gt; exist, validates pagination options, and then calls the product service to confirm the item is still available:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/order/src/service.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;recordOrder&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;userId and productId are required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetchProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;product not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;createOrder&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;randomUUID&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That remote call lives in a small client that normalizes URLs, treats 404s as “not found,” and wraps other errors in a validation message so downstream consumers receive clean, human-readable results:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/order/src/clients/product-client.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;fetchProduct&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;base&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PRODUCT_SERVICE_URL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PRODUCT_SERVICE_URL&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PRODUCT_SERVICE_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;base&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/products/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ValidationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;product lookup failed&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repository stays lean by saving only foreign keys. If the catalog changes later, the order history still reads well, and the service can rebuild richer views by fetching user and product details when needed, which keeps the storage footprint small and the coupling loose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/order/src/repository.js&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;createOrder&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nx"&gt;pool&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;getPool&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;pool&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;INSERT INTO orders (id, user_id, product_id) VALUES ($1, $2, $3)&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;mapOrder&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;product_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  API Gateway and Service-to-Service Communication
&lt;/h2&gt;

&lt;p&gt;From the start I wanted one door for clients. The gateway connects everything, and the &lt;code&gt;proxyTo&lt;/code&gt; helper does the heavy lifting by taking an incoming request, rebuilding the destination URL, and streaming the response back without leaking hop-by-hop headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// gateway/src/index.js&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;proxyTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;normalizedBase&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;endsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;baseUrl&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;URL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;originalUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;normalizedBase&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/`&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
      &lt;span class="k"&gt;delete&lt;/span&gt; &lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;host&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;init&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;

      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;GET&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HEAD&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;init&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="nx"&gt;init&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;content-type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;

      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;init&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
      &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;parsed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;{}&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;parsed&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;_err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The routes mount each downstream service under a clean prefix, which keeps the public API steady even if I move services around inside the cluster and makes documentation easier for anyone consuming the gateway:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// gateway/src/index.js&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;proxyTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;serviceConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userServiceUrl&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/products&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;proxyTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;serviceConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;productServiceUrl&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;use&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/orders&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;proxyTo&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;serviceConfig&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;orderServiceUrl&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the system, the order service calls the product service through the same HTTP endpoints. The approach is intentionally simple because it matches what many teams already run. Right now those calls trust the network and do not add extra authentication, so improving that handshake is near the top of my hardening list. When I explore rate limiting or service discovery, the gateway will be the natural place to add them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Configuration, Security, and Secrets Management
&lt;/h2&gt;

&lt;p&gt;One personal rule for the project was simple: avoid “works on my machine” bugs. Every service reads configuration through &lt;code&gt;env.getConfig&lt;/code&gt;, which applies defaults, checks required values, and handles small type conversions before the app even starts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// services/product/src/config.js&lt;/span&gt;
&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loadEnv&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;__dirname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;..&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;.env&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getConfig&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;PORT&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3002&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;parser&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Number&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;postgres://product_service:password@localhost:5434/product_db&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;JWT_SECRET&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;default&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;devsecret&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When the stack runs in Kubernetes, the JWT secret comes from a cluster secret instead of shipping inside the image, which means new secrets can be rotated without rebuilding containers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# k8s/secret.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Secret&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;jwt-secret&lt;/span&gt;
  &lt;span class="na"&gt;namespace&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mini-ecommerce&lt;/span&gt;
&lt;span class="na"&gt;stringData&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;devsecret&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The user service issues tokens with that secret, the other services verify them locally, and role checks—like the admin filter in the product service—use the decoded payload to make decisions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local Development Workflow
&lt;/h2&gt;

&lt;p&gt;Weekend hacking works only if the feedback loop stays short, so Docker Compose became the main control room:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Install dependencies once with &lt;code&gt;npm install&lt;/code&gt; so every workspace shares the same node_modules tree.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;npm run compose:up&lt;/code&gt; to launch the three services, the gateway, and their Postgres companions (using the compose file shown above) and let Docker wire the local network for you.&lt;/li&gt;
&lt;li&gt;Send every request through &lt;code&gt;http://localhost:8080&lt;/code&gt; so the gateway path stays well traveled and the API surface mirrors production traffic.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Right now the services run with plain &lt;code&gt;node&lt;/code&gt; processes, so I still restart them by hand when code changes. Hot reloaders are on the to-do list, but even without them the shared package keeps logs and errors consistent. Docker volumes remember the seeded catalog and test users between runs, so I can experiment, restart, and keep moving without rebuilding the database every time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Deploying to Kubernetes
&lt;/h2&gt;

&lt;p&gt;By Sunday afternoon curiosity won. I wanted to watch the system run inside a cluster, so the manifests in &lt;code&gt;k8s/&lt;/code&gt; mirror the Compose layout almost line for line.&lt;/p&gt;

&lt;p&gt;The user service deployment is representative of the pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# k8s/user-service.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-service&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;replicas&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;user-service&lt;/span&gt;
          &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mini-ecommerce-user:latest&lt;/span&gt;
          &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;JWT_SECRET&lt;/span&gt;
              &lt;span class="na"&gt;valueFrom&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;secretKeyRef&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;jwt-secret&lt;/span&gt;
                  &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;value&lt;/span&gt;
          &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/healthz&lt;/span&gt;
              &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3001&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gateway pairs a deployment with an ingress so there is one public entry point:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# k8s/gateway.yaml&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;apps/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Deployment&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-gateway&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;template&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-gateway&lt;/span&gt;
          &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;mini-ecommerce-gateway:latest&lt;/span&gt;
          &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
            &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;USER_SERVICE_URL&lt;/span&gt;
              &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;http://user-service&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;apiVersion&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;networking.k8s.io/v1&lt;/span&gt;
&lt;span class="na"&gt;kind&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Ingress&lt;/span&gt;
&lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-gateway&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;rules&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;http&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;paths&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/&lt;/span&gt;
            &lt;span class="na"&gt;pathType&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Prefix&lt;/span&gt;
            &lt;span class="na"&gt;backend&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
              &lt;span class="na"&gt;service&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api-gateway&lt;/span&gt;
                &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
                  &lt;span class="na"&gt;number&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;80&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Dedicated Postgres deployments keep data siloed per service, honoring the “database per service” mantra without any shared state leaks.&lt;/p&gt;

&lt;p&gt;With images tagged—think &lt;code&gt;mini-ecommerce-user:latest&lt;/code&gt;—a &lt;code&gt;kubectl apply -f k8s/&lt;/code&gt; sets up the same architecture I run locally. Rolling updates and restarts behave the way I expect, which makes this repo a comfortable sandbox for practicing cluster operations. Secrets ship with &lt;code&gt;kubectl apply -f k8s/secret.yaml&lt;/code&gt;, and the workload manifests read them as environment variables; config maps follow the same pattern for plain settings.&lt;/p&gt;

&lt;h2&gt;
  
  
  Observability, Testing, and Next Experiments
&lt;/h2&gt;

&lt;p&gt;I kept observability light but friendly. The logger shown earlier prefixes every line with a service name, so one &lt;code&gt;tail -f&lt;/code&gt; gives a clear picture of who is talking. Tests live next to the code inside each service’s &lt;code&gt;__tests__&lt;/code&gt; folder; they mix unit checks with small integration cases so I can change a function and still trust the boundaries, and they double as documentation because they show how the modules are meant to collaborate.&lt;/p&gt;

&lt;p&gt;There is still plenty to explore. A message broker for order events, circuit breakers inside the product client, and rate limiting at the gateway are already on the list. The current setup leaves room for those ideas without tearing up the base.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Relearned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Clear domain boundaries keep ownership simple and give every rule a home.&lt;/li&gt;
&lt;li&gt;A small shared toolkit (&lt;code&gt;@mini/shared&lt;/code&gt;) stops the team—future me included—from rebuilding the same helpers.&lt;/li&gt;
&lt;li&gt;The API gateway protects client URLs while backend services evolve in private.&lt;/li&gt;
&lt;li&gt;Matching the local Compose setup inside Kubernetes lowers the stress when promoting changes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The weekend build reminded me that microservices are less about counting repositories and more about choosing clear boundaries. Steady ownership, honest contracts, and repeatable operations beat shiny patterns every time. Now that this mini eCommerce system lives in the toolbox, I can reopen the code and the lessons whenever I need a quick refresher.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>microservices</category>
      <category>node</category>
      <category>learning</category>
    </item>
    <item>
      <title>Scaling Healthcare Data Processing: Multi-Environment FHIR Patient Updates with Smart Batch Processing</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Tue, 23 Sep 2025 05:36:58 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/scaling-healthcare-data-processing-multi-environment-fhir-patient-updates-with-smart-batch-b3f</link>
      <guid>https://dev.to/budiwidhiyanto/scaling-healthcare-data-processing-multi-environment-fhir-patient-updates-with-smart-batch-b3f</guid>
      <description>&lt;p&gt;The request sounded simple: &lt;em&gt;“Can we keep patient phone numbers up to date?”&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;At first we thought it was a quick operations chore. Then we traced the real data flow and saw the mess underneath. Phone numbers rolled in from WhatsApp, hospital front desks, and survey tools, each with its own format. Patients jumped between facilities, so their trails were often broken. The operations team lived in Google Sheets, and every region guarded its own FHIR server with different credentials, limits, and quirks.&lt;/p&gt;

&lt;p&gt;Our first fix was a tiny script that looped through one row at a time. On a test file it worked fine, but once we aimed it at 10,000 rows the run dragged on for hours, chewed through hundreds of megabytes of memory, and could crash if a single record looked wrong.&lt;/p&gt;

&lt;p&gt;This article is the story of how that fragile script became a production-ready workflow. The same 10,000-row load now finishes in about 10–12 minutes per region, using only 256Mi memory and 0.5 vCPU. More important, it stays steady, it survives bad data, and operations teams are happy to run it every day.&lt;/p&gt;




&lt;h3&gt;
  
  
  From One Region to a Platform
&lt;/h3&gt;

&lt;p&gt;The moment a second region joined the queue, the to-do list grew fast. We suddenly needed to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Serve multiple regions, such as Purbalingga and Lombok Barat, at the same time without stepping on each other.&lt;/li&gt;
&lt;li&gt;Keep every environment on its own FHIR endpoint, spreadsheet, and credential set,no mixing, ever.&lt;/li&gt;
&lt;li&gt;Give operators live feedback with status updates, readable logs, and a safe way to restart when things went wrong.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;At that point the one-off script had nowhere to grow. We needed a real architecture that could respect those boundaries.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Challenge: Scale, Isolation, and Real-World Limits
&lt;/h3&gt;

&lt;p&gt;Running two regions side by side exposed the real limits:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tens of thousands of records every day.&lt;/li&gt;
&lt;li&gt;Updates that had to finish in under 15 minutes.&lt;/li&gt;
&lt;li&gt;Tight resource limits (very small memory and CPU).&lt;/li&gt;
&lt;li&gt;Zero tolerance for mixing environments.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We also needed more resilience: a single broken record could not freeze the run, and the FHIR servers deserved a gentle pace so they never tipped into overload.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why the Simple Loop Fails
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Inefficient sequential approach
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;all_records&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;find_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="nf"&gt;update_patient_phone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;phone_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This loop looked fine in early tests. In production it fell apart,every row opened new network calls, memory crept upward, progress stayed invisible, and one exception could stop the whole job. That simple design hurt us later on.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Solution: Smart Batching Across Multiple Environments
&lt;/h3&gt;

&lt;p&gt;The turning point came when we stopped thinking about “a script that updates phones” and started thinking about “a pipeline that needs to stay healthy.” Stability, visibility, and consistency became the main goals.&lt;/p&gt;

&lt;p&gt;Once we named those needs, the design almost wrote itself: batch the work, reuse connections, pace the requests, and send every result back into the spreadsheets everyone already trusted. On top of that, make sure each batch leaves a clear log trail so operators can watch the system move.&lt;/p&gt;




&lt;h3&gt;
  
  
  Architecture Overview
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐    ┌──────────────────┐    ┌─────────────────┐
│  Google Sheets   │    │   Flask Web      │    │    FHIR Server  │
│   (per region)   │───▶│   Application    │───▶│    (per region) │
│                  │    │                  │    │                 │
│ • Regional rows  │    │ • Batch engine   │    │ • Patient query │
│ • Status column  │    │ • Memory hygiene │    │ • Phone update  │
│ • Daily feeds    │    │ • Safe retries   │    │ • Rate limiting │
└─────────────────┘    └──────────────────┘    └─────────────────┘
         │                       │
         └───────────────┬───────┘
                         │
                ┌────────────────────┐
                │  Config &amp;amp; Secrets  │
                │  (per environment) │
                │  JSON + env vars   │
                └────────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice, each region runs end to end on its own. Operators manage rows in a regional Google Sheet, the Flask app reads that sheet and processes batches, and every update goes to the matching FHIR server. The config and secrets layer supplies the right credentials and URLs per run, so requests stay isolated and nothing leaks across environments.&lt;/p&gt;




&lt;h3&gt;
  
  
  The Batch Engine (Built for Production)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;process_records&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;start_time&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total_records&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;successful_updates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;failed_updates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;patients_not_found&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;processing_time_minutes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;batches_processed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setup_session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Starting batch processing with batch size: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="n"&gt;batch_size&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="n"&gt;batch_num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;total_batches&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;//&lt;/span&gt; &lt;span class="n"&gt;batch_size&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== Processing batch &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;batch_num&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;total_batches&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; records) ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;row_index&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;row_index&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="n"&gt;identifier&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="n"&gt;phone&lt;/span&gt;       &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;phone_number&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
                    &lt;span class="n"&gt;record_num&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;record_num&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;=== PROGRESS: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record_num&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; processed ===&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="n"&gt;patients&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;find_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fhir_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;patients_not_found&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                        &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Patient &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; not found&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="nf"&gt;update_worksheet_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;worksheet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;continue&lt;/span&gt;

                    &lt;span class="n"&gt;all_success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
                    &lt;span class="n"&gt;failed_msgs&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                    &lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

                    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;update_patient_phone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                            &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fhir_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;
                        &lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;success_count&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                        &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                            &lt;span class="n"&gt;all_success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
                            &lt;span class="n"&gt;failed_msgs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Patient &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;patient&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;all_success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;successful_updates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="n"&gt;success_count&lt;/span&gt;
                        &lt;span class="nf"&gt;update_worksheet_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;worksheet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;failed_updates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;success_count&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;failed_msgs&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patients&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="nf"&gt;update_worksheet_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;worksheet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                    &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# be kind to downstreams
&lt;/span&gt;
                &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;failed_updates&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
                    &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Error processing &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;unknown&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;errors&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;❌ &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;continue&lt;/span&gt;

            &lt;span class="c1"&gt;# Keep memory flat
&lt;/span&gt;            &lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;batches_processed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;processing_time_minutes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;time&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start_time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Processing completed in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;processing_time_minutes&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; minutes&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;error&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;message&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;Processing failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Fatal error&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The batch engine became the system’s heart. Inside the loop you can see how it slices the sheet into blocks of 100 rows, logs the batch number, and keeps track of every success or failure. That small pause after each batch and record keeps memory flat and slows the request rate so the FHIR servers never get hammered. Even when a row misbehaves, the error handler writes it down and the loop keeps going, which means operators still see steady progress instead of a half-finished run.&lt;/p&gt;




&lt;h3&gt;
  
  
  Practices That Made the System Work
&lt;/h3&gt;

&lt;h4&gt;
  
  
  1) Batch Size That Fits Reality
&lt;/h4&gt;

&lt;p&gt;When we tried tiny batches the system spent more time setting up than doing real work. When we went too big, the process grabbed extra memory and slowed everything down. After a few trial runs, 100 records felt balanced,quick to process, light on resources, and easy to monitor in the logs and in the sheet.&lt;/p&gt;




&lt;h4&gt;
  
  
  2) Connection Pooling and Safe Retries
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;setup_session&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;retry&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;total&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;backoff_factor&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;status_forcelist&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;429&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;502&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;504&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;adapter&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;HTTPAdapter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;adapter&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Pooling the HTTP session kept us from opening a fresh connection for every row, which trimmed latency and CPU spikes. The retry helper then waited a little longer after each failure, so short network hiccups cleared on their own instead of breaking the run. With those two pieces in place, the pipeline finished sooner and recovered smoothly from the usual internet noise.&lt;/p&gt;




&lt;h4&gt;
  
  
  3) Explicit Memory Hygiene
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# keep memory flat
&lt;/span&gt;&lt;span class="k"&gt;del&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;
&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;By explicitly deleting each batch and pausing briefly, memory remained flat. No creeping leaks, no surprises during long runs.&lt;/p&gt;




&lt;h4&gt;
  
  
  4) Pacing to Protect Servers
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# ~10 ops/sec
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A short delay between requests prevented FHIR servers from being overwhelmed. Paradoxically, slowing down slightly made the whole system finish faster, because retries and throttling were reduced.&lt;/p&gt;




&lt;h3&gt;
  
  
  Features That Build Trust
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Real-Time Status in Sheets
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;update_worksheet_status&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;worksheet&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;error_message&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Update the status_update_phone_number column (Column G)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;status_column&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;  &lt;span class="c1"&gt;# Column G (1-indexed)
&lt;/span&gt;        &lt;span class="n"&gt;value&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;success&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="nf"&gt;else &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;error_message&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;error_message&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;failed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;worksheet&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update_cell&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status_column&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Row &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; (G): &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;value&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Failed to update status for row &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row_index&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Operators never asked for a new dashboard; they just wanted their sheet to tell them what happened. This helper does exactly that, dropping “success” or “failed” (with the reason) straight into Column G so they can watch the run move row by row.&lt;/p&gt;

&lt;h4&gt;
  
  
  Structured Batch Logs and Alerts
&lt;/h4&gt;

&lt;p&gt;Every batch writes a compact log entry with the batch number, record count, and any failures. Those logs land in Cloud Logging and a small alerting rule pings the on-call channel when something looks off. If a row fails, the operator spots it in the sheet and can jump straight to the matching log line because the correlation ID is right there in the message.&lt;/p&gt;




&lt;h4&gt;
  
  
  Handle Multiple Patients per Identifier
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;find_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fhir_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;fhir_url&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/Patient&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;identifier&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;total&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;resource&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;entry&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Some identifiers pointed to more than one Patient record. Rather than pretend the duplicates did not exist, the system updates each match so that all copies stay aligned,even when the source data is messy.&lt;/p&gt;




&lt;h4&gt;
  
  
  Phone Number Cleaning
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;clean_phone_number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Sheets can turn big numbers into scientific notation
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;e+&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;e-&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;lower&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;ValueError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Could not parse scientific notation: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# If multiple numbers, take the first
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;,&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c1"&gt;# Keep only digits and plus
&lt;/span&gt;    &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;r&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[^\d+]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Indonesian heuristic: restore missing leading zero
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startswith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;8&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;11&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
        &lt;span class="n"&gt;phone&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;0&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;
        &lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Added missing leading zero: &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;'"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Basic sanity check
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;phone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;+&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;''&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;phone&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Phone numbers are notoriously messy. Sheets loves to turn big numbers into scientific notation, people paste in two numbers separated by commas, and in Indonesia a missing leading zero can point to the wrong person. The cleaner walks through each of those cases so the final value is something we can safely send to FHIR.&lt;/p&gt;




&lt;h3&gt;
  
  
  Multi-Environment Deployment: One Region = One Tenant
&lt;/h3&gt;

&lt;h4&gt;
  
  
  JSON Config as Source of Truth
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"environment-name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"project_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-gcp-project-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"service_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-cloud-run-service-name"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"asia-southeast2"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"platform"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"managed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"memory"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"256Mi"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cpu"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0.5"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"900"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"max_instances"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"min_instances"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"0"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"concurrency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"credential_file"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"service-account-credentials.json"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"env_vars"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"FHIR_SERVER_URL"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://your-fhir-server.com/fhir"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"FHIR_API_TOKEN"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-api-token"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"SPREADSHEET_ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-google-sheets-id"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"WORKSHEET_ID"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"your-worksheet-name"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each environment,Purbalingga, Lombok Barat, and friends,gets its own JSON file. The application code stays the same, while the config file names the project, credentials, and spreadsheet for that region. That simple split keeps the runs isolated, makes audits easy, and lets us roll back a region without touching the others.&lt;/p&gt;




&lt;h4&gt;
  
  
  Cloud Run Profile Per Environment
&lt;/h4&gt;

&lt;p&gt;Each region deploys to its own Cloud Run service with a lean profile: 256Mi memory, 0.5 vCPU, and a single instance. It keeps costs low, keeps performance predictable, and matches the steady pace we designed for.&lt;/p&gt;

&lt;h3&gt;
  
  
  Lessons Learned
&lt;/h3&gt;

&lt;p&gt;When we stepped back after the first few successful runs, a handful of habits stood out.&lt;/p&gt;

&lt;p&gt;Regional boundaries matter. Keeping configs and credentials per region meant every incident stayed where it started. If Lombok Barat hit a problem, Purbalingga kept running without even noticing.&lt;/p&gt;

&lt;p&gt;100-record batches are the sweet spot. That size is big enough to move quickly but small enough to avoid memory spikes. It also lines up nicely with the logging we added, so operators can read progress in plain language.&lt;/p&gt;

&lt;p&gt;The spreadsheet is still the source of truth. By writing results directly into Column G, we gave operators instant trust. They did not need to learn a new tool; their everyday sheet became the dashboard.&lt;/p&gt;

&lt;p&gt;Polite clients make for calm servers. Gentle pacing and retries with backoff handled the usual internet noise. Instead of chasing flaky errors, we saw quiet logs and smooth throughput.&lt;/p&gt;

&lt;p&gt;Clean data upfront saves pain later. Fixing phone numbers at the edge kept downstream systems clean. Once we did that, support tickets about wrong contacts dropped sharply.&lt;/p&gt;




&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;The journey took us from a fragile script to a workflow the team can trust. We didn’t introduce exotic technology; we simply leaned on good habits,clear boundaries, careful batching, shared connections, steady pacing, and honest visibility.&lt;/p&gt;

&lt;p&gt;Today the 10,000-row jobs finish in about 12 minutes per environment. Memory stays flat. Operators watch the spreadsheet fill with results while alerts stay quiet.&lt;/p&gt;

&lt;p&gt;For us, that’s what healthcare data scaling looks like: not only faster runs, but calmer shifts, clearer feedback, and an architecture that can keep growing with the organization.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>softwareengineering</category>
      <category>performance</category>
      <category>healthinformatics</category>
    </item>
    <item>
      <title>Enhancing FHIR EpisodeOfCare Resources: Improving Interoperability and Performance</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Wed, 03 Sep 2025 11:19:03 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/enhancing-fhir-episodeofcare-resources-improving-interoperability-and-performance-4cnd</link>
      <guid>https://dev.to/budiwidhiyanto/enhancing-fhir-episodeofcare-resources-improving-interoperability-and-performance-4cnd</guid>
      <description>&lt;p&gt;Healthcare runs on data, but too often, that data doesn't feel like it was designed for the people who use it. Doctors and administrators regularly face cryptic codes when what they really need is a name, an identifier, or a simple confirmation of which hospital is in charge.&lt;/p&gt;

&lt;p&gt;This was exactly the challenge we faced with FHIR EpisodeOfCare resources. By default, they tracked patient care episodes using reference codes, but those references weren't very useful in practice. They were difficult to query, slowed down workflows, and created friction in sharing data across organizations.&lt;/p&gt;

&lt;p&gt;We set out to fix that by enhancing EpisodeOfCare resources with richer references adding patient identifiers, display names, and organization names directly into the resource. The reason was simple: more detailed references make queries easier, reduce ambiguity, and dramatically improve interoperability, especially in federated healthcare environments where multiple hospitals, clinics, and systems exchange data.&lt;/p&gt;

&lt;p&gt;But there was another challenge: scale. It wasn't just about a handful of resources. We had to handle more than 10,000 EpisodeOfCare records. Enhancing that many records required a reliable, scalable, and cost effective approach.&lt;/p&gt;

&lt;p&gt;The solution came through batch processing, powered by Google Cloud Healthcare API, a Python-based enhancement engine, and serverless deployment on Cloud Run. The results were remarkable: thousands of resources enhanced in minutes, at low cost, with 100% success. Most importantly, the data became more useful for both people and systems.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Introduction: The Interoperability Challenge
&lt;/h2&gt;

&lt;p&gt;In today's healthcare environment, data lives everywhere. Patients might receive care at multiple facilities across different regions, each with its own information system. These systems often don't agree on how to represent data.&lt;/p&gt;

&lt;p&gt;FHIR was designed as a bridge, providing a standard way to share healthcare information. But standards leave room for interpretation, and sometimes that room creates gaps.&lt;/p&gt;

&lt;p&gt;The EpisodeOfCare resource is a good example. It records the period during which a patient is under care, but the way it references patients and organizations often leaves out the details that make data useful in practice.&lt;/p&gt;

&lt;p&gt;Here's what a typical EpisodeOfCare resource looked like before enhancement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"patient"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Patient/f0c29c2a-1f20-4b9f-b49c-8d4ad487a6e8"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"managingOrganization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Organization/100007732"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On the surface, it seems fine. But in practice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The patient's name is missing.&lt;/li&gt;
&lt;li&gt;Their identifier, such as a national ID (NIK), is missing.&lt;/li&gt;
&lt;li&gt;The organization's name is missing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This makes queries harder. If you want to filter EpisodeOfCare resources by patient name, you can't do it directly. If you want to find all patients managed by a specific hospital, you have to look up organization codes separately. In federated systems, where data needs to be shared seamlessly across multiple institutions, this lack of clarity makes interoperability fragile.&lt;/p&gt;

&lt;p&gt;The result: more work for healthcare staff, more room for errors, and less effective use of valuable data.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Our Approach: Adding the Missing Pieces
&lt;/h2&gt;

&lt;p&gt;We set out to make EpisodeOfCare resources more useful, more searchable, and more interoperable.&lt;/p&gt;

&lt;p&gt;The key was enhancing the references so that they carried not just codes, but context.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Patient identifiers (NIKs): By embedding identifiers directly into EpisodeOfCare, we made it easy to query by ID and match patients across systems.&lt;/li&gt;
&lt;li&gt;Patient display names: Adding names meant healthcare workers could instantly see who the record referred to, and systems could query resources by name.&lt;/li&gt;
&lt;li&gt;Organization display names: Embedding the managing organization's actual name improved both usability and data federation across institutions.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's what the same resource looked like after enhancement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"patient"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Patient/f0c29c2a-1f20-4b9f-b49c-8d4ad487a6e8"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Dr. YULIANI SUSANTI"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"identifier"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"https://fhir.kemkes.go.id/id/nik"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"3271234567890123"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"managingOrganization"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"reference"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Organization/100007732"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"display"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"RSUD PURBALINGGA"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now, queries become much simpler. Need to find all patients managed by RSUD PURBALINGGA? You can query directly on the display field. Need to check for duplicate NIKs? The identifier is already there.&lt;/p&gt;

&lt;p&gt;By enriching references, we didn't just improve usability, we made the data easier to search, more consistent across systems, and more reliable in federated environments.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. The Big Challenge: Scaling Up
&lt;/h2&gt;

&lt;p&gt;Enhancing one EpisodeOfCare resource was easy. Enhancing over 10,000 resources was a serious technical challenge.&lt;/p&gt;

&lt;p&gt;Processing each record individually would have been too slow and too fragile. We risked hitting API rate limits, overloading memory, and wasting time. In a federated environment, where performance and reliability are critical, this was unacceptable.&lt;/p&gt;

&lt;p&gt;The solution was batch processing. Instead of treating each record as a separate job, we processed them in groups. This allowed us to control throughput, recover gracefully from errors, and optimize performance.&lt;/p&gt;

&lt;p&gt;With smart pagination and configurable parameters, the system could handle datasets of any size. Whether it was 1,000 records or 10,000, the engine processed them efficiently.&lt;/p&gt;

&lt;p&gt;In practice, this meant:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Thousands of resources processed in under half an hour.&lt;/li&gt;
&lt;li&gt;Hundreds of API calls per minute without hitting limits.&lt;/li&gt;
&lt;li&gt;Full runs costing less than a dollar, even at large scale.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The leap from hundreds to tens of thousands of records proved that the system wasn't just functional; it was scalable.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Inside the Engine
&lt;/h2&gt;

&lt;p&gt;At the heart of the system was a Python-based enhancement engine. Its job was to fetch EpisodeOfCare resources, enrich their references, and save them back in enhanced form.&lt;/p&gt;

&lt;p&gt;The logic for enhancing patients, for example, looked like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;enhance_patient_reference&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_ref&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;patient_resource&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Add NIK identifier and display name to patient reference&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;enhanced_ref&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;patient_ref&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;nik&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_nik_from_patient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient_resource&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;enhanced_ref&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;identifier&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;nik&lt;/span&gt;

    &lt;span class="n"&gt;display_name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extract_patient_display_name&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;patient_resource&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;display_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;enhanced_ref&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;display&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;display_name&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;enhanced_ref&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This small piece of code hides a lot of sophistication. It knows how to handle multiple NIK formats, build names from different components, and gracefully skip enhancements if data is missing.&lt;/p&gt;

&lt;p&gt;But beyond the logic, what mattered most was resilience. Logs tracked every step. Failed calls retried automatically. Errors didn't crash the process. They were logged, and the batch continued. This was essential for handling more than 10,000 records without losing progress.&lt;/p&gt;

&lt;p&gt;And of course, everything was designed with security and compliance in mind. Data was encrypted in transit and at rest, authentication was handled securely, and every enhancement left an audit trail.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Deployment in the Real World
&lt;/h2&gt;

&lt;p&gt;We chose Google Cloud Run to deploy the service. This gave us a serverless platform that scaled automatically and required almost no infrastructure management.&lt;/p&gt;

&lt;p&gt;Each region had its own deployment, with its own configuration. For example, Purbalingga and Lombok Barat each had dedicated endpoints, tuned for their workloads.&lt;/p&gt;

&lt;p&gt;We added features that made operations smooth:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Health-check endpoints to confirm the system was running.&lt;/li&gt;
&lt;li&gt;Monitoring dashboards to track progress.&lt;/li&gt;
&lt;li&gt;Dry-run mode to test enhancements safely.&lt;/li&gt;
&lt;li&gt;Automated deployment scripts for fast updates.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The result was a production-ready system that didn't just work in theory, but in the complex, messy reality of healthcare IT.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. The Results
&lt;/h2&gt;

&lt;p&gt;The numbers tell a powerful story:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Over 10,000 EpisodeOfCare resources enhanced&lt;/li&gt;
&lt;li&gt;100% success rate&lt;/li&gt;
&lt;li&gt;Zero data loss&lt;/li&gt;
&lt;li&gt;293 API calls per minute sustained&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But behind the numbers were the real wins:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Doctors could query patients by name or ID without extra clicks.&lt;/li&gt;
&lt;li&gt;Administrators no longer had to translate organization codes manually.&lt;/li&gt;
&lt;li&gt;Federated systems could exchange EpisodeOfCare data more reliably, because references carried meaningful context.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Healthcare workers reported that tasks which once took minutes now took seconds. And across organizations, interoperability became less of a headache.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. Looking Ahead
&lt;/h2&gt;

&lt;p&gt;As we reflect on the project, one theme stands out above all others: the importance of the federated healthcare ecosystem.&lt;/p&gt;

&lt;p&gt;Healthcare is rarely contained within a single system. Patients move between clinics, hospitals, insurance providers, and government programs. Each of these organizations may run its own IT infrastructure, but the real challenge, and the real opportunity, comes when they need to share data with one another.&lt;/p&gt;

&lt;p&gt;This is where enhanced EpisodeOfCare resources make a difference. By embedding identifiers, display names, and organization names directly into the references, we remove friction. Instead of systems struggling to reconcile codes or run complicated joins, they can exchange data that already carries meaningful context.&lt;/p&gt;

&lt;p&gt;In a federated setup, this means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queries become easier across institutions, because identifiers and names are consistent and visible.&lt;/li&gt;
&lt;li&gt;Data exchange becomes more reliable, because the meaning of each reference is clearer.&lt;/li&gt;
&lt;li&gt;Patient journeys can be tracked more smoothly, even when care happens across multiple organizations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The future we see isn't about adding more technology for its own sake. It's about making the data we already have work better in a federated world, where collaboration between systems is the norm, not the exception.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Conclusion: Data That Works Better
&lt;/h2&gt;

&lt;p&gt;At the start, our challenge was simple: EpisodeOfCare resources weren't very helpful to humans. They carried references, but not enough detail.&lt;/p&gt;

&lt;p&gt;By enhancing those references, we solved three problems at once: we made queries easier, improved day-to-day usability, and strengthened interoperability across federated systems.&lt;/p&gt;

&lt;p&gt;Scaling to more than 10,000 records proved that the solution wasn't just a prototype it was production-ready. And deploying it on the cloud made it secure, cost-efficient, and easy to operate.&lt;/p&gt;

&lt;p&gt;In the end, the lesson is clear: good healthcare data is about more than accuracy. It's about accessibility, usability, and interoperability. When data works better, healthcare works better.&lt;/p&gt;

&lt;p&gt;This is one step toward a future where healthcare systems truly work together, seamlessly and intelligently. And if there's one takeaway from our journey, it's that improving interoperability doesn't always require massive overhauls. sometimes, it just takes the right enhancements, applied at the right scale.&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>performance</category>
      <category>softwareengineering</category>
      <category>healthinformatics</category>
    </item>
    <item>
      <title>Caching Strategies Across Application Layers: Building Faster, More Scalable Products</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Sun, 23 Mar 2025 06:34:21 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/caching-strategies-across-application-layers-building-faster-more-scalable-products-h08</link>
      <guid>https://dev.to/budiwidhiyanto/caching-strategies-across-application-layers-building-faster-more-scalable-products-h08</guid>
      <description>&lt;p&gt;Sarah’s phone buzzed at 2:43 AM. Half-asleep, she answered. On the other end, the on-call engineer sounded stressed:  &lt;/p&gt;

&lt;p&gt;&lt;em&gt;"The database load just spiked. The app is crawling, and users are reporting timeouts everywhere."&lt;/em&gt;  &lt;/p&gt;

&lt;p&gt;As the product lead for a fast-growing SaaS platform with over 200,000 daily active users, Sarah knew that every minute of slowdown meant frustrated customers—and possibly some of them leaving for good.  &lt;/p&gt;

&lt;p&gt;The cause? A small feature update had unintentionally bypassed key caching layers, sending every request straight to the database. What should have been a routine release turned into an emergency, taking hours to fix and forcing a partial rollback.  &lt;/p&gt;

&lt;p&gt;These kinds of issues happen more often than we’d like to admit. In the rush to ship features, caching can feel like an afterthought—something only engineers worry about. But in reality, caching affects all of us, from product managers thinking about user experience to developers balancing performance and reliability.  &lt;/p&gt;

&lt;p&gt;Caching is essential for applications, from mobile apps to web platforms and APIs. When done right, it prevents delays and enhances user experience.&lt;/p&gt;

&lt;p&gt;When implemented correctly, caching reduces unnecessary work, enhances efficiency, and improves user experience. But when we overlook caching, it can lead to stale data, inconsistent behavior, or even system outages like the one Sarah’s team faced.  &lt;/p&gt;

&lt;p&gt;In this article, we’ll walk through different types of caching, from browser caches that speed up loading times to database caches that reduce repeated queries. Along the way, we’ll share real-world examples, practical strategies, and common mistakes we’ve encountered.  &lt;/p&gt;

&lt;p&gt;By the end, we hope to have a clearer understanding of how caching fits into product development—not just as a technical detail, but as a tool we can all use to build faster, more reliable applications.&lt;/p&gt;

&lt;p&gt;Now that we understand why caching is crucial for performance and scalability, let's start by exploring the first layer of caching: the browser cache. This is often the fastest and easiest way to improve load times for end users.&lt;/p&gt;

&lt;h2&gt;
  
  
  Browser Caching: The First Line of Defense
&lt;/h2&gt;

&lt;p&gt;Picture this: we're sipping our morning coffee, opening our favorite news app, and it loads instantly. Not because we have the world's fastest internet connection, but because our browser remembers what it downloaded yesterday. That's browser caching at work—the quiet, behind-the-scenes optimization that makes the web feel fast.  &lt;/p&gt;

&lt;p&gt;When we visit a website for the first time, our browser doesn’t just display the page and forget about it. It stores key resources—the JavaScript that makes buttons work, the CSS that styles the layout, and the images that make the page visually engaging. The next time we visit, instead of downloading everything again, the browser retrieves these cached files in a fraction of the time.  &lt;/p&gt;

&lt;p&gt;HTTP headers dictate what gets cached and for how long, serving as caching instructions from the server:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cache-Control&lt;/strong&gt;: Defines how long to keep a resource before checking for a new version.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ETag&lt;/strong&gt;: Works like a version number—if the file hasn’t changed, the browser skips downloading it again.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Expires&lt;/strong&gt;: Sets a specific expiration date for content.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Last-Modified&lt;/strong&gt;: Lets the browser ask, “Has this changed since the last time I checked?” and reloads only if necessary.
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To implement browser caching effectively, developers often use tools like Webpack (for asset bundling and versioning), Workbox.js (for managing service worker caching), and browser DevTools (Chrome, Firefox, Safari) to inspect cache behavior. Performance audit tools like Google Lighthouse help measure caching effectiveness alongside other optimizations.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation Strategies
&lt;/h3&gt;

&lt;p&gt;Here are some strategies we can use to make browser caching more effective:  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Version-stamping assets&lt;/strong&gt;. Instead of naming a file &lt;code&gt;main.js&lt;/code&gt;, we can use &lt;code&gt;main.d41ef2c.js&lt;/code&gt;. The unique fingerprint tells the browser when to fetch a new version, preventing users from seeing outdated files.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Service workers for offline caching&lt;/strong&gt;. Service workers store and serve critical assets even when there's no internet connection, ensuring smooth offline experiences.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Balancing memory and disk cache&lt;/strong&gt;. Browsers store frequently used assets in fast memory cache, while less critical ones go to disk cache. Structuring assets properly helps optimize performance.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Preloading and prefetching&lt;/strong&gt;. Good websites anticipate what users need next. Using &lt;code&gt;&amp;lt;link rel="preload"&amp;gt;&lt;/code&gt;, we can tell the browser to fetch key assets early. With &lt;code&gt;&amp;lt;link rel="prefetch"&amp;gt;&lt;/code&gt;, we can load resources before users even request them—for example, preloading images for the next page they’re likely to visit.  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;When browser caching is well-implemented, it does more than just speed up loading times—it improves the overall user experience and optimizes infrastructure efficiency.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Faster load times&lt;/strong&gt;. Returning visitors don’t have to re-download assets they already have, making pages feel instantly responsive. A content-heavy news site, for example, saw noticeable improvements when implementing proper browser caching, as frequent readers no longer had to reload the same images and scripts every time they visited.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lower bounce rates&lt;/strong&gt;. Users expect websites to load quickly, and slow performance often leads to frustration and abandonment. An e-commerce company that optimized its caching strategy found that customers stayed on their product pages longer, leading to better engagement and higher conversion rates.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Improved session continuity&lt;/strong&gt;. For applications that rely on frequent interactions, caching can make navigation smoother. A media streaming platform optimized caching for its homepage and video thumbnails, ensuring that users could browse without unnecessary delays when switching between content.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reduced data usage&lt;/strong&gt;. For mobile users, caching reduces the need to re-download resources, which is particularly valuable for those on limited data plans or in regions with slower network connections. A mobile app improved its usability in areas with spotty internet access by caching key interface elements, allowing users to continue browsing seamlessly even with temporary network disruptions.  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Pitfalls
&lt;/h3&gt;

&lt;p&gt;While caching improves performance, it comes with challenges that need careful handling:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Over-aggressive caching&lt;/strong&gt;. Setting cache durations too long for dynamic content can cause users to see outdated information. An e-commerce site once cached its inventory page for six months, leading customers to try purchasing out-of-stock items. To prevent this, it's important to set appropriate expiration times and ensure cache invalidation mechanisms are in place for frequently changing data.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Not caching enough&lt;/strong&gt;. Some assets, like logos or icons, rarely change but are often fetched repeatedly due to improper caching rules. Without caching, users waste bandwidth downloading the same files on every visit. Identifying truly static assets and assigning them long cache durations helps optimize performance without risking outdated content.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache invalidation issues&lt;/strong&gt;. Even after a successful deployment, users may still see outdated versions of a website due to cache not being properly refreshed. This often happens when file names remain unchanged after an update. Using versioned filenames like &lt;code&gt;main.abc123.js&lt;/code&gt; ensures browsers fetch the latest files while still benefiting from caching.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security and privacy risks&lt;/strong&gt;. Without proper controls, caching sensitive data can lead to privacy breaches. A banking app once cached account summaries incorrectly, momentarily exposing one user’s balance to another. To prevent such risks, sensitive content should be marked as non-cacheable using headers like &lt;code&gt;Cache-Control: no-store&lt;/code&gt;, ensuring it is never stored or served from cache.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While browser caching speeds up individual page loads, it has limitations. Users across the globe may experience delays when fetching content from a single server. A visitor in New York might enjoy instant access, while someone in Tokyo faces sluggish load times. This is where Content Delivery Networks (CDNs) step in, delivering content closer to users for a seamless experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  CDN Caching: Bringing Content Closer to Users
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Imagine we’re running a global coffee chain. Instead of brewing all our coffee in Jakarta and shipping it worldwide, which would result in cold, stale coffee, we build local shops in every city. That’s essentially what a Content Delivery Network (CDN) does for digital content.  &lt;/p&gt;

&lt;p&gt;CDNs maintain a vast network of servers—often called edge nodes—strategically positioned around the world. When Marco in Milan or Priya in Pune wants to see our website’s hero image, they don’t have to wait for data to travel from a server in Virginia. Instead, they receive a copy from a nearby edge server, making everything feel much faster no matter where they are.  &lt;/p&gt;

&lt;p&gt;Modern CDNs go beyond static file caching. Many now support caching API responses, running small programs at the edge using tools like Cloudflare Workers or Lambda@Edge, and even protecting against malicious traffic. It’s like having a combination of a local warehouse, a smart assistant, and a security guard in every city where our users live.  &lt;/p&gt;

&lt;p&gt;One of the biggest advantages of CDNs is geographical awareness. If our app suddenly gains popularity in South Korea, the CDN automatically ensures those users get the same fast experience as someone near our headquarters.  &lt;/p&gt;

&lt;p&gt;Different teams select CDNs based on their needs. For example, Cloudflare is widely used for its security and DDoS protection, Fastly is known for real-time cache purging and low latency, AWS CloudFront integrates seamlessly with other AWS services, and Akamai offers a massive global network, making it a preferred choice for large enterprises.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation Strategies
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cache rules configuration&lt;/strong&gt;. Setting different caching policies for different types of content ensures a balance between freshness and performance. For example, a news organization might set up cache rules as follows:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Breaking news pages cached for 5 minutes.
&lt;/li&gt;
&lt;li&gt;Weekly feature articles for 24 hours.
&lt;/li&gt;
&lt;li&gt;Archived content for 30 days.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache keys&lt;/strong&gt;. Cache keys determine what makes content unique. Should two versions of the same product page—one with &lt;code&gt;?ref=email&lt;/code&gt; and one without—be cached separately or treated as the same? One e-commerce company unknowingly created thousands of duplicate caches because they weren’t handling session IDs properly in URLs. A small change to their cache key strategy significantly reduced their CDN costs.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Dynamic content acceleration&lt;/strong&gt;. Even frequently changing content can benefit from short-lived caching. A financial services app caches personalized portfolio summaries for just 30 seconds, eliminating unnecessary database hits while keeping data fresh enough for users.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Edge functions and workers&lt;/strong&gt;. Some CDNs allow small programs to run at the edge to dynamically modify responses. A gaming company used edge functions to insert region-specific tournament details into a single cached page, avoiding the need to generate separate pages for each region.  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;CDN caching improves performance in ways that browser caching alone cannot. Here’s how:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Faster page load times&lt;/strong&gt;. By reducing the distance between users and content, CDNs significantly decrease delays, especially for users far from the origin server.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Global consistency&lt;/strong&gt;. Users across different regions experience similar performance, whether they’re in Brazil, Japan, or the U.S.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced load on origin servers&lt;/strong&gt;. CDNs absorb the bulk of traffic, reducing direct requests to backend infrastructure and preventing overload during high-traffic events. A retail company that experienced traffic spikes on Black Friday relied on a CDN to handle the surge without increasing server capacity.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized bandwidth costs&lt;/strong&gt;. CDNs apply compression and delivery optimizations, reducing data transfer costs. A video streaming startup switched to a CDN with better video compression, cutting delivery expenses while improving streaming quality.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Overly complex cache configurations&lt;/strong&gt;. Some teams implement overly complex caching rules, making them difficult to modify. One engineering manager put it this way: “Our CDN config has become our legacy code.” Keeping rules simple and well-documented makes ongoing maintenance easier.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache coherency issues&lt;/strong&gt;. Keeping content synchronized across different regions isn’t always straightforward. A company launching a new product found that European users saw the update two hours before U.S. users due to inconsistent cache invalidation. This led to confusion, support tickets, and customer complaints on social media.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mismanaged CDN costs&lt;/strong&gt;. CDN pricing models vary—some charge primarily for bandwidth, while others focus on request volume. A streaming service attempted to reduce bandwidth costs but overlooked the fact that their CDN charged mostly for requests, causing their costs to rise instead of fall. Understanding pricing structures is crucial to avoiding unexpected expenses.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security gaps at the CDN level&lt;/strong&gt;. Security measures applied at the origin server don’t automatically carry over to the CDN. A financial services company carefully configured security headers on its main servers but forgot to apply them at the CDN level, leaving key vulnerabilities exposed. Ensuring that security policies are consistently enforced across all layers helps prevent such oversights.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;CDNs are fantastic for delivering static assets like images, CSS, and JavaScript files quickly, but what about dynamic data? API calls, such as product listings, user dashboards, or flight availability, often require fresh data. If every request hits the backend, it can slow down the entire system. API Gateway caching steps in as a solution, reducing redundant requests and improving API response times.&lt;/p&gt;

&lt;h2&gt;
  
  
  API Gateway Caching: The Request Filter
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Imagine we’re running a popular restaurant where the maître d' intercepts repeat orders before they reach the kitchen. "The table in the corner wants another plate of the special pasta? I already know exactly how the chef prepares that—no need to bother the kitchen again." That’s essentially what API Gateway caching does for our applications.  &lt;/p&gt;

&lt;p&gt;Acting as a middle layer between client applications and backend services, API Gateway caching reduces redundant processing by storing commonly requested API responses. While CDNs are optimized for static assets like images and scripts, API Gateways are designed to cache structured data, such as JSON or XML responses, helping to offload repeated database queries and reduce API latency.  &lt;/p&gt;

&lt;p&gt;A travel booking platform initially had uncached API calls taking 600-800ms to return flight results. With API Gateway caching enabled, identical searches took just 40ms, significantly improving responsiveness.  &lt;/p&gt;

&lt;p&gt;Many teams use AWS API Gateway for cloud-native applications, while Kong is a popular choice for self-hosted and Kubernetes environments. For enterprise-scale API management, solutions like Apigee (Google Cloud) are widely used. If we’re already using NGINX, the MicroCache module offers a lightweight alternative. The best choice depends on factors like infrastructure, compliance needs, and scale.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Key Features of API Gateway Caching
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Full response caching&lt;/strong&gt;. Unlike some caching layers that store fragments, API Gateways typically cache entire API responses. A financial services app implemented was making thousands of identical market data queries per minute—API Gateway caching reduced their backend load by 94%.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Security and authentication handling&lt;/strong&gt;. API Gateways can authenticate requests before checking the cache, ensuring unauthorized users don’t access cached responses they shouldn’t see. This is especially critical for applications handling sensitive data.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache key customization&lt;/strong&gt;. We can define which parts of a request—headers, query parameters, path segments, or even body elements—should determine cache uniqueness. A media streaming service I advised improved cache efficiency by including device type in cache keys but excluding session identifiers, dramatically reducing redundant caching.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Granular TTL control&lt;/strong&gt;. Different API endpoints have different freshness needs. A banking app implemented cached account history for 30 minutes, transaction statuses for 60 seconds, and current balances were never cached.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rate limiting and quota management&lt;/strong&gt;. Even when serving cached responses, API Gateways can enforce rate limits, helping prevent traffic spikes from overwhelming backend services.  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Implementation Strategies
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cache per endpoint&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Each API has different requirements—some are read-heavy, others update frequently. A product catalog API implementation cached:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Category listings for 30 minutes
&lt;/li&gt;
&lt;li&gt;Product details for 5 minutes
&lt;/li&gt;
&lt;li&gt;Inventory status for 30 seconds
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Cache segmentation&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Sometimes, the same API endpoint needs different caching rules depending on user type. A B2B platform implementation cached pricing API responses for:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Anonymous users → Cached for 1 hour
&lt;/li&gt;
&lt;li&gt;Authenticated partners → Cached for 5 minutes to reflect negotiated pricing updates
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Selective caching&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Not all HTTP methods should be cached.  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GET requests are typically safe to cache.
&lt;/li&gt;
&lt;li&gt;POST, PUT, and DELETE modify data and should bypass the cache.
One team mistakenly mistakenly cached POST requests, leading to orders not appearing in customer history for 15 minutes.
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Using vary headers&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Many applications deliver different responses based on content type, language, or device. Configuring cache keys properly can prevent unnecessary duplicate caching. A global e-commerce site implementation doubled cache efficiency by properly implementing &lt;code&gt;Vary&lt;/code&gt; headers for different languages.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache bypass options&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Some users need real-time data. We can implement a query parameter like &lt;code&gt;?fresh=true&lt;/code&gt; to allow users to bypass the cache when necessary. One investment platform implementation added a “Refresh” button to ensure users saw real-time financial reports while still benefiting from caching during normal browsing.  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;API Gateway caching improves application performance in multiple ways:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Faster API responses&lt;/strong&gt;. Caching API calls can reduce response times by 70-95%, making applications feel much snappier.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lower backend load&lt;/strong&gt;. By serving cached responses, API Gateways can reduce redundant API calls by over 80%, easing database strain and improving scalability. A social media analytics platform implemented reduced database queries from 3 million to 400,000 per hour just by enabling API caching.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Consistent performance&lt;/strong&gt;. One e-commerce platform implementation didn’t just improve average API response times—they also eliminated unpredictable latency spikes during peak traffic hours. As their CTO put it, "Users notice inconsistency more than they notice raw speed."  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Improved API availability&lt;/strong&gt;. When a payment service implementation had database slowdowns, their API Gateway continued serving cached responses, preventing an outage. Their team estimated that caching bought them 30 minutes of breathing room before backend fixes took effect.  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Over-aggressive caching&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
Caching sensitive, user-specific data can lead to serious security issues. One financial app briefly showed User A’s balance to User B due to improper cache key settings. Always include user identifiers in cache keys when necessary.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inconsistent user experience&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
When caching is misconfigured, some users see fresh data while others get stale responses. A document editing platform implemented cached document statuses too aggressively, causing team members to see outdated content for several minutes, even after refreshing.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache poisoning&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
If an error response or incorrect data gets cached, it can spread to multiple users. A healthcare app implementation cached incomplete patient records due to a database migration issue—turning a 30-second problem into a 15-minute one.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Hidden bugs&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
When caching is working well, we may not notice backend failures immediately. One team mistakenly discovered that their API was throwing errors 20% of the time, but the cache had masked the problem for weeks. Regular cache bypass testing helps detect hidden failures.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache stampede&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
When a frequently accessed item expires, multiple clients may request fresh data at the same time. This sudden spike can overload the backend, causing unexpected performance issues. A sports statistics platform implemented saw their database crash during a major game because their player stats cache expired just as a star player scored.&lt;br&gt;&lt;br&gt;
&lt;strong&gt;Solution:&lt;/strong&gt; Use staggered expirations and background refreshes to avoid traffic spikes when cache entries expire.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;API Gateway caching optimizes API calls, but what if we need to cache frequently accessed data within the application itself? Imagine a dashboard that displays the same metrics for hundreds of users—fetching this data from the database every time would be inefficient. Instead, application caching allows us to store frequently used data in memory, significantly improving performance and reducing backend strain.&lt;/p&gt;

&lt;h2&gt;
  
  
  Application Layer Caching: The Middle Tier
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Remember the last time we asked a friend the same question twice in five minutes? They probably gave us a look that said, "I just told you that." With application layer caching, we avoid recalculating or fetching data we’ve already seen, making our systems much more efficient.  &lt;/p&gt;

&lt;p&gt;Sitting between the application and the database, application caching acts as a short-term memory store. It holds frequently accessed data in fast, in-memory storage like Redis or Memcached—think of them as digital scratch pads that can be read thousands of times faster than even the most optimized database.  &lt;/p&gt;

&lt;p&gt;Many teams rely on Redis for its versatility and support for different data structures. Managed services like AWS ElastiCache and Azure Cache for Redis simplify operations, while language-specific libraries like Caffeine (Java) and node-cache (Node.js) provide efficient caching within specific tech stacks.  &lt;/p&gt;

&lt;p&gt;Unlike browser and CDN caching, which primarily handle static assets, application caching deals with dynamic data. It stores information that changes frequently but doesn’t need to be recalculated or retrieved every time, such as:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User profile details
&lt;/li&gt;
&lt;li&gt;Product inventories
&lt;/li&gt;
&lt;li&gt;Complex API responses
&lt;/li&gt;
&lt;li&gt;Results of computationally expensive tasks
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A team optimizing a recommendation engine once spent three days refining an algorithm to generate product suggestions. They later discovered that caching those recommendations for a few minutes provided an even greater performance boost within just three hours. "We were trying to build a faster car," they noted, "when we really just needed to stop making unnecessary trips."  &lt;/p&gt;

&lt;h3&gt;
  
  
  Common Caching Patterns
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Data caching.&lt;/strong&gt; Storing database records or API responses in memory to reduce repetitive database queries. By keeping frequently accessed data readily available, this approach reduces database load and improves response times while maintaining relatively fresh data.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Computation caching.&lt;/strong&gt; Storing the results of expensive calculations so they don’t have to be recomputed repeatedly. For example, a financial services platform calculating risk assessment scores for users might cache the results for a short period. Instead of recalculating the same data for every request, the system retrieves it instantly from cache, significantly improving response times and reducing the load on computing resources.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Session caching.&lt;/strong&gt; Storing user session data in memory for quick access. This ensures that applications can efficiently maintain user authentication, preferences, and shopping cart data across multiple requests or page reloads without frequent database lookups.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Rate limiting.&lt;/strong&gt; Using a cache to track and limit API requests from individual users. This helps prevent accidental or intentional overload of the system by enforcing request thresholds while reducing unnecessary processing.  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Implementation Strategies
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache-aside (lazy loading).&lt;/strong&gt; Before making a request to the database, the application first checks the cache. If the data isn’t there, it fetches the information, stores it in the cache, and serves it to the user. This pattern is widely used because it keeps cache management simple. A login API implementation reduced response time from 600ms to 40ms by caching user authentication data this way.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Read-through caching.&lt;/strong&gt; The cache itself is responsible for retrieving missing data. If the requested data isn’t in the cache, it automatically fetches it from the source. This simplifies application logic but requires a more sophisticated caching layer.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write-through caching.&lt;/strong&gt; Every time data is updated, it’s written to both the cache and the backend database simultaneously. This ensures the cache is always in sync with the latest data but adds some latency to write operations. A ticket-booking platform used this approach to ensure seat availability information was always accurate.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Write-behind caching.&lt;/strong&gt; Data is first written to the cache and then asynchronously updated in the backend. This improves write performance but carries some risk if the system fails before syncing with the database.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Time-based expiration.&lt;/strong&gt; Different types of data have different expiration needs. An e-commerce site implemented a layered approach:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Product descriptions cached for 1 week
&lt;/li&gt;
&lt;li&gt;Inventory levels cached for 5 minutes
&lt;/li&gt;
&lt;li&gt;Flash sale prices cached for 30 seconds
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Application caching improves both user experience and infrastructure efficiency:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Faster user interactions.&lt;/strong&gt; Caching frequently accessed data can reduce API response times by 50-95%, making applications feel more responsive. A mobile app reduced its average API response time from 300ms to just 35ms after implementing application caching.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lower infrastructure costs.&lt;/strong&gt; Optimized caching reduces CPU and memory usage, allowing teams to handle more traffic with fewer resources. A B2B platform reduced its server count by 60% while handling more requests.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reduced database load.&lt;/strong&gt; Caching minimizes expensive database queries, keeping systems stable under heavy traffic. An analytics dashboard lowered database CPU utilization from 85% to 30%, eliminating timeouts during peak hours.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Better scalability without extra cost.&lt;/strong&gt; Caching allows systems to handle traffic spikes without requiring a massive increase in infrastructure. As one CTO put it, "Before caching, each new marketing campaign meant an emergency infrastructure meeting. Now we just watch the metrics and smile, knowing the system will handle it."  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache invalidation challenges.&lt;/strong&gt; Knowing when to refresh or discard cached data is surprisingly complex. Some engineering teams have created cache invalidation diagrams that look more like abstract art than structured designs.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Stale data issues.&lt;/strong&gt; If cache invalidation isn’t handled correctly, users may see outdated information. A marketplace app once displayed "In Stock" labels for products that had already sold out, frustrating customers and increasing support tickets.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache penetration.&lt;/strong&gt; If non-existent data is frequently requested, it can bypass the cache and overload the database. A system experiencing slowdowns due to bots requesting random product IDs mitigated the issue by implementing a "negative result cache" to remember which IDs didn’t exist.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache avalanche.&lt;/strong&gt; If many cached items expire simultaneously, the sudden surge of database queries can cause system failure. A social platform crashed during a product launch when all promotional content caches expired simultaneously, triggering thousands of database queries. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Local vs. distributed caching challenges.&lt;/strong&gt; Local in-memory caches work well for small applications, but as traffic grows, a distributed caching system becomes essential. A startup struggling with inconsistent user experiences found that switching to Redis as a centralized cache immediately resolved the issue.  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Application caching keeps frequently accessed data readily available, but what happens when the database itself becomes the bottleneck? Every database query consumes CPU and memory, leading to slow response times under heavy load. Database caching steps in as the final layer of defense, storing precomputed results and frequently queried data in memory to keep things running smoothly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Database Caching: The Foundation Layer
&lt;/h2&gt;

&lt;h3&gt;
  
  
  How It Works
&lt;/h3&gt;

&lt;p&gt;Imagine walking into a library where the librarian already has our favorite books set aside because "we always ask for these." That’s similar to how database caching works—it anticipates frequently accessed data and keeps it readily available, making our applications run more efficiently.  &lt;/p&gt;

&lt;p&gt;Database caching is the deepest, most fundamental layer of caching in our application stack. Unlike other caching layers that store pre-processed responses, database caching optimizes how data is retrieved, reducing redundant work and improving performance.  &lt;/p&gt;

&lt;p&gt;Modern database systems do much more than just store data—they actively optimize how information is accessed and processed. They incorporate multiple caching mechanisms, much like a master chess player thinking several moves ahead:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Buffer cache.&lt;/strong&gt; Stores frequently accessed disk pages directly in memory, reducing the need to fetch data from disk. This is like keeping our most-read pages in a high-speed binder instead of going back to the filing cabinet every time.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Query cache.&lt;/strong&gt; Remembers answers to frequently asked questions. If multiple users request "How many users signed up last Tuesday?" the database can return a cached result instead of recalculating it. While MySQL had a built-in query cache (now removed due to inefficiencies), databases like PostgreSQL offer alternatives through prepared statements.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Execution plan cache.&lt;/strong&gt; Before executing a query, the database determines the most efficient way to fetch the data. This planning process can be expensive, but caching the execution plan avoids recalculating it every time. It’s like a delivery driver optimizing a route once and reusing it for similar trips.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Materialized views.&lt;/strong&gt; store pre-computed query results as tables, eliminating redundant calculations and speeding up queries. Instead of recalculating "total sales by region, by month" each time someone loads a dashboard, a materialized view keeps this report pre-generated.  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These caching mechanisms vary by database. PostgreSQL supports prepared statements and buffer management tools like pgFincore, while MySQL/MariaDB optimize performance through the InnoDB buffer pool. SQL Server provides buffer pool extensions and execution plan caching.  &lt;/p&gt;

&lt;p&gt;For high-read scenarios, tools like PgBouncer and Amazon RDS Proxy help manage database connections efficiently, while materialized views (supported in PostgreSQL, Oracle, and SQL Server) provide powerful query caching capabilities.  &lt;/p&gt;

&lt;h3&gt;
  
  
  Implementation Strategies
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Index optimization.&lt;/strong&gt; Indexes act like pre-sorted lookup tables, allowing databases to quickly locate specific data. A well-designed index can turn a slow, full-table scan into a lightning-fast lookup. One team reduced a product search query from 30 seconds to 25 milliseconds simply by adding the right index.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Query optimization.&lt;/strong&gt; Small changes in how queries are written can significantly impact performance. Queries that leverage cached execution plans often run much faster than those that force full table scans. Two engineering teams wrote nearly identical queries—one was 50 times faster because it aligned with the database’s caching mechanisms.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Connection pooling.&lt;/strong&gt; Establishing database connections is expensive, involving authentication, state setup, and memory allocation. Connection pooling maintains pre-established connections that applications can reuse, preserving cached execution plans and reducing response times. An e-commerce platform reduced checkout page load time by 300ms simply by implementing connection pooling.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Read replicas.&lt;/strong&gt; For read-heavy workloads, having multiple read-only copies of the database can reduce the load on the primary database. News websites, for example, use read replicas to handle peak morning traffic without slowing down content updates.  &lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Why This Matters
&lt;/h3&gt;

&lt;p&gt;Database caching improves deep system-level performance, directly impacting user experience and scalability:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Faster queries.&lt;/strong&gt; Caching reduces query execution time dramatically. In some cases, complex analytics queries that originally took 30 seconds can be reduced to milliseconds, improving application responsiveness.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Lower database load.&lt;/strong&gt; Caching minimizes redundant computations and disk access, reducing CPU and I/O usage. A healthcare platform lowered database CPU utilization from 95% to 30% just by optimizing buffer cache settings.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;More concurrent users.&lt;/strong&gt; Optimized database caching enables applications to support more simultaneous users. One team improved their system capacity from 5,000 to 20,000 concurrent users without upgrading infrastructure.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Reduced infrastructure costs.&lt;/strong&gt; Efficient caching delays or eliminates the need for expensive database upgrades. A CTO once noted, "We were about to spend $10,000 a month on a database upgrade. After optimizing our caching strategy, we achieved better performance on our existing hardware and postponed that expense for 18 months."  &lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Common Pitfalls
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cache size allocation.&lt;/strong&gt; Allocating the right amount of memory for database caching is critical. Too little memory reduces caching benefits, while too much can starve other processes or cause the system to swap to disk, leading to performance degradation.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Query plan instability.&lt;/strong&gt; Cached execution plans are optimized based on current data distribution. As data changes, a previously efficient query plan can suddenly become inefficient, leading to performance issues. A retail company saw checkout times slow dramatically when their database chose a suboptimal execution plan due to shifting customer behavior.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Over-indexing.&lt;/strong&gt; While indexes speed up reads, they slow down writes because every update requires index maintenance. One team once had &lt;strong&gt;25 indexes&lt;/strong&gt; on a single table—so many that inserts and updates were spending more time updating indexes than modifying the actual data.  &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Isolation level mismatches.&lt;/strong&gt; Database transactions follow isolation rules that control how changes are visible to concurrent requests. A financial services app showed inconsistent account balances because its caching strategy conflicted with its chosen isolation level.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We’ve explored each caching layer independently, but the real magic happens when they work together. A robust caching strategy isn't just about optimizing one layer—it’s about ensuring browser, CDN, API, application, and database caching function cohesively to prevent bottlenecks. Let’s explore how to align these layers for maximum efficiency and scalability.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrated Caching Strategy: Putting It All Together
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Layered Caching Architecture
&lt;/h3&gt;

&lt;p&gt;An effective caching strategy leverages multiple layers, with each layer optimized for specific types of data and access patterns:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Browser Cache.&lt;/strong&gt; Stores static assets like images, stylesheets, and scripts directly on the user’s device, reducing load times for repeat visits.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CDN.&lt;/strong&gt; Distributes cached content across globally distributed edge servers, ensuring fast delivery regardless of location.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API Gateway Cache.&lt;/strong&gt; Speeds up API response times by caching frequently requested data before it reaches backend services.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application Cache.&lt;/strong&gt; Reduces redundant computations by caching frequently accessed data and computed results in memory.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database Cache.&lt;/strong&gt; Optimizes query execution by storing precomputed results, reducing database load and improving scalability.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cache Coherency Across Layers
&lt;/h3&gt;

&lt;p&gt;One of the most challenging aspects of multi-layer caching is maintaining consistency across layers. Strategies include:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cache Invalidation Chains.&lt;/strong&gt; Ensures that when data is updated in one layer, all dependent caches are invalidated, preventing stale responses.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTL Hierarchies.&lt;/strong&gt; Higher caching layers (e.g., browser and CDN) expire cached content more quickly than lower layers, balancing freshness and efficiency.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-Based Invalidation.&lt;/strong&gt; Uses pub/sub messaging (e.g., Redis Pub/Sub, Kafka) to notify caching layers when data changes, improving consistency.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Versioned Cache Keys.&lt;/strong&gt; Embeds data versions into cache keys, ensuring clients retrieve the latest content without requiring manual cache clearing.
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Monitoring and Optimization
&lt;/h3&gt;

&lt;p&gt;A successful caching implementation requires ongoing monitoring and refinement:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cache Hit Ratio.&lt;/strong&gt; measures how often data is served from the cache instead of being fetched from the database. A higher ratio means fewer expensive API or database calls. Regular monitoring helps fine-tune caching strategies.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Size and Eviction.&lt;/strong&gt; Ensuring caches aren’t overfilled prevents excessive evictions, which can lead to performance drops.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response Time Distribution.&lt;/strong&gt; Comparing cached vs. non-cached response times highlights areas where caching can be improved.
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost-Benefit Analysis.&lt;/strong&gt; Balances the savings from caching with the risk of serving stale data, ensuring an optimal caching strategy.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Caching isn’t just about improving performance—it’s a core architectural decision that directly impacts user experience and scalability. A well-designed caching strategy ensures applications remain fast, efficient, and resilient under heavy load. Now, let’s summarize the key takeaways and best practices to implement caching successfully.&lt;/p&gt;

&lt;h2&gt;
  
  
  Conclusion: Caching as a Product Strategy
&lt;/h2&gt;

&lt;p&gt;As explored throughout this article, caching isn’t merely a technical optimization—it’s a fundamental product strategy that impacts everything from user experience to operational costs. When implemented thoughtfully across all application layers, caching provides a competitive edge through superior performance, lower infrastructure costs, and improved scalability.  &lt;/p&gt;

&lt;p&gt;The most successful product teams treat caching as a core architectural decision rather than an afterthought. They recognize that different layers require different caching approaches and design their systems to maximize the strengths of each layer.  &lt;/p&gt;

&lt;p&gt;A well-designed caching strategy goes beyond speed—it enhances user experience, optimizes infrastructure costs, and ensures applications scale efficiently. By treating caching as an integral part of system architecture from the start, products can be built to handle growth and traffic surges while providing a seamless experience for users, no matter where they are.&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Mastering The Art of Pull Requests: A Developer's Guide to Smooth Code Reviews</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Mon, 24 Feb 2025 20:03:53 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/the-art-of-pull-requests-a-developers-guide-to-smooth-code-reviews-38bk</link>
      <guid>https://dev.to/budiwidhiyanto/the-art-of-pull-requests-a-developers-guide-to-smooth-code-reviews-38bk</guid>
      <description>&lt;p&gt;In the fast-paced world of software development, version control tools like Git have become essential for keeping projects organized and collaborative. As developers, we often work in parallel, creating branches for new features, fixing bugs, and making updates. But when it’s time to bring those changes back into the main codebase, pull requests (PRs) are the bridge between isolated development and team collaboration.&lt;/p&gt;

&lt;p&gt;We’ve all faced the frustration of a PR getting rejected or sent back for revisions. It’s easy to spend hours on a feature, only to realize our implementation doesn’t align with the team’s vision. Over time, however, we learn that effective PRs aren’t just about the code itself. They’re about how we communicate our changes.&lt;/p&gt;

&lt;p&gt;In this article, we’ll walk through how to craft pull requests that make the review process smoother, faster, and more efficient for everyone involved.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Crafting the Perfect Pull Request&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;1. Start with an Informative Title&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;First impressions matter, even in the world of code. The title of our pull request is our chance to immediately convey the essence of the change. Think of it as the headline of an article; it should be short, clear, and informative. Avoid titles like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"Updated code"&lt;/li&gt;
&lt;li&gt;"Fixed bug"&lt;/li&gt;
&lt;li&gt;"Changes from yesterday"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Instead, use more descriptive titles, such as:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;feat: Add OAuth2 authentication for API endpoints&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;fix: Resolve race condition in user session handling&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;refactor: Optimize database query performance in user search&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A well-crafted title helps our reviewers understand the change without opening the PR. It sets the tone for the whole review process.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;2. Set the Context (Don’t Assume Everyone Knows the Problem)&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Before diving into code, it’s important to provide context. Not everyone may be familiar with the specific problem we’re solving, and it’s essential to explain the &lt;em&gt;why&lt;/em&gt; behind the changes. A solid PR description makes the review easier and faster. Here’s a template for a well-structured PR description:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Problem&lt;/span&gt;
The current user authentication system doesn't support social login, causing friction during user onboarding. We're seeing a 40% drop-off rate at the registration step.
Related ticket: AUTH-123

&lt;span class="gu"&gt;## Solution&lt;/span&gt;
Implemented OAuth2 authentication flow with Google:
&lt;span class="p"&gt;-&lt;/span&gt; Added OAuth2 middleware for handling Google authentication
&lt;span class="p"&gt;-&lt;/span&gt; Created new user profile mapping logic
&lt;span class="p"&gt;-&lt;/span&gt; Implemented session management for social login

&lt;span class="gu"&gt;## Technical Details&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; Uses passport-google-oauth20 for authentication
&lt;span class="p"&gt;-&lt;/span&gt; Added new database fields: googleId, socialProfile
&lt;span class="p"&gt;-&lt;/span&gt; Modified user model to support multiple auth methods

&lt;span class="gu"&gt;## Testing&lt;/span&gt;
&lt;span class="p"&gt;1.&lt;/span&gt; Click "Login with Google" button
&lt;span class="p"&gt;2.&lt;/span&gt; Authorize test application
&lt;span class="p"&gt;3.&lt;/span&gt; Verify successful redirect to dashboard
&lt;span class="p"&gt;4.&lt;/span&gt; Check user profile contains Google data

&lt;span class="gu"&gt;## Configuration&lt;/span&gt;
New environment variables required:
&lt;span class="p"&gt;-&lt;/span&gt; GOOGLE_CLIENT_ID
&lt;span class="p"&gt;-&lt;/span&gt; GOOGLE_CLIENT_SECRET
&lt;span class="p"&gt;-&lt;/span&gt; OAUTH_CALLBACK_URL
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This structure ensures that our reviewers understand the issue, the approach we’ve taken, and how to validate the solution. A PR without context can slow things down significantly, so always make sure to include enough information for our team to understand our changes.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;3. Keep the PR Focused&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We’ve all been there: tempted to tackle multiple issues in one pull request. But this often leads to oversized PRs that can overwhelm reviewers. Instead, we should try to break our work into smaller, more focused PRs. Each PR should address a specific task. For example, if we’re building a user management system, we can break it down into smaller tasks like:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;First PR: &lt;code&gt;feat: Add basic user model and migration&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Second PR: &lt;code&gt;feat: Implement user authentication endpoints&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Third PR: &lt;code&gt;feat: Add user profile management UI&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Fourth PR: &lt;code&gt;feat: Integrate email verification system&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Each PR should focus on a single feature or bug fix, which makes the review process easier for everyone. This leads to faster reviews and fewer reworks.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;4. Commit Messages: Keep It Clean&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;A good commit message does more than explain the code. It helps everyone understand why the change was made and how it fits into the bigger picture.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Commit Messages Matter:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;They provide context: A well-written commit message explains the &lt;em&gt;why&lt;/em&gt; behind a change.&lt;/li&gt;
&lt;li&gt;They improve collaboration: Future developers can trace the history of the project and easily understand the purpose of each commit.&lt;/li&gt;
&lt;li&gt;They save time: Clear commit messages reduce the need for follow-up questions and prevent back-and-forth during the review process.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples of Poor Commit Messages:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;"Fixed stuff"&lt;br&gt;&lt;br&gt;
&lt;em&gt;Why it’s bad&lt;/em&gt;: This is vague and doesn’t specify what was fixed or why. Did we fix a bug, improve performance, or refactor code? It’s unclear.&lt;br&gt;&lt;br&gt;
&lt;em&gt;Better version&lt;/em&gt;: &lt;code&gt;fix(auth): resolve user login bug caused by expired tokens&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Updated files"&lt;br&gt;&lt;br&gt;
&lt;em&gt;Why it’s bad&lt;/em&gt;: This message doesn’t tell the reviewer what was changed or why the update was necessary.&lt;br&gt;&lt;br&gt;
&lt;em&gt;Better version&lt;/em&gt;: &lt;code&gt;chore: update dependencies to fix security vulnerabilities&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;"Work in progress"&lt;br&gt;&lt;br&gt;
&lt;em&gt;Why it’s bad&lt;/em&gt;: This doesn’t describe any meaningful change and suggests the code is incomplete. It also makes the review process harder, as the reviewer doesn’t know if they’re looking at a finished feature or just a draft.&lt;br&gt;&lt;br&gt;
&lt;em&gt;Better version&lt;/em&gt;: &lt;code&gt;feat(api): add user authentication endpoints&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;5. Review-Readiness Checklist: Why It’s Crucial&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Before submitting a PR, we should use a review-readiness checklist to ensure our code is in top shape. Here’s why having a checklist is important:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Saves reviewers’ time: It minimizes the chances of reviewers asking for basic fixes, allowing them to focus on the logic of the code.&lt;/li&gt;
&lt;li&gt;Improves consistency: A checklist ensures that all pull requests meet the same standard, making the review process smoother for everyone.&lt;/li&gt;
&lt;li&gt;Reduces back-and-forth: By double-checking our code and tests, we avoid multiple rounds of revisions.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Example checklist:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight markdown"&gt;&lt;code&gt;&lt;span class="gu"&gt;## Pre-Submission Checklist:&lt;/span&gt;
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Code follows project style guide
&lt;span class="p"&gt;-&lt;/span&gt; [ ] All tests pass (&lt;span class="sb"&gt;`npm run test`&lt;/span&gt;)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Lint checks pass (&lt;span class="sb"&gt;`npm run lint`&lt;/span&gt;)
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Documentation updated
&lt;span class="p"&gt;-&lt;/span&gt; [ ] No sensitive data in commits
&lt;span class="p"&gt;-&lt;/span&gt; [ ] Branch is up to date with &lt;span class="sb"&gt;`main`&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  &lt;strong&gt;Overcoming Common Pull Request Challenges&lt;/strong&gt;
&lt;/h3&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Challenge 1: Pull Requests with Too Many Changes&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;We’ve faced the temptation to submit a massive PR that includes multiple features. However, this often leads to confusion and long review times. Here’s how we solved it: Instead of trying to work on everything at once, we broke the task into smaller, more manageable PRs. First for the database schema changes, then for API modifications, followed by frontend adjustments. This made it easier for reviewers to focus on one thing at a time. &lt;/p&gt;

&lt;p&gt;Lesson Learned: Keep the PRs focused on a single aspect of the project.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Challenge 2: Lack of Context in PR Descriptions&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Early in my career, I submitted PRs with minimal descriptions, assuming everyone knew what was happening. This led to confusion, questions, and delays in the review process. To fix this, I started adding detailed descriptions for each PR, explaining what the change was, why it was necessary, and how to test it. It made the review process much faster and more efficient.&lt;/p&gt;

&lt;p&gt;Lesson Learned: Always provide context in our PR descriptions. Clear explanations can save us time and reduce the need for back-and-forth.&lt;/p&gt;

&lt;h4&gt;
  
  
  &lt;strong&gt;Challenge 3: Inconsistent Commit Messages&lt;/strong&gt;
&lt;/h4&gt;

&lt;p&gt;Our team once struggled with inconsistent commit messages, which made it difficult to track changes. Some messages were too vague, while others were overly detailed. To resolve this, we created a standardized commit message format (e.g., &lt;code&gt;feat:&lt;/code&gt;, &lt;code&gt;fix:&lt;/code&gt;, &lt;code&gt;chore:&lt;/code&gt;) and made sure everyone followed it. This made the project history much more readable and improved collaboration.&lt;/p&gt;

&lt;p&gt;Lesson Learned: Use a consistent commit message format. It helps everyone understand changes quickly.&lt;/p&gt;

&lt;h3&gt;
  
  
  &lt;strong&gt;Conclusion: Pull Requests as a Collaborative Opportunity&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Crafting effective pull requests is a skill that improves with practice. By following these guidelines, we can make the review process more efficient while maintaining a cleaner, more maintainable codebase. Each PR is a learning opportunity. Every review comment, suggestion, or question helps us grow as developers and improve our coding practices.&lt;/p&gt;

&lt;p&gt;The goal isn’t just to get our code merged; it’s to build a high-quality codebase that’s easy for everyone to understand and maintain. By putting extra care into our PRs, we’re investing in the success of our project and our growth as developers.&lt;/p&gt;

&lt;p&gt;Read More :&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://www.conventionalcommits.org/en/v1.0.0/" rel="noopener noreferrer"&gt;https://www.conventionalcommits.org/en/v1.0.0/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>softwareengineering</category>
      <category>softwaredevelopment</category>
      <category>github</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Feature Toggles: A Simple Way to Manage Access to Premium Features</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Tue, 11 Feb 2025 22:48:05 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/feature-toggles-a-simple-way-to-manage-access-to-premium-features-52j6</link>
      <guid>https://dev.to/budiwidhiyanto/feature-toggles-a-simple-way-to-manage-access-to-premium-features-52j6</guid>
      <description>&lt;p&gt;Imagine building an application with both free and premium features, and wanting to roll out a new feature for premium users without disrupting the experience for everyone else. The challenge is ensuring only the right users get access while keeping things smooth. For example, a streaming service like Netflix offers exclusive content or features to premium users, such as higher streaming quality or early access to new shows. They need a way to release these features gradually to the right audience and monitor performance before full rollout.&lt;/p&gt;

&lt;p&gt;This is where feature toggles come in. They allow us to control which features are visible to different user groups, like premium subscribers. By enabling or disabling specific features based on user access levels, we can release new features gradually to the right audience, just like Netflix does. This approach helps monitor the feature’s performance, gather feedback, and ensure everything runs smoothly before making it available to all premium users, without affecting non-premium ones.&lt;/p&gt;

&lt;h2&gt;
  
  
  Understanding Feature Toggles: More Than Just On/Off Switches
&lt;/h2&gt;

&lt;p&gt;So, what is a feature toggle? A feature toggle, also known as a feature flag, is a technique used in software development to enable or disable specific features without deploying new code. It allows us to control the visibility and availability of certain features in an application based on specific conditions, such as user roles or subscription levels. &lt;/p&gt;

&lt;p&gt;It’s much more than a simple if/else statement. Feature toggles provide us with a powerful strategy for managing application behavior. Think of it as a control panel for features, where switches can be flipped on and off based on the needs of different users. This gives us the flexibility to gradually release features, test them, or provide exclusive content to certain user groups—without disrupting the experience for everyone else.&lt;/p&gt;

&lt;h3&gt;
  
  
  When Is This Useful?
&lt;/h3&gt;

&lt;p&gt;Here are some scenarios where feature toggles really shine:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Gradually rolling out a new UI to ensure it doesn’t break anything&lt;/li&gt;
&lt;li&gt;Running A/B tests to figure out if a blue button performs better than a red one&lt;/li&gt;
&lt;li&gt;Controlling access to premium features (we’ll dive deeper into this one!)&lt;/li&gt;
&lt;li&gt;Testing features in different environments without changing the code&lt;/li&gt;
&lt;li&gt;Giving specific users access to beta-test features while others remain unaware&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  System Architecture: How It All Fits Together
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwmy3n31dvm66iny15rh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fjwmy3n31dvm66iny15rh.png" alt="Image description" width="800" height="561"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Let’s take a closer look at how a feature toggle system works. Breaking it down into components helps make it clearer.&lt;/p&gt;

&lt;h3&gt;
  
  
  The System Components
&lt;/h3&gt;

&lt;p&gt;The feature toggle system consists of several key parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Client Browser: This is where the user interacts with the application (e.g., web or mobile browser).&lt;/li&gt;
&lt;li&gt;Frontend App: The client browser communicates with the frontend application, where the user interface is managed.&lt;/li&gt;
&lt;li&gt;Backend API: The frontend app sends requests to the backend API, which handles the business logic and makes decisions about feature availability.&lt;/li&gt;
&lt;li&gt;Toggle Service: The toggle service evaluates whether a specific feature should be enabled or disabled based on user access levels and feature configurations.&lt;/li&gt;
&lt;li&gt;User Service: This service checks user-related information, like their subscription level or role, to determine their access rights.&lt;/li&gt;
&lt;li&gt;Toggle DB: The database where feature configurations are stored, determining whether a feature is on or off for specific user groups.&lt;/li&gt;
&lt;li&gt;User DB: The database that stores user information, such as their subscription level or status.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Understanding the Flow
&lt;/h3&gt;

&lt;p&gt;Consider the flowchart that illustrates how the system handles feature access. Here’s what happens step-by-step:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The frontend app requests information about feature access from the backend API.&lt;/li&gt;
&lt;li&gt;The backend API then checks the user’s subscription level (whether they are a premium or free user).&lt;/li&gt;
&lt;li&gt;If the user is a premium subscriber, the toggle service checks if the requested feature is enabled for their user group by looking at the Toggle DB.

&lt;ul&gt;
&lt;li&gt;If the feature is enabled, the feature is shown to the user.&lt;/li&gt;
&lt;li&gt;If the feature is disabled, the user is shown an upgrade prompt.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;If the user is a free subscriber, the backend skips the feature toggle check and directly shows the upgrade prompt.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Premium Feature Access Control: A Real-World Example
&lt;/h2&gt;

&lt;p&gt;Imagine building a SaaS platform with both free and premium features. Just like streaming services such as Netflix, where premium subscribers get access to exclusive content like higher streaming quality or early releases, we need to ensure that only paying users can access these features. The challenge is offering these benefits without affecting the experience of free-tier users.&lt;/p&gt;

&lt;p&gt;Netflix, for example, uses feature toggles to gradually release premium features or exclusive content to a select group of premium users. This helps them test the feature, gather feedback, and monitor performance before rolling it out to all paying users. This strategy ensures a smooth user experience while maintaining the value of their subscription plans.&lt;/p&gt;

&lt;p&gt;To implement this in our own projects, let’s take a look at how we can set up the necessary data models and backend logic to manage access control based on subscription levels, using feature toggles to control feature availability.&lt;/p&gt;

&lt;h3&gt;
  
  
  Setting Up the Data Models
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;ToggleConfig&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;subscriptionLevel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;enabled&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;lastModified&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;modifiedBy&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Getters and setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ToggleConfig&lt;/code&gt; stores the configuration for each feature. The &lt;code&gt;featureName&lt;/code&gt; identifies the feature, &lt;code&gt;subscriptionLevel&lt;/code&gt; defines which users can access it, and &lt;code&gt;enabled&lt;/code&gt; indicates whether it is active for that level. The &lt;code&gt;lastModified&lt;/code&gt; and &lt;code&gt;modifiedBy&lt;/code&gt; fields help track changes made to the feature.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;subscriptionLevel&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;subscriptionStart&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;subscriptionEnd&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Getters and setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;User&lt;/code&gt; stores information about a user, such as their &lt;code&gt;subscriptionLevel&lt;/code&gt;, which is essential for checking feature access.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FeatureAccess&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;hasAccess&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt; &lt;span class="n"&gt;lastChecked&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// Getters and setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;FeatureAccess&lt;/code&gt; records whether a user has access to a particular feature, including the user ID, feature name, access status, and the date it was checked.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Backend Magic
&lt;/h3&gt;

&lt;p&gt;Here’s where it gets interesting. Let’s look at how we check if a user can access a feature:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FeatureToggleService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ToggleRepository&lt;/span&gt; &lt;span class="n"&gt;toggleRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;UserRepository&lt;/span&gt; &lt;span class="n"&gt;userRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;FeatureAccessRepository&lt;/span&gt; &lt;span class="n"&gt;accessRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;isFeatureEnabledForUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;// Get user subscription level&lt;/span&gt;
        &lt;span class="nc"&gt;User&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;userRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findById&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;orElseThrow&lt;/span&gt;&lt;span class="o"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UserNotFoundException&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;

        &lt;span class="c1"&gt;// Check feature toggle configuration&lt;/span&gt;
        &lt;span class="nc"&gt;Optional&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;ToggleConfig&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;toggleConfig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;toggleRepository&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByFeatureAndSubscriptionLevel&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getSubscriptionLevel&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;

        &lt;span class="c1"&gt;// Record access check&lt;/span&gt;
        &lt;span class="nc"&gt;FeatureAccess&lt;/span&gt; &lt;span class="n"&gt;access&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FeatureAccess&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setUserId&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setFeatureName&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setHasAccess&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;toggleConfig&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;map&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nl"&gt;ToggleConfig:&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="n"&gt;isEnabled&lt;/span&gt;&lt;span class="o"&gt;).&lt;/span&gt;&lt;span class="na"&gt;orElse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;));&lt;/span&gt;
        &lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setLastChecked&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="o"&gt;());&lt;/span&gt;
        &lt;span class="n"&gt;accessRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;access&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isHasAccess&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This service first retrieves the user’s subscription level by querying the &lt;code&gt;UserRepository&lt;/code&gt;. Then it checks if the feature is enabled for that subscription by querying the &lt;code&gt;ToggleRepository&lt;/code&gt;. After checking, the result is saved in the &lt;code&gt;FeatureAccess&lt;/code&gt; repository, and the access status is returned.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="nd"&gt;@RestController&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;FeatureToggleController&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;FeatureToggleService&lt;/span&gt; &lt;span class="n"&gt;featureToggleService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="nd"&gt;@GetMapping&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/api/features/{featureName}/access"&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;FeatureAccessResponse&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;checkFeatureAccess&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
        &lt;span class="nd"&gt;@PathVariable&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
        &lt;span class="nd"&gt;@RequestParam&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;userId&lt;/span&gt;
    &lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;hasAccess&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;featureToggleService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isFeatureEnabledForUser&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;userId&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="nc"&gt;FeatureAccessResponse&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;FeatureAccessResponse&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;featureName&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;hasAccess&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;hasAccess&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="s"&gt;"Feature available"&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s"&gt;"Please upgrade to access this feature"&lt;/span&gt;
        &lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;ResponseEntity&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ok&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This controller exposes an API endpoint that checks whether a user has access to a specific feature. It calls the &lt;code&gt;FeatureToggleService&lt;/code&gt; to perform the necessary checks and returns the result.&lt;/p&gt;

&lt;h3&gt;
  
  
  Making It Look Good: The Frontend
&lt;/h3&gt;

&lt;p&gt;On the frontend, the idea is to either show the premium feature or an upgrade prompt. Here’s how this is handled in React:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;React&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;useState&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;react&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;useFeatureToggle&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./hooks/useFeatureToggle&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PremiumFeature&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;featureName&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useFeatureToggle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;featureName&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;LoadingSpinner&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ErrorMessage&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="sr"&gt;/&amp;gt;&lt;/span&gt;&lt;span class="err"&gt;;
&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;feature-container&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;isEnabled&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;premium-feature&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;h2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Premium&lt;/span&gt; &lt;span class="nx"&gt;Feature&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/h2&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;PremiumContent&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;div&lt;/span&gt; &lt;span class="nx"&gt;className&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;upgrade-prompt&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;h2&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;Upgrade&lt;/span&gt; &lt;span class="nx"&gt;Required&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/h2&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="nx"&gt;This&lt;/span&gt; &lt;span class="nx"&gt;feature&lt;/span&gt; &lt;span class="nx"&gt;is&lt;/span&gt; &lt;span class="nx"&gt;available&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;premium&lt;/span&gt; &lt;span class="nx"&gt;subscribers&lt;/span&gt; &lt;span class="nx"&gt;only&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/p&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;          &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;UpgradeButton&lt;/span&gt; &lt;span class="o"&gt;/&amp;gt;&lt;/span&gt;
        &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;      &lt;span class="p"&gt;)}&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="sr"&gt;/div&lt;/span&gt;&lt;span class="err"&gt;&amp;gt;
&lt;/span&gt;  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This component checks the feature status using the &lt;code&gt;useFeatureToggle&lt;/code&gt; hook. If the feature is enabled, it displays the premium content; if not, it shows an upgrade prompt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Custom Hook for Feature Toggle&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;useFeatureToggle&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;featureName&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;setState&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;useState&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;checkFeatureAccess&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="s2"&gt;`/api/features/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;featureName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/access?userId=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

        &lt;span class="nf"&gt;setState&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;hasAccess&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;setState&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
          &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;isLoading&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;
        &lt;span class="p"&gt;});&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="nf"&gt;checkFeatureAccess&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;featureName&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;state&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The custom hook makes an API call to check the feature's availability for the given user and feature name. It manages the state for loading, error, and the feature's enabled status.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making It Work in the Real World: Best Practices
&lt;/h2&gt;

&lt;p&gt;Now that we’ve seen how everything works, here are some additional technique to make the feature toggle system robust:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Performance Matters
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Caching toggle states is crucial since checking the database for every request can be slow.&lt;/li&gt;
&lt;li&gt;Keep the toggle logic efficient and simple to avoid unnecessary delays.&lt;/li&gt;
&lt;li&gt;Use background jobs for logging to keep the user experience smooth.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Keep It Secure
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Always validate user permissions to prevent unauthorized access.&lt;/li&gt;
&lt;li&gt;Encrypt sensitive toggle configurations for security.&lt;/li&gt;
&lt;li&gt;Use audit logs to track changes made to feature configurations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Stay Organized
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Regularly monitor which toggles are in use.&lt;/li&gt;
&lt;li&gt;Clean up unused toggles to keep things tidy.&lt;/li&gt;
&lt;li&gt;Document everything to maintain clarity.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  4. Think About Scale
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Design the system to scale efficiently for a large user base and numerous toggles.&lt;/li&gt;
&lt;li&gt;Use caching effectively for scalability.&lt;/li&gt;
&lt;li&gt;Consider using a distributed configuration system for global reach.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Feature toggles really feel like a superpower in the development world. They help us:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Gradually release features and test them safely.&lt;/li&gt;
&lt;li&gt;Experiment with new ideas without risking the entire application.&lt;/li&gt;
&lt;li&gt;Provide tailored experiences for different user groups.&lt;/li&gt;
&lt;li&gt;Do all of this without deploying new code each time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Starting with a single feature toggle is a great way to get familiar with this concept. As the comfort level grows, expanding to more complex scenarios—like controlling premium feature access—becomes easier.&lt;/p&gt;

&lt;p&gt;The best part is that once feature toggles are in place, releasing new features becomes far less stressful. Instead of hoping for the best, we have full control over who sees what and when.&lt;/p&gt;

&lt;p&gt;Thanks for reading! I’d love to hear your thoughts and experiences with implementing feature toggles. Feel free to share your comments or any insights you have from using them in your own projects.&lt;/p&gt;

</description>
      <category>softwareengineering</category>
      <category>architecture</category>
      <category>systemdesign</category>
      <category>java</category>
    </item>
    <item>
      <title>Ensuring Reliable Payment Systems with Idempotency</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Sat, 01 Feb 2025 17:39:49 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/ensuring-reliable-payment-systems-with-idempotency-2d0l</link>
      <guid>https://dev.to/budiwidhiyanto/ensuring-reliable-payment-systems-with-idempotency-2d0l</guid>
      <description>&lt;p&gt;Making payments online should be seamless. But when something goes wrong—whether it's a slow connection or a double-click—the last thing we want is for our customers to get charged twice. This is where idempotency comes in. It’s about making sure that repeated actions (like payment requests) don’t cause unintended side effects, such as multiple charges for the same thing.&lt;/p&gt;

&lt;p&gt;Let's walk through how idempotency works and why it’s crucial for creating a smooth and reliable payment experience.&lt;/p&gt;




&lt;h3&gt;
  
  
  What is Idempotency?
&lt;/h3&gt;

&lt;p&gt;In simple terms, idempotency means that if an operation is repeated, it should always produce the same result. For example, if a customer tries to pay for something twice (either by accident or due to a network hiccup), they should only be charged once.&lt;/p&gt;

&lt;p&gt;Think about it like this: You’re ordering coffee online, and the payment request is sent, but then the page freezes up. You try again, but you don’t want to end up with two cups of coffee and a double charge, right? Idempotency ensures that doesn’t happen.&lt;/p&gt;




&lt;h3&gt;
  
  
  Why Is Idempotency So Important?
&lt;/h3&gt;

&lt;p&gt;Reliable payment systems build trust with customers. If customers worry that their payments could be duplicated or lost due to errors, they’re less likely to use the service again.&lt;/p&gt;

&lt;p&gt;Here’s why idempotency is essential in payment systems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Prevents Duplicate Charges: Sometimes, due to network failures or timeouts, a payment request might be processed multiple times. Idempotency ensures that even if the request is retried, only one payment is processed.&lt;/li&gt;
&lt;li&gt;Increases Customer Confidence: When customers know they won’t get charged twice for the same order, they’re more likely to trust our system and keep using it.&lt;/li&gt;
&lt;li&gt;Protects Business: If we accidentally charge a customer twice, it could result in refund requests, complaints, or even legal issues. Idempotency helps us avoid that.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  How Does Idempotency Work?
&lt;/h3&gt;

&lt;p&gt;Now that we know why idempotency is important, let’s take a look at how it actually works in a payment system. The process revolves around a unique idempotency key, which helps the system recognize duplicate requests.&lt;/p&gt;

&lt;h4&gt;
  
  
  Here’s the basic process:
&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;The user makes a payment request, providing a unique idempotency key.&lt;/li&gt;
&lt;li&gt;The system checks if it has seen this key before:

&lt;ul&gt;
&lt;li&gt;If the key exists, the system simply returns the previously processed result (no duplicate charge).&lt;/li&gt;
&lt;li&gt;If the key doesn’t exist, the system processes the payment and saves the result, along with the key.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;The payment is processed and stored, and the system ensures the same response is sent if the user retries the request.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here’s a flowchart that visualizes this process:&lt;/p&gt;




&lt;h3&gt;
  
  
  Flowchart for Managing Idempotency Calls
&lt;/h3&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxb52ghsl7wclbn9osb0.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvxb52ghsl7wclbn9osb0.png" alt="Flowchart for Managing Idempotency Calls" width="800" height="1384"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Explanation of the Flowchart:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Client Request: The client initiates a payment request, passing an idempotency key.&lt;/li&gt;
&lt;li&gt;Check for Idempotency Key: The server checks whether the request has already been processed.&lt;/li&gt;
&lt;li&gt;If Key Exists: The server returns the cached response.&lt;/li&gt;
&lt;li&gt;If Key Does Not Exist: The server processes the payment.&lt;/li&gt;
&lt;li&gt;Payment Success: Based on the payment result, the status is saved as "SUCCESS" or "FAILED".&lt;/li&gt;
&lt;li&gt;Return Payment Response: The server ensures the same response is returned on retries.&lt;/li&gt;
&lt;/ol&gt;




&lt;h3&gt;
  
  
  How to Implement Idempotency in a Payment System
&lt;/h3&gt;

&lt;p&gt;Now, let's dive into how we can implement idempotency in our own payment system. It’s not as complicated as it might sound, and it can really improve the reliability of our system.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 1: Generate a Unique Idempotency Key
&lt;/h4&gt;

&lt;p&gt;When a user initiates a payment, generate a unique idempotency key for that transaction. This key will serve as an identifier to track the transaction, ensuring that any duplicate requests can be detected.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 2: Check for the Idempotency Key
&lt;/h4&gt;

&lt;p&gt;Before processing the payment, the system should check if the idempotency key has been used before. If it has, return the previously cached response. If not, proceed with the payment.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 3: Store the Payment Result
&lt;/h4&gt;

&lt;p&gt;After the payment is processed, save the result in a database and cache it. This way, if the payment request comes in again (due to a retry), the system will recognize the key and avoid processing the payment again.&lt;/p&gt;

&lt;p&gt;Here’s how we can implement this in code.&lt;/p&gt;




&lt;h3&gt;
  
  
  Code for Implementing Idempotency
&lt;/h3&gt;

&lt;p&gt;Let’s look at a practical example using Java, a common backend language. We’ll simulate a simple payment service that uses an idempotency key to prevent duplicate payments.&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Database Schema for Payments
&lt;/h4&gt;

&lt;p&gt;We’ll need a table to store payment records. This table includes the &lt;code&gt;idempotency_key&lt;/code&gt;, the payment amount, currency, and the payment status (whether it was successful or failed).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idempotency_key&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="nb"&gt;DECIMAL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  2. Payment Service with Idempotency Check
&lt;/h4&gt;

&lt;p&gt;Here’s how we can implement the logic in Java to check and handle idempotency.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.beans.factory.annotation.Autowired&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.transaction.annotation.Transactional&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.BigDecimal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PaymentService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Autowired&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;PaymentRepository&lt;/span&gt; &lt;span class="n"&gt;paymentRepository&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Database repository for payment records&lt;/span&gt;

    &lt;span class="nd"&gt;@Autowired&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;CacheService&lt;/span&gt; &lt;span class="n"&gt;cacheService&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Cache service (could be Redis or in-memory cache)&lt;/span&gt;

    &lt;span class="nd"&gt;@Transactional&lt;/span&gt;
    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="nf"&gt;processPayment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;PaymentException&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

        &lt;span class="c1"&gt;// Check if the request has already been processed by looking into cache first&lt;/span&gt;
        &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="n"&gt;cachedPayment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cacheService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Check cache for processed payment&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cachedPayment&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// If cached payment exists, return it (cached response)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cachedPayment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// If no cached response, check the database (fallback)&lt;/span&gt;
        &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="n"&gt;existingPayment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;paymentRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;findByIdempotencyKey&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;existingPayment&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Save to cache for faster future access&lt;/span&gt;
            &lt;span class="n"&gt;cacheService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;existingPayment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; 
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;existingPayment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Process the payment as it is a new request&lt;/span&gt;
        &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="n"&gt;newPayment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"PROCESSING"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;paymentRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;// Simulate payment processing (normally we'd call a payment gateway here)&lt;/span&gt;
        &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="n"&gt;paymentSuccessful&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;simulatePaymentProcessing&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;paymentSuccessful&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setStatus&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"SUCCESS"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;setStatus&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"FAILED"&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="c1"&gt;// Save payment status to database and cache&lt;/span&gt;
        &lt;span class="n"&gt;paymentRepository&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;save&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="n"&gt;cacheService&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Cache the result for future requests&lt;/span&gt;

        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;newPayment&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;simulatePaymentProcessing&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;compareTo&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;BigDecimal&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;ZERO&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;PaymentService Class&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This service handles payment requests and checks if the idempotency key has been used before by first looking in the cache. If the key is found, the system returns the cached payment response. If the key isn't found, it proceeds with payment processing and saves the result to both the database and the cache.
&lt;/li&gt;
&lt;li&gt;The &lt;code&gt;processPayment&lt;/code&gt; method first checks the cache for the idempotency key, then the database. If neither contains the payment, it processes the payment and stores the result.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;CacheService Class&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;CacheService&lt;/code&gt; manages an in-memory cache using a &lt;code&gt;ConcurrentHashMap&lt;/code&gt;. This is a simple way to store processed payments temporarily, speeding up future requests with the same idempotency key. In production systems, we would likely use a distributed cache like Redis.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;Payment Entity&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The &lt;code&gt;Payment&lt;/code&gt; class represents a payment record with fields for the idempotency key, amount, currency, payment status, and the timestamp of creation. This model is used to store the payment details in the database.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Cache Service Implementation
&lt;/h4&gt;

&lt;p&gt;In this example, we’re using a simple in-memory cache.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;org.springframework.stereotype.Service&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.util.concurrent.ConcurrentHashMap&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Service&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CacheService&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="kd"&gt;final&lt;/span&gt; &lt;span class="nc"&gt;ConcurrentHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ConcurrentHashMap&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&amp;gt;();&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;get&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Fetch from cache&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="n"&gt;payment&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;put&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payment&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Store in cache&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;void&lt;/span&gt; &lt;span class="nf"&gt;remove&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;remove&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// Remove from cache&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  4. Payment Entity (with Idempotency Key)
&lt;/h4&gt;

&lt;p&gt;This is the payment entity that represents a payment record.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;javax.persistence.Entity&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;javax.persistence.Id&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.math.BigDecimal&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;java.time.LocalDateTime&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

&lt;span class="nd"&gt;@Entity&lt;/span&gt;
&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Payment&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;

    &lt;span class="nd"&gt;@Id&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// Unique key for each payment request&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// SUCCESS / FAILED / PROCESSING&lt;/span&gt;
    &lt;span class="kd"&gt;private&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt; &lt;span class="n"&gt;createdAt&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;

    &lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nf"&gt;Payment&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;idempotencyKey&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;BigDecimal&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;idempotencyKey&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;idempot&lt;/span&gt;

&lt;span class="n"&gt;encyKey&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;amount&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;amount&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currency&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;createdAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;LocalDateTime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;now&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// Getters and setters&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Final Thoughts
&lt;/h3&gt;

&lt;p&gt;Idempotency is a small but essential detail in payment systems that ensures a smoother, error-free experience for our customers. By implementing an idempotency key to track each transaction, we can make sure that our users aren’t accidentally charged twice, no matter what happens during the payment process.&lt;/p&gt;

&lt;p&gt;I'd love to hear your thoughts on this! Have you had to handle idempotency in your own systems before? Or maybe you've run into any challenges with payment retries? Feel free to share your experiences or leave a comment below!&lt;/p&gt;

</description>
      <category>java</category>
      <category>systemdesign</category>
      <category>softwaredevelopment</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Designing an Internet Credit Purchase System</title>
      <dc:creator>Budi Widhiyanto</dc:creator>
      <pubDate>Wed, 15 Jan 2025 08:58:11 +0000</pubDate>
      <link>https://dev.to/budiwidhiyanto/designing-an-internet-credit-purchase-system-1175</link>
      <guid>https://dev.to/budiwidhiyanto/designing-an-internet-credit-purchase-system-1175</guid>
      <description>&lt;p&gt;During one of the technical interviews I faced, I was asked to design an e-commerce system that allows users to purchase internet credits from third-party providers.&lt;/p&gt;

&lt;p&gt;Confidently, I proposed a straightforward solution: display available packages, let users select one, process payments via an external gateway, and interact with the provider to deliver the credits. However, when asked about failure scenarios—like the provider running out of stock after a user completes payment—I realized my design lacked the resilience to handle such issues effectively.&lt;/p&gt;

&lt;p&gt;A few weeks ago, I conducted research into &lt;a href="https://dev.to/budiwidhiyanto/designing-a-scalable-backend-for-flash-sales-4g9o"&gt;flash sale systems and inventory reservation patterns&lt;/a&gt;, particularly focusing on inventory reservation strategies. Flash sales often deal with high demand and limited stock, requiring sophisticated mechanisms to maintain system stability and manage customer expectations. One concept I discovered was temporary inventory reservations, which help prevent overselling during peak times.&lt;/p&gt;

&lt;p&gt;This research reminded me of my interview experience. I recognized that applying these inventory reservation strategies could have addressed the shortcomings in my initial design. By incorporating temporary holds on inventory during the checkout process, the system could effectively handle scenarios where the provider's stock is depleted.&lt;/p&gt;

&lt;p&gt;In this research documentation, I aim to share the insights gained from my research and propose a refined approach to designing an internet credit purchase system. By integrating inventory reservation strategies, we can build a platform that is both robust and user-friendly, capable of handling various failure scenarios while providing a seamless experience.&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Key Design Considerations
&lt;/h3&gt;

&lt;p&gt;When designing an internet credit purchasing system, there are a few key factors to consider to ensure a seamless, secure, and enjoyable user experience. Let’s break them down:&lt;/p&gt;

&lt;h4&gt;
  
  
  1.1 Quota Management
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Real-time Quota Verification: The system should instantly check if the internet credit packages are in stock, so users don’t accidentally select unavailable options.&lt;/li&gt;
&lt;li&gt;Temporary Quota Reservation: Add a mechanism to hold a selected package for a short period, giving users enough time to complete their purchase without the risk of losing the item.&lt;/li&gt;
&lt;li&gt;Handling Quota Conflicts: Develop strategies to manage situations where multiple users try to buy the same package at the same time, ensuring fair allocation.&lt;/li&gt;
&lt;li&gt;Cache Management for Package Information: Keep cache data accurate and up-to-date so users always see the right details and availability.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  1.2 Payment Processing
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Secure Payment Handling: Implement strong security measures to protect users’ payment details during transactions.&lt;/li&gt;
&lt;li&gt;Escrow System for Payment Protection: Use an escrow service to hold funds until the credits are delivered, keeping both buyers and providers safe.&lt;/li&gt;
&lt;li&gt;Payment Gateway Integration: Make sure the system connects smoothly with reliable payment gateways to ensure hassle-free transactions.&lt;/li&gt;
&lt;li&gt;Refund Mechanisms: Create clear and user-friendly processes for issuing refunds in case of failed payments or cancellations.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  1.3 Provider Integration
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;System Availability: Partner with providers who have reliable systems to ensure purchases are processed without disruptions.&lt;/li&gt;
&lt;li&gt;API Reliability: Work with providers offering stable, well-documented APIs for seamless integration.&lt;/li&gt;
&lt;li&gt;Service Activation Verification: Include checks to confirm that purchased credits are activated properly and promptly.&lt;/li&gt;
&lt;li&gt;Error Handling and Retries: Implement protocols to catch and resolve errors quickly, with retry mechanisms for any failed processes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  1.4 Transaction Safety
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Money Flow Control: Ensure funds are only released after transactions are completed successfully.&lt;/li&gt;
&lt;li&gt;Transaction Consistency: Keep accurate and consistent records of all transactions to prevent errors.&lt;/li&gt;
&lt;li&gt;Rollback Mechanisms: Have a plan to revert transactions if something goes wrong, protecting both users and the system.&lt;/li&gt;
&lt;li&gt;Audit Trail: Maintain detailed logs to help monitor and troubleshoot any issues effectively.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  1.5 User Experience
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Clear Error Messages: Provide users with understandable and informative error messages to guide them through any issues encountered.&lt;/li&gt;
&lt;li&gt;Transaction Status Visibility: Allow users to easily track the status of their purchases in real-time, enhancing transparency.&lt;/li&gt;
&lt;li&gt;Quick Package Loading: Optimize the system to load available packages swiftly, reducing waiting times for users.&lt;/li&gt;
&lt;li&gt;Real-time Updates: Keep users informed of any changes or updates to their transactions or available packages promptly.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By taking these considerations into account, we can design an internet credit purchasing system that is efficient, secure, and user-friendly, leading to higher user satisfaction and trust.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. System Design and Flow
&lt;/h3&gt;

&lt;p&gt;Building on the foundational considerations outlined above, the next step is translating these principles into a robust and effective system design. By carefully mapping out the interactions between various components, we can ensure that the system not only meets functional requirements but also provides a seamless user experience while maintaining reliability and scalability.&lt;/p&gt;

&lt;p&gt;In this section, we will delve into the system’s architecture and flow, showcasing how the core functionalities—like quota management, payment processing, and service activation—are implemented cohesively. The aim is to highlight how each design choice contributes to addressing potential challenges and delivering a dependable e-commerce credit purchasing platform.&lt;/p&gt;

&lt;p&gt;Let’s start with an overview of the system’s flow, visualized through a flowchart, to illustrate how users interact with the system from start to finish.&lt;/p&gt;

&lt;h4&gt;
  
  
  2.1 Flowchart Diagram
&lt;/h4&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo7j4xxhrz11cyjjngey.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fvo7j4xxhrz11cyjjngey.png" alt="Flowchart Internet Credit Purchase System" width="800" height="1980"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The system's flow is divided into six phases for clarity:&lt;/p&gt;

&lt;h4&gt;
  
  
  Package Selection Phase
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;First, the user visits the package selection page, where the app fetches package data from a cache. This data includes available packages and their cached quota information, which is then displayed to the user.&lt;/li&gt;
&lt;li&gt;The user picks a package and clicks "Buy."&lt;/li&gt;
&lt;li&gt;If the quota for that package isn’t available, the app shows a "Not Available" message and takes the user back to the selection page. Otherwise, the system temporarily reserves the quota for the user.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Purchase Initiation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Next, the system attempts to reserve the quota for the chosen package.&lt;/li&gt;
&lt;li&gt;If the reservation fails, the user sees an error message and is redirected back to the selection page.&lt;/li&gt;
&lt;li&gt;If the reservation is successful, the user moves forward to the payment page.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Payment Phase
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;At this stage, the user starts the payment process and gets redirected to a third-party payment gateway.&lt;/li&gt;
&lt;li&gt;The app waits for a response (callback) from the payment gateway to confirm the payment status.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Payment Processing
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;The app checks the payment gateway's callback to validate the payment:

&lt;ul&gt;
&lt;li&gt;For invalid callbacks, the system logs the issue and halts further steps.&lt;/li&gt;
&lt;li&gt;For valid callbacks:

&lt;ul&gt;
&lt;li&gt;If the payment fails: The system releases the reserved quota and informs the user about the issue.&lt;/li&gt;
&lt;li&gt;If the payment succeeds: The system verifies the payment, holds the funds in escrow, and creates a new order.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;/ul&gt;

&lt;h5&gt;
  
  
  Service Activation
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Once the payment is successful, the system asks the provider to activate the service.

&lt;ul&gt;
&lt;li&gt;If the activation fails: The escrow funds are refunded to the customer, and they’re notified about the failure.&lt;/li&gt;
&lt;li&gt;If the activation succeeds: The system verifies the activation. 

&lt;ul&gt;
&lt;li&gt;If the verification fails, the customer gets a refund. &lt;/li&gt;
&lt;li&gt;If the verification succeeds, the escrow funds are released to the provider, and the customer receives a notification.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;/ul&gt;

&lt;h5&gt;
  
  
  Background Processes
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Periodic Cache Updates: Package data cache is updated regularly.&lt;/li&gt;
&lt;li&gt;Real-time Quota Updates: Quota changes are communicated via WebSocket connections.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This flow ensures a smooth, reliable experience for users, while also managing resources and potential errors effectively.&lt;/p&gt;

&lt;h4&gt;
  
  
  2.2 Sequence Diagram
&lt;/h4&gt;

&lt;p&gt;The sequence diagram below helps to illustrate the interaction between different roles and components.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6rnp4o462oqktq775vq.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fj6rnp4o462oqktq775vq.png" alt="Sequence Diagram Internet Credit Purchase System" width="800" height="1433"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The system's flow is divided into six phases for clarity:&lt;/p&gt;

&lt;h4&gt;
  
  
  Package Selection Phase
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The customer starts by visiting the package selection page.&lt;/li&gt;
&lt;li&gt;The frontend retrieves package data from the cache and displays all the available packages, along with their cached quota information, to the customer.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Purchase Initiation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Once the customer selects a package and clicks "Buy," the frontend sends a purchase request to the backend.&lt;/li&gt;
&lt;li&gt;The backend checks with the provider to see if the selected package’s quota is still available in real time.&lt;/li&gt;
&lt;li&gt;If the quota is available, the backend reserves it temporarily with the provider for 15 minutes.&lt;/li&gt;
&lt;li&gt;The backend then sends a reservation confirmation to the frontend, and the customer is redirected to the payment page.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Payment Phase
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The customer proceeds to the payment page and submits their payment details.&lt;/li&gt;
&lt;li&gt;The frontend sends this information to the backend, which initializes a payment session with the payment gateway.&lt;/li&gt;
&lt;li&gt;Once the payment session is ready, the backend shares the session details with the frontend.&lt;/li&gt;
&lt;li&gt;The frontend redirects the customer to the payment gateway to complete the payment.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Payment Processing
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;At the payment gateway, the customer enters their payment information and completes the payment process.&lt;/li&gt;
&lt;li&gt;The payment gateway notifies the backend of the payment status through a callback:

&lt;ul&gt;
&lt;li&gt;If the payment is successful:

&lt;ul&gt;
&lt;li&gt;The backend verifies the payment status with the payment gateway.&lt;/li&gt;
&lt;li&gt;The payment is held in escrow, and the backend confirms the escrow hold.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;If the payment fails:

&lt;ul&gt;
&lt;li&gt;The backend releases the reserved quota with the provider and updates the payment status to reflect the failure.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;If no callback is received within the expected time:

&lt;ul&gt;
&lt;li&gt;The backend periodically polls the payment gateway to check the payment status. If the payment fails or times out, the backend releases the reserved quota.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Service Activation
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;If the payment is successful, the backend requests the provider to activate the service.&lt;/li&gt;
&lt;li&gt;The provider responds with the activation status, and the backend verifies the activation:

&lt;ul&gt;
&lt;li&gt;If the activation is successful:

&lt;ul&gt;
&lt;li&gt;The backend releases the payment from escrow to the provider.&lt;/li&gt;
&lt;li&gt;A success notification is sent to the customer, letting them know the service is ready.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;If the activation fails:

&lt;ul&gt;
&lt;li&gt;The backend refunds the escrowed funds to the customer.&lt;/li&gt;
&lt;li&gt;A failure notification is sent to inform the customer about the issue.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;/li&gt;

&lt;/ul&gt;

&lt;h4&gt;
  
  
  Background Processes
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The provider sends real-time updates about package quotas to the backend via WebSocket connections.&lt;/li&gt;
&lt;li&gt;These updates ensure the cache is always up-to-date, so customers see the most accurate package availability when browsing.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3. Technical Implementation
&lt;/h3&gt;

&lt;p&gt;Now that we’ve outlined the system’s flow and interactions, it’s time to dive into how it all comes together in code. This section breaks down the implementation step by step, showing how the design is translated into working parts that handle everything from managing orders to interacting with providers and payment systems.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Domain Models
@Getter @Setter
@Entity
public class Package {
    @Id
    private String id;
    private String name;
    private BigDecimal price;
    private BigDecimal providerCost;
    private String description;
    private boolean active;
}

@Getter @Setter
@Entity
public class Order {
    @Id
    private String id;
    private String customerId;
    private String packageId;
    private String reservationId;
    private String paymentId;
    private String escrowId;
    private OrderStatus status;
    private BigDecimal amount;
    private BigDecimal providerCost;
    private LocalDateTime createdAt;
    private LocalDateTime updatedAt;
}

@Getter @Setter
@Entity
public class QuotaReservation {
    @Id
    private String id;
    private String packageId;
    private LocalDateTime expiresAt;
    private ReservationStatus status;
}

// Enums
public enum OrderStatus {
    CREATED, RESERVED, PAYMENT_PENDING, PAYMENT_COMPLETED, 
    IN_ESCROW, ACTIVATING, ACTIVATION_FAILED, COMPLETED, REFUNDED
}

public enum ReservationStatus {
    ACTIVE, EXPIRED, USED, CANCELLED
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s what these classes do:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Package&lt;/strong&gt;: This is where we define the internet credit packages that users can purchase. It keeps track of details like the package ID, name, price, provider cost, description, and whether the package is active or not.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Order&lt;/strong&gt;: Think of this as a record of user purchases. It includes information such as the order ID, customer ID, the selected package ID, and related details like the reservation ID, payment ID, escrow ID, order status, payment amount, provider cost, and timestamps.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;QuotaReservation&lt;/strong&gt;: This handles temporary reservations for package quotas. It logs the reservation ID, the package it’s tied to, when it expires, and its current status (like active or expired).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OrderStatus Enum&lt;/strong&gt;: This enum maps out all the possible stages an order can go through, from &lt;code&gt;CREATED&lt;/code&gt; and &lt;code&gt;RESERVED&lt;/code&gt; to &lt;code&gt;PAYMENT_PENDING&lt;/code&gt;, &lt;code&gt;COMPLETED&lt;/code&gt;, or even &lt;code&gt;REFUNDED&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ReservationStatus Enum&lt;/strong&gt;: Similarly, this enum tracks the state of a quota reservation, whether it’s &lt;code&gt;ACTIVE&lt;/code&gt;, &lt;code&gt;EXPIRED&lt;/code&gt;, &lt;code&gt;USED&lt;/code&gt;, or &lt;code&gt;CANCELLED&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Together, these classes and enums build the backbone for managing packages, orders, and quota reservations in the system. It’s a simple yet structured approach to handle e-commerce functionality effectively.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Request/Response DTOs
@Getter @Setter
public class OrderRequest {
    private String customerId;
    private String packageId;
    private BigDecimal amount;
}

@Getter @Setter
public class PaymentCallback {
    private String orderId;
    private String paymentId;
    private String status;
    private BigDecimal amount;
    private LocalDateTime timestamp;
}

@Getter @Setter
public class QuotaResponse {
    private String packageId;
    private boolean available;
    private Integer remainingQuota;
    private LocalDateTime timestamp;
}

@Getter @Setter
public class ReservationResponse {
    private String id;
    private String packageId;
    private LocalDateTime expiresAt;
    private ReservationStatus status;
}

@Getter @Setter
public class ActivationResponse {
    private String orderId;
    private boolean success;
    private String activationId;
    private String errorCode;
    private String errorMessage;
}

@Getter @Setter
public class VerificationResponse {
    private String orderId;
    private String activationId;
    private boolean success;
    private String status;
    private LocalDateTime activatedAt;
}

@Getter @Setter
public class PaymentRequest {
    private String orderId;
    private BigDecimal amount;
    private String currency;
    private String customerId;
    private String returnUrl;
    private String callbackUrl;
}

@Getter @Setter
public class PaymentSession {
    private String sessionId;
    private String paymentUrl;
    private LocalDateTime expiresAt;
    private String status;
}

@Getter @Setter
public class EscrowResponse {
    private String id;
    private String paymentId;
    private BigDecimal amount;
    private String status;
    private LocalDateTime createdAt;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Let’s break it down:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;OrderRequest&lt;/strong&gt;: This is all about the data needed to create a new order. It includes the customer ID, the package they want to buy, and the total amount they’ll pay. &lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;PaymentCallback&lt;/strong&gt;: Think of this as a notification from the payment gateway. After a payment attempt, it provides details like the order ID, payment ID, status (success or failure), the amount paid, and when the payment happened.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;QuotaResponse&lt;/strong&gt;: This one’s about checking availability. It tells us if a package is available, how much quota is left, and when the information was last updated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ReservationResponse&lt;/strong&gt;: Once a package is reserved, this gives you all the details: the reservation ID, the associated package, when the reservation expires, and its current status (like active or expired).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;ActivationResponse&lt;/strong&gt;: This tells us how the service activation went. If it succeeded or failed, it gives us an activation ID and error details if something went wrong.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;VerificationResponse&lt;/strong&gt;: After activation, we verify if everything went smoothly. This includes the order ID, activation ID, success status, and the time it was activated.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;PaymentRequest&lt;/strong&gt;: Before starting the payment process, this DTO collects the necessary details like the order ID, the amount to be paid, the currency, customer ID, and callback URLs.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;PaymentSession&lt;/strong&gt;: This is what gets created when the payment process kicks off. It includes the session ID, the payment URL (where the user goes to pay), when it expires, and the session status.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;EscrowResponse&lt;/strong&gt;: If the funds are held in escrow, this tells us all about it—like the escrow ID, payment ID, the amount held, status, and when it was created.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All of these classes define the building blocks for communication between different parts of the system—whether it’s requests going out or responses coming back. They ensure everyone (and everything) is on the same page.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Cache Service
@Service
@Slf4j
public class PackageCacheService {
    private final Cache&amp;lt;String, Package&amp;gt; packageCache;
    private final ProviderClient providerClient;

    @Scheduled(fixedRate = 300000) // 5 minutes
    public void updateCache() {
        try {
            List&amp;lt;Package&amp;gt; packages = providerClient.getAllPackages();
            packages.forEach(pkg -&amp;gt; 
                packageCache.put(pkg.getId(), pkg));
        } catch (Exception e) {
            log.error("Failed to update package cache", e);
        }
    }

    public Package getPackage(String id) {
        return packageCache.get(id);
    }

    public void updatePackageQuota(QuotaUpdate update) {
        Package pkg = packageCache.get(update.getPackageId());
        if (pkg != null) {
            // Update quota information
            packageCache.put(update.getPackageId(), pkg);
        }
    }
}

// Provider Integration
@Service
public class ProviderClient {
    private final WebClient webClient;
    private final RetryTemplate retryTemplate;

    public QuotaResponse checkQuota(String packageId) {
        return retryTemplate.execute(context -&amp;gt; 
            webClient.get()
                    .uri("/packages/{id}/quota", packageId)
                    .retrieve()
                    .bodyToMono(QuotaResponse.class)
                    .block()
        );
    }

    public ReservationResponse reserveQuota(String packageId) {
        return webClient.post()
                .uri("/packages/{id}/reserve", packageId)
                .retrieve()
                .bodyToMono(ReservationResponse.class)
                .block();
    }

    public ActivationResponse activateService(String orderId) {
        return webClient.post()
                .uri("/orders/{id}/activate", orderId)
                .retrieve()
                .bodyToMono(ActivationResponse.class)
                .block();
    }

    public VerificationResponse verifyActivation(String orderId) {
        return webClient.get()
                .uri("/orders/{id}/verify", orderId)
                .retrieve()
                .bodyToMono(VerificationResponse.class)
                .block();
    }

    public List&amp;lt;Package&amp;gt; getAllPackages() {
        return webClient.get()
                .uri("/packages")
                .retrieve()
                .bodyToFlux(Package.class)
                .collectList()
                .block();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Cache Service
&lt;/h5&gt;

&lt;h6&gt;
  
  
  1. Purpose:
&lt;/h6&gt;

&lt;p&gt;This service takes care of a local cache that stores package data. The goal is to make the system faster and reduce unnecessary calls to the provider's API.&lt;/p&gt;

&lt;h6&gt;
  
  
  2. Key Features:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;updateCache()&lt;/code&gt;: This method refreshes the local cache every 5 minutes by fetching all package data from the provider. It ensures the cache stays up to date.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getPackage()&lt;/code&gt;: This method retrieves package info from the cache using its ID.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;updatePackageQuota()&lt;/code&gt;: When quota details change, this method updates the cache with the new information for a specific package.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Provider Integration
&lt;/h5&gt;

&lt;h6&gt;
  
  
  1. Purpose:
&lt;/h6&gt;

&lt;p&gt;This service handles communication with the provider's API. It manages tasks like checking quotas, reserving packages, activating services, and verifying those activations.&lt;/p&gt;

&lt;h6&gt;
  
  
  2. Key Features:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;checkQuota()&lt;/code&gt;: This method checks if a package has enough quota available by calling the provider's API.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;reserveQuota()&lt;/code&gt;: It reserves a package's quota for a customer by sending a request to the provider.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;activateService()&lt;/code&gt;: When it's time to activate a service for an order, this method handles the request to the provider.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;verifyActivation()&lt;/code&gt;: After activation, this method confirms whether everything was successful.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;getAllPackages()&lt;/code&gt;: This method retrieves all available packages from the provider, which is useful for updating the cache or displaying package options to users.&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  3. Retry Mechanism:
&lt;/h6&gt;

&lt;p&gt;The service uses &lt;code&gt;RetryTemplate&lt;/code&gt; to automatically retry requests to the provider’s API when there are temporary issues. This ensures the system stays reliable and resilient even during minor hiccups.&lt;/p&gt;

&lt;p&gt;By combining these features, this code ensures the system efficiently manages package data while maintaining smooth and dependable communication with the provider's API.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Payment Gateway Integration
@Service
public class PaymentGatewayClient {
    private final WebClient webClient;

    public PaymentSession initializePayment(PaymentRequest request) {
        return webClient.post()
                .uri("/payments/initialize")
                .body(request)
                .retrieve()
                .bodyToMono(PaymentSession.class)
                .block();
    }

    public EscrowResponse holdInEscrow(String paymentId) {
        return webClient.post()
                .uri("/payments/{id}/escrow", paymentId)
                .retrieve()
                .bodyToMono(EscrowResponse.class)
                .block();
    }

    public void releaseToProvider(String escrowId) {
        webClient.post()
                .uri("/escrow/{id}/release", escrowId)
                .retrieve()
                .bodyToMono(Void.class)
                .block();
    }

    public void refundToCustomer(String escrowId) {
        webClient.post()
                .uri("/escrow/{id}/refund", escrowId)
                .retrieve()
                .bodyToMono(Void.class)
                .block();
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Payment Gateway Integration
&lt;/h5&gt;

&lt;p&gt;This class plays a key role in managing how the system interacts with the payment gateway to handle financial transactions smoothly and securely.&lt;/p&gt;

&lt;h6&gt;
  
  
  1. &lt;code&gt;initializePayment(PaymentRequest request)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;Think of this as starting the payment process. It sends a request to the payment gateway with all the payment details.&lt;/li&gt;
&lt;li&gt;It returns a &lt;code&gt;PaymentSession&lt;/code&gt; object, which includes information like the payment URL and session status.&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  2. &lt;code&gt;holdInEscrow(String paymentId)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;This method secures the payment in an escrow account using the given payment ID.&lt;/li&gt;
&lt;li&gt;It provides an &lt;code&gt;EscrowResponse&lt;/code&gt; object that contains all the details about the escrowed funds.&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  3. &lt;code&gt;releaseToProvider(String escrowId)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;After the service is successfully activated, this method releases the funds from escrow to the service provider.&lt;/li&gt;
&lt;li&gt;The escrow ID is used to identify and release the correct funds.&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  4. &lt;code&gt;refundToCustomer(String escrowId)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;If something goes wrong—like the service activation fails, this method refunds the escrowed funds back to the customer.&lt;/li&gt;
&lt;li&gt;It uses the escrow ID to process the refund.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Key Features:
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;The class uses &lt;code&gt;WebClient&lt;/code&gt; to send HTTP requests to the payment gateway's REST API, ensuring seamless integration.&lt;/li&gt;
&lt;li&gt;It handles critical operations like starting payments, managing escrow, and processing refunds.&lt;/li&gt;
&lt;li&gt;All methods use synchronous calls (via &lt;code&gt;.block()&lt;/code&gt;) to make sure operations are completed before moving on, ensuring reliability.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This class is a crucial piece of the puzzle when it comes to managing secure and efficient financial transactions in the system.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Notification DTOs
@Getter @Setter
public class EmailNotification {
    private String to;
    private String subject;
    private String templateId;
    private Map&amp;lt;String, Object&amp;gt; templateData;
}

@Getter @Setter
public class SmsNotification {
    private String phoneNumber;
    private String templateId;
    private Map&amp;lt;String, Object&amp;gt; templateData;
}

// Notification Service
@Service
@Slf4j
public class NotificationService {
    private final EmailService emailService;
    private final SmsService smsService;
    private final QuotaWebSocketHandler webSocketHandler;

    public void sendSuccessNotification(Order order) {
        try {
            EmailNotification email = buildSuccessEmail(order);
            emailService.sendEmail(email);

            SmsNotification sms = buildSuccessSms(order);
            smsService.sendSms(sms);

            webSocketHandler.sendOrderUpdate(order);
        } catch (Exception e) {
            log.error("Failed to send success notification for order: " + order.getId(), e);
        }
    }

    public void sendFailureNotification(Order order) {
        try {
            EmailNotification email = buildFailureEmail(order);
            emailService.sendEmail(email);

            SmsNotification sms = buildFailureSms(order);
            smsService.sendSms(sms);

            webSocketHandler.sendOrderUpdate(order);
        } catch (Exception e) {
            log.error("Failed to send failure notification for order: " + order.getId(), e);
        }
    }

    private EmailNotification buildSuccessEmail(Order order) {
        EmailNotification notification = new EmailNotification();
        notification.setSubject("Order Completed Successfully");
        notification.setTemplateId("ORDER_SUCCESS");
        notification.setTemplateData(Map.of(
            "orderId", order.getId(),
            "packageId", order.getPackageId(),
            "amount", order.getAmount()
        ));
        return notification;
    }

    private SmsNotification buildSuccessSms(Order order) {
        SmsNotification notification = new SmsNotification();
        notification.setTemplateId("ORDER_SUCCESS_SMS");
        notification.setTemplateData(Map.of(
            "orderId", order.getId()
        ));
        return notification;
    }

    private EmailNotification buildFailureEmail(Order order) {
        EmailNotification notification = new EmailNotification();
        notification.setSubject("Order Processing Failed");
        notification.setTemplateId("ORDER_FAILURE");
        notification.setTemplateData(Map.of(
            "orderId", order.getId(),
            "packageId", order.getPackageId()
        ));
        return notification;
    }

    private SmsNotification buildFailureSms(Order order) {
        SmsNotification notification = new SmsNotification();
        notification.setTemplateId("ORDER_FAILURE_SMS");
        notification.setTemplateData(Map.of(
            "orderId", order.getId()
        ));
        return notification;
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  Notification DTOs
&lt;/h5&gt;

&lt;h6&gt;
  
  
  1. EmailNotification:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;Think of this as a blueprint for sending email notifications. It includes:

&lt;ul&gt;
&lt;li&gt;The recipient's email (&lt;code&gt;to&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;The subject of the email.&lt;/li&gt;
&lt;li&gt;A template ID to determine the format.&lt;/li&gt;
&lt;li&gt;Dynamic data (&lt;code&gt;templateData&lt;/code&gt;) to personalize the content.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h6&gt;
  
  
  2. SmsNotification:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;Similar to the email notification but tailored for SMS. It includes:

&lt;ul&gt;
&lt;li&gt;The recipient's phone number (&lt;code&gt;phoneNumber&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;A template ID for the message format.&lt;/li&gt;
&lt;li&gt;Dynamic data (&lt;code&gt;templateData&lt;/code&gt;) for personalization.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h5&gt;
  
  
  Notification Service
&lt;/h5&gt;

&lt;p&gt;This service handles all the notifications sent to users about their order status. Here's how it works:&lt;/p&gt;

&lt;h6&gt;
  
  
  1. &lt;code&gt;sendSuccessNotification(Order order)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;This method handles sending success notifications. It uses:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;buildSuccessEmail&lt;/code&gt; to create an email notification.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;buildSuccessSms&lt;/code&gt; to create an SMS notification.&lt;/li&gt;
&lt;li&gt;It also sends real-time updates through WebSocket using &lt;code&gt;QuotaWebSocketHandler&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h6&gt;
  
  
  2. &lt;code&gt;sendFailureNotification(Order order)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;This one takes care of failure notifications. It uses:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;buildFailureEmail&lt;/code&gt; for email messages.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;buildFailureSms&lt;/code&gt; for SMS messages.&lt;/li&gt;
&lt;li&gt;Like success notifications, it also sends WebSocket updates.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h6&gt;
  
  
  3. Helper Methods:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;buildSuccessEmail&lt;/code&gt; and &lt;code&gt;buildFailureEmail&lt;/code&gt;: These methods create email notifications based on whether the order was successful or failed. They use templates and the order's details.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;buildSuccessSms&lt;/code&gt; and &lt;code&gt;buildFailureSms&lt;/code&gt;: These do the same but for SMS notifications.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Additional Features:
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;WebSocket Updates: Keeps the front-end updated in real time using &lt;code&gt;QuotaWebSocketHandler&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Error Logging: If something goes wrong, it logs the errors for debugging.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This service ensures that users are always in the loop about their orders, whether it's through email, SMS, or real-time updates.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// WebSocket Related Classes
@Getter @Setter
public class QuotaUpdate {
    private String packageId;
    private Integer availableQuota;
    private LocalDateTime timestamp;
}

// WebSocket Configuration
@Configuration
@EnableWebSocket
public class WebSocketConfig implements WebSocketConfigurer {
    @Override
    public void registerWebSocketHandlers(WebSocketHandlerRegistry registry) {
        registry.addHandler(quotaWebSocketHandler(), "/ws/quota")
               .setAllowedOrigins("*");
    }

    @Bean
    public WebSocketHandler quotaWebSocketHandler() {
        return new QuotaWebSocketHandler();
    }
}

@Component
@Slf4j
public class QuotaWebSocketHandler extends TextWebSocketHandler {
    private final PackageCacheService cacheService;
    private final ObjectMapper objectMapper;
    private final Set&amp;lt;WebSocketSession&amp;gt; sessions = new ConcurrentHashSet&amp;lt;&amp;gt;();

    @Override
    public void afterConnectionEstablished(WebSocketSession session) {
        sessions.add(session);
    }

    @Override
    public void afterConnectionClosed(WebSocketSession session, CloseStatus status) {
        sessions.remove(session);
    }

    @Override
    protected void handleTextMessage(
            WebSocketSession session, 
            TextMessage message) throws IOException {
        QuotaUpdate update = objectMapper.readValue(message.getPayload(), 
                                                  QuotaUpdate.class);
        cacheService.updatePackageQuota(update);
    }

    public void sendOrderUpdate(Order order) {
        TextMessage message = new TextMessage(objectMapper.writeValueAsString(order));
        sessions.forEach(session -&amp;gt; {
            try {
                if (session.isOpen()) {
                    session.sendMessage(message);
                }
            } catch (IOException e) {
                log.error("Failed to send order update to session", e);
            }
        });
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h5&gt;
  
  
  QuotaUpdate Class
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Think of this class as a simple messenger for quota updates. It carries three key pieces of information:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;packageId&lt;/code&gt;: The ID of the package being updated.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;availableQuota&lt;/code&gt;: How much quota is left for this package.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;timestamp&lt;/code&gt;: When the update was made.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h5&gt;
  
  
  WebSocket Configuration
&lt;/h5&gt;

&lt;h6&gt;
  
  
  1. WebSocketConfig:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;This is the setup that makes WebSocket communication possible.&lt;/li&gt;
&lt;li&gt;It registers a handler (&lt;code&gt;quotaWebSocketHandler&lt;/code&gt;) to listen for WebSocket connections at &lt;code&gt;/ws/quota&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It also allows connections from any origin by setting &lt;code&gt;allowedOrigins("*")&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  2. &lt;code&gt;quotaWebSocketHandler()&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;This defines the WebSocket handler bean that will manage incoming messages and connections.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  QuotaWebSocketHandler
&lt;/h5&gt;

&lt;p&gt;This is where all the WebSocket magic happens! It manages real-time updates between the server and clients.&lt;/p&gt;

&lt;h6&gt;
  
  
  1. Fields:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;PackageCacheService&lt;/code&gt;: Helps update the local cache whenever a quota update comes in.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;ObjectMapper&lt;/code&gt;: Handles the conversion of JSON payloads to Java objects and vice versa.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;sessions&lt;/code&gt;: Keeps track of all the active WebSocket sessions (clients currently connected).&lt;/li&gt;
&lt;/ul&gt;

&lt;h6&gt;
  
  
  2. Methods:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;afterConnectionEstablished(WebSocketSession session)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;Adds a new client session to the active list as soon as they connect.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;afterConnectionClosed(WebSocketSession session, CloseStatus status)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;Removes the client session when they disconnect.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;
&lt;code&gt;handleTextMessage(WebSocketSession session, TextMessage message)&lt;/code&gt;:

&lt;ul&gt;
&lt;li&gt;Processes incoming messages.&lt;/li&gt;
&lt;li&gt;Converts the received JSON into a &lt;code&gt;QuotaUpdate&lt;/code&gt; object and updates the local cache.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;h6&gt;
  
  
  3. &lt;code&gt;sendOrderUpdate(Order order)&lt;/code&gt;:
&lt;/h6&gt;

&lt;ul&gt;
&lt;li&gt;Sends real-time updates about order changes to all connected clients.&lt;/li&gt;
&lt;li&gt;Converts the &lt;code&gt;Order&lt;/code&gt; object to JSON and sends it as a message to active WebSocket sessions.&lt;/li&gt;
&lt;li&gt;Makes sure only open connections receive updates.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Key Features of the Code:
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Real-time Updates:

&lt;ul&gt;
&lt;li&gt;Keeps clients instantly informed about quota changes and order updates.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Thread-Safe Management:

&lt;ul&gt;
&lt;li&gt;Uses &lt;code&gt;ConcurrentHashSet&lt;/code&gt; to handle connected clients, ensuring no conflicts when multiple clients are active.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Error Handling:

&lt;ul&gt;
&lt;li&gt;Logs errors when there’s an issue sending messages, making it easier to troubleshoot.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This setup ensures smooth and instant communication between the backend and the front-end, so users always have up-to-date information on quota availability and order statuses.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Exception Classes
public class QuotaNotAvailableException extends RuntimeException {
    public QuotaNotAvailableException() {
        super("Package quota is not available");
    }
}

public class OrderNotFoundException extends RuntimeException {
    public OrderNotFoundException(String orderId) {
        super("Order not found: " + orderId);
    }
}

public class PaymentVerificationException extends RuntimeException {
    public PaymentVerificationException(String message) {
        super(message);
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here’s a breakdown of these custom exception classes and how they’re used to handle specific error scenarios in the system:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;QuotaNotAvailableException&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This exception is triggered when a user tries to purchase a package, but the quota for that package is already gone.&lt;/li&gt;
&lt;li&gt;It comes with a straightforward default message: "Package quota is not available," so both developers and users get a clear understanding of the issue.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;OrderNotFoundException&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This one kicks in when the system can’t find an order based on the provided &lt;code&gt;orderId&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;It includes a detailed error message like, "Order not found: [orderId]," making it easy to pinpoint exactly which order is missing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;PaymentVerificationException&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If there’s an issue verifying a payment—maybe the amounts don’t match, or the payment status is unclear—this exception gets thrown.&lt;/li&gt;
&lt;li&gt;It allows you to pass in a custom message, adding flexibility and context for diagnosing payment issues.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By using these exceptions, the system handles errors in a clean and predictable way. They not only make debugging more efficient for developers but also ensure users receive clear and actionable feedback when something goes wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Order Service
@Service
@Transactional
@Slf4j
public class OrderService {
    private final OrderRepository orderRepository;
    private final ProviderClient providerClient;
    private final PaymentGatewayClient paymentGatewayClient;
    private final NotificationService notificationService;
    private final PackageCacheService packageCacheService;

    @Autowired
    public OrderService(OrderRepository orderRepository,
                       ProviderClient providerClient,
                       PaymentGatewayClient paymentGatewayClient,
                       NotificationService notificationService,
                       PackageCacheService packageCacheService) {
        this.orderRepository = orderRepository;
        this.providerClient = providerClient;
        this.paymentGatewayClient = paymentGatewayClient;
        this.notificationService = notificationService;
        this.packageCacheService = packageCacheService;
    }

    public Order createOrder(OrderRequest request) {
        log.info("Creating new order for package: {}", request.getPackageId());

        // Check quota
        QuotaResponse quota = providerClient.checkQuota(request.getPackageId());
        if (!quota.isAvailable()) {
            log.warn("Quota not available for package: {}", request.getPackageId());
            throw new QuotaNotAvailableException();
        }

        // Get package details
        Package pkg = packageCacheService.getPackage(request.getPackageId());

        // Reserve quota
        ReservationResponse reservation = providerClient.reserveQuota(request.getPackageId());

        // Create order
        Order order = new Order();
        order.setId(UUID.randomUUID().toString());
        order.setCustomerId(request.getCustomerId());
        order.setPackageId(request.getPackageId());
        order.setReservationId(reservation.getId());
        order.setAmount(pkg.getPrice());
        order.setProviderCost(pkg.getProviderCost());
        order.setStatus(OrderStatus.RESERVED);
        order.setCreatedAt(LocalDateTime.now());
        order.setUpdatedAt(LocalDateTime.now());

        Order savedOrder = orderRepository.save(order);
        log.info("Order created successfully: {}", savedOrder.getId());

        return savedOrder;
    }

    public Order processPayment(String orderId, PaymentCallback callback) {
        log.info("Processing payment for order: {}", orderId);

        Order order = orderRepository.findById(orderId)
                .orElseThrow(() -&amp;gt; new OrderNotFoundException(orderId));

        try {
            // Verify payment amount matches order amount
            if (!order.getAmount().equals(callback.getAmount())) {
                throw new PaymentVerificationException("Payment amount mismatch");
            }

            // Update order with payment details
            order.setPaymentId(callback.getPaymentId());
            order.setStatus(OrderStatus.PAYMENT_COMPLETED);
            order.setUpdatedAt(LocalDateTime.now());
            orderRepository.save(order);

            // Hold payment in escrow
            log.info("Holding payment in escrow for order: {}", orderId);
            EscrowResponse escrow = paymentGatewayClient.holdInEscrow(callback.getPaymentId());
            order.setEscrowId(escrow.getId());
            order.setStatus(OrderStatus.IN_ESCROW);
            orderRepository.save(order);

            // Activate service
            log.info("Initiating service activation for order: {}", orderId);
            order.setStatus(OrderStatus.ACTIVATING);
            orderRepository.save(order);

            ActivationResponse activation = providerClient.activateService(orderId);
            if (activation.isSuccess()) {
                verifyActivation(order);
            } else {
                handleActivationFailure(order);
            }

        } catch (Exception e) {
            log.error("Error processing payment for order: {}", orderId, e);
            handleActivationFailure(order);
        }

        return order;
    }

    private void verifyActivation(Order order) {
        log.info("Verifying activation for order: {}", order.getId());
        int attempts = 0;
        boolean activated = false;

        while (attempts &amp;lt; 3 &amp;amp;&amp;amp; !activated) {
            try {
                VerificationResponse verification = 
                    providerClient.verifyActivation(order.getId());

                if (verification.isSuccess()) {
                    activated = true;
                    completeOrder(order);
                }
            } catch (Exception e) {
                log.error("Verification attempt {} failed for order: {}", 
                         attempts + 1, order.getId(), e);
            }

            attempts++;
            if (!activated &amp;amp;&amp;amp; attempts &amp;lt; 3) {
                try {
                    Thread.sleep(2000); // Wait before next attempt
                } catch (InterruptedException ie) {
                    Thread.currentThread().interrupt();
                    break;
                }
            }
        }

        if (!activated) {
            handleActivationFailure(order);
        }
    }

    private void completeOrder(Order order) {
        log.info("Completing order: {}", order.getId());
        try {
            paymentGatewayClient.releaseToProvider(order.getEscrowId());
            order.setStatus(OrderStatus.COMPLETED);
            order.setUpdatedAt(LocalDateTime.now());
            orderRepository.save(order);
            notificationService.sendSuccessNotification(order);
            log.info("Order completed successfully: {}", order.getId());
        } catch (Exception e) {
            log.error("Error completing order: {}", order.getId(), e);
            handleActivationFailure(order);
        }
    }

    private void handleActivationFailure(Order order) {
        log.warn("Handling activation failure for order: {}", order.getId());
        try {
            paymentGatewayClient.refundToCustomer(order.getEscrowId());
            order.setStatus(OrderStatus.REFUNDED);
            order.setUpdatedAt(LocalDateTime.now());
            orderRepository.save(order);
            notificationService.sendFailureNotification(order);
            log.info("Order refunded successfully: {}", order.getId());
        } catch (Exception e) {
            log.error("Error processing refund for order: {}", order.getId(), e);
        }
    }

    public Order getOrder(String orderId) {
        return orderRepository.findById(orderId)
                .orElseThrow(() -&amp;gt; new OrderNotFoundException(orderId));
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;OrderService&lt;/code&gt; class handles the heavy lifting when it comes to managing orders. Let’s break down how it works:&lt;/p&gt;

&lt;h5&gt;
  
  
  Key Responsibilities
&lt;/h5&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;createOrder(OrderRequest request)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This method is all about creating a new order. It checks if the package is available, grabs the details, reserves the quota, and saves the order to the database with an initial status of &lt;code&gt;RESERVED&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;processPayment(String orderId, PaymentCallback callback)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Here, the payment gets handled. The system verifies the payment details, updates the order, puts the payment in escrow, and starts the service activation process. If something goes wrong, it gracefully manages failures.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;verifyActivation(Order order)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This method double-checks if the service activation went smoothly. It tries up to three times, and if it still fails, the system falls back to handle the failure.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;completeOrder(Order order)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Once everything checks out, this method finalizes the order. It releases the escrow funds to the provider, updates the status, and notifies the user about the success.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;handleActivationFailure(Order order)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If activation fails, this method ensures the customer gets a refund and a notification about what went wrong.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;getOrder(String orderId)&lt;/code&gt;&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This straightforward method retrieves an order by its ID. If the order doesn’t exist, it throws a specific exception.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h5&gt;
  
  
  Why It Works
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;It ensures transactions are either completed or rolled back, thanks to its transactional nature.&lt;/li&gt;
&lt;li&gt;With clear error handling and retries, it’s robust enough to handle real-world hiccups.&lt;/li&gt;
&lt;li&gt;Notifications keep users in the loop at every step.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This service is the backbone of the order management process, tying everything together for a seamless user experience.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;// Order Controller
@RestController
@RequestMapping("/api/orders")
@Slf4j
public class OrderController {
    private final OrderService orderService;

    @Autowired
    public OrderController(OrderService orderService) {
        this.orderService = orderService;
    }

    @PostMapping
    public ResponseEntity&amp;lt;Order&amp;gt; createOrder(@Valid @RequestBody OrderRequest request) {
        log.info("Received order creation request for package: {}", request.getPackageId());
        try {
            Order order = orderService.createOrder(request);
            return ResponseEntity.ok(order);
        } catch (QuotaNotAvailableException e) {
            log.warn("Quota not available for package: {}", request.getPackageId());
            return ResponseEntity.status(HttpStatus.CONFLICT).build();
        } catch (Exception e) {
            log.error("Error creating order", e);
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
        }
    }

    @PostMapping("/callback")
    public ResponseEntity&amp;lt;Void&amp;gt; handlePaymentCallback(
            @Valid @RequestBody PaymentCallback callback) {
        log.info("Received payment callback for order: {}", callback.getOrderId());
        try {
            orderService.processPayment(callback.getOrderId(), callback);
            return ResponseEntity.ok().build();
        } catch (OrderNotFoundException e) {
            log.warn("Order not found: {}", callback.getOrderId());
            return ResponseEntity.notFound().build();
        } catch (PaymentVerificationException e) {
            log.warn("Payment verification failed for order: {}", callback.getOrderId());
            return ResponseEntity.status(HttpStatus.UNPROCESSABLE_ENTITY).build();
        } catch (Exception e) {
            log.error("Error processing payment callback", e);
            return ResponseEntity.status(HttpStatus.INTERNAL_SERVER_ERROR).build();
        }
    }

    @GetMapping("/{orderId}")
    public ResponseEntity&amp;lt;Order&amp;gt; getOrder(@PathVariable String orderId) {
        log.info("Retrieving order: {}", orderId);
        try {
            Order order = orderService.getOrder(orderId);
            return ResponseEntity.ok(order);
        } catch (OrderNotFoundException e) {
            log.warn("Order not found: {}", orderId);
            return ResponseEntity.notFound().build();
        }
    }
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;OrderController&lt;/code&gt; class takes care of the REST API endpoints that manage orders in the system. Think is the bridge between the client making requests and the backend services doing the heavy lifting.&lt;/p&gt;

&lt;h5&gt;
  
  
  Key Endpoints
&lt;/h5&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;POST /api/orders&lt;/code&gt; (createOrder)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This endpoint handles creating a new order. &lt;/li&gt;
&lt;li&gt;Here's what happens:

&lt;ul&gt;
&lt;li&gt;It takes in an &lt;code&gt;OrderRequest&lt;/code&gt; from the client.&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;OrderService.createOrder&lt;/code&gt; to process the request and create the order.&lt;/li&gt;
&lt;li&gt;Sends back:

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;200 OK&lt;/code&gt; response with the newly created order if all goes well.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;409 Conflict&lt;/code&gt; if the package quota is unavailable.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;500 Internal Server Error&lt;/code&gt; for any unexpected issues.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;POST /api/orders/callback&lt;/code&gt; (handlePaymentCallback)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This one processes payment updates sent by the payment gateway.&lt;/li&gt;
&lt;li&gt;Here's the flow:

&lt;ul&gt;
&lt;li&gt;It receives a &lt;code&gt;PaymentCallback&lt;/code&gt; with all the payment details.&lt;/li&gt;
&lt;li&gt;Calls &lt;code&gt;OrderService.processPayment&lt;/code&gt; to handle the payment and update the order status.&lt;/li&gt;
&lt;li&gt;The possible responses are:

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;200 OK&lt;/code&gt; if the payment is successfully handled.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;404 Not Found&lt;/code&gt; if the order ID provided doesn’t exist.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;422 Unprocessable Entity&lt;/code&gt; if there’s a mismatch in payment verification.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;500 Internal Server Error&lt;/code&gt; for anything unexpected.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;GET /api/orders/{orderId}&lt;/code&gt; (getOrder)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;This endpoint fetches the details of a specific order by its ID.&lt;/li&gt;
&lt;li&gt;Here's how it works:

&lt;ul&gt;
&lt;li&gt;It calls &lt;code&gt;OrderService.getOrder&lt;/code&gt; to retrieve the order.&lt;/li&gt;
&lt;li&gt;Returns:

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;200 OK&lt;/code&gt; response with the order details if found.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;404 Not Found&lt;/code&gt; if the order ID doesn’t match any records.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h5&gt;
  
  
  Features
&lt;/h5&gt;

&lt;ul&gt;
&lt;li&gt;Separation of Concerns: The &lt;code&gt;OrderController&lt;/code&gt; delegates all business logic to the &lt;code&gt;OrderService&lt;/code&gt;, keeping things clean and focused.&lt;/li&gt;
&lt;li&gt;Validation: Request payloads are validated using the &lt;code&gt;@Valid&lt;/code&gt; annotation to ensure the data coming in meets expectations.&lt;/li&gt;
&lt;li&gt;Error Handling:

&lt;ul&gt;
&lt;li&gt;Provides specific and helpful responses for common issues, like unavailable quotas or missing orders.&lt;/li&gt;
&lt;li&gt;Logs any issues to make debugging easier.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Logging: Tracks key events like incoming requests, errors, and order details for better visibility.&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;This controller ensures that the client and backend communicate seamlessly, making order management as smooth as possible.&lt;/p&gt;

&lt;h3&gt;
  
  
  Conclusion
&lt;/h3&gt;

&lt;p&gt;This research documentation lays out the foundation for designing an e-commerce credit sales system, tackling important challenges like quota management, payment processing, and service activation. While this design covers the basics, there’s always room to make things better!&lt;/p&gt;

&lt;p&gt;Here are a few ideas to improve this design:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;strong&gt;event-driven architecture&lt;/strong&gt; to make the system more flexible and scalable.
&lt;/li&gt;
&lt;li&gt;Add &lt;strong&gt;message queue-based processing&lt;/strong&gt; to handle lots of transactions smoothly.
&lt;/li&gt;
&lt;li&gt;Explore &lt;strong&gt;advanced caching strategies&lt;/strong&gt; to speed things up and reduce dependency on external APIs.
&lt;/li&gt;
&lt;li&gt;Consider &lt;strong&gt;distributed system patterns&lt;/strong&gt; for easier scaling and better reliability.
&lt;/li&gt;
&lt;li&gt;Implement &lt;strong&gt;circuit breakers&lt;/strong&gt; to handle third-party service hiccups gracefully.
&lt;/li&gt;
&lt;li&gt;Set up &lt;strong&gt;monitoring and alerts&lt;/strong&gt; to catch issues early and fix them quickly.
&lt;/li&gt;
&lt;li&gt;Strengthen &lt;strong&gt;security measures&lt;/strong&gt; to protect users and their data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks so much for reading! I hope this documentation has been useful and provides clarity for anyone exploring similar challenges. Of course, this design isn’t perfect—there’s always room for improvement. If you have any thoughts or suggestions, I’d love to hear them. &lt;/p&gt;

&lt;p&gt;resources:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://medium.com/@unaware_harry/a-deep-dive-into-clean-architecture-and-solid-principles-dcdcec5db48a" rel="noopener noreferrer"&gt;A Deep Dive into Clean Architecture and SOLID Principles&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://medium.com/design-microservices-architecture-with-patterns/design-e-commerce-applications-with-microservices-architecture-c69e7f8222e7" rel="noopener noreferrer"&gt;Design E-Commerce Applications with Microservices Architecture&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://dev.to/budiwidhiyanto/designing-a-scalable-backend-for-flash-sales-4g9o"&gt;Designing a Scalable Backend for Flash Sales&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://en.wikipedia.org/wiki/Escrow" rel="noopener noreferrer"&gt;Escrow on Wikipedia&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>webdev</category>
      <category>systemdesign</category>
      <category>java</category>
      <category>ecommerce</category>
    </item>
  </channel>
</rss>
