<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Rani Hirani</title>
    <description>The latest articles on DEV Community by Rani Hirani (@codewithrani).</description>
    <link>https://dev.to/codewithrani</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3795396%2F3fb710c5-a41d-451e-85e1-748174df21c3.png</url>
      <title>DEV Community: Rani Hirani</title>
      <link>https://dev.to/codewithrani</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/codewithrani"/>
    <language>en</language>
    <item>
      <title>From Query Tuning to Cache Versioning: Lessons from a Production Endpoint</title>
      <dc:creator>Rani Hirani</dc:creator>
      <pubDate>Thu, 26 Feb 2026 20:09:04 +0000</pubDate>
      <link>https://dev.to/codewithrani/from-query-tuning-to-cache-versioning-lessons-from-a-production-endpoint-4968</link>
      <guid>https://dev.to/codewithrani/from-query-tuning-to-cache-versioning-lessons-from-a-production-endpoint-4968</guid>
      <description>&lt;p&gt;This week, I tried optimizing a production endpoint.&lt;/p&gt;

&lt;p&gt;My first instinct was predictable: rewrite the query.&lt;/p&gt;

&lt;p&gt;Because when performance drops, we instinctively blame the database. Performance issue equals database issue. Right?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Wrong.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Query Wasn’t the Problem&lt;/p&gt;

&lt;p&gt;I started by reviewing everything carefully. the CTE usage, the ordering logic, the count behavior, the execution plan, and multiple rewrite attempts to see if something subtle was being missed.&lt;/p&gt;

&lt;p&gt;I tested variations. I compared execution times. I re-evaluated the logic to ensure nothing unnecessary was happening.&lt;/p&gt;

&lt;p&gt;The result?&lt;/p&gt;

&lt;p&gt;&lt;u&gt;No consistent improvement.&lt;/u&gt;&lt;/p&gt;

&lt;p&gt;And that’s when it became clear: the query wasn’t inefficient. It was already aligned with the business requirements. It was returning exactly what it was supposed to return — Jira boards, their sub-boards, and their sprints — in a structurally correct way.&lt;/p&gt;

&lt;p&gt;Trying to squeeze more performance out of it wasn’t optimization anymore. It was forcing change where none was needed.&lt;/p&gt;

&lt;p&gt;The problem was somewhere else.&lt;/p&gt;

&lt;p&gt;Strategic Caching — Not Just @Cacheable&lt;/p&gt;

&lt;p&gt;So we shifted focus to caching.&lt;/p&gt;

&lt;p&gt;But caching is not just slapping @Cacheable on a method and calling it a day. It forces you to think about boundaries and trade-offs.&lt;/p&gt;

&lt;p&gt;We had to answer three questions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;What exactly are we caching?&lt;/li&gt;
&lt;li&gt;For how long?&lt;/li&gt;
&lt;li&gt;What risks does stale data introduce?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Initially, we considered a TTL of five hours, since the database sync only runs every twenty-four hours. On paper, that seemed reasonable.&lt;/p&gt;

&lt;p&gt;But once we evaluated the risks more carefully, the picture changed.&lt;/p&gt;

&lt;p&gt;Permissions can change.&lt;br&gt;
Board structures can change.&lt;br&gt;
Serving stale Jira board and sprint data could affect user experience and correctness.&lt;/p&gt;

&lt;p&gt;So we reduced the TTL to thirty minutes.&lt;/p&gt;

&lt;p&gt;It was a deliberate trade-off: freshness over aggressive caching.&lt;/p&gt;

&lt;p&gt;Because performance means nothing if correctness degrades.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Stale Cache Reality
&lt;/h2&gt;

&lt;p&gt;After introducing caching, we later made changes to the DTO structure.&lt;/p&gt;

&lt;p&gt;That’s when things started behaving inconsistently.&lt;/p&gt;

&lt;p&gt;The endpoint wasn’t broken — it was unpredictable.&lt;/p&gt;

&lt;p&gt;The reason was simple but painful: Redis was still serving old serialized payloads. The cache contained outdated object structures that no longer matched the current DTO shape.&lt;/p&gt;

&lt;p&gt;We weren’t facing a logic bug. We were facing stale cache data.&lt;/p&gt;

&lt;p&gt;The solution was explicit cache versioning.&lt;/p&gt;

&lt;p&gt;We introduced versioned cache names like:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;jiraBoardKeysCacheV6&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;jiraSubBoardsByOrgCacheV7&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Instead of hoping stale entries would disappear, we forced hard invalidation.&lt;/p&gt;

&lt;p&gt;It was a reminder that cache invalidation isn’t an academic concept. It’s a production concern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architectural Mistake We Didn’t See at First
&lt;/h2&gt;

&lt;p&gt;During this process, we uncovered a deeper issue.&lt;/p&gt;

&lt;p&gt;We were caching JPA entities directly.&lt;/p&gt;

&lt;p&gt;That decision quietly introduced risk:&lt;/p&gt;

&lt;p&gt;Lazy-loading proxies leaking into serialization.&lt;br&gt;
Tight coupling between persistence and API layers.&lt;br&gt;
Serialization inconsistencies.&lt;br&gt;
Difficulty evolving response structures.&lt;/p&gt;

&lt;p&gt;The cache had effectively become an extension of the database layer.&lt;/p&gt;

&lt;p&gt;So we introduced proper boundaries:&lt;/p&gt;

&lt;p&gt;&lt;code&gt;BoardDTO&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;code&gt;SubBoardDTO&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Each with a static fromEntity(...) mapping method.&lt;/p&gt;

&lt;p&gt;DTOs became the contract between layers. They became the cache boundary.&lt;/p&gt;

&lt;p&gt;Now the separation was clear:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database layer ≠ API layer ≠ Cache layer&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And that separation stabilized the system.&lt;/p&gt;

&lt;p&gt;The Null Pointer Edge Case&lt;/p&gt;

&lt;p&gt;Just when things seemed stable, a runtime error appeared:&lt;/p&gt;

&lt;p&gt;“element cannot be mapped to a null key”&lt;/p&gt;

&lt;p&gt;The root cause was subtle. We were grouping sub-boards by boardId, but some boardId values were null. The grouping operation failed because null keys were not handled.&lt;/p&gt;

&lt;p&gt;The fix required defensive handling:&lt;/p&gt;

&lt;p&gt;Filtering null keys before grouping.&lt;br&gt;
Adding null checks inside DTO mapping.&lt;/p&gt;

&lt;p&gt;After that, we eliminated:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Null pointer risks.&lt;/li&gt;
&lt;li&gt;Null sub-board responses.&lt;/li&gt;
&lt;li&gt;Fragile response shapes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It reinforced something simple: &lt;strong&gt;optimistic assumptions don’t survive production.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;What This Was Really About&lt;/p&gt;

&lt;p&gt;At the beginning, this looked like a query optimization problem.&lt;/p&gt;

&lt;p&gt;It wasn’t.&lt;/p&gt;

&lt;p&gt;It was:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A data-shape problem.&lt;/li&gt;
&lt;li&gt;A cache-boundary problem.&lt;/li&gt;
&lt;li&gt;A stale data problem.&lt;/li&gt;
&lt;li&gt;An edge-case handling problem.&lt;/li&gt;
&lt;li&gt;The database wasn’t slow.&lt;/li&gt;
&lt;li&gt;The architecture was fragile.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And once we strengthened the boundaries, defined cache strategy properly, and handled edge cases deliberately, the endpoint stabilized.&lt;/p&gt;

&lt;p&gt;System design is rarely about dramatic rewrites.&lt;br&gt;
Most of the time, it’s about understanding where your assumptions break — and fixing the seams between layers.&lt;/p&gt;

&lt;p&gt;And more often than not, that’s where the real performance work lives.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
