<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: GraphPilot</title>
    <description>The latest articles on DEV Community by GraphPilot (@graphpilot).</description>
    <link>https://dev.to/graphpilot</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Forganization%2Fprofile_image%2F13686%2F8ed734e5-a20c-458d-989a-b5f000ce93e3.png</url>
      <title>DEV Community: GraphPilot</title>
      <link>https://dev.to/graphpilot</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/graphpilot"/>
    <language>en</language>
    <item>
      <title>Why you can't just put a CDN in front of GraphQL</title>
      <dc:creator>Kay Schecker</dc:creator>
      <pubDate>Tue, 16 Jun 2026 16:01:19 +0000</pubDate>
      <link>https://dev.to/graphpilot/why-you-cant-just-put-a-cdn-in-front-of-graphql-1b34</link>
      <guid>https://dev.to/graphpilot/why-you-cant-just-put-a-cdn-in-front-of-graphql-1b34</guid>
      <description>&lt;p&gt;It's one of the most obvious ideas the moment a GraphQL API starts buckling under load: "Can't we just put a CDN in front of it?" The question comes up on almost every team scaling GraphQL: in architecture reviews, in Stack Overflow threads, in Slack. It sounds reasonable: CDNs made REST fast and cheap to scale, so surely the same trick works one layer up. Then you try it, and within an hour you understand why GraphQL caching is its own product category instead of a config flag. That's exactly why it needs its own tooling and processes. And that's what this post is about.&lt;/p&gt;

&lt;p&gt;The short version: GraphQL breaks almost every assumption that HTTP caching is built on. And for good reason, because that same flexibility is what makes GraphQL so powerful in the first place. Some of the resulting problems you can solve in an afternoon. One of them, invalidation, is genuinely hard, and it's where most home-grown attempts quietly fall apart. But it is solvable: there are good tools that let you master these difficulties.&lt;/p&gt;

&lt;h2&gt;
  
  
  The first challenge: one endpoint, one verb, everything in the body
&lt;/h2&gt;

&lt;p&gt;HTTP caching is keyed on the request. A CDN looks at the method and the URL, maybe a header or two, and decides whether it has seen this exact thing before. &lt;code&gt;GET /users/1&lt;/code&gt; is trivially cacheable: the URL &lt;em&gt;is&lt;/em&gt; the cache key, and &lt;code&gt;Cache-Control&lt;/code&gt; tells the CDN how long to keep it.&lt;/p&gt;

&lt;p&gt;GraphQL throws that model out. You have a single endpoint, usually &lt;code&gt;/graphql&lt;/code&gt;, and almost everything goes through &lt;code&gt;POST&lt;/code&gt; with the actual query sitting in the request body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;POST /graphql
Content-Type: application/json

{
  "query": "
    {
      user(id: 1) {
        name
        posts {
          title
        }
      }
    }
  "
}
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;To a stock CDN, every request to your API looks identical: same method, same URL. It has no idea that one body asks for a user's name and the next triggers a 50-table aggregation. It can't tell two requests apart, so it can't safely cache either. The thing that made REST cacheable, namely a meaningful URL, is gone.&lt;/p&gt;

&lt;p&gt;So before you can cache anything, you have to teach the cache to look &lt;em&gt;inside&lt;/em&gt; the request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Making GraphQL cacheable at all (the easy 80%)
&lt;/h2&gt;

&lt;p&gt;This part is mostly solved, and if you only need read caching you can get a long way:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Normalize the query into a stable key.&lt;/strong&gt; Two requests that ask for the same data can look different on the wire: different whitespace, reordered fields, variables inline versus separated. You parse the query, canonicalize it (sort fields, extract variables, strip noise), and hash the result. Now logically identical requests collapse to the same cache key.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Get it past the CDN layer.&lt;/strong&gt; Because bodies and &lt;code&gt;POST&lt;/code&gt; are awkward to cache, you lean on &lt;code&gt;GET&lt;/code&gt; requests plus &lt;a href="https://graphpilot.io/#optimize" rel="noopener noreferrer"&gt;&lt;em&gt;persisted queries&lt;/em&gt;&lt;/a&gt;: the client sends a hash of a known query instead of the full text, and that hash becomes part of a cacheable URL. Automatic Persisted Queries (APQ) are the common flavor.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Assign TTLs from the schema.&lt;/strong&gt; Not all data ages the same way. A product catalog can be stale for minutes; an account balance cannot. You annotate types and fields with a max age and let the cache apply per-type, per-field TTLs instead of one blunt number for the whole response.&lt;/p&gt;

&lt;p&gt;Do those three things and you have a working read cache. This is the part vendors demo and the part you can build yourself. It is &lt;em&gt;not&lt;/em&gt; the part that hurts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The hard part: invalidation
&lt;/h2&gt;

&lt;p&gt;TTL-only caching forces an unpleasant trade-off. Short TTLs mean low hit rates; you're barely caching. Long TTLs mean you serve stale data, and "the dashboard showed the old number for five minutes" is the kind of bug that can erode trust fast. What you actually want is to cache aggressively &lt;em&gt;and&lt;/em&gt; drop a cached response the instant the underlying data changes.&lt;/p&gt;

&lt;p&gt;Here's where GraphQL's flexibility turns against you. In REST, a write maps cleanly to a resource: &lt;code&gt;PUT /users/1&lt;/code&gt; invalidates the cache entry for &lt;code&gt;/users/1&lt;/code&gt;. There's a URL to purge. In GraphQL there's no URL, and worse: a single mutation can invalidate &lt;strong&gt;fragments of many unrelated cached responses&lt;/strong&gt;. Change one user's name and you've potentially staled every cached query that embedded that user: a profile page, a comment list, a search result, an admin table, each cached under a different key.&lt;/p&gt;

&lt;p&gt;You can't purge by URL because there isn't one. So you invalidate by &lt;em&gt;what's inside&lt;/em&gt; the responses instead.&lt;/p&gt;

&lt;p&gt;The standard approach is &lt;strong&gt;surrogate keys&lt;/strong&gt; (also called cache tags). When you cache a response, you walk it and record every entity it contains, typically &lt;code&gt;__typename&lt;/code&gt; + &lt;code&gt;id&lt;/code&gt;, e.g. &lt;code&gt;User:1&lt;/code&gt;, &lt;code&gt;Post:42&lt;/code&gt;. Those become tags attached to the cached entry. When a mutation comes through, you figure out which entities it touched and purge every cached entry tagged with them. &lt;code&gt;User:1&lt;/code&gt; changes → drop everything tagged &lt;code&gt;User:1&lt;/code&gt;, regardless of which query produced it.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wkbbm3kdb5c0zc0t8jh.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F5wkbbm3kdb5c0zc0t8jh.png" alt="Surrogate-key invalidation: one mutation names User:1; the proxy purges every cached response tagged with it" width="800" height="299"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Conceptually clean. The trouble is in everything the clean version ignores.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where it gets nasty
&lt;/h2&gt;

&lt;p&gt;This is the part nobody puts in the marketing copy:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Lists and pagination.&lt;/strong&gt; A mutation &lt;em&gt;creates&lt;/em&gt; a new post. The new entity has an ID that doesn't appear in any cached response yet, so tag-based purging can't find the lists it should now appear in. "Invalidate the entity" doesn't help when the problem is "a collection should have grown." You end up needing coarser, list-level invalidation, which claws back some of the hit rate you fought for.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mutations that don't return the changed entity.&lt;/strong&gt; Your tagging logic learns which entities to purge by inspecting payloads. A mutation that returns just &lt;code&gt;{ success: true }&lt;/code&gt; tells you nothing about what it changed. Now you're guessing, or you're forcing schema conventions on every mutation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Derived and aggregate fields.&lt;/strong&gt; &lt;code&gt;commentCount&lt;/code&gt;, computed totals, or "is this in stock" depend on entities that aren't directly named in the response. The thing that changed and the field that's now wrong don't share an ID.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Authorization and personalization.&lt;/strong&gt; The same query returns different data per viewer. Cache it naively and you leak one user's data to another. That's a security bug, not a performance one. Cache it correctly and your key space explodes by user, gutting your hit rate. Deciding what's safely shared versus per-viewer is a design problem with no universal answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partial responses and errors.&lt;/strong&gt; GraphQL can return &lt;code&gt;data&lt;/code&gt; &lt;em&gt;and&lt;/em&gt; &lt;code&gt;errors&lt;/code&gt; in the same &lt;code&gt;200&lt;/code&gt; response. Do you cache a partial result? A response that succeeded for three fields and failed for one? There's no HTTP status to lean on.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema changes.&lt;/strong&gt; Deploy a new schema and yesterday's cached responses may no longer match today's shape. Your cache has to know when its assumptions expired.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The good news: none of these is insurmountable. But together they're why "we'll just add caching to our GraphQL gateway" turns into a quarter of work.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the edge makes it harder, and why it's still worth it
&lt;/h2&gt;

&lt;p&gt;Everything above is true even with a single, central cache. Push it to the edge, meaning many points of presence close to users, and you inherit a distributed-systems problem on top of the GraphQL problem.&lt;/p&gt;

&lt;p&gt;Now your cache state is spread across many locations. A purge isn't an instant local delete; it has to &lt;em&gt;propagate&lt;/em&gt;, and during that window different users can get different answers. A write in one region races against a read in another. The index that maps entities to cached responses has to be consistent enough to be trustworthy and fast enough to consult on the hot path. Because at the edge you can't burn real compute per request without giving back the latency you came for.&lt;/p&gt;

&lt;p&gt;So why bother? Because when it works, you move read traffic off your origin entirely and serve it from near the user, and your database stops being the thing that falls over at peak. The payoff is real. It's just that the gap between "naive TTL cache" and "correct, invalidating, edge-distributed cache" is exactly where the engineering lives.&lt;/p&gt;

&lt;h2&gt;
  
  
  What good looks like
&lt;/h2&gt;

&lt;p&gt;Whether you build this or buy it: these are the properties worth insisting on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://graphpilot.io/#schema-driven" rel="noopener noreferrer"&gt;Schema-driven cache configuration&lt;/a&gt;.&lt;/strong&gt; Cacheability, TTLs, and scoping declared alongside the schema, so reviewable, versioned, and diffable, not scattered across ad-hoc rules.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://graphpilot.io/#edge-cache" rel="noopener noreferrer"&gt;Entity-level surrogate keys with a real purge API&lt;/a&gt;.&lt;/strong&gt; Tagging plus a way to say "drop everything touching &lt;code&gt;User:1&lt;/code&gt;" on demand, not just on a timer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://graphpilot.io/#security" rel="noopener noreferrer"&gt;Explicit auth scoping&lt;/a&gt;.&lt;/strong&gt; A deliberate decision about what's shared versus per-viewer, enforced by default. Personalized data should never be cacheable by accident.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;a href="https://graphpilot.io/#observability" rel="noopener noreferrer"&gt;Observability&lt;/a&gt;.&lt;/strong&gt; Hit rate, staleness, and purge lag, visible. You cannot tune an invalidation strategy you can't measure, and "it feels faster" is not a metric.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Closing
&lt;/h2&gt;

&lt;p&gt;GraphQL caching is easy to start and hard to finish. The cacheable-key part is a weekend; correct invalidation across personalized, paginated, edge-distributed responses is why this becomes its own product category and not a feature you flip on in passing. If you've shipped a GraphQL API and watched your origin strain under read traffic, you've felt the pull toward the edge, and probably the pain of doing it right.&lt;/p&gt;

&lt;p&gt;That's exactly the problem we're building &lt;strong&gt;GraphPilot&lt;/strong&gt; to solve. It's not open yet. I'd genuinely rather hear which of the edge cases above is hurting &lt;em&gt;you&lt;/em&gt; than guess, so drop a comment below: which one bites hardest in your setup? And if you run GraphQL in production and want early access, you can &lt;a href="https://graphpilot.io/register" rel="noopener noreferrer"&gt;join the waitlist&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>graphql</category>
      <category>cdn</category>
    </item>
  </channel>
</rss>
