DEV Community: Maciej Łopalewski

Reciprocal Rank Fusion on free Elasticsearch: licensing, workarounds, and the OpenSearch alternative

Maciej Łopalewski — Wed, 24 Jun 2026 10:59:52 +0000

Reciprocal Rank Fusion (RRF) on Elasticsearch is gated to the Enterprise tier. On the Basic/free tier, querying with the rrf retriever returns a 403:

{
  "error": {
    "root_cause": [
      {
        "type": "security_exception",
        "reason": "current license is non-compliant for [Reciprocal Rank Fusion (RRF)]",
        "license.expired.feature": "Reciprocal Rank Fusion (RRF)"
      }
    ],
    "type": "security_exception",
    "reason": "current license is non-compliant for [Reciprocal Rank Fusion (RRF)]",
    "license.expired.feature": "Reciprocal Rank Fusion (RRF)"
  },
  "status": 403
}

This has held since RRF first appeared as a technical preview in 8.8 (May 2023), through the 8.16 GA release (November 2024), and into the 9.x line. The linear retriever added in 8.18 / 9.0 is gated the same way, also at Enterprise. Three practical workarounds exist: implement RRF yourself in application code, combine BM25 and vector results with a normalized linear combination, or move to OpenSearch - which ships RRF, score normalization (including z-score), and late-interaction reranking by a field under Apache 2.0 license.

What RRF actually does

Reciprocal Rank Fusion combines multiple ranked result sets into a single ranking based on document positions rather than raw scores. The formula is:

score(d) = Σ 1 / (k + rank_i(d))

where rank_i(d) is the document's position in the i-th result list and k is a smoothing constant. The original paper from Cormack, Clarke, and Büttcher (SIGIR 2009) used k=60 and noted the choice was not critical. Their Table 1 shows MAP barely moves across k ∈ [10, 100] - only at the extremes (k=0 or k=500) does the score drop meaningfully. Elastic and OpenSearch both default the RRF rank constant to 60, following the original paper's convention.

The reason RRF matters for hybrid search: BM25 scores are unbounded and query/corpus-dependent, while vector similarity scores live on a different scale and distribution depending on the similarity function and embedding model. Naively summing them lets one method dominate purely from scale. RRF discards the scores entirely and works only with ranks, which sidesteps the normalization problem with no tuning required.

What Basic actually gets you, and what it doesn't

The Elastic subscriptions matrix splits hybrid-search-related features across three paid tiers. Knowing what's behind each wall is more useful than the headline "RRF is paid":

Feature	Basic / free	Platinum	Enterprise
Vector search (kNN)	✓	✓	✓
Standard, kNN, pinned, rescorer retrievers	✓	✓	✓
Similarity functions for vector fields	✓	✓	✓
Synonym management	✓	✓	✓
ELSER (learned sparse encoder)	—	✓	✓
Elastic Rerank	—	✓	✓
Inference API	—	✓	✓
RRF for hybrid search	—	—	✓
Linear, rule, text similarity re-ranker retrievers	—	—	✓
Rank Vectors (for MaxSim)	—	—	✓
DiskBBQ	—	—	✓
Indexing vectors with GPUs	—	—	✓
Query Rules	—	—	✓
Learning to Rank	—	—	✓

Source: https://www.elastic.co/subscriptions

The pattern is clear. Basic gives you the building blocks for client-side hybrid search - kNN runs, BM25 runs, both can be queried separately. Platinum unlocks Elastic-managed inference and ELSER. Enterprise is where the actual modern hybrid search features live: rank fusion, learned sparse rerankers, late interaction, GPU vector indexing, learning to rank.

If you're on Basic and you want hybrid search, you're either reimplementing pieces of Enterprise in application code, switching engines, or starting a trial.

The licensing timeline

The gating has been continuous and has expanded, not relaxed. The August 2024 license change that re-added AGPLv3 to Elasticsearch made the source code open source again but did not change which features the default ELv2 distribution gates behind paid tiers.

Version	Date	What changed
8.4	August 2022	First hybrid search support
8.8	May 2023	RRF added as technical preview, gated
8.14	June 2024	Retrievers framework introduced
Pre-8.16	August 2024	Elastic announced AGPLv3 as an additional source-code license option; feature-tier gating in the default distribution unchanged
8.16	November 2024	RRF and retrievers reach GA, gated to Enterprise
8.18 / 9.0	April 2025	Linear retriever added, also gated to Enterprise
9.0.1 Basic	June 2025	Linear retriever still throws license error in production
Late 2025	September 2025	Per-retriever weights added to RRF retriever (still Enterprise-gated)

Self-managed deployments can start a 30-day trial that gives access to all subscription features, including Enterprise-tier features, for evaluation. Elastic Cloud trials are 14 days. Elastic also publishes a trial extension form that grants one additional 30-day extension on request.

Workaround 1: Implement RRF in your application

Run BM25 and kNN as two separate queries against Elasticsearch Basic, then fuse the result lists in application code. The fusion logic is roughly seven lines of Python:

from collections import defaultdict

def rrf_fusion(rankings: list[list[str]], k: int = 60) -> list[tuple[str, float]]:
    """Combine multiple ranked lists of document IDs using Reciprocal Rank Fusion."""
    scores: dict[str, float] = defaultdict(float)
    for ranking in rankings:
        for rank, doc_id in enumerate(ranking, start=1):
            scores[doc_id] += 1.0 / (k + rank)
    return sorted(scores.items(), key=lambda x: x[1], reverse=True)

Calling it with a BM25 result list and a kNN result list returns the fused ranking:

bm25_ids = [hit["_id"] for hit in es.search(index=idx, query=match_query)["hits"]["hits"]]
knn_ids = [hit["_id"] for hit in es.search(index=idx, knn=knn_query)["hits"]["hits"]]

fused = rrf_fusion([bm25_ids, knn_ids])
top_10_ids = [doc_id for doc_id, _ in fused[:10]]

The trade-offs are clear. You pay two round trips instead of one, you lose retrievers-specific features like inner_hits and the unified pagination model, and you have to rehydrate the document _source after fusion - typically with an mget call against the top IDs. In return, you get RRF ranking on the Basic tier with no licensing exposure. In many RAG-style systems embedding generation and kNN latency dominate, but the actual breakdown depends on cache state, candidate window size, and deployment topology - measure for your workload before assuming the extra round trip is free.

Workaround 2: Linear combination with manual normalization

The Basic tier still allows two separate queries fused with a weighted sum, as long as the math happens in your application rather than through the gated linear retriever. Min-max normalize both score sets per query - using the min and max from each result list as the range - then combine with a weight α:

from collections import defaultdict

def minmax_normalize(score: float, lo: float, hi: float) -> float:
    rng = hi - lo
    return 1.0 if rng == 0 else (score - lo) / rng

def linear_fusion(bm25_hits: list[dict], knn_hits: list[dict], alpha: float = 0.5):
    scores: dict[str, float] = defaultdict(float)

    if bm25_hits:
        s = [h["_score"] for h in bm25_hits]
        lo, hi = min(s), max(s)
        for h in bm25_hits:
            scores[h["_id"]] += alpha * minmax_normalize(h["_score"], lo, hi)

    if knn_hits:
        s = [h["_score"] for h in knn_hits]
        lo, hi = min(s), max(s)
        for h in knn_hits:
            scores[h["_id"]] += (1 - alpha) * minmax_normalize(h["_score"], lo, hi)

    return sorted(scores.items(), key=lambda x: x[1], reverse=True)

When you have labeled query data, calibrated linear can beat RRF. Elastic's research on ELSER + BM25 across BEIR reports that around 40 annotated queries are enough for linear combination to start outperforming RRF, and with 300 calibration queries the optimized linear combination achieved a 6% NDCG@10 improvement over ELSER alone - compared to RRF's 1.4% improvement over the same baseline. Without calibration data, RRF is the safer default - it requires no tuning and is far less sensitive to score distribution mismatches.

Workaround 3: Switch to OpenSearch

OpenSearch ships hybrid search and RRF under Apache 2.0 with no feature-tier licensing. The native hybrid query and normalization-processor (min_max, L2) landed in 2.10 (September 25, 2023). Native RRF landed in 2.19 (February 11, 2025). The 3.x line has continued to invest heavily in vector and hybrid search since then.

Version	Date	Hybrid search additions
2.10	September 25, 2023	First `hybrid` query, `normalization-processor` (min_max, L2)
2.19	February 11, 2025	Native RRF (`score-ranker-processor`), pagination support, `hybrid_score_explanation`
3.0	May 6, 2025	z-score normalization, lower bound for min-max, inner hits in hybrid, GPU acceleration for vector index builds (experimental)
3.1	June 24, 2025	GPU acceleration for vector index builds GA, hybrid query performance improvements (up to 65% latency reduction)
3.3	October 14, 2025	Up to 20% faster hybrid for lexical subqueries, `lateInteractionScore` for reranking by a field using externally hosted ColBERT/ColPali models

The trend continued after 3.3: OpenSearch 3.4, released in December 2025, added further vector-search investment such as k-NN memory-optimized search warmup, native FP16 vector scoring, and JDK 25 support. OpenSearch 3.5 followed in February 2026, and 3.6.0 - the project's first long-term support release - was published April 7, 2026. Refer to the OpenSearch version history and downloads page for the current state.

The score-ranker-processor documentation also shows that custom subquery weights are supported via the parameters.weights array on RRF - useful when you want to weight the BM25 and vector legs differently rather than treat them equally. Elasticsearch added per-retriever weights to its RRF retriever in late 2025 as well, but because the RRF retriever itself is Enterprise-gated, this doesn't change the Basic-tier workaround story.

The mapping to Elasticsearch's paid tiers is striking. Several broadly comparable capabilities that are Enterprise-only in Elasticsearch's built-in implementation are open-source in OpenSearch:

Capability	Elasticsearch tier	OpenSearch tier
RRF rank fusion	Enterprise	Apache 2.0 (since 2.19)
Built-in weighted score fusion	Enterprise (`linear` retriever)	Apache 2.0 score-based hybrid normalization/combination via search pipelines (since 2.10)
Multiple normalization methods (min-max, L2, z-score)	Enterprise	Apache 2.0 (z-score in 3.0)
Late-interaction reranking by a field (ColBERT/ColPali workflows)	Enterprise (Rank Vectors)	Apache 2.0 (`lateInteractionScore` in 3.3)
GPU acceleration for vector index builds	Enterprise	Apache 2.0 (3.0 experimental, 3.1 GA)
Learning to Rank	Enterprise	Apache 2.0 (LTR plugin)

If the project is greenfield and does not depend on Elastic-specific features such as ELSER, Elastic Rerank, ES|QL, or the managed Inference API integrations, OpenSearch is the cleanest path. The migration cost from an existing Elasticsearch deployment is non-trivial - index format compatibility, client library differences, and Kibana versus OpenSearch Dashboards differences all add up - but the functional gap between the free tiers has generally widened for hybrid and vector-search use cases.

When paying for Enterprise makes sense

The trial-then-pay path is reasonable when the team needs text_similarity_reranker for semantic reranking with hosted models, the calibrated linear retriever, MaxSim with rank vectors for ColBERT-style retrieval, GPU vector indexing for billion-scale corpora, learning-to-rank pipelines, or production support contracts. None of this comes for free, and reimplementing it is more expensive than the license for organizations with substantial search infrastructure. For teams whose only blocker is RRF specifically, the workarounds above usually win on cost and are straightforward to implement for simple two-leg hybrid retrieval.

What most teams actually need

Most teams that hit the RRF license error don't actually need RRF specifically - they need hybrid search to work. Manual linear combination with normalized scores is often sufficient for a first production hybrid-search implementation on any Elasticsearch version. The core RRF scoring loop is another twenty lines of code on top of that - production deployments will additionally want pagination, deduplication, an mget rehydration step, tie-breaking, and observability around the fused ranking. And for projects starting fresh in 2026, the open-source answer keeps getting better with every OpenSearch release.

Originally published at https://u11d.com on June 3, 2026.

AWS CloudFront Cache Policies: Complete Guide

Maciej Łopalewski — Wed, 20 May 2026 11:30:24 +0000

A CloudFront cache policy controls two things: the cache key (which combination of URL, headers, cookies, and query strings makes a request unique) and the TTL (how long CloudFront keeps an object at the edge before re-checking the origin). Those two settings together determine your cache hit ratio.

The CloudFront console currently shows fifteen managed cache policies. Five are broadly useful on a typical self-managed distribution: CachingOptimized, CachingOptimizedForUncompressedObjects, CachingDisabled, UseOriginCacheControlHeaders, and UseOriginCacheControlHeaders-QueryStrings. One (Elemental-MediaPackage) is for AWS Elemental MediaPackage video origins. The remaining nine are Amplify-related — a standalone Amplify policy for Amplify origins plus eight Amplify-* policies that Amplify Hosting attaches to its own distributions automatically. You can also build a custom policy when none of the managed ones fit the shape of your traffic.

This is a follow-up to my earlier post on CloudFront's three policy types, going one level deeper on just cache policies.

What a cache policy controls

A cache policy has three categories of settings: policy information (name and description, just metadata), TTL settings (Minimum, Maximum, Default), and cache key settings (which headers, cookies, and query strings to include). Everything that affects caching behavior lives in the latter two.

The cache key is the unique identifier CloudFront uses to look up an object at an edge location. If two viewer requests produce the same cache key, the second one is a cache hit. If they differ — say, one request includes a ?ref=twitter query string and the other does not, and your policy includes query strings in the cache key — they get treated as separate objects, even when the response body is identical. Cache key shape is the single biggest lever for hit ratio.

The TTL settings work alongside Cache-Control and Expires headers from your origin to determine how long a cached object stays valid at the edge. They behave subtly differently from each other; more on that next.

How Minimum, Maximum, and Default TTL actually work

The three TTL settings are not redundant. Each one applies in a different scenario, and getting them confused is one of the more common ways to either over-cache stale content or hammer your origin unnecessarily.

Default TTL is used when your origin sends no Cache-Control or Expires header at all. CloudFront falls back to this value, subject to the Minimum TTL floor — if Minimum TTL is greater than Default TTL, CloudFront caches for at least the Minimum TTL. The default for the AWS managed CachingOptimized policy is 86,400 seconds (24 hours).

Maximum TTL caps the TTL when your origin does send Cache-Control or Expires. If your origin says Cache-Control: max-age=5184000 (60 days) and Maximum TTL is 31,536,000 (365 days), CloudFront honors the origin's 60 days. If your origin says max-age=63072000 (730 days), CloudFront caps it at 365 days. This setting only matters when you want to override an origin that's claiming overly aggressive cache durations.

One nuance worth knowing: if your origin sends both max-age and s-maxage, CloudFront uses s-maxage for its own caching decisions and lets browsers use max-age. This is how you get different cache durations at the edge versus the browser without writing two policies.

Minimum TTL is the floor. CloudFront keeps the object for at least this long, no matter what. This one comes with a sharp edge: if Minimum TTL is greater than 0, CloudFront ignores Cache-Control: no-cache, no-store, and private directives from your origin. The object gets cached anyway, for at least the Minimum TTL duration. Both CachingOptimized and CachingOptimizedForUncompressedObjects have Minimum TTL of 1 second, and the standalone Amplify policy has Minimum TTL of 2 seconds — meaning you cannot reliably stop caching with origin headers alone if you are using them.

If you set all three TTLs to 0, caching is effectively disabled — which is exactly what CachingDisabled does.

Cache key settings: headers, cookies, query strings, and compression

Each of the three viewer-data sources (headers, cookies, query strings) can be configured independently. Cookies and query strings have four modes: include none, include all, include specifically named ones, or include all-except-named-ones. Headers are the exception — they support only "none" or a specific list. There is no "all headers" option, because that would be unbounded and would fragment the cache catastrophically.

When you include a header, cookie, or query string in the cache key, CloudFront also automatically forwards it to the origin on cache misses. The cache key and the origin request are coupled by default; the only way to forward something to the origin without affecting the cache key is to add it to a separate origin request policy. This is the most common confusion with cache policies and the reason CloudFront split them apart in the first place.

A subtle but important detail: cache key matching uses header, cookie, and query string names, but the matching is on the full name+value. Specifying session_id in the cache key means every distinct value of session_id produces a different cache key — so if every visitor has a unique session ID, every request is a cache miss. This is why "include all cookies" rarely makes sense for general traffic.

The compression settings (EnableAcceptEncodingGzip, EnableAcceptEncodingBrotli) tell CloudFront to normalize the Accept-Encoding header before adding it to the cache key. With both enabled, the cache key sees one of br,gzip, gzip, or br (depending on what the viewer supports), or no Accept-Encoding at all when the viewer supports neither — in that last case CloudFront sends Accept-Encoding: identity to the origin instead. Without normalization, every browser variation of Accept-Encoding: gzip, deflate, br, zstd would produce a distinct cache key. Enable this when your origin returns compressed responses or when CloudFront edge compression is on. Leave it off otherwise.

One gotcha: if you enable Gzip or Brotli in the cache policy, do not also include Accept-Encoding in an origin request policy attached to the same behavior. CloudFront handles that header itself when compression is enabled, and adding it to the origin request policy has no effect.

The AWS managed cache policies

This is the comparison table for the policies you'll actually attach to a self-managed distribution by hand. The Amplify Hosting policies (Amplify-* and Amplify-*-V2) are intentionally excluded — see the Amplify section below for why.

Policy	TTL (min/default/max)	Cookies	Query strings	Headers in cache key	Compression
`CachingOptimized`	1s / 24h / 365d	None	None	None	Gzip + Brotli
`CachingOptimizedForUncompressedObjects`	1s / 24h / 365d	None	None	None	Off
`CachingDisabled`	0 / 0 / 0	None	None	None	Off
`UseOriginCacheControlHeaders`	0 / 0 / 365d	All	None	Host, Origin, method overrides	Gzip + Brotli
`UseOriginCacheControlHeaders-QueryStrings`	0 / 0 / 365d	All	All	Host, Origin, method overrides	Gzip + Brotli
`Elemental-MediaPackage`	0 / 24h / 365d	None	aws.manifestfilter, start, end, m	Origin	Gzip

CachingOptimized

ID: 658327ea-f89d-4fab-a63d-7e88639e58f6

The default choice for static content — S3 buckets, image assets, JS and CSS bundles, anything that does not change based on who is asking. The cache key is just the requested object plus the normalized Accept-Encoding. No headers, no cookies, no query strings. This produces the highest possible cache hit ratio for static content.

Watch the Minimum TTL of 1 second: even with Cache-Control: no-store on your origin, CloudFront holds the object for at least one second. That is usually fine, but if you are using this policy on something where origin-side cache busting needs to take effect immediately, swap to UseOriginCacheControlHeaders or CachingDisabled.

CachingOptimizedForUncompressedObjects

ID: b2884449-e4de-46a7-ac36-70bc7f1ddd6d

Identical to CachingOptimized except compression is off. Use this when your origin does not return Gzip or Brotli (raw binary files, video segments, pre-compressed media formats like MP4 or WebP) and you are not using CloudFront edge compression. With compression off, the Accept-Encoding header is excluded from the cache key entirely, which keeps things simple.

If you cannot decide between this and CachingOptimized, default to CachingOptimized. The compression setting only causes problems when your origin produces objects that do not benefit from compression — and even then, it is a minor inefficiency rather than a correctness bug.

CachingDisabled

ID: 4135ea2d-6df8-44a3-9df3-4b5a84be39ad

All TTLs at 0. Nothing in the cache key. Every request goes straight through to the origin. Use this for API behaviors, dynamic GET endpoints that should never be cached, real-time data, WebSockets, and anywhere caching would be incorrect rather than just suboptimal. (POST, PUT, and DELETE aren't cached by CloudFront in the first place — only GET, HEAD, and optionally OPTIONS are — so attaching CachingDisabled to those methods is more about being explicit than functionally necessary.)

UseOriginCacheControlHeaders, UseOriginCacheControlHeaders-QueryStrings, and Elemental-MediaPackage also have Minimum TTL of 0, so they too will respect Cache-Control: no-store from your origin. The difference is that CachingDisabled never caches anything regardless of origin headers, while the others cache when the origin tells them to.

UseOriginCacheControlHeaders

ID: 83da9c7e-98b4-4e11-a168-04f0df8e2c65

Defers caching duration to your origin's Cache-Control and Expires headers. If the origin says max-age=600, CloudFront caches for 10 minutes. If the origin says no-store, CloudFront does not cache at all (because Minimum TTL is 0).

This is the right choice for CMS-backed sites, mixed static-and-dynamic apps, and anywhere your application code already knows what should be cached and for how long. WordPress, Drupal, server-rendered Next.js, and most traditional web apps fit here.

The cache key includes all cookies plus Host, Origin, and three method-override headers (X-HTTP-Method-Override, X-HTTP-Method, X-Method-Override). The cookie inclusion is significant: if your application sets per-user session cookies on every response, you will fragment the cache per user. Either avoid setting cookies on cacheable responses, or stick with CachingOptimized for paths that should be shared across users.

UseOriginCacheControlHeaders-QueryStrings

ID: 4cc15a8a-d715-48a4-82b8-cc0b614638fe

Same as UseOriginCacheControlHeaders but with all query strings included in the cache key. Use this when your origin returns different responses based on query string values — search results, filtered listings, paginated content — and you want CloudFront to cache each variant separately.

The trade-off is cache fragmentation. URLs that include UTM parameters, click IDs, or other tracking junk get cached independently, which both lowers your hit ratio and bloats CloudFront's storage of your content. If you can strip those server-side or with a CloudFront Function before this policy applies, do.

Amplify (and the eight related Amplify policies)

ID: 2e54312d-136d-493c-8eb9-b001f22f67d2

Two distinct things share the Amplify name. The standalone Amplify policy is documented as a regular CloudFront managed policy designed for use with an Amplify web app origin — Min TTL 2 seconds, Max TTL 600 seconds (10 minutes), Default TTL 2 seconds, with Authorization, CloudFront-Viewer-Country, and Host in the cache key plus all cookies and all query strings. AWS doesn't warn against using it; it's just narrowly tuned for Amplify-shaped workloads. The 2-second Minimum TTL means even no-store responses get cached briefly, which is rarely what you want outside of that specific architecture.

The eight Amplify-* policies are something else entirely: Amplify-Default, Amplify-DefaultNoCookies, Amplify-ImageOptimization, Amplify-StaticContent, plus a -V2 variant of each. These are managed by Amplify Hosting itself — Amplify attaches them to the distributions it provisions and resets them on every deploy. AWS explicitly says "we don't recommend that you use these policies for your distributions." If you need similar cache key shapes for a non-Amplify app, copy the settings into a custom policy.

The V2 variants appear to be tied to the August 2024 Amplify Hosting caching overhaul, which raised default static asset cache duration from 2 seconds to 1 year and Maximum TTL from 10 minutes to 1 year. AWS hasn't documented the V2 policies in the public CloudFront developer guide as of this writing — they're visible in the console but the documentation only covers the four originals.

Elemental-MediaPackage

ID: 08627262-05a9-4f76-9ded-b50ca2e3a84f

For AWS Elemental MediaPackage origins specifically — HLS and DASH video streaming. The cache key includes the four query string parameters that MediaPackage actually uses for manifest filtering and time-shifted playback (aws.manifestfilter, start, end, m) plus the Origin header for CORS. Other query strings are excluded, which keeps the cache key tight even when player libraries append cache-busting noise.

If you are not using MediaPackage, do not use this policy. If you are, use it — it is tuned for the specific request shape MediaPackage produces.

How to choose a cache policy

Match your origin and content type to the policy that is already tuned for it.

Use case	Recommended policy
S3 static site, image bucket, asset CDN	`CachingOptimized`
Video files (MP4, WebP), pre-compressed binaries	`CachingOptimizedForUncompressedObjects`
API endpoints, dynamic GET responses, real-time data	`CachingDisabled`
WordPress, Drupal, server-rendered apps with origin Cache-Control	`UseOriginCacheControlHeaders`
Same as above, but query strings affect the response	`UseOriginCacheControlHeaders-QueryStrings`
AWS Elemental MediaPackage HLS/DASH origin	`Elemental-MediaPackage`
Custom requirements that do not match any of the above	Custom cache policy

For a single distribution serving multiple content types, you typically attach different policies to different cache behaviors. A common setup: CachingOptimized on the default behavior (static assets), CachingDisabled on /api/*, UseOriginCacheControlHeaders on /blog/* if blog pages set their own Cache-Control.

When to create a custom cache policy

Reach for a custom policy when you need to include something specific in the cache key that the managed policies do not cover. The most common reasons are device-type segmentation using CloudFront-Is-Mobile-Viewer (serving different markup to phones versus desktops), country-based variation using CloudFront-Viewer-Country (geo-targeted content where edge handling beats origin-side detection), language negotiation via Accept-Language, and tier-based content where a coarse-grained auth bucket determines which response to serve.

Here is a Terraform example for a custom policy that varies on viewer country and a coarse auth-tier cookie:

resource "aws_cloudfront_cache_policy" "country_and_tier" {
  name        = "country-and-tier"
  default_ttl = 3600       # 1 hour when origin sends no Cache-Control
  max_ttl     = 86400      # cap origin Cache-Control at 24 hours
  min_ttl     = 0          # respect origin no-store

  parameters_in_cache_key_and_forwarded_to_origin {
    # normalize Accept-Encoding so cache key collapses across browsers
    enable_accept_encoding_gzip   = true
    enable_accept_encoding_brotli = true

    headers_config {
      header_behavior = "whitelist"
      headers {
        items = ["CloudFront-Viewer-Country"]  # geo-target at the edge
      }
    }

    cookies_config {
      cookie_behavior = "whitelist"
      cookies {
        # coarse bucket like anonymous/free/premium — NOT a per-user session ID
        items = ["auth_tier"]
      }
    }

    query_strings_config {
      query_string_behavior = "none"
    }
  }
}

The cookie name matters here. auth_tier with values like anonymous, free, and premium produces three cache variants per country, which is reasonable. A per-user session token in the same slot would produce one cache variant per user, which is the cache-buster pattern the article warns against earlier.

A few things to know when going custom. CloudFront-generated headers like CloudFront-Viewer-Country are not sent to the origin by default — you need to either include them in the cache key (which automatically forwards them to the origin) or add them to an origin request policy if you want them at the origin without varying the cache. The cache compression toggle should match what your origin returns; turning it on when your origin does not compress can fragment the cache without benefit. And keep Minimum TTL at 0 unless you specifically want to override origin no-store directives, because the surprise-caching behavior burns a lot of debugging hours when you forget about it.

Common pitfalls

Cache hit ratio tanks after a policy change. Almost always means the cache key got too specific. Check whether the new policy includes cookies or query strings the previous one did not, and re-check whether your origin sets per-user cookies on responses that should be shared across users.

Origin keeps getting hit despite a long Default TTL. Default TTL only applies when the origin sends no Cache-Control or Expires header. If your origin sets Cache-Control: max-age=60, CloudFront uses 60 seconds, not the policy's Default TTL. Either change the origin or use Maximum TTL to cap origin-controlled durations.

no-cache from origin is being ignored. Several managed policies (CachingOptimized, CachingOptimizedForUncompressedObjects, and Amplify) have a Minimum TTL greater than 0, which overrides origin no-cache directives. If you need origin-side cache busting to take effect immediately, switch to CachingDisabled or build a custom policy with Minimum TTL = 0.

Forwarding CloudFront-Viewer-Country to origin does not work. For CloudFront-generated headers like CloudFront-Viewer-Country, make sure the header is explicitly enabled via a cache policy or an origin request policy — they aren't sent to your origin by default. If it's in the cache key, it's also forwarded to the origin automatically. If you only need it at the origin and don't want to vary the cache, put it in an origin request policy instead. (Note: a few CloudFront-* headers, including CloudFront-Viewer-Address, CloudFront-Viewer-ASN, and the TLS-related ones, can only be added via an origin request policy and not in a cache policy.)

Authorization header is not reaching the origin. CloudFront removes it by default. To forward it, either include Authorization in the cache key with a cache policy, or use an origin request policy that forwards all viewer headers (the managed Managed-AllViewer policy does this). You cannot forward only Authorization via an origin request policy. Be careful: putting Authorization in the cache key creates per-token cache variants and is usually inappropriate for broadly shared cacheable content.

What's next

If you came here looking for "which cache policy should I use," the short answer is CachingOptimized for static content, CachingDisabled for APIs, UseOriginCacheControlHeaders for content where your origin can express its own caching intent. Everything else is tuning at the margins.

The next post in this series goes deep on origin request policies — where the separation between cache key and origin forwarding really earns its keep — followed by response headers policies, which can replace a lot of security middleware with a single CloudFront config.

AWS CloudFront Explained: How Cache, Origin, and Response Policies Supercharge Your CDN

Maciej Łopalewski — Wed, 21 Jan 2026 09:00:00 +0000

If you have configured Amazon CloudFront in the past, you might remember wrestling with "Cache Behaviors" - a monolithic setting where caching logic, origin forwarding, and header manipulation were all jumbled together.

Those days are over.

Modern CloudFront architecture uses a modular Policy System. This approach decouples caching (what is stored) from origin requests (what is sent to the backend) and response headers (security/CORS).

For DevOps engineers and cloud architects, understanding these three policy types is the key to building performant, secure, and scalable content delivery networks. This guide breaks down the ecosystem of CloudFront Managed Policies and helps you choose the right tools for the job.

What is CloudFront?

Before diving into policies, let’s ground ourselves in the basics. Amazon CloudFront is a global Content Delivery Network (CDN). Its primary job is to sit between your users and your infrastructure (the "Origin" - like an S3 bucket or an EC2 load balancer).

Latency: It serves content from "Edge Locations" physically closer to the user.
Security: It terminates TLS connections at the edge and blocks DDoS attacks.
Scale: It absorbs traffic spikes so your backend doesn't crash.

The Policy Trio: How They Work

In the modern CloudFront request flow, three distinct policies interact to process a user's request. Understanding the distinction between them is critical for avoiding common pitfalls like "cache misses" or CORS errors.

1. Cache Policy

Where it sits: At the very front of the flow.
What it does: It determines the Cache Key.

When a user requests content, CloudFront uses this policy to decide if it already has a copy. It defines which headers, cookies, or query strings make a request "unique."

Strict policies = Higher cache hit ratio.
Loose policies = Lower cache hit ratio (more load on origin).

2. Origin Request Policy

Where it sits: Between CloudFront and your Backend (Origin).
What it does: It determines what data is forwarded to the backend during a cache miss.

This is the most misunderstood policy. It allows you to send data (like user-specific cookies) to your backend without including that data in the Cache Key. This keeps your cache efficiency high while still giving your application the data it needs to process logic.

3. Response Headers Policy

Where it sits: On the way back to the user.
What it does: It injects specific HTTP headers into the response.

Regardless of what your backend sends, this policy ensures the browser receives the correct Security (HSTS, XSS protection) and CORS headers.

Top Managed Policies: A Cheat Sheet

AWS maintains a library of "Managed Policies" that cover about 90% of use cases. Using these is a best practice - they are rigorously tested, updated by AWS, and require zero maintenance.

Here are the most essential managed policies for each category.

A. Managed Cache Policies

Control the Cache Key and TTL (Time To Live).

CachingOptimized

Best For: S3 Buckets, Static Websites, Images/Assets.

How it works: It ignores almost all headers and cookies. It aggressively caches content based solely on the URL path.
Why choose it: This provides the highest possible cache hit ratio. If your content doesn't change based on who is viewing it, use this.

CachingDisabled

Best For: Dynamic APIs, WebSockets, Real-time data.

How it works: It sets the Time-To-Live (TTL) to 0. Every request bypasses the cache and goes straight to the origin.
Why choose it: Essential for endpoints where data changes every second, or for write operations (POST/PUT) where caching would break functionality.

UseOriginCacheControlHeaders

Best For: CMS (WordPress/Drupal), Hybrid Apps.

How it works: It defers the decision to your server. It looks for Cache-Control headers sent by your backend to decide how long to store the file.
Why choose it: Perfect if you have a mix of static and dynamic content and want your application code, rather than CloudFront configuration, to control cache duration.

B. Managed Origin Request Policies

Control what the backend sees (without breaking the cache).

AllViewer

Best For: Legacy Applications, Complex Dynamic Apps.

How it works: Forwards everything - every header, every cookie, every query string - to the origin.
Why choose it: If your application relies on specific, obscure headers or client-side cookies to function, this ensures nothing is stripped out. Warning: This may expose internal origin details.

CORS-S3Origin

Best For: S3 Buckets serving assets to other domains.

How it works: Specifically whitelists the headers S3 requires to process CORS checks (Origin, Access-Control-Request-Method, etc.).
Why choose it: S3 handles CORS differently than a standard web server. Standard forwarding often fails with S3; this policy fixes those specific issues instantly.

UserAgentRefererHeaders

Best For: Analytics, Hotlink Protection.

How it works: It specifically forwards the User-Agent and Referer headers while stripping others.
Why choose it: If your backend needs to block requests from specific sites (hotlinking) or serve different content to mobile vs. desktop devices, but doesn't need full cookie visibility.

C. Managed Response Headers Policies

Control browser security and access.

SecurityHeadersPolicy

Best For: Everything. (Seriously, use this everywhere).

How it works: Automatically injects industry-standard security headers like:
- Strict-Transport-Security (HSTS)
- X-Frame-Options: DENY (prevents clickjacking)
- X-Content-Type-Options: nosniff
Why choose it: It instantly hardens your application against common web attacks without requiring code changes on your server.

CORS-with-preflight-and-SecurityHeadersPolicy

Best For: Single Page Apps (React, Vue, Angular) calling APIs.

How it works: Combines the security headers above with a permissive CORS configuration. It handles the OPTIONS pre-flight requests that modern browsers send before making API calls.
Why choose it: It solves the dreaded "CORS Error" in browser consoles for modern web applications.

SimpleCORS

Best For: Public, read-only data feeds.

How it works: Adds Access-Control-Allow-Origin: * to the response.
Why choose it: If you are hosting public data (like a weather feed or public JSON file) that you want anyone to be able to use on their website, this is the quickest way to enable it.

Common CloudFront Misconfigurations and How Managed Policies Fix Them

Even experienced DevOps teams run into the same CloudFront issues over and over. Almost all of them trace back to legacy cache behaviors or overly customized settings. Here’s how CloudFront Managed Policies solve the most common problems.

1. “My cache hit ratio is terrible.”

Cause: Your Cache Key is too loose - it includes unnecessary headers, cookies, or query strings.
Symptom: Every request is seen as "unique," forcing a constant stream of cache misses.
The Fix: Use the CachingOptimized managed policy. It strips almost everything from the Cache Key, restoring high hit ratios - perfect for static assets, SPAs, and S3 origins.

2. “CloudFront keeps forwarding too many headers to my origin.”

Cause: Legacy behaviors often forward all headers by default.
Impact: Increased origin load, slower responses, and potential "Request Header Too Large" errors on the backend.
The Fix: Switch to an Origin Request Policy like UserAgentRefererHeaders or CORS-S3Origin. This ensures you forward only what your backend actually needs to function.

3. “I’m still getting CORS errors in the browser.”

Cause: Missing or inconsistent Access-Control-* headers.
The Fix: Apply the CORS-with-preflight-and-SecurityHeadersPolicy response policy. It handles OPTIONS preflight requests and injects all required CORS headers at the edge - even if your backend forgets them.

4. “S3 CORS works on localhost, but not in CloudFront.”

Cause: S3 requires specific headers to process CORS. If CloudFront strips them, S3 treats the request as standard and omits the CORS response headers.
The Fix: Use the CORS-S3Origin Origin Request Policy. This explicitly forwards Origin, Access-Control-Request-Method, and Access-Control-Request-Headers so S3 knows to respond correctly.

5. “My dynamic API is being cached when it shouldn’t be.”

Cause: Your API path (/api/*) is falling through to a default behavior that has caching enabled.
The Fix: Create a specific behavior for your API path and attach CachingDisabled. This guarantees every request bypasses the edge and reaches your application.

Summary

Moving to Managed Policies allows you to operate with "Intent-Based Configuration." Instead of tweaking individual settings, you select a policy that matches your architectural intent.

Intent	Recommended Policy Combo
Static Website	Cache: `CachingOptimized` Origin Request: `None` (or `CORS-S3Origin`) Response: `SecurityHeadersPolicy`
Dynamic API	Cache: `CachingDisabled` Origin Request: `AllViewer` Response: `SimpleCORS`
Modern Web App (SPA)	Cache: `CachingOptimized` (for assets) Origin Request: `None` Response: `CORS-with-preflight`

Architecting Dagster at Scale: Navigating the Challenges of 50+ Code Locations on Kubernetes

Maciej Łopalewski — Mon, 20 Oct 2025 07:58:08 +0000

As your organization embraces Dagster for data orchestration, your projects will inevitably grow. What starts with a single, manageable code location can quickly expand into dozens, each owned by a different team or serving a distinct data domain. While this modularity is one of Dagster’s strengths, it introduces significant architectural challenges, especially when deploying on Kubernetes.

Managing 50, 100, or even more code locations is not just a matter of adding more entries to a YAML file. It has profound implications for resource consumption, deployment speed, and overall maintainability.

This article dives deep into the trade-offs of managing Dagster at scale on Kubernetes. We’ll explore different deployment models, discuss performance and observability, and share real-world patterns (and anti-patterns) to help you design a data platform that is both powerful and efficient.

First, What Are Dagster Code Locations?

In Dagster, a code location is a collection of Dagster definitions (like assets, jobs, schedules, and sensors) that are loaded in a single Python environment. Think of it as a self-contained package of data pipelines. By isolating code into distinct locations, you achieve several benefits:

Fault Tolerance: An error in one code location (e.g., a missing Python dependency) won’t prevent other code locations from loading.
Independent Deployments: Team A can update their pipelines without forcing Team B to redeploy.
Dependency Management: Each code location can have its own requirements.txt or pyproject.toml, avoiding conflicts between teams that need different library versions.

Dagster’s central components, like the webserver and the daemon, communicate with these code locations via a gRPC API to fetch definitions and launch runs. This separation is key to its scalability.

The Hidden Costs of Managing 50+ Code Locations

When you deploy Dagster on Kubernetes using the official Helm chart, the standard approach is to use user code deployments. This feature creates a dedicated Kubernetes Deployment and Service for each code location you define.

A typical Dagster architecture on Kubernetes, where each code location runs in its own pod.

This model works perfectly for a handful of locations. But as you scale past 50, you start to feel the pain points:

Resource Overhead: Each code location pod consumes resources just by running. A baseline Python process, the gRPC server, and health checks require a certain amount of CPU and memory. While a single pod might only need 100MB of RAM, 50 of them instantly consume 5GB - and that’s before they even load your code. This idle resource consumption can become a significant cost.
Deployment Bottlenecks: If you need to update a shared library or a base Docker image used by all code locations, you trigger a massive rollout. Kubernetes must terminate 50+ old pods and schedule 50+ new ones. In a resource-constrained cluster, this can lead to long deployment times, "Pod unschedulable" events, and service degradation.
The "Launchpad" Problem: It's crucial to remember that these code location pods do not run your data pipelines. Their primary role is to serve metadata to the webserver and provide the necessary code to the Dagster daemon, which then launches another pod (the "run pod" or "job pod") to actually execute the pipeline. This means your infrastructure must support both the standing army of code location pods and the transient pods for active runs, further compounding resource pressure.

Kubernetes Deployment Models: Trade-Offs and Strategies

Given the challenges, let's analyze the two primary architectural models for deploying Dagster code locations on Kubernetes.

Model 1: The Standard "Pod per Code Location"

This is the default and recommended approach using user-code-deployments in the Dagster Helm chart.

How it works: You define each code location in your values.yaml file, and Helm creates a separate Kubernetes Deployment for each.

# values.yaml
userCodeDeployments:
  enabled: true
  deployments:
    - name: "sales-analytics"
      image:
        repository: "my-registry/sales-analytics"
        tag: "0.2.1"
      # ... resources, env vars, etc.
    - name: "marketing-etl"
      image:
        repository: "my-registry/marketing-etl"
        tag: "1.5.0"
      # ...
    # ... 50 more entries

Pros:

Full Isolation: The best model for fault tolerance and dependency management.
Clear Ownership: Easy to map a code location pod to a specific team or project.
Granular Updates: An update to the sales-analytics image only triggers a rollout for that single deployment.

Cons:

High Resource Overhead: The primary driver of idle resource consumption.
Slow Global Deployments: Updating all locations at once is slow and resource-intensive.
Cluster Limits: Can strain clusters that have a low limit on the total number of pods.

Model 2: The Monolithic "Single Pod" Approach (A Workaround)

For teams struggling with the overhead of the standard model, an alternative is to consolidate all code locations into a single process. This is not officially recommended as it moves away from Dagster's core isolation principles, but it can be a pragmatic solution in specific, resource-constrained scenarios.

How it works: You can "hack" the official Helm chart to run all your code locations within the main Dagster webserver and daemon pods. This involves building a single, monolithic Docker image containing the code for all pipelines and providing a workspace.yaml that loads them from the local filesystem.

# In your monolithic Dockerfile
COPY ./pipelines/sales_analytics /opt/dagster/app/sales_analytics
COPY ./pipelines/marketing_etl /opt/dagster/app/marketing_etl
# ...

# workspace.yaml loaded into the webserver/daemon
load_from:
  - python_module:
      module_name: sales_analytics.definitions
      working_directory: /opt/dagster/app/sales_analytics
  - python_module:
      module_name: marketing_etl.definitions
      working_directory: /opt/dagster/app/marketing_etl
  # ... all other locations

You would disable userCodeDeployments and ensure this workspace file is used by the main Dagster pods.

Pros:

Minimal Resource Footprint: Dramatically reduces the number of standing pods, saving significant idle resources.
Fast Deployments: An update involves rolling out just a few pods (webserver, daemon), which is much faster than 50+.

Cons:

No Fault Tolerance: A single broken dependency or syntax error in one code location can bring down the entire system.
Dependency Hell: All teams must agree on a single, shared set of Python dependencies.
Massive Pods: The webserver and daemon pods become huge, potentially requiring very large and expensive Kubernetes nodes to run.
Coupled Deployments: Any change requires rebuilding and redeploying the entire monolithic image.

Strategies for Maintainability and Scaling

Instead of choosing one extreme, the best strategy often lies in intelligent application of the standard model.

Use a Separate Image Per Code Location: Avoid using a single base image for all your code locations. While it seems efficient, it creates tight coupling. Instead, build and version a Docker image for each code location independently. This ensures that only the code locations that have actually changed will be redeployed during an update.
Aggressively Monitor Resources: Use tools like Prometheus and Grafana to monitor the CPU and memory usage of your code location pods. Are they constantly sitting at 5% of their requested resources? You are likely overprovisioning. Adjust their resources.requests in your Helm chart to free up capacity for run pods.
Optimize Deployment Times: Keep your Docker images lean. A smaller image pulls faster, leading to quicker pod startup times. Use multi-stage builds and avoid including unnecessary build-time dependencies in your final image.

Real-World Patterns and Anti-Patterns

Theory is one thing, but production issues are the best teacher. Here are some patterns to emulate and anti-patterns to avoid.

Anti-Pattern: The Heavyweight Code Location

A common mistake is to load large models or initialize expensive clients at the module level of your Dagster code. Remember: everything you import and define globally in your code location gets loaded into memory the moment the pod starts.

Real-world example:
A team was using the libpostal library for address parsing. Simply adding import postal to their asset definitions caused the memory footprint of their code location pod to jump by 2GB. When several other teams copied this pattern, the cluster's memory usage skyrocketed, causing widespread performance issues.

# assets/address_parsing.py

from postal.parser import parse_address # <-- This import loads a large model into memory!

from dagster import asset

@asset
def parsed_addresses(raw_addresses):
    # This asset's code location pod now holds a 2GB model in memory,
    # even when the asset is not running.
    return [parse_address(addr) for addr in raw_addresses]

The Fix: There are two great ways to solve this problem.

Lazy Loading: The simplest fix is to lazily import or load expensive resources inside your asset or op functions. This ensures the resource is only loaded into memory in the short-lived run pod, not the long-running code location pod.

# A better approach
from dagster import asset

@asset
def parsed_addresses(raw_addresses):
    from postal.parser import parse_address # <-- Import inside the function

    return [parse_address(addr) for addr in raw_addresses]

Externalize as a Microservice: For an even more robust and scalable solution, you can externalize the heavy dependency entirely. You can deploy libpostal as a microservice (e.g., using a wrapper like libpostal-rest) to have more control over its resources. This centralizes the resource-intensive component into a single, dedicated instance that you can manage and scale independently, serving all your Dagster pipelines via a simple network call.

Pattern: Domain-Driven Consolidation

If you have many small, related code locations owned by the same team, consider consolidating them. Instead of having sales-team-daily, sales-team-weekly, and sales-team-hourly, merge them into a single sales-team code location. This reduces pod sprawl without creating a true monolith.

Conclusion: When to Split and When to Consolidate

Choosing the right architecture is a balancing act. Here's a simple heuristic:

Stick with the "Pod per Code Location" model as your default. The isolation and maintainability benefits are immense and align with Dagster's core design. Use the strategies outlined above to mitigate the resource overhead.
Consolidate code locations that are owned by the same team, share the same dependencies, and are deployed together. This is a pragmatic way to reduce pod count.
Only consider the "Monolithic" model as a last resort. If you are in a highly resource-constrained environment and suffering from cripplingly slow rollouts due to pod churn, it can be a temporary lifeline. But be fully aware of the trade-offs in stability and dependency management you are making.

Boost Your Medusa E-Commerce Development: Streamlined Local Setup Guide

Maciej Łopalewski — Fri, 27 Jun 2025 05:27:38 +0000

Medusa is a powerful open-source headless commerce engine that provides a flexible and robust foundation for building e-commerce applications. Getting started with a new framework often involves setting up various prerequisites and understanding configuration nuances. While the official documentation is excellent, having a pre-configured starting point can significantly accelerate the local development process.

This article will guide you through setting up Medusa locally, focusing on common requirements and helpful configurations. We'll also introduce a specific resource, the medusa-starter repository, designed to simplify these initial steps and provide valuable examples for future deployment.

The Standard Medusa Installation Path

The official Medusa documentation offers comprehensive guides for getting started. You can find the detailed installation instructions covering prerequisites like Node.js, PostgreSQL, and Redis on the Medusa Installation guide. It's highly recommended to familiarize yourself with these official steps.

Simplifying Prerequisites with `medusa-starter`

One of the initial hurdles in setting up a local development environment for Medusa is provisioning the necessary database (PostgreSQL) and caching layer (Redis). The medusa-starter repository addresses this by providing a simple Docker Compose file specifically for these services.

Within the repository, you'll find a compose.db.yaml file (link to compose.db.yaml). This file allows you to spin up ready-to-use PostgreSQL and Redis instances with a single command:

docker-compose -f compose.db.yaml up -d

This command brings up the necessary services in detached mode (-d), allowing you to quickly get your database and cache dependencies running without manual installation and configuration. Based on the configuration in compose.db.yaml, this will set up:

PostgreSQL: Available on localhost:5432. It will be configured with the user, password, and database name specified within the compose file or corresponding environment variables, ready for your Medusa instance to connect.
Redis: Available on localhost:6379. For this local development setup, authentication is typically not required, allowing for straightforward connection.

The medusa-starter repository includes environment variables tailored to connect to these services, making integration straightforward once they are running.

Understanding `NODE_ENV` and Its Implications

A critical aspect of configuring your Medusa application, both locally and in production, is the NODE_ENV environment variable. This variable significantly influences Medusa's behavior.

NODE_ENV=development: This is the standard setting for local development. In this mode, Medusa often provides more detailed logging and error messages, and certain security constraints are relaxed to facilitate rapid iteration.
NODE_ENV=production: This setting is intended for production deployments. Medusa enables stricter security measures and optimizes for performance. A key behavior change in production mode is the requirement for a secure connection (HTTPS/TLS) and a custom domain to access the Medusa Admin panel. This is because Medusa uses secure cookies for authentication, which browsers will only send over a secure connection to a specific domain, not localhost.

A `NODE_ENV` Workaround for Local Docker (Without TLS)

If you are trying to run your Medusa application locally within a Docker container without setting up TLS and a custom domain, using NODE_ENV=production will prevent you from logging into the admin panel.

As a workaround for this specific scenario (local Docker testing without TLS), you can set NODE_ENV to a different value, such as CI. While this allows you to bypass the secure cookie requirement locally, it is important to understand that this is a workaround and not a recommended practice for actual production deployments. For production, always use NODE_ENV=production and ensure proper TLS setup.

Beyond Local Development: Production Examples in `medusa-starter`

The medusa-starter repository isn't just for getting started locally. It also provides valuable examples to help you transition towards production deployments:

medusa-config.ts Example: The repository includes an example medusa-config.ts file (link to medusa-config.ts) that demonstrates how to configure modules and settings suitable for a production environment, often integrating with the Docker Compose database setup.
Example Dockerfiles: You'll find example Dockerfiles (link to Dockerfiles) that show you how to package your Medusa application into a Docker image, a common step for cloud deployments.
Example GitHub Actions Workflow: The repository includes a basic GitHub Actions workflow example (link to GitHub Actions) that you can adapt for your own projects. This workflow demonstrates a basic Continuous Integration (CI) pipeline, which is crucial for automating testing and building your application.

Conclusion

Setting up a development environment can sometimes feel like the first significant hurdle. The medusa-starter repository (v2) aims to lower that barrier by providing pre-configured examples for essential services via Docker Compose. By understanding the role of NODE_ENV and leveraging the provided configuration and Docker examples, you can not only streamline your local development workflow but also gain a head start on preparing your Medusa application for production deployment. Explore the repository, adapt the examples to your needs, and happy coding!

DEV Community: Maciej Łopalewski

Reciprocal Rank Fusion on free Elasticsearch: licensing, workarounds, and the OpenSearch alternative

What RRF actually does

What Basic actually gets you, and what it doesn't

The licensing timeline

Workaround 1: Implement RRF in your application

Workaround 2: Linear combination with manual normalization

Workaround 3: Switch to OpenSearch

When paying for Enterprise makes sense

What most teams actually need

AWS CloudFront Cache Policies: Complete Guide

What a cache policy controls

How Minimum, Maximum, and Default TTL actually work

Cache key settings: headers, cookies, query strings, and compression

The AWS managed cache policies

CachingOptimized

CachingOptimizedForUncompressedObjects

CachingDisabled

UseOriginCacheControlHeaders

UseOriginCacheControlHeaders-QueryStrings

Amplify (and the eight related Amplify policies)

Elemental-MediaPackage

How to choose a cache policy

When to create a custom cache policy

Common pitfalls

What's next

AWS CloudFront Explained: How Cache, Origin, and Response Policies Supercharge Your CDN

What is CloudFront?

The Policy Trio: How They Work

1. Cache Policy

2. Origin Request Policy

3. Response Headers Policy

Top Managed Policies: A Cheat Sheet

A. Managed Cache Policies

CachingOptimized

CachingDisabled

UseOriginCacheControlHeaders

B. Managed Origin Request Policies

AllViewer

CORS-S3Origin

UserAgentRefererHeaders

C. Managed Response Headers Policies

SecurityHeadersPolicy

CORS-with-preflight-and-SecurityHeadersPolicy

SimpleCORS

Common CloudFront Misconfigurations and How Managed Policies Fix Them

1. “My cache hit ratio is terrible.”

2. “CloudFront keeps forwarding too many headers to my origin.”

3. “I’m still getting CORS errors in the browser.”

4. “S3 CORS works on localhost, but not in CloudFront.”

5. “My dynamic API is being cached when it shouldn’t be.”

Summary

Architecting Dagster at Scale: Navigating the Challenges of 50+ Code Locations on Kubernetes

First, What Are Dagster Code Locations?

The Hidden Costs of Managing 50+ Code Locations

Kubernetes Deployment Models: Trade-Offs and Strategies

Model 1: The Standard "Pod per Code Location"

Model 2: The Monolithic "Single Pod" Approach (A Workaround)

Strategies for Maintainability and Scaling

Real-World Patterns and Anti-Patterns

Anti-Pattern: The Heavyweight Code Location

Pattern: Domain-Driven Consolidation

Conclusion: When to Split and When to Consolidate

Boost Your Medusa E-Commerce Development: Streamlined Local Setup Guide

The Standard Medusa Installation Path

Simplifying Prerequisites with medusa-starter

Understanding NODE_ENV and Its Implications

A NODE_ENV Workaround for Local Docker (Without TLS)

Beyond Local Development: Production Examples in medusa-starter

Conclusion

Simplifying Prerequisites with `medusa-starter`

Understanding `NODE_ENV` and Its Implications

A `NODE_ENV` Workaround for Local Docker (Without TLS)

Beyond Local Development: Production Examples in `medusa-starter`