<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Sumedh Bala</title>
    <description>The latest articles on DEV Community by Sumedh Bala (@sumedhbala).</description>
    <link>https://dev.to/sumedhbala</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3580553%2F1432be4b-7147-4437-8eb9-b66d129f92e2.jpg</url>
      <title>DEV Community: Sumedh Bala</title>
      <link>https://dev.to/sumedhbala</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/sumedhbala"/>
    <language>en</language>
    <item>
      <title>Claude Code Costs, Act IV — The mistakes catalogue &amp; cheat sheet</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Fri, 26 Jun 2026 06:00:29 +0000</pubDate>
      <link>https://dev.to/sumedhbala/claude-code-costs-act-iv-the-mistakes-catalogue-cheat-sheet-34bo</link>
      <guid>https://dev.to/sumedhbala/claude-code-costs-act-iv-the-mistakes-catalogue-cheat-sheet-34bo</guid>
      <description>&lt;p&gt;This act consolidates everything into two reference artifacts: a catalogue of the mistakes (each with its symptom, cost, and fix) for revision, and a one-page cheat sheet.&lt;/p&gt;

&lt;h2&gt;
  
  
  The mistakes catalogue
&lt;/h2&gt;

&lt;p&gt;Ordered roughly by how much money they cost and how often they happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Routing per turn instead of per conversation.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; a "cost-saving" router that bounces models mid-conversation. &lt;em&gt;Cost:&lt;/em&gt; a cold prefix write on the first switch + a catch-up write on every re-entry; for Sonnet, a net &lt;em&gt;loss&lt;/em&gt; (+2.1% measured). &lt;em&gt;Fix:&lt;/em&gt; route sticky — per conversation or per sub-agent. Where you switch matters more than whether.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Inserting a proxy that re-serializes JSON.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; a logging/compression/routing proxy that parses and re-emits the request. &lt;em&gt;Cost:&lt;/em&gt; even a logically-identical re-encode changes the bytes (separators, escaping, key order, number formatting) and drops cache hit rate to ~0 for &lt;em&gt;all&lt;/em&gt; traffic. &lt;em&gt;Fix:&lt;/em&gt; forward original bytes verbatim; mutate only by surgical byte-fragment replacement on the &lt;code&gt;messages&lt;/code&gt; tail.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Leaving the tool/MCP surface unstable.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; tool count grows across early turns (async MCP connectors), or dynamic/runtime tool discovery is enabled. &lt;em&gt;Cost:&lt;/em&gt; byte-0 churn → full cold rebuild (2× write) every affected turn. &lt;em&gt;Fix:&lt;/em&gt; &lt;code&gt;--strict-mcp-config&lt;/code&gt; and a frozen startup tool set; a fixed dispatcher tool instead of runtime tool mutation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Volatile tokens early in the prompt.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; a timestamp, UUID, or request-ID near the top of the system prompt; unsorted &lt;code&gt;json.dumps&lt;/code&gt;. &lt;em&gt;Cost:&lt;/em&gt; every request has a unique prefix → ~0% hit rate, silently. &lt;em&gt;Fix:&lt;/em&gt; move all volatile content &lt;em&gt;after&lt;/em&gt; the last breakpoint; sort keys deterministically.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Stripping or editing thinking blocks in a proxy.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; a proxy that "cleans up" or removes thinking blocks. &lt;em&gt;Cost:&lt;/em&gt; editing → 400 (integrity); removing → 200 but a position-dependent re-key (removing an &lt;em&gt;early&lt;/em&gt; block lost 8,104 cache tokens vs 440 for a late one). &lt;em&gt;Fix:&lt;/em&gt; never mutate thinking blocks; carry the request body as Claude Code assembled it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Trusting the displayed session cost as your invoice.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; using &lt;code&gt;/cost&lt;/code&gt; or the status line as the source of truth. &lt;em&gt;Cost:&lt;/em&gt; it prices 1-hour writes at the 5-minute 1.25× rate and reads ~10% low ($1.77 vs $1.97 on a real session). &lt;em&gt;Fix:&lt;/em&gt; use the Anthropic Console for absolute billing; use the displayed figure only for in-session relative comparison.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;7. Caching a prefix you'll read fewer than 3 times.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; aggressive caching in a short-lived custom harness. &lt;em&gt;Cost:&lt;/em&gt; a wasted write is 2× — double what no caching would have cost. &lt;em&gt;Fix:&lt;/em&gt; cache only when you'll re-read the prefix ≥3 times (the break-even); let interactive sessions cache by default.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;8. Running an aggressive input compressor over a cached prefix.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; LLMLingua/LLM-based compression applied to the conversational prefix. &lt;em&gt;Cost:&lt;/em&gt; rewriting cached bytes flips reads (0.1×) to writes (2×); accuracy degrades past ~20×. &lt;em&gt;Fix:&lt;/em&gt; use input compressors only where there's no cache to lose (one-shot, RAG assembly) or only on the fresh tail; prefer a cache-aware tool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;9. Agentic bursts overflowing the 20-block lookback.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; a single turn emits ~10+ parallel tool calls (≈21+ tool-call/result blocks). &lt;em&gt;Cost:&lt;/em&gt; the next turn can't re-link the cache and cold-writes the gap. &lt;em&gt;Fix:&lt;/em&gt; in Claude Code, bound the burst; behind a proxy, a stateless breakpoint grid (stride ~18, 4 markers) rescues up to ~74 blocks/turn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;10. A cache breakpoint below the minimum prefix.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; marking a sub-1,024-token prefix for caching. &lt;em&gt;Cost:&lt;/em&gt; the marker silently does nothing (&lt;code&gt;cache_creation: 0&lt;/code&gt;); you think caching is on. &lt;em&gt;Fix:&lt;/em&gt; confirm non-zero &lt;code&gt;cache_creation&lt;/code&gt; then non-zero &lt;code&gt;cache_read&lt;/code&gt; on the following request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;11. Assuming HTTP 200 means the cache survived.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; judging a proxy change by "it still works." &lt;em&gt;Cost:&lt;/em&gt; 200 means "valid request," not "cache preserved" — you can silently convert cheap reads into expensive writes. &lt;em&gt;Fix:&lt;/em&gt; validate every prefix-touching change by watching &lt;code&gt;cache_read&lt;/code&gt; on the &lt;em&gt;next&lt;/em&gt; request, not just the status code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;12. Switching Haiku → Sonnet mid-conversation.&lt;/strong&gt; &lt;em&gt;Symptom:&lt;/em&gt; escalating an in-flight Haiku conversation to Sonnet. &lt;em&gt;Cost:&lt;/em&gt; stacks three things — full cache miss (model-scoped), Sonnet now keeps-and-bills thinking Haiku had been ignoring, and any prefix-leanness benefit evaporates. &lt;em&gt;Fix:&lt;/em&gt; decide the tier at conversation start; if you must escalate, accept it as a fresh cold start, not a cheap continuation.&lt;/p&gt;

&lt;h2&gt;
  
  
  One-page cheat sheet
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The four mental models&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Amnesiac contractor&lt;/strong&gt; — the server stores nothing; the request is the only state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Prefix-match cache&lt;/strong&gt; — caching is a strict byte-prefix; a change at byte N invalidates everything ≥ N.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Model-scoped key&lt;/strong&gt; — a cache entry belongs to one model; another model can't read it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encrypted envelope&lt;/strong&gt; — thinking rides back sealed in the &lt;code&gt;signature&lt;/code&gt;; you carry it, can't open it, can't edit it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The numbers&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cache &lt;strong&gt;read = 0.1×&lt;/strong&gt; input; cache &lt;strong&gt;write = 2×&lt;/strong&gt; (1-hour TTL, what Claude Code sends); &lt;strong&gt;output = 5×&lt;/strong&gt; input, never cached.&lt;/li&gt;
&lt;li&gt;Caching breaks even from the &lt;strong&gt;3rd&lt;/strong&gt; read (at 2×). A wasted write costs &lt;strong&gt;2×&lt;/strong&gt; (double no-cache).&lt;/li&gt;
&lt;li&gt;A cache hit is &lt;strong&gt;~10× cheaper&lt;/strong&gt; than cold processing.&lt;/li&gt;
&lt;li&gt;Tokenizer factor: Opus 4.8 = 1.0; Haiku 4.5 = 0.775 (measured); Sonnet 4.6 ≈ 0.775 (assumed).&lt;/li&gt;
&lt;li&gt;Claude Code: &lt;strong&gt;3 breakpoints, all 1-hour TTL&lt;/strong&gt; (tools+identity, full system, sliding tail). Displayed &lt;code&gt;total_cost_usd&lt;/code&gt; under-reports (prices writes at 1.25×).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The render order&lt;/strong&gt; — &lt;code&gt;tools → system → messages&lt;/code&gt;. Stable first, volatile last.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The two safest cost levers&lt;/strong&gt; (never touch the prefix):&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Output compression&lt;/strong&gt; (Caveman) — attacks the priciest line, which is never cached.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sticky routing&lt;/strong&gt; — per conversation/sub-agent: all-Sonnet ≈ −53%, all-Haiku ≈ −85% vs Opus.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The reflex to build:&lt;/strong&gt; before any change that touches tools, system, or message &lt;em&gt;history&lt;/em&gt;, ask &lt;em&gt;"does this change a byte inside the cached prefix?"&lt;/em&gt; If yes, you're paying a 2× rewrite — only do it on purpose.&lt;/p&gt;

&lt;h2&gt;
  
  
  Synthesis — the whole guide in twelve lines
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code is a stateless API client.&lt;/strong&gt; Every turn is one complete &lt;code&gt;/v1/messages&lt;/code&gt; request carrying tools + system + the whole conversation; the server stores nothing and the client re-sends everything (including thinking) each turn. The request is the only state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching is a prefix match.&lt;/strong&gt; A byte change at N kills caches at ≥ N; order tools → system → messages; stable first, volatile last.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Breakpoints are content-addressed cut points (max 4);&lt;/strong&gt; read the longest matching prefix, reprocess only the delta; backward re-link is bounded by 20 blocks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A new turn adds and reads — it doesn't invalidate.&lt;/strong&gt; Reads refresh the TTL; the 2× write premium is paid only on the delta; the big prefix is written once.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code uses 3 breakpoints at 1-hour TTL&lt;/strong&gt; — tools+identity, full system, sliding tail. The date is a system-reminder after the system breakpoints; the billing header's token is excluded from the cache key; its displayed cost under-reports by pricing 1-hour writes at 1.25×.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Switching models is not "always cold":&lt;/strong&gt; first call cold, later same-model calls warm-read + catch-up-write. Caches are model-scoped, the system prompt differs by two lines, and across the Opus/Sonnet/Haiku family thinking blocks replay across models — rendered and billed, not dropped, even when encrypted; only Fable/Mythos drop them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thinking is an encrypted envelope.&lt;/strong&gt; The &lt;code&gt;signature&lt;/code&gt; carries the full reasoning, decrypted server-side; you can't read it, can't edit it (400), but must carry it for continuity. Billed as output at generation, input on replay; &lt;code&gt;display&lt;/code&gt; is visibility-only.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Thinking-strip is a model-class trait, neutralized by Claude Code.&lt;/strong&gt; Default Haiku/older strip thinking (free, out of the cache key); Opus 4.5+/Sonnet 4.6+ keep it. Claude Code forces &lt;code&gt;keep:"all"&lt;/code&gt; everywhere, so the strip never bites in real usage — and a proxy that &lt;em&gt;removes&lt;/em&gt; blocks pays a position-dependent re-key.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic MCP tools&lt;/strong&gt; (runtime &lt;code&gt;list_changed&lt;/code&gt; or async connector load) churn byte 0 → full rebuild. Keep the tool set stable; use a fixed dispatcher. Stabilize the tool surface first.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output dominates generation-heavy turns&lt;/strong&gt; — the only regime where a cold cross-model bounce can still win (output &amp;gt; ~2,500 tokens).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Route per conversation, not per turn.&lt;/strong&gt; At 2× writes, per-request 20%-Sonnet &lt;em&gt;loses&lt;/em&gt; 2.1% while sticky all-Sonnet saves 53% and all-Haiku 85%. Where you switch matters more than whether.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost-reduction tooling clusters by cost line.&lt;/strong&gt; Routers attack the model tier (pay off only sticky); compressors attack input/cache (must leave the cached prefix byte-stable); output compressors attack the priciest, cache-neutral line. The safest wins — output compression and sticky routing — are the two that never touch the cached prefix.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  References &amp;amp; caveats
&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;Costs are computed from measured token counts at Anthropic list prices (per 1M, as of 2026-06-15): Opus 4.8 $5/$25, Sonnet 4.6 $3/$15, Haiku 4.5 $1/$5 in/out; cache read 0.1×, cache write 2× (1-hour TTL). The tokenizer factor 0.775 is measured for Haiku and assumed for Sonnet. Sonnet routing figures are modeled, not run. Claude Code's displayed &lt;code&gt;total_cost_usd&lt;/code&gt; uses a 1.25× write multiplier and reads lower than these documented-rate figures. Prices and behaviors are version-specific (Claude Code 2.1.150, mid-2026) — re-verify for your own setup.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anthropic documentation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Extended thinking (signatures, encryption, display, retention, multi-turn/tool-use): &lt;code&gt;platform.claude.com/docs/en/build-with-claude/extended-thinking&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Adaptive thinking &amp;amp; effort: &lt;code&gt;platform.claude.com/docs/en/build-with-claude/adaptive-thinking&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Prompt caching: &lt;code&gt;platform.claude.com/docs/en/build-with-claude/prompt-caching&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Pricing: &lt;code&gt;claude.com/pricing&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cost-reduction tooling&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Routers: RouteLLM &lt;code&gt;lm-sys/RouteLLM&lt;/code&gt; · vLLM Semantic Router &lt;code&gt;vllm-project/semantic-router&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Input/cache compressors: Headroom &lt;code&gt;chopratejas/headroom&lt;/code&gt; · LLMLingua &lt;code&gt;microsoft/LLMLingua&lt;/code&gt; · RTK &lt;code&gt;rtk-ai/rtk&lt;/code&gt; · lean-ctx &lt;code&gt;yvgude/lean-ctx&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Output compressor: Caveman &lt;code&gt;JuliusBrussee/caveman&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;MCP servers referenced: &lt;code&gt;github/github-mcp-server&lt;/code&gt; (v1.0.5; dynamic-toolsets removed in v1.1.0, PR #2512 — for tech-debt, not cache) · &lt;code&gt;docker/mcp-gateway&lt;/code&gt; (v0.43.0 / &lt;code&gt;4833d8c&lt;/code&gt;; &lt;code&gt;dynamic-tools&lt;/code&gt; default-on)&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>claude</category>
      <category>programming</category>
    </item>
    <item>
      <title>Claude Code Costs, Act III — The ecosystem of options for spending less</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Fri, 26 Jun 2026 06:00:10 +0000</pubDate>
      <link>https://dev.to/sumedhbala/claude-code-costs-act-iii-the-ecosystem-of-options-for-spending-less-33pc</link>
      <guid>https://dev.to/sumedhbala/claude-code-costs-act-iii-the-ecosystem-of-options-for-spending-less-33pc</guid>
      <description>&lt;p&gt;There is a whole open-source ecosystem aimed at cutting LLM cost. The trick to evaluating any of it is to ask &lt;strong&gt;which of the three cost lines it attacks&lt;/strong&gt; — cached input (0.1×), uncached/written input (1×/2×), or output (the priciest, never cached) — and then whether it does so &lt;strong&gt;without forfeiting the model-scoped prompt cache.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;That last clause is the recurring catch, and it's the throughline of this entire guide:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Anything that rewrites the request prefix per turn forfeits the model-scoped prompt cache.&lt;/strong&gt; A tool nets real savings only if it (a) preserves the cached prefix byte-for-byte, or (b) attacks a cost line the cache doesn't cover — namely &lt;strong&gt;output.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;We'll walk the ecosystem in cost-line order: input/cache compressors, then the output compressor, then the model routers, then what routing actually costs inside Claude Code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Input / cache-line compressors
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Headroom — compressing the input / cache-write line
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Headroom&lt;/strong&gt; (&lt;code&gt;headroomlabs-ai/headroom&lt;/code&gt;, formerly &lt;code&gt;chopratejas/headroom&lt;/code&gt;; Apache-2.0) is a local-first compression &lt;strong&gt;library + proxy + MCP&lt;/strong&gt; for coding agents, and the most deeply-engineered tool in this category. (It's also brand-new and fast-moving — created 2026-01, dozens of commits a day — so treat specifics as a moving target.) Three things make it notable:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Content-type-aware routing&lt;/strong&gt; — it picks a compressor per content type (table below) rather than compressing blindly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reversibility&lt;/strong&gt; — it stores the originals, and the model retrieves the full text on demand, so compression doesn't lose information.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A &lt;code&gt;CacheAligner&lt;/code&gt;&lt;/strong&gt; — it pulls volatile content (dates, IDs) &lt;em&gt;out&lt;/em&gt; of the prefix to keep it stable across calls (one half of composing compression with the cache; see below).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Self-reported results: 47–92% token savings with accuracy held (GSM8K ±0, SQuAD 97% at 19% compression, BFCL tools 97% at 32%) &lt;strong&gt;[vendor self-reported]&lt;/strong&gt;. Independent measurement is far lower: the one rigorous third-party benchmark found &lt;strong&gt;~10% on-wire savings&lt;/strong&gt;, matching Headroom's &lt;em&gt;own&lt;/em&gt; published fleet median of &lt;strong&gt;4.8%&lt;/strong&gt; — an order of magnitude below the headline. &lt;strong&gt;[independent benchmark, 2026-06]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Headroom's content-type routing &lt;strong&gt;[med]&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Content&lt;/th&gt;
&lt;th&gt;Compressor&lt;/th&gt;
&lt;th&gt;Savings&lt;/th&gt;
&lt;th&gt;Preserves&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;JSON arrays&lt;/td&gt;
&lt;td&gt;SmartCrusher&lt;/td&gt;
&lt;td&gt;70–90%&lt;/td&gt;
&lt;td&gt;keys, UUIDs, booleans&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Search results&lt;/td&gt;
&lt;td&gt;SearchCompressor&lt;/td&gt;
&lt;td&gt;80–95%&lt;/td&gt;
&lt;td&gt;matched lines&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Build/test logs&lt;/td&gt;
&lt;td&gt;LogCompressor&lt;/td&gt;
&lt;td&gt;85–95%&lt;/td&gt;
&lt;td&gt;errors, stack traces&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Source code&lt;/td&gt;
&lt;td&gt;CodeCompressor (AST)&lt;/td&gt;
&lt;td&gt;40–70%&lt;/td&gt;
&lt;td&gt;signatures, imports&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Diffs / HTML / text&lt;/td&gt;
&lt;td&gt;Diff/HTML/Text&lt;/td&gt;
&lt;td&gt;50–80%&lt;/td&gt;
&lt;td&gt;high-entropy tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Images&lt;/td&gt;
&lt;td&gt;trained router (MiniLM/SigLIP)&lt;/td&gt;
&lt;td&gt;40–90%&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;em&gt;The image path routes through a **trained MiniLM/SigLIP classifier&lt;/em&gt;* (&lt;code&gt;chopratejas/technique-router&lt;/code&gt;); HTML is handled by &lt;strong&gt;extraction&lt;/strong&gt; (trafilatura — it drops irrelevant blocks, not tokens), not a lossy crusher.*&lt;/p&gt;

&lt;h4&gt;
  
  
  Two surfaces — where it compresses
&lt;/h4&gt;

&lt;p&gt;Headroom works on two surfaces, and they stand in opposite relationships to the cache — so it's worth separating them (the table's ratios apply to only one of them). They're two &lt;em&gt;insertion points&lt;/em&gt; for the &lt;strong&gt;same&lt;/strong&gt; compressors, picked by how you deploy Headroom — not two passes that both run. &lt;strong&gt;Integrate&lt;/strong&gt; it (MCP/SDK/hooks) and tool outputs shrink at the source; just &lt;strong&gt;proxy&lt;/strong&gt; it (&lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt;) and they're caught later, on the wire. You normally get one, not both.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The tool-result surface — via the MCP server, SDK, or hooks (this is what the table measures).&lt;/strong&gt; When a tool produces a big blob — a 60 KB search result, a noisy build log, a giant JSON array — Headroom compresses &lt;em&gt;that blob before it is ever appended to the transcript&lt;/em&gt;. The agent keeps the original and re-sends only the shrunk form, so the model sees the small version and the prompt cache forms around it. Safe by construction: the bytes shrink &lt;em&gt;before&lt;/em&gt; any cache exists, so there's no cached prefix to bust. This is where Headroom's own docs scope the headline — &lt;em&gt;"70–90% **on tool outputs&lt;/em&gt;&lt;em&gt;."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The proxy surface — the &lt;code&gt;/v1/messages&lt;/code&gt; body, when you point &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; at Headroom.&lt;/strong&gt; Here the design is more delicate, and it's the catch-point &lt;em&gt;when there was no source-level integration&lt;/em&gt;. Headroom parses each request and finds compressible content in the &lt;strong&gt;live tail&lt;/strong&gt; (the newest messages that aren't cached yet — typically a large tool result that nothing shrank at the source, e.g. a &lt;code&gt;Bash&lt;/code&gt;, &lt;code&gt;WebFetch&lt;/code&gt;, or MCP output), runs it through the same compressors, and swaps in the shrunk form plus a retrievable marker (CCR — see the machinery section). The one thing it is built &lt;em&gt;not&lt;/em&gt; to do is rewrite the &lt;strong&gt;frozen, already-cached prefix&lt;/strong&gt;: those bytes have to stay byte-identical or the 0.1× cache discount is lost. So the intent is simple — &lt;em&gt;shrink the fresh tail, never the cached prefix.&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  When the proxy surface actually pays off — and why that's rarely Claude Code
&lt;/h4&gt;

&lt;p&gt;The proxy path pays off only when there genuinely &lt;em&gt;is&lt;/em&gt; a large, compressible blob sitting in the live tail. That's common for raw-API clients and other agents that dump big tool outputs straight into the request — point them at Headroom and it trims the tail on the way out. &lt;strong&gt;A real Claude Code session is the case where it pays off least&lt;/strong&gt; — measured live, it compressed &lt;strong&gt;0%&lt;/strong&gt; of the conversation — for two structural reasons:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code freezes its tool results into the cached prefix.&lt;/strong&gt; CC places its &lt;code&gt;cache_control&lt;/code&gt; breakpoints over recent activity, so by the time a tool result could be a compression candidate it's already inside the frozen prefix — and Headroom, by design, won't touch frozen bytes (doing so would bust the cache, as the next section shows). The block is forwarded byte-identical.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude Code already offloads big outputs itself.&lt;/strong&gt; CC 2.1.150 saves any large tool output to a disk file and sends only a ~2 KB preview into the body (its &lt;code&gt;&amp;lt;persisted-output&amp;gt;&lt;/code&gt; mechanism), so there's rarely a big blob left in the conversation for a proxy to compress at all.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;So behind Claude Code the conversation body is forwarded effectively unchanged, and Headroom's real savings come from the tool-result surface above: route verbose tool output through its MCP server / hooks and you get the table's ratios; point &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; at it and expect little from the &lt;em&gt;conversation&lt;/em&gt; itself. (A separate standalone Rust proxy in the same repo goes further — it defaults to true byte-faithful passthrough, &lt;code&gt;compression_mode = Off&lt;/code&gt; — but the &lt;code&gt;headroom&lt;/code&gt; CLI never launches it.) The modest &lt;strong&gt;~10% / 4.8%&lt;/strong&gt; independent numbers above fit this picture: most of the real ratio lives on tool outputs, and the cache-busting moves that would chase more were deliberately fenced off — because, as the next section shows, getting proxy compression wrong costs far more than it saves. &lt;strong&gt;[measured, 2026-06]&lt;/strong&gt;&lt;/p&gt;

&lt;h4&gt;
  
  
  Composing compression with the cache
&lt;/h4&gt;

&lt;p&gt;Compression and caching are two separate discounts on the token bill, and the question is whether they &lt;em&gt;stack&lt;/em&gt; or &lt;em&gt;cancel&lt;/em&gt;. Caching is a 90%-off coupon on bytes the server has already seen — a cached prefix is billed at &lt;strong&gt;0.1×&lt;/strong&gt; (a "cache read") instead of full price. Compression just means &lt;em&gt;send fewer bytes&lt;/em&gt;. Whether they &lt;strong&gt;compose&lt;/strong&gt; (both apply) or &lt;strong&gt;fight&lt;/strong&gt; (one voids the other) depends entirely on &lt;em&gt;which&lt;/em&gt; bytes you compress:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;What you do&lt;/th&gt;
&lt;th&gt;Cost of a 100K-token cached prefix&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Fight&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;compress the whole prefix 100K → 20K, rewriting the cached bytes&lt;/td&gt;
&lt;td&gt;those 20K are now &lt;em&gt;new&lt;/em&gt; bytes → cache &lt;strong&gt;miss&lt;/strong&gt; → billed as a 2× write ≈ 40K-equiv. You shrank the text but &lt;strong&gt;voided the coupon&lt;/strong&gt;, and the bill went &lt;em&gt;up&lt;/em&gt; (it was 10K-equiv as a read).&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Compose&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;leave the cached span byte-identical, compress only the &lt;em&gt;fresh tail&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;the 100K stays a 0.1× read ≈ 10K-equiv &lt;strong&gt;and&lt;/strong&gt; the new tail is smaller — both discounts apply.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So "composing compression with the cache" means arranging things so compression shrinks tokens &lt;em&gt;and&lt;/em&gt; the cache discount survives. It takes two separate things, and only one of them is the &lt;code&gt;CacheAligner&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A stable prefix&lt;/strong&gt; — no volatile content (dates, IDs) drifting the cached bytes each call. This is the &lt;code&gt;CacheAligner&lt;/code&gt;'s entire job: it moves volatile content to the message tail. &lt;strong&gt;Claude Code already does exactly this&lt;/strong&gt; — the date sits after the system breakpoints and the billing-header token is excluded from the cache key (Act I) — so the &lt;code&gt;CacheAligner&lt;/code&gt; adds little behind CC. It earns its keep for &lt;em&gt;hand-rolled&lt;/em&gt; API clients that would otherwise leave a timestamp or session ID at the top of the system prompt. (In current Headroom the &lt;code&gt;CacheAligner&lt;/code&gt; actually ships &lt;strong&gt;disabled by default and reduced to a detector&lt;/strong&gt; — its rewrite path was removed after it was found to &lt;em&gt;mutate the cache hot zone&lt;/em&gt;, the very thing it was meant to protect.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache-safe compression&lt;/strong&gt; — leave the cached span byte-identical and shrink only the fresh tail; rewrite cached bytes and you trade a 0.1× read for a 2× write (the Act II flip). This is a &lt;em&gt;different&lt;/em&gt; mechanism from the &lt;code&gt;CacheAligner&lt;/code&gt;, and behind Claude Code it's the half that actually matters: get it wrong and the compressor costs more than it saves — usually more than running no compressor at all, since a multi-turn CC session's cached prefix dominates the bill. It's exactly the bug list below.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  The cache-killer trap — serialization drift
&lt;/h4&gt;

&lt;p&gt;The cache is a &lt;strong&gt;byte-exact longest-prefix match&lt;/strong&gt; (Act I): the API extends a cached entry only while the new request's leading bytes are &lt;em&gt;identical&lt;/em&gt;, and re-processes everything from the first differing byte onward. Headroom runs as a local proxy &lt;em&gt;in that path&lt;/em&gt;, between Claude Code and the API — and the hazard of sitting there is that it can change those bytes &lt;strong&gt;without changing the meaning&lt;/strong&gt;, which is enough to miss.&lt;/p&gt;

&lt;p&gt;To see how, follow one request through it. There are three nodes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[A] Claude Code  ──raw bytes──▶  [B] Headroom (proxy)  ──re-emitted bytes──▶  [C] Anthropic API
   (the client)                  (logs / compresses)                          (server + cache)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;A — Claude Code builds the request.&lt;/strong&gt; It serializes the whole request to JSON bytes (&lt;code&gt;model&lt;/code&gt;, &lt;code&gt;tools&lt;/code&gt;, &lt;code&gt;system&lt;/code&gt;, &lt;code&gt;messages&lt;/code&gt;), writing &lt;em&gt;compact&lt;/em&gt; JSON — no space after &lt;code&gt;:&lt;/code&gt; or &lt;code&gt;,&lt;/code&gt; — and raw UTF-8 for non-ASCII. A &lt;code&gt;content&lt;/code&gt; field reaches Headroom as these literal bytes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;"café"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;B — a naive proxy re-emits the request.&lt;/strong&gt; To compress a tool output (or merely to log the body), the natural code parses the incoming bytes into an object, edits its one field, and serializes them back — and a stock encoder is not Claude Code's. Python's default &lt;code&gt;json.dumps()&lt;/code&gt; re-emits the &lt;em&gt;same object&lt;/em&gt; as:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"role"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"user"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"content"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"caf&lt;/span&gt;&lt;span class="se"&gt;\u&lt;/span&gt;&lt;span class="s2"&gt;00e9"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two unrequested changes: a space after every &lt;code&gt;:&lt;/code&gt; and &lt;code&gt;,&lt;/code&gt;, and &lt;code&gt;é&lt;/code&gt; becomes the 6-character escape &lt;code&gt;\u00e9&lt;/code&gt;. Same object, two valid byte encodings — the stock encoder just picked a different one than Claude Code did. Headroom calls this &lt;strong&gt;serialization drift&lt;/strong&gt;, and its &lt;a href="https://github.com/headroomlabs-ai/headroom/blob/main/REALIGNMENT/01-bug-list.md" rel="noopener noreferrer"&gt;&lt;code&gt;REALIGNMENT/01-bug-list.md&lt;/code&gt;&lt;/a&gt; is a 72-item P0–P6 &lt;strong&gt;self-audit&lt;/strong&gt; of every way it can occur — each entry matching a real Anthropic prefix-cache rule. (Headroom has since fixed this in the default path — see &lt;em&gt;What Headroom actually ships&lt;/em&gt; below.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;C — Anthropic looks up the cache.&lt;/strong&gt; Get this step wrong and the proxy flips from saving money to &lt;em&gt;costing&lt;/em&gt; it — a net expense, not a saving. The cache keys on the bytes the server &lt;em&gt;receives&lt;/em&gt; — Headroom's output — and never sees Claude Code's originals at all. So each turn the server matches this request's prefix against the bytes &lt;strong&gt;Headroom itself sent, and the server cached, on the previous turn&lt;/strong&gt; — not against anything Claude Code produced. If Headroom re-serializes &lt;em&gt;byte-identically every turn&lt;/em&gt;, those match and you still hit; the cache simply keys on Headroom's encoding instead of Claude Code's. The bill blows up only when Headroom's output is &lt;strong&gt;not byte-stable from one turn to the next&lt;/strong&gt; — and naive re-serialization makes that the default: &lt;code&gt;json.dumps()&lt;/code&gt; without &lt;code&gt;sort_keys&lt;/code&gt;, &lt;code&gt;set&lt;/code&gt;/&lt;code&gt;dict&lt;/code&gt; iteration order, drifting float formatting, or byte-copying some messages while re-serializing others. Any of those changes a byte near the &lt;strong&gt;top&lt;/strong&gt; of the request, which re-keys the whole prefix after it → a cold write at 2× instead of a &lt;code&gt;cache_read&lt;/code&gt; at 0.1×. (You also pay a one-time cold rebuild the moment Headroom is dropped in front of a cache Claude Code had already warmed with compact bytes — its spaced output matches none of those entries.)&lt;/p&gt;

&lt;p&gt;What that costs:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Proxied traffic collapses toward 100% cold writes (0% hit)&lt;/strong&gt; — and not just the message that was touched: the drifted byte sits near the top, so it re-keys everything after it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;10–20× the price&lt;/strong&gt; for the prefix on every turn (0.1× read → 1–2× write), for content the server already had cached. A proxy meant to save money can cost more than it saves.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It is silent&lt;/strong&gt; — the request is valid, so HTTP 200 and correct responses, no error. The only evidence is usage: &lt;code&gt;cache_read&lt;/code&gt; stuck near 0 and &lt;code&gt;cache_creation&lt;/code&gt; high every turn. You find it in the bill, not a stack trace.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each row is a way a re-serializing proxy's output drifts across turns — and each is a real entry in Headroom's own 72-item audit, so you can look it up. The &lt;strong&gt;Headroom bug&lt;/strong&gt; column gives the audit ID (P0 = its &lt;em&gt;"cache-killer smoking guns — every customer affected"&lt;/em&gt; class, the label from when the audit was written):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Drift cause&lt;/th&gt;
&lt;th&gt;What the proxy did&lt;/th&gt;
&lt;th&gt;What changed in the bytes&lt;/th&gt;
&lt;th&gt;Headroom bug&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Separator / escaping drift&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;re-encoded with default &lt;code&gt;json.dumps()&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;,&lt;/code&gt;→&lt;code&gt;,&lt;/code&gt;, &lt;code&gt;:&lt;/code&gt;→&lt;code&gt;:&lt;/code&gt;, raw UTF-8 → &lt;code&gt;\uXXXX&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;P0-2&lt;/strong&gt; (also &lt;strong&gt;P0-1&lt;/strong&gt; — system prompt &lt;code&gt;.strip()&lt;/code&gt; + memory append)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Key-order drift&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;json.dumps()&lt;/code&gt; without &lt;code&gt;sort_keys=True&lt;/code&gt;, or iterated a &lt;code&gt;set&lt;/code&gt;/&lt;code&gt;dict&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;object keys come out in a different order request-to-request&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;P3-28&lt;/strong&gt; (tool array), &lt;strong&gt;P3-29&lt;/strong&gt; (schema keys)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Numeric-precision drift&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;round-tripped a number through a generic JSON value&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;1.0&lt;/code&gt;→&lt;code&gt;1&lt;/code&gt;; integers above 2⁵³ lose low digits&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;P0-5&lt;/strong&gt; (also &lt;strong&gt;P4-46&lt;/strong&gt; — missing &lt;code&gt;serde_json&lt;/code&gt; &lt;code&gt;arbitrary_precision&lt;/code&gt;/&lt;code&gt;raw_value&lt;/code&gt;)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Re-emitting unchanged history&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;re-serialized retained messages during compression instead of byte-copying them&lt;/td&gt;
&lt;td&gt;even "untouched" history shifts byte-for-byte&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;P1-13&lt;/strong&gt; (also &lt;strong&gt;P0-3&lt;/strong&gt; ignored &lt;code&gt;cache_control&lt;/code&gt;; &lt;strong&gt;P0-4&lt;/strong&gt; ICM dropped hot-zone messages)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;All four are the silent invalidators from Act I, now seen from the proxy author's chair. They collapse to &lt;strong&gt;one rule: forward the original request bytes verbatim.&lt;/strong&gt; If you must mutate, do surgical byte-fragment replacement on the &lt;code&gt;messages&lt;/code&gt; tail only — never parse-and-re-emit the whole envelope — and confirm &lt;code&gt;cache_read&lt;/code&gt; is unchanged on the next request. A compressor that ignores this loses far more to cold rewrites than it ever saves in tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What Headroom actually ships (v0.27.0).&lt;/strong&gt; It now does exactly what the ✅ Fix below prescribes — which is why the live measurement earlier showed it forwarding Claude Code's bytes unchanged. Its forwarder defaults to &lt;strong&gt;byte-faithful&lt;/strong&gt;: an unmutated body is forwarded &lt;em&gt;verbatim&lt;/em&gt; (the original bytes), and a mutated one is re-serialized &lt;strong&gt;once&lt;/strong&gt;, canonically, to &lt;strong&gt;compact UTF-8 that matches Claude Code's own encoding&lt;/strong&gt; (no added spaces; non-ASCII stays raw, not escaped) — so the headline drift (the &lt;code&gt;café&lt;/code&gt; case, audit item &lt;strong&gt;P0-2&lt;/strong&gt;) is gone by default. The naive &lt;code&gt;json.dumps&lt;/code&gt; path survives only as an explicit emergency rollback (&lt;code&gt;HEADROOM_PROXY_PYTHON_FORWARDER_MODE=legacy_json_kwarg&lt;/code&gt;), pinned by a regression test (&lt;code&gt;test_proxy_byte_faithful_forwarding.py&lt;/code&gt;). So read the 72-item audit as a catalog of &lt;em&gt;ways a re-serializing proxy bites itself&lt;/em&gt;: the worst (P0-2) is fixed by the byte-faithful default; the rest are assigned fixes and worked through release by release. It's the cleanest proof the hazard is real and worth engineering against.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — adding a "compression" or "logging" proxy that re-serializes JSON.&lt;/strong&gt; A re-encode stays cache-safe only if it reproduces the exact same bytes on &lt;em&gt;every&lt;/em&gt; turn — and naive code never does (separators, escaping, key order, float formatting, or mixing byte-copied and re-serialized messages all drift). In practice the hit rate falls to ~0 for &lt;em&gt;all&lt;/em&gt; proxied traffic. This is the most expensive proxy mistake there is.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Forward the original request bytes verbatim. Mutate only by surgical byte-fragment replacement on the &lt;code&gt;messages&lt;/code&gt; tail, never by parse-and-re-emit of the whole envelope. Verify with &lt;code&gt;cache_read&lt;/code&gt; after the proxy is inserted.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h4&gt;
  
  
  Two more ways a proxy quietly costs you
&lt;/h4&gt;

&lt;p&gt;Independent of compression, just from pointing &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; at a custom host:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool schemas get inlined.&lt;/strong&gt; Claude Code can't assume a custom endpoint supports tool-search deferral, so it &lt;strong&gt;eagerly materializes every tool schema into the context window&lt;/strong&gt; — the proxy you added to save tokens silently adds thousands. Fix: &lt;code&gt;ENABLE_TOOL_SEARCH=true&lt;/code&gt; (or &lt;code&gt;headroom wrap claude&lt;/code&gt;, which sets it). [Headroom troubleshooting docs + issue #753]&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The 1M-context beta header gets dropped.&lt;/strong&gt; Behind a custom base URL, Claude Code can drop the &lt;code&gt;context-1m-2025-08-07&lt;/code&gt; beta and &lt;strong&gt;silently cap you at 200K context&lt;/strong&gt;; the workaround is the &lt;code&gt;[1m]&lt;/code&gt; model suffix (e.g. &lt;code&gt;claude-opus-4-8[1m]&lt;/code&gt;). [issue #1158]&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  &lt;code&gt;ReadLifecycleManager&lt;/code&gt; — collapsing stale and superseded reads
&lt;/h4&gt;

&lt;p&gt;The biggest single source of waste in a coding session is &lt;strong&gt;old file reads piling up in replayed history&lt;/strong&gt;, and this is the transform built to attack it. It's worth a closer look — partly because it clears up a common misconception.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A block's bytes never change — its &lt;em&gt;relevance&lt;/em&gt; does.&lt;/strong&gt; A &lt;code&gt;tool_result&lt;/code&gt; is immutable once produced: the 400 lines of &lt;code&gt;worker.py&lt;/code&gt; you read at step 1 are those bytes forever. What &lt;code&gt;ReadLifecycleManager&lt;/code&gt; reacts to is a &lt;em&gt;later&lt;/em&gt; message that makes the earlier read obsolete:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;STALE&lt;/strong&gt; — a later &lt;code&gt;Edit&lt;/code&gt;/&lt;code&gt;Write&lt;/code&gt; modified the same file &lt;em&gt;after&lt;/em&gt; it was read, so the old read no longer matches what's on disk.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SUPERSEDED&lt;/strong&gt; — a later &lt;code&gt;Read&lt;/code&gt; of the same file, so the earlier read is redundant; only the newest copy matters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So it isn't "a block's content changed." It's "block #3 became dead weight because block #8 came along" — the manager rewrites the &lt;em&gt;old&lt;/em&gt; block based on &lt;em&gt;newer&lt;/em&gt; blocks further down the history. (Its own notes put roughly two-thirds of Read-output bytes going stale and another eighth superseded — about three-quarters dead weight that's still re-sent every turn.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this happens constantly: Claude sends many tool calls per turn.&lt;/strong&gt; One thing you type is &lt;em&gt;not&lt;/em&gt; one API request. Claude Code runs an agentic loop — call a tool, read the result, call another — often 10–50 round trips before the final answer, and &lt;strong&gt;each &lt;code&gt;tool_use → tool_result&lt;/code&gt; round trip is a separate &lt;code&gt;POST /v1/messages&lt;/code&gt; that replays the entire conversation so far&lt;/strong&gt;. By request #20 the message array holds all 19 prior reads, edits, and bash outputs stacked up — which is exactly where stale and superseded reads accumulate. (When the sections above said "the live tail," this loop is what fills it.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Worked example.&lt;/strong&gt; You type one prompt:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"Rename &lt;code&gt;process()&lt;/code&gt; to &lt;code&gt;handle()&lt;/code&gt; in &lt;code&gt;src/worker.py&lt;/code&gt; and fix all the callers."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Claude's loop — each row is a separate API request that replays everything above it:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Step&lt;/th&gt;
&lt;th&gt;Tool call&lt;/th&gt;
&lt;th&gt;What lands in history&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Read src/worker.py&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;block A&lt;/strong&gt;: full 400-line file (~1,500 tok)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Grep "process\("&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;12 call sites&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Edit src/worker.py&lt;/code&gt; (the def)&lt;/td&gt;
&lt;td&gt;edit confirmation — file on disk now changed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Edit src/worker.py&lt;/code&gt; (a caller)&lt;/td&gt;
&lt;td&gt;another edit&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;Read src/worker.py&lt;/code&gt; (re-read to verify)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;block E&lt;/strong&gt;: updated 400-line file (~1,500 tok)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;&lt;code&gt;Bash: pytest&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;test output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;… final answer …&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;By the request for step 6, the replayed history contains &lt;strong&gt;both block A and block E&lt;/strong&gt; — the same file, read twice. Block A is now &lt;strong&gt;STALE&lt;/strong&gt; (the file was edited at steps 3–4, &lt;em&gt;after&lt;/em&gt; it was read) — and also &lt;strong&gt;SUPERSEDED&lt;/strong&gt; (re-read at step 5). Either way it's dead weight, since block E is the current truth. The manager collapses block A to a one-liner via the &lt;strong&gt;stale&lt;/strong&gt; path and stores the original for retrieval:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[Read content stale: src/worker.py was modified after this read — re-read the file for current content. Retrieve original: hash=ab12…]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;~1,500 tokens → ~20 — and because block A would otherwise ride along in &lt;em&gt;every&lt;/em&gt; remaining request of the loop (steps 6, 7, …), you don't save it once, you save it on every remaining round trip.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why "stale," not "superseded"?&lt;/strong&gt; Headroom ships &lt;strong&gt;stale&lt;/strong&gt; collapse on by default but &lt;strong&gt;superseded&lt;/strong&gt; collapse &lt;em&gt;off&lt;/em&gt; — collapsing a pure re-read (one with no intervening edit) is riskier, because the earlier copy usually sits in the cached prefix and rewriting it would bust the cache. So a re-read &lt;em&gt;alone&lt;/em&gt; is left in place by default; it's the &lt;em&gt;edit&lt;/em&gt; that makes block A provably safe to drop. (One more clever lever shipped deliberately conservative for cache safety — see the pattern below.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When the collapse actually fires (the cache tie-in).&lt;/strong&gt; It can only rewrite block A while A is both obsolete &lt;em&gt;and&lt;/em&gt; still in the mutable tail. Because the cached prefix grows as the loop runs, two regimes appear — and this is why so little of it shows up behind Claude Code:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;A is still in the live tail&lt;/strong&gt; → collapsed on the spot.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A is already in the frozen prefix&lt;/strong&gt; → skipped, to protect the cache (rewriting it would trade a 0.1× read for a 2× write). It's reclaimed only once that part of the cache turns over — and since Claude Code rides a 1-hour TTL that refreshes on every read (Act I), in practice that means a &lt;em&gt;new session&lt;/em&gt; or a long idle gap, not mid-loop.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the canonical trigger is the everyday &lt;strong&gt;read → edit → re-read&lt;/strong&gt; loop Claude runs constantly — but, like every proxy-surface lever here, behind Claude Code it's gated by the same frozen-prefix reality from the disclaimer above, so it mostly fires across turns rather than within one.&lt;/p&gt;

&lt;h4&gt;
  
  
  Inside the proxy — the rest of the machinery
&lt;/h4&gt;

&lt;p&gt;You don't need any of this to use Headroom, but it explains what you're running and where the cost levers really are. Most of it is built for cases &lt;em&gt;other&lt;/em&gt; than a live Claude Code conversation (one-shot calls, MCP tool output, other agents) — which is exactly why so little of it fires on the wire behind CC.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;proxy&lt;/code&gt; vs &lt;code&gt;wrap&lt;/code&gt; — same compression, different wiring.&lt;/strong&gt; Both run the same server with the same compression defaults. &lt;code&gt;headroom proxy&lt;/code&gt; is just the compression layer — &lt;em&gt;you&lt;/em&gt; point Claude Code at it. &lt;code&gt;headroom wrap claude&lt;/code&gt; adds the integration: it sets &lt;code&gt;ENABLE_TOOL_SEARCH=true&lt;/code&gt;, writes &lt;code&gt;.claude/settings.local.json&lt;/code&gt;, downloads and wires &lt;strong&gt;RTK&lt;/strong&gt;, adds a project header, and gives you a clean &lt;code&gt;unwrap&lt;/code&gt;. A common misconception is that "&lt;code&gt;wrap&lt;/code&gt; compresses harder." It doesn't — &lt;code&gt;wrap&lt;/code&gt; is convenience + RTK; the compression aggressiveness is identical, and the aggressive &lt;code&gt;agent-90&lt;/code&gt; savings profile is &lt;strong&gt;opt-in in both&lt;/strong&gt;. So the headline "60–95%" is a ceiling you choose, not out-of-the-box CC behavior.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Subscription vs PAYG — what "saving" even means.&lt;/strong&gt; Headroom classifies the client by &lt;strong&gt;User-Agent&lt;/strong&gt;: Claude Code (&lt;code&gt;claude-code/&lt;/code&gt;, &lt;code&gt;claude-cli/&lt;/code&gt;) is treated as a &lt;strong&gt;subscription&lt;/strong&gt; and gets the &lt;em&gt;conservative&lt;/em&gt; policy; an API/PAYG client gets the aggressive one. This matters because on a &lt;strong&gt;subscription (Pro/Max)&lt;/strong&gt; the bill is flat — token savings don't reduce a per-token charge, they buy &lt;strong&gt;context headroom&lt;/strong&gt; (fewer auto-compactions, longer sessions) and help you &lt;strong&gt;stay under the 5-hour / weekly caps&lt;/strong&gt;. On &lt;strong&gt;PAYG / Bedrock / Vertex&lt;/strong&gt;, the same savings are &lt;strong&gt;real dollars&lt;/strong&gt;. Same tool, two different definitions of "win."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The break-even math — &lt;code&gt;net_mutation_gain&lt;/code&gt;.&lt;/strong&gt; Headroom has a formula for &lt;em&gt;when it's worth rewriting cached content&lt;/em&gt;: &lt;code&gt;gain = ΔT·(w + r·(R−1)) − P_alive·(w−r)·(S+ΔT)&lt;/code&gt;, mutate only if &lt;code&gt;gain &amp;gt; 0&lt;/code&gt; (ΔT = tokens removed, S = cached suffix below the edit, R = expected remaining reads, P_alive = chance the cache is still warm, w = 1.25 write, r = 0.1 read). The intuition is the whole Act II flip in one line: &lt;strong&gt;editing a message re-writes the entire cached suffix beneath it&lt;/strong&gt;, so shaving 2K off a message sitting on a 50K cached suffix needs ~&lt;strong&gt;288&lt;/strong&gt; future reads to break even (never worth it), while shaving 50K &lt;em&gt;above&lt;/em&gt; a 10K suffix breaks even in ~&lt;strong&gt;2.3&lt;/strong&gt; reads (worth it now). Crucially, this gate ships &lt;strong&gt;off&lt;/strong&gt; by default and — together with a frozen-prefix guard — is &lt;em&gt;why&lt;/em&gt; Headroom leaves your cached prefix alone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A real gap: the 1-hour-TTL write price.&lt;/strong&gt; That formula hardcodes the write multiplier at &lt;strong&gt;&lt;code&gt;1.25&lt;/code&gt;&lt;/strong&gt; — the &lt;em&gt;5-minute&lt;/em&gt; TTL price. There is &lt;strong&gt;no&lt;/strong&gt; branch that reads a request's &lt;code&gt;cache_control.ttl&lt;/code&gt; and switches it to &lt;strong&gt;&lt;code&gt;2.0&lt;/code&gt;&lt;/strong&gt; for a 1-hour cache. If a session uses the 1h tier, the model &lt;strong&gt;understates&lt;/strong&gt; the re-write penalty, biasing any cache-break decision toward "too eager." Mostly latent today (the gate is off), but a genuine correctness gap worth knowing if you ever enable aggressive mode.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What's protected from compression.&lt;/strong&gt; Exact-output tools (&lt;code&gt;Bash&lt;/code&gt;, &lt;code&gt;WebFetch&lt;/code&gt;) are kept &lt;strong&gt;verbatim&lt;/strong&gt; — lossy compression there would corrupt meaning — and the structural tools (&lt;code&gt;Read/Glob/Grep/Write/Edit&lt;/code&gt;) are excluded from blind crushing. Only "safe" content types (JSON arrays, logs, search results) get lossy treatment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reversibility (CCR), concretely.&lt;/strong&gt; When a compressor drops content it caches the original locally, keyed by a SHA-256 hash (24 hex chars), and leaves a marker like &lt;code&gt;[201 items compressed to 20. Retrieve more: hash=fa88…]&lt;/code&gt;. If the model needs the detail it calls the &lt;strong&gt;&lt;code&gt;headroom_retrieve&lt;/code&gt;&lt;/strong&gt; MCP tool (BM25-filterable). Default TTL ~30 min; the retrieve/compress endpoints are &lt;strong&gt;loopback-only&lt;/strong&gt;, so dropped data can't be reached off-box. This is why "same answers, fraction of the tokens" is defensible — the data is offloaded, not gone.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Levers that cut turns/output, not just bytes.&lt;/strong&gt; &lt;code&gt;headroom learn&lt;/code&gt; mines &lt;code&gt;~/.claude/projects/*.jsonl&lt;/code&gt; for recurring error→recovery patterns (wrong path → right path) and writes a correction into &lt;strong&gt;&lt;code&gt;CLAUDE.local.md&lt;/code&gt;&lt;/strong&gt; so the next session gets it right the first time — fewer wasted turns. A &lt;strong&gt;verbosity shaper&lt;/strong&gt; appends a deterministic, idempotent terseness instruction (the same &lt;em&gt;output&lt;/em&gt; line Caveman attacks, done at the proxy). &lt;strong&gt;RTK&lt;/strong&gt; (&lt;code&gt;rtk-ai/rtk&lt;/code&gt;) filters/compacts verbose shell-command output at the source — &lt;code&gt;wrap&lt;/code&gt;-only, installed as a Claude Code PreToolUse hook, downloaded on demand (not bundled; details under &lt;em&gt;Other compressors&lt;/em&gt; below). And a cross-agent shared-context store dedups the handoff payload when work moves Claude → Codex → Gemini.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How to check it's actually saving (don't trust, verify).&lt;/strong&gt; The proxy logs a &lt;code&gt;PERF&lt;/code&gt; line per request (&lt;code&gt;tokens_before / tokens_after / tokens_saved&lt;/code&gt; + cache metrics); the analyzer prices saved tokens at the &lt;strong&gt;cache-read&lt;/strong&gt; rate when the cache was hit, so it won't overstate savings on a cached prefix. &lt;code&gt;headroom perf&lt;/code&gt; / &lt;code&gt;headroom gain&lt;/code&gt; report measured % against the target. That the accounting is itself cache-aware is the clearest sign the tool respects the one rule that governs this whole act.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Other input/cache &amp;amp; hosted compressors
&lt;/h3&gt;

&lt;p&gt;Same cost line as Headroom (input / cache-write), shipped as point tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Microsoft LLMLingua&lt;/strong&gt; (&lt;code&gt;microsoft/LLMLingua&lt;/code&gt;) — prompt compression up to 20× at ~1.5pt GSM8K loss; &lt;strong&gt;LongLLMLingua&lt;/strong&gt; improves RAG retrieval up to 21.4% using ¼ the tokens. ⚠️ Degrades past ~20–25× and &lt;strong&gt;busts the prefix cache unless the prefix is re-stabilized&lt;/strong&gt; afterward. Best for one-shot / RAG context, &lt;em&gt;not&lt;/em&gt; a cached conversational prefix. &lt;strong&gt;[high]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RTK&lt;/strong&gt; ("Rust Token Killer", &lt;code&gt;rtk-ai/rtk&lt;/code&gt;) — a standalone Rust CLI that &lt;strong&gt;wraps a command and filters/compacts its output&lt;/strong&gt; before it reaches the model (&lt;code&gt;rtk ls&lt;/code&gt; / &lt;code&gt;git&lt;/code&gt; / &lt;code&gt;test&lt;/code&gt; / &lt;code&gt;diff&lt;/code&gt; / &lt;code&gt;grep&lt;/code&gt; / &lt;code&gt;pytest&lt;/code&gt; …, 50+ subcommands). Savings are real but uneven — measured &lt;strong&gt;~60%&lt;/strong&gt; on &lt;code&gt;ls&lt;/code&gt;, &lt;strong&gt;~85%&lt;/strong&gt; bytes on &lt;code&gt;git log&lt;/code&gt; (one line per commit), but &lt;strong&gt;0%&lt;/strong&gt; when no filter matches (it passes through). Headroom downloads it on demand (pinned &lt;code&gt;v0.42.4&lt;/code&gt;, over TLS, &lt;strong&gt;no checksum&lt;/strong&gt;) and installs it &lt;strong&gt;&lt;code&gt;wrap&lt;/code&gt;-only&lt;/strong&gt; as a Claude Code &lt;strong&gt;PreToolUse hook&lt;/strong&gt; (&lt;code&gt;rtk init&lt;/code&gt;), so it sits &lt;em&gt;outside&lt;/em&gt; the &lt;code&gt;/v1/messages&lt;/code&gt; proxy path — it shrinks tool output &lt;strong&gt;at the source&lt;/strong&gt;, the productized version of the preprocessing-hook technique. (It's the same &lt;code&gt;rtk&lt;/code&gt; you'd install standalone; savings are tracked via &lt;code&gt;rtk gain&lt;/code&gt;, and it only auto-applies once the hook is installed.) Good when tool outputs are the bloat. &lt;strong&gt;[med — savings partly measured 2026-06]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;lean-ctx&lt;/strong&gt; (&lt;code&gt;yvgude/lean-ctx&lt;/code&gt;) — a CLI/MCP context trimmer. &lt;strong&gt;[med]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Hosted &amp;amp; provider-native (text leaves your box; non-reversible):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OpenAI Compaction&lt;/strong&gt; — provider-native conversation-history summarization. &lt;strong&gt;[med]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compresr.ai&lt;/strong&gt;, &lt;strong&gt;The Token Company&lt;/strong&gt; — hosted compression APIs. &lt;strong&gt;[low]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — running an aggressive input compressor over a cached conversational prefix.&lt;/strong&gt; Past ~20× compression accuracy degrades, and any rewrite of already-cached bytes flips reads to writes. You can spend more on rebuilds than you save on tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Use input compressors where there &lt;em&gt;is&lt;/em&gt; no cache to lose (one-shot calls, RAG context assembly) or only on the fresh tail. On a cached prefix, prefer a cache-aware approach: compress only the uncached tail and keep the cached span byte-identical (the discipline Headroom's frozen-prefix handling is built around).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Output-line compressor
&lt;/h2&gt;

&lt;p&gt;Output is the most expensive class and is &lt;strong&gt;never cached at generation&lt;/strong&gt;, so compressing it never &lt;em&gt;busts&lt;/em&gt; the prefix cache — and it actually helps it: because each response is cache-written as input on the next turn (see &lt;em&gt;What the cache stores&lt;/em&gt;), shrinking the output also shrinks every downstream cache-write and cache-read it would have become. So the win compounds — 5× saved on the output now, plus smaller writes/reads on every later turn — with no risk to the cached prefix. This makes it, alongside sticky routing, one of the two safest levers in the entire ecosystem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caveman — compressing the priciest line
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Caveman&lt;/strong&gt; (&lt;code&gt;JuliusBrussee/caveman&lt;/code&gt;, &lt;code&gt;dlepold/caveman-distillate&lt;/code&gt;, and forks) is a Claude Code &lt;strong&gt;skill&lt;/strong&gt; that instructs the model to answer in telegraphic, low-grammar prose — &lt;em&gt;"why use many token when few token do trick"&lt;/em&gt;: short sentences, no filler, dropped articles. Community-reported &lt;strong&gt;~65% output-token reduction&lt;/strong&gt;, corroborated across multiple independent repos (repo-benchmarked at ~65%, range 22–87% across 10 prompts). Best for internal/agent-facing output where terseness is fine; the trade-off is human readability. &lt;strong&gt;[community-reported]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Caveman's savings come from &lt;em&gt;prompt injection&lt;/em&gt; — the same context-shaping that governs caching — and its central engineering problem (keeping an instruction alive as context compression prunes it) is the flip side of the prefix-stability story from Acts I–II. Because the tool is a master class in &lt;em&gt;how a Claude Code plugin actually works&lt;/em&gt;, the full anatomy is worth studying.&lt;/p&gt;

&lt;h4&gt;
  
  
  Caveman anatomy — what it actually is
&lt;/h4&gt;

&lt;p&gt;Caveman is &lt;strong&gt;not a model or a service&lt;/strong&gt; — it's a prompt-injection + distribution system. One idea: inject a compression ruleset into a coding agent's context so it writes terse "caveman-style" prose (drop articles, filler, hedging; keep every piece of technical substance), cutting ~65% of output tokens at full accuracy.&lt;/p&gt;

&lt;p&gt;The repo is two codebases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The behavior&lt;/strong&gt; — ~400 lines of Markdown prompt. This is the actual IP.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The delivery machinery&lt;/strong&gt; — ~5,800 lines of Node that detects 30+ installed agents and wires the behavior into each one's incompatible config.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The npm package is &lt;code&gt;caveman-installer&lt;/code&gt;; the shipped artifact is the &lt;strong&gt;installer&lt;/strong&gt;, not a model. Maintainers edit only the source-of-truth files (&lt;code&gt;skills/*/SKILL.md&lt;/code&gt;, &lt;code&gt;agents/cavecrew-*.md&lt;/code&gt;, &lt;code&gt;src/rules/*.md&lt;/code&gt;); CI mirrors them into the plugin distribution and rebuilds the release ZIP.&lt;/p&gt;

&lt;h4&gt;
  
  
  The behavior layer (the prompt)
&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;skills/caveman/SKILL.md&lt;/code&gt; is the single source of truth for behavior:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The frontmatter description doubles as the trigger surface&lt;/strong&gt; — phrases like "talk like caveman", "less tokens", "be brief" are listed so semantic skill-matching auto-activates the skill.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persistence&lt;/strong&gt; fights the real failure mode: models drift back to verbose after several turns or after context compression prunes the instruction. The rules repeatedly assert &lt;em&gt;"ACTIVE EVERY RESPONSE."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Six intensity levels:&lt;/strong&gt; &lt;code&gt;lite&lt;/code&gt;, &lt;code&gt;full&lt;/code&gt; (default), &lt;code&gt;ultra&lt;/code&gt;, plus three classical-Chinese &lt;code&gt;wenyan-*&lt;/code&gt; variants, each with worked examples.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-clarity safety valve:&lt;/strong&gt; it drops back to normal prose for security warnings, irreversible-action confirmations, and any case where dropping articles/conjunctions would make a multi-step instruction ambiguous — then resumes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language preservation:&lt;/strong&gt; compress the &lt;em&gt;style&lt;/em&gt;, not the &lt;em&gt;language&lt;/em&gt; (Portuguese in → Portuguese caveman out); code, API names, commit-type keywords, and error strings stay verbatim.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;cavecrew&lt;/strong&gt; is a separate play: three subagent presets (&lt;code&gt;cavecrew-investigator/builder/reviewer&lt;/code&gt;) that emit caveman output. The win here is &lt;strong&gt;main-context longevity, not output cost&lt;/strong&gt; — a subagent's result is injected verbatim into the main thread, so a 2k-token prose result costs 2k of main-context budget; the caveman version returns ~700. Each has a strict, greppable output contract (e.g. investigator: &lt;code&gt;path:line — symbol — note&lt;/code&gt;).&lt;/p&gt;

&lt;h4&gt;
  
  
  The runtime layer — three stateless hooks, one flag file
&lt;/h4&gt;

&lt;p&gt;Claude Code spawns a fresh process per hook event and keeps no plugin code resident, so the hooks share no memory — they coordinate through one ~12-byte file, &lt;code&gt;$CLAUDE_CONFIG_DIR/.caveman-active&lt;/code&gt; (contents = the active mode string):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; SessionStart            UserPromptSubmit          Statusline
 caveman-activate.js     caveman-mode-tracker.js   caveman-statusline.sh/.ps1
       │                        │                         │
       └──── writes mode ───►  .caveman-active  ◄── reads ┘ (read-only)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SessionStart → &lt;code&gt;caveman-activate.js&lt;/code&gt;&lt;/strong&gt; (fires once): resolves + persists the mode; injects the ruleset as hidden system context (SessionStart stdout &lt;em&gt;is&lt;/em&gt; system context the model sees but the user doesn't); reads &lt;code&gt;SKILL.md&lt;/code&gt; at runtime and filters the 6-level table down to the active level only (saves injected tokens, avoids confusing the model with inactive levels); and appends a statusline nudge if none is configured (how a plugin, which can't write &lt;code&gt;settings.json&lt;/code&gt;, bootstraps its badge).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;UserPromptSubmit → &lt;code&gt;caveman-mode-tracker.js&lt;/code&gt;&lt;/strong&gt; (every message): natural-language activation (below); &lt;code&gt;/caveman-stats&lt;/code&gt; interception (blocks the prompt and returns stats via &lt;code&gt;{decision:"block"}&lt;/code&gt; — a zero-token slash command); slash-command parsing; NL deactivation (unlink the flag); and &lt;strong&gt;per-turn reinforcement&lt;/strong&gt; — a one-line re-anchor every turn, because SessionStart's full ruleset gets pruned by context compression and competing plugins inject their own style each turn.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Statusline → &lt;code&gt;caveman-statusline.sh/.ps1&lt;/code&gt;&lt;/strong&gt; (read-only, frequent): never shells out to node; reads the flag (≤64 bytes), lowercases, strips anything outside &lt;code&gt;[a-z0-9-]&lt;/code&gt;, whitelist-matches a known mode, and renders &lt;code&gt;[CAVEMAN:MODE]&lt;/code&gt; plus a savings suffix.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The security spine — &lt;code&gt;caveman-config.js&lt;/code&gt;.&lt;/strong&gt; All hook filesystem I/O funnels through one module because the flag path is predictable and user-writable — the canonical local-attacker setup. &lt;code&gt;safeWriteFlag&lt;/code&gt; uses &lt;code&gt;O_NOFOLLOW&lt;/code&gt; + atomic temp-write + rename + &lt;code&gt;0600&lt;/code&gt;, and refuses if the flag itself is a symlink (the clobber vector) while still tolerating a legitimately symlinked parent dir (resolve, then verify uid ownership). &lt;code&gt;readFlag&lt;/code&gt; is symmetric (symlink-refuse, 64-byte cap, &lt;code&gt;VALID_MODES&lt;/code&gt; whitelist → returns null on any anomaly). Without this, an attacker could symlink the flag to &lt;code&gt;~/.ssh/id_rsa&lt;/code&gt;, or fill it with ANSI/OSC escapes the statusline would render on every keystroke; the whitelist guarantees the only renderable outputs are the ~11 known mode strings. &lt;code&gt;src/hooks/package.json&lt;/code&gt; pins &lt;code&gt;{"type":"commonjs"}&lt;/code&gt; so the hooks load even when an ancestor &lt;code&gt;package.json&lt;/code&gt; declares &lt;code&gt;"type":"module"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;"Off" is represented by &lt;em&gt;file absence&lt;/em&gt;: deactivation unlinks the flag, and readers treat "missing" and "unreadable" identically to inactive — failing safe toward off, never toward leaking content.&lt;/p&gt;

&lt;h4&gt;
  
  
  Natural-language activation
&lt;/h4&gt;

&lt;p&gt;The matcher in &lt;code&gt;caveman-mode-tracker.js&lt;/code&gt; has three OR'd branches: &lt;strong&gt;(A)&lt;/strong&gt; verb-before-noun (&lt;code&gt;activate|enable|turn on|start|talk like … caveman&lt;/code&gt;), &lt;strong&gt;(B)&lt;/strong&gt; noun-before-keyword (&lt;code&gt;caveman … mode|activate|enable|…&lt;/code&gt;), and &lt;strong&gt;(C)&lt;/strong&gt; brevity requests with no "caveman" at all (&lt;code&gt;less tokens|fewer tokens|be brief|be terse|shorter answers&lt;/code&gt;). A negative guard (&lt;code&gt;!/stop|disable|turn off|deactivate/&lt;/code&gt;) is load-bearing: "turn off caveman mode" matches Branch B, but the guard suppresses the write so a deactivation request can't accidentally re-arm the mode.&lt;/p&gt;

&lt;p&gt;Two deliberate asymmetries: activation &lt;strong&gt;never carries a level&lt;/strong&gt; (it always resolves the default — "talk like caveman ultra" does not give ultra; in an ultra-pinned repo, "be brief" silently activates ultra), and it &lt;strong&gt;respects &lt;code&gt;off&lt;/code&gt; as a configured default&lt;/strong&gt; (even "talk like caveman" writes nothing if the configured default is off — configured intent beats an ad-hoc phrase). Firing order is fixed — NL activation → stats → slash → NL deactivation → reinforcement — and because deactivation runs last and unconditionally deletes, it &lt;strong&gt;always wins on conflict&lt;/strong&gt; (fail-safe toward off).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Verified gap:&lt;/strong&gt; the opencode port (&lt;code&gt;src/plugins/opencode/plugin.js&lt;/code&gt;) reimplements the logic but has &lt;strong&gt;no Branch C&lt;/strong&gt; — "less tokens" / "be brief" do not activate caveman under opencode, a genuine inconsistency with the Claude Code hook and the README promise. A clear fix-or-document candidate.&lt;/p&gt;

&lt;h4&gt;
  
  
  Why a SessionStart hook, not &lt;code&gt;--append-system-prompt&lt;/code&gt;?
&lt;/h4&gt;

&lt;p&gt;This is a useful design lesson about Claude Code's injection paths. &lt;code&gt;--append-system-prompt&lt;/code&gt; fails Caveman's constraints; a SessionStart hook satisfies them:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A plugin cannot set it.&lt;/strong&gt; It's a CLI/SDK launch flag; &lt;code&gt;plugin.json&lt;/code&gt; has no &lt;code&gt;systemPrompt&lt;/code&gt;/&lt;code&gt;appendSystemPrompt&lt;/code&gt; field (the keys are &lt;code&gt;name&lt;/code&gt;/&lt;code&gt;description&lt;/code&gt;/&lt;code&gt;author&lt;/code&gt;/&lt;code&gt;hooks&lt;/code&gt;). SessionStart stdout-as-context is the only plugin-declarable always-on injection path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The content is dynamic.&lt;/strong&gt; &lt;code&gt;caveman-activate.js&lt;/code&gt; computes the injection each session (mode resolution, &lt;code&gt;SKILL.md&lt;/code&gt; read, level filtering, conditional statusline nudge); a static launch string can't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always-on, zero per-launch friction.&lt;/strong&gt; &lt;code&gt;--append-system-prompt&lt;/code&gt; is per-invocation; a SessionStart hook fires forever after one install.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Mid-session switching needs a hook anyway.&lt;/strong&gt; A launch-time string is frozen for the process; Caveman already needs UserPromptSubmit for switching + reinforcement, so SessionStart is the natural sibling on one flag-file mechanism.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single source of truth.&lt;/strong&gt; The hook reads &lt;code&gt;SKILL.md&lt;/code&gt; live; a static append would drift.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The honest counterpoint: &lt;code&gt;--append-system-prompt&lt;/code&gt; sits &lt;em&gt;higher&lt;/em&gt; in precedence (it's appended to the real system prompt; SessionStart stdout is a lower-precedence system reminder), so a static append would be stickier in principle. Caveman gives that up and compensates with per-turn reinforcement — which arguably survives context compression &lt;em&gt;better&lt;/em&gt; because it's continuously re-injected. Net: the SessionStart hook wins not as a better injection primitive in the abstract, but as the only one that is plugin-distributable, dynamic, persistent, and switchable.&lt;/p&gt;

&lt;h4&gt;
  
  
  Cross-cutting lessons from Caveman
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Deliberately lopsided&lt;/strong&gt; — trivial core behavior, heavily over-engineered distribution. That's the &lt;em&gt;correct&lt;/em&gt; trade: the hard problem isn't making the model terse (one paragraph does that), it's reliably injecting that paragraph into 30 agents with incompatible configs, idempotently, on every OS, without clobbering user config or opening a symlink attack, and keeping it loaded as context compresses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Issue-driven design&lt;/strong&gt; — nearly every non-obvious line traces to a numbered GitHub issue (Windows quoting, hook double-firing, orphaned hooks, opencode hook keys), not speculation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Honest measurement&lt;/strong&gt; — the eval harness compares skill-output vs &lt;em&gt;plain-terse&lt;/em&gt; output (not vs baseline, which would conflate the skill with generic terseness), and stats report savings only for the benchmarked &lt;code&gt;full&lt;/code&gt; mode.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Model routers — attacking the model-tier line
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;model router&lt;/strong&gt; looks at each turn and sends it to a different model based on how hard the turn looks — easy turns to a cheap model, hard turns to a strong one — to lower the bill.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;RouteLLM&lt;/strong&gt; (UC Berkeley LMSYS, &lt;code&gt;lm-sys/RouteLLM&lt;/code&gt;) — a trained router; reports ~85% cost reduction at ~95% of GPT-4 quality on MT Bench (45% on MMLU, 35% on GSM8K). The figures are GPT-4-era and benchmark-specific. &lt;strong&gt;[high]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;vLLM Semantic Router&lt;/strong&gt; (&lt;code&gt;vllm-project/semantic-router&lt;/code&gt;) — routes across local/private/frontier models by cost, latency, privacy, and safety. &lt;strong&gt;[high]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The idea is sound for stateless, one-shot calls. Inside a &lt;em&gt;stateful&lt;/em&gt; coding agent like Claude Code, switching models &lt;strong&gt;mid-conversation&lt;/strong&gt; ("dynamic" or per-request routing) has three costs that don't show up on the per-token price tag — you only see them when you measure:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You lose the cache.&lt;/strong&gt; The prompt cache is &lt;em&gt;model-scoped&lt;/em&gt; (Act II) — Sonnet can't read Opus's cached prefix. The first turn you bounce to a cheaper model, that model has no warm cache and re-reads the entire conversation from scratch. On a long session the prefix dominates the bill, so that one cold read can wipe out everything the cheaper model saved you.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The reasoning is carried and billed — but you can't verify the new model &lt;em&gt;uses&lt;/em&gt; it.&lt;/strong&gt; Earlier turns contain &lt;em&gt;thinking blocks&lt;/em&gt; the original model produced. Across the Opus/Sonnet/Haiku family those blocks &lt;em&gt;do&lt;/em&gt; cross the switch — re-rendered into the new model's prompt and billed as input, so you &lt;strong&gt;pay&lt;/strong&gt; to carry them. What you &lt;strong&gt;can't&lt;/strong&gt; see, because they're encrypted, is whether the target model actually reused that reasoning or quietly ignored it — you pay for the carry with no guarantee of the benefit. The only symptom of a miss is the new model doing a worse job.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coming back can miss too.&lt;/strong&gt; A cache breakpoint only looks back ~20 content blocks (Act I). If you route &lt;em&gt;away&lt;/em&gt; from your main model and stay away for more than ~20 blocks before returning, the main model's cached prefix has aged out of that window — so even coming home is a cold miss, not the warm hit you expected.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Because of #2, judge routing by outcomes, not by the token price.&lt;/strong&gt; A router can make every turn cheaper and still be a &lt;em&gt;net loss&lt;/em&gt; if the weaker reasoning makes you take more human turns to get the same result — three cheap turns plus your time to re-steer the agent can cost more than one good expensive turn. So &lt;strong&gt;put evals in place&lt;/strong&gt; (task success rate, turns-to-done, human-correction rate) and watch them the moment you enable routing. If the human experience gets worse, the token savings are a mirage.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — routing per request inside one conversation.&lt;/strong&gt; It forfeits the model-scoped cache, makes you carry-and-bill the original model's (encrypted, unverifiable) reasoning on the new model, and can miss the cache even on the return trip. Measured below: bouncing 20% of turns to Sonnet actually &lt;em&gt;loses&lt;/em&gt; money.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Route &lt;strong&gt;sticky&lt;/strong&gt; (per conversation or per sub-agent) so each model reads its own warm cache, or use routers only for stateless single-shot traffic where there's no cache to lose. Either way, gate it behind evals so a cheaper bill can't hide a worse experience.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  How routing actually behaves inside Claude Code
&lt;/h2&gt;

&lt;p&gt;The routers above assume you can swap models freely. Inside Claude Code you can't: the request body is &lt;strong&gt;co-designed for one target model&lt;/strong&gt;, and routing it elsewhere fails in escalating ways before you even reach the cache economics.&lt;/p&gt;

&lt;h3&gt;
  
  
  The parameter-failure ladder &lt;strong&gt;[measured / doc-confirmed]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;A naive proxy that rewrites only the &lt;code&gt;model&lt;/code&gt; field hits a wall fast. req 0001 → Sonnet (200); req 0002 → rerouted to Haiku → &lt;strong&gt;&lt;code&gt;400: "adaptive thinking is not supported on this model"&lt;/code&gt;&lt;/strong&gt; → Claude Code aborts. Claude Code sends &lt;code&gt;thinking:{type:"adaptive"}&lt;/code&gt; (valid for Sonnet); Haiku 4.5 rejects it.&lt;/p&gt;

&lt;p&gt;Three parameters must be normalized to route to Haiku, each surfacing as a distinct 400:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;output_config.effort&lt;/code&gt;&lt;/strong&gt; — Haiku 4.5 rejects effort; drop it. (Relatedly, &lt;code&gt;xhigh&lt;/code&gt; is Opus-4.7+-only: replaying an Opus request to Sonnet 4.6 gives &lt;code&gt;400: "does not support effort level 'xhigh'. Supported: high, low, max, medium."&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;thinking:{type:"adaptive"}&lt;/code&gt;&lt;/strong&gt; → must become &lt;code&gt;{type:"enabled","budget_tokens":N}&lt;/code&gt;. You &lt;em&gt;can't&lt;/em&gt; set &lt;code&gt;{type:"disabled"}&lt;/code&gt;, because Claude Code also sends &lt;code&gt;context_management:{clear_thinking_20251015}&lt;/code&gt;, which returns &lt;code&gt;400: "clear_thinking_20251015 strategy requires thinking to be enabled or adaptive."&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;budget_tokens&lt;/code&gt;&lt;/strong&gt; must be ≥1024 and &amp;lt; &lt;code&gt;max_tokens&lt;/code&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;With those fixed (a "smart" proxy), both models complete: feature built, all 87 tests pass, every request 200; rerouted-to-Haiku requests carrying Sonnet thinking blocks all succeeded. The lesson: &lt;strong&gt;the request body is interlocked&lt;/strong&gt; — model name, thinking parameter, effort, and context-management strategy are co-designed. Naive per-request rewriting fails loudly (400), and even when fixed, forfeits the model-scoped cache.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Claude Code already routes correctly — and you should leave it alone.&lt;/strong&gt; It calls Haiku &lt;em&gt;itself&lt;/em&gt; for background/auxiliary work while the main loop stays on your primary model (observed as &lt;code&gt;orig_model=haiku&lt;/code&gt; requests the proxy never touched). The right answer for almost everyone is to &lt;strong&gt;let the built-in routing do its job and add no router at all.&lt;/strong&gt; Don't insert your own routing &lt;strong&gt;unless you really know what you're doing&lt;/strong&gt; — you'd have to reproduce the interlocked parameter normalization above &lt;em&gt;and&lt;/em&gt; still eat the model-scoped cache penalty, which usually costs more than it saves. If you have a genuine reason to route, route &lt;strong&gt;sticky&lt;/strong&gt; (per task or per sub-agent), &lt;strong&gt;never mid-conversation&lt;/strong&gt; — and treat that as an advanced move you measure yourself, not a default.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The rate card and the per-turn cost formula
&lt;/h3&gt;

&lt;p&gt;Per 1M tokens; read 0.1×, write 2× (1-hour TTL):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;component&lt;/th&gt;
&lt;th&gt;Opus 4.8&lt;/th&gt;
&lt;th&gt;Sonnet 4.6&lt;/th&gt;
&lt;th&gt;Haiku 4.5&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;cache read&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;td&gt;$0.30&lt;/td&gt;
&lt;td&gt;$0.10&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;cache write (2×)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$10.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$6.00&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;$2.00&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;output&lt;/td&gt;
&lt;td&gt;$25.00&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;(base input)&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The per-turn cost is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;turn $ = read × read_rate + write × write_rate + output × output_rate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With rates baked in (output ×0.775 tokenizer factor for Haiku/Sonnet):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Opus $   = read × 0.00000050 + write × 0.00001000 + output × 0.000025
Sonnet $ = read × 0.00000030 + write × 0.00000600 + (output × 0.775) × 0.000015
Haiku $  = read × 0.00000010 + write × 0.00000200 + (output × 0.775) × 0.000005
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  The two regimes, and break-even
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sticky / per-conversation routing&lt;/strong&gt; → the cheaper model reads its &lt;em&gt;own&lt;/em&gt; warm cache → a flat rate multiple of Opus → &lt;strong&gt;clean savings.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-request bouncing&lt;/strong&gt; → the 1st call cold-writes the full prefix; later calls catch-up-write the gap. Write-heavy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Break-even (per turn):&lt;/strong&gt; a cold bounce only saves when the turn's &lt;em&gt;output&lt;/em&gt; is large relative to the prefix — empirically around &lt;strong&gt;2,500 output tokens.&lt;/strong&gt; Below that, the cold prefix write dwarfs whatever you saved on the cheaper output.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Three measured experiments
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Experiment A — tiny task (linked-list fix, 6 turns, Opus 4.8).&lt;/strong&gt; Prefix ~41K, all warm. All-Opus: &lt;strong&gt;$0.199&lt;/strong&gt; (~$0.033/turn). A 10%-Haiku per-request bounce: &lt;strong&gt;+16%&lt;/strong&gt; (the cold write dwarfs the tiny outputs). A 10%-Haiku &lt;em&gt;sticky&lt;/em&gt;: &lt;strong&gt;−8%.&lt;/strong&gt; Break-even output ≈ 2,500 tokens. (Claude Code self-reported $0.182 — its 1.25× write estimate.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experiment B — feature build (16 Opus turns, MCP stabilized).&lt;/strong&gt; Usage: &lt;code&gt;cache_read&lt;/code&gt; 1,067,378 · &lt;code&gt;cache_write&lt;/code&gt; 53,449 · &lt;code&gt;output&lt;/code&gt; 35,579 · &lt;code&gt;input&lt;/code&gt; 2,502. Acceptance 27/27, 24 unit tests green.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Turn&lt;/th&gt;
&lt;th&gt;read&lt;/th&gt;
&lt;th&gt;write&lt;/th&gt;
&lt;th&gt;output&lt;/th&gt;
&lt;th&gt;Opus $&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;29,874&lt;/td&gt;
&lt;td&gt;11,301&lt;/td&gt;
&lt;td&gt;168&lt;/td&gt;
&lt;td&gt;0.1321&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;41,175&lt;/td&gt;
&lt;td&gt;1,544&lt;/td&gt;
&lt;td&gt;13,466&lt;/td&gt;
&lt;td&gt;0.3850 ✱&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;42,719&lt;/td&gt;
&lt;td&gt;16,024&lt;/td&gt;
&lt;td&gt;127&lt;/td&gt;
&lt;td&gt;0.1848&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;58,743&lt;/td&gt;
&lt;td&gt;517&lt;/td&gt;
&lt;td&gt;5,802&lt;/td&gt;
&lt;td&gt;0.1796&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;59,260&lt;/td&gt;
&lt;td&gt;5,858&lt;/td&gt;
&lt;td&gt;8,347&lt;/td&gt;
&lt;td&gt;0.2969&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;65,118&lt;/td&gt;
&lt;td&gt;8,611&lt;/td&gt;
&lt;td&gt;446&lt;/td&gt;
&lt;td&gt;0.1298&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;73,729&lt;/td&gt;
&lt;td&gt;561&lt;/td&gt;
&lt;td&gt;194&lt;/td&gt;
&lt;td&gt;0.0473&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;74,290&lt;/td&gt;
&lt;td&gt;283&lt;/td&gt;
&lt;td&gt;500&lt;/td&gt;
&lt;td&gt;0.0525&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;74,573&lt;/td&gt;
&lt;td&gt;591&lt;/td&gt;
&lt;td&gt;316&lt;/td&gt;
&lt;td&gt;0.0511&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;75,164&lt;/td&gt;
&lt;td&gt;405&lt;/td&gt;
&lt;td&gt;466&lt;/td&gt;
&lt;td&gt;0.0533&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;75,569&lt;/td&gt;
&lt;td&gt;562&lt;/td&gt;
&lt;td&gt;96&lt;/td&gt;
&lt;td&gt;0.0458&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;76,131&lt;/td&gt;
&lt;td&gt;477&lt;/td&gt;
&lt;td&gt;3,232&lt;/td&gt;
&lt;td&gt;0.1236&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;76,608&lt;/td&gt;
&lt;td&gt;3,468&lt;/td&gt;
&lt;td&gt;123&lt;/td&gt;
&lt;td&gt;0.0761&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;14&lt;/td&gt;
&lt;td&gt;80,076&lt;/td&gt;
&lt;td&gt;1,269&lt;/td&gt;
&lt;td&gt;1,574&lt;/td&gt;
&lt;td&gt;0.0921&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;81,345&lt;/td&gt;
&lt;td&gt;1,659&lt;/td&gt;
&lt;td&gt;227&lt;/td&gt;
&lt;td&gt;0.0629&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;83,004&lt;/td&gt;
&lt;td&gt;319&lt;/td&gt;
&lt;td&gt;495&lt;/td&gt;
&lt;td&gt;0.0571&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1.970&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;✱ Turn 2 includes a one-off uncached-input term (2,472 × $5/M = $0.0124). The cross-check is instructive: the documented 2× rate totals &lt;strong&gt;$1.970&lt;/strong&gt;, while Claude Code &lt;em&gt;displayed&lt;/em&gt; &lt;strong&gt;$1.77&lt;/strong&gt; (its 1.25× under-count). The $0.20 gap = 53,449 × ($10−$6.25)/M ≈ $0.20 — exactly the extra write cost. This is the displayed-cost gotcha made concrete.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Experiment D — interleaved 25-turn routing (Opus / Haiku / Sonnet).&lt;/strong&gt; 20% of turns (3, 9, 14, 19, 23) routed to a cheaper model in a single session; turn 3 is the first bounce (cold), the rest are catch-up writes. One table answers the question directly — what a 25-turn session costs when 20% of its turns go to a different model, versus staying put or going fully sticky:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Routing strategy&lt;/th&gt;
&lt;th&gt;Session cost&lt;/th&gt;
&lt;th&gt;vs all-Opus&lt;/th&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;All-Opus (no routing)&lt;/td&gt;
&lt;td&gt;$2.574&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20% of turns → Haiku, per-request&lt;/td&gt;
&lt;td&gt;$2.278&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;−11.5%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;modest win&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;20% of turns → Sonnet, per-request&lt;/td&gt;
&lt;td&gt;$2.628&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+2.1%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;❌ &lt;em&gt;loses money&lt;/em&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All-Haiku, &lt;strong&gt;sticky&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;~$0.41&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;−85%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;biggest win&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;All-Sonnet, &lt;strong&gt;sticky&lt;/strong&gt;
&lt;/td&gt;
&lt;td&gt;~$1.21&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;−53%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;big win&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Read it slowly — it's the punchline of the whole guide. Routing 20% of turns to &lt;strong&gt;Sonnet per-request actually loses money&lt;/strong&gt; (+2.1%): Sonnet's write rate is 3× Haiku's, so on the cold and catch-up turns the parallel-prefix write costs more than the cheaper output saves (turn 3 alone runs +$0.09 versus just staying on Opus). Yet the &lt;em&gt;same&lt;/em&gt; model run &lt;strong&gt;sticky saves 53%.&lt;/strong&gt; That's the lesson: &lt;strong&gt;where you switch matters more than whether&lt;/strong&gt; — bouncing per-request can cost you; keeping one cheaper model for the whole conversation is the real win.&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing a tool: the decision matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool / class&lt;/th&gt;
&lt;th&gt;Attacks&lt;/th&gt;
&lt;th&gt;Net win when…&lt;/th&gt;
&lt;th&gt;Watch-out&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RouteLLM, vLLM Semantic Router&lt;/td&gt;
&lt;td&gt;model tier&lt;/td&gt;
&lt;td&gt;routing is sticky per conversation/sub-agent&lt;/td&gt;
&lt;td&gt;per-request bouncing forfeits the model-scoped cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Headroom&lt;/td&gt;
&lt;td&gt;input + cache&lt;/td&gt;
&lt;td&gt;the prefix is stabilized so cache reads survive&lt;/td&gt;
&lt;td&gt;retrieval round-trips cost some tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LLMLingua&lt;/td&gt;
&lt;td&gt;input&lt;/td&gt;
&lt;td&gt;one-shot / RAG context, not a cached prefix&lt;/td&gt;
&lt;td&gt;busts the cache; degrades past ~20×&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;RTK, lean-ctx&lt;/td&gt;
&lt;td&gt;tool-output input&lt;/td&gt;
&lt;td&gt;tool outputs are verbose (logs, installers, &lt;code&gt;git&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Caveman&lt;/td&gt;
&lt;td&gt;output&lt;/td&gt;
&lt;td&gt;output volume dominates and terse prose is acceptable&lt;/td&gt;
&lt;td&gt;human readability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;OpenAI Compaction, Compresr, Token Co.&lt;/td&gt;
&lt;td&gt;history / input&lt;/td&gt;
&lt;td&gt;provider or hosted flows&lt;/td&gt;
&lt;td&gt;hosted ones send data off-box; non-reversible&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The single most reliable lever is protecting the cached prefix&lt;/strong&gt; (Act I). Compression and routing help only insofar as they don't undo that. &lt;strong&gt;Output compression (Caveman) and sticky routing are the two that never touch the prefix&lt;/strong&gt; — which is why they're the safest wins in the entire ecosystem.&lt;/p&gt;
&lt;/blockquote&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>claude</category>
      <category>programming</category>
    </item>
    <item>
      <title>Claude Code Costs, Act II — Where the big hidden costs are</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Fri, 26 Jun 2026 05:59:40 +0000</pubDate>
      <link>https://dev.to/sumedhbala/claude-code-costs-act-ii-where-the-big-hidden-costs-are-4gf1</link>
      <guid>https://dev.to/sumedhbala/claude-code-costs-act-ii-where-the-big-hidden-costs-are-4gf1</guid>
      <description>&lt;p&gt;&lt;iframe class="speakerdeck-iframe ltag_speakerdeck" src="https://speakerdeck.com/player/f6e53458057d4c488e336e6284e530a4"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📊 New here? Scroll through the slides above for a quick visual overview of this act — then dive into the details below.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A single-model session that stays well-cached is cheap. The biggest swing in a multi-model bill comes from one move — &lt;strong&gt;switching models&lt;/strong&gt; — because the prompt cache belongs to a single model. The instant you switch, the cache you already paid for is thrown away.&lt;/p&gt;

&lt;p&gt;Whether that &lt;em&gt;helps or hurts&lt;/em&gt; comes down to &lt;strong&gt;how&lt;/strong&gt; you switch:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sticky&lt;/strong&gt; — keep a whole conversation (or a sub-agent) on one model. The cheaper model builds and reuses its &lt;em&gt;own&lt;/em&gt; warm cache, so the bill goes &lt;strong&gt;down&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-turn bouncing&lt;/strong&gt; — flip between models mid-conversation. It's &lt;em&gt;not&lt;/em&gt; a full cold rebuild each time: only the &lt;strong&gt;first&lt;/strong&gt; call to each model is fully cold, and returning to one you've already used re-reads that model's &lt;em&gt;own&lt;/em&gt; warm prefix and writes only the &lt;em&gt;catch-up&lt;/em&gt; since you left (so long as you're back within the ~20-block / 1-hour reuse window — see &lt;em&gt;first call cold, then warm catch-up&lt;/em&gt; below). But you're now keeping &lt;strong&gt;two&lt;/strong&gt; caches warm and paying a catch-up write on every flip, so at the 2× write rate it can still cost &lt;strong&gt;more&lt;/strong&gt; than never switching.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Our 25-turn run shows both ends: bouncing 20% of turns to Sonnet &lt;em&gt;lost&lt;/em&gt; ~2%, while keeping the whole run on Sonnet &lt;em&gt;saved&lt;/em&gt; ~53% (on Haiku, ~85%) — see Act III. So the goal isn't to avoid switching; it's to switch the right way. The costly version is the one that sneaks in by accident: a router that picks a model per request, or a manual swap partway through a conversation.&lt;/p&gt;

&lt;p&gt;What about the model's prior &lt;strong&gt;reasoning&lt;/strong&gt; — its thinking blocks? It's tempting to count that as a second switching cost, but mechanically it's &lt;em&gt;just context&lt;/em&gt;: the client re-sends it every turn, switch or not (Mental model 1). A switch only changes what the &lt;em&gt;new&lt;/em&gt; model does with it — it either &lt;strong&gt;re-bills&lt;/strong&gt; the reasoning as input (a cost), or &lt;strong&gt;strips&lt;/strong&gt; it before the model sees it (a behavioral change).&lt;/p&gt;

&lt;p&gt;The strip case is the one to watch, and it's about &lt;em&gt;behavior&lt;/em&gt;, not money. Reasoning travels as an opaque, encrypted signature: you can't read it or edit it — you can only carry it whole or lose it. If a switch drops it, the model continues &lt;em&gt;without&lt;/em&gt; its earlier chain of thought, and may no longer behave the way the previous turns set it up to.&lt;/p&gt;

&lt;p&gt;So this part covers the cache cost of a switch first, then what happens to reasoning across one.&lt;/p&gt;

&lt;h2&gt;
  
  
  Switching models: the model-scoped cache
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Mental model 3: the model-scoped key
&lt;/h3&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;A cache entry belongs to exactly one model. Another model cannot read it.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is the rule that makes "just route the easy turns to a cheaper model" so often backfire. There's really &lt;strong&gt;one fundamental reason&lt;/strong&gt; a switch can't reuse the cache you already paid for — proven below: the cache key is model-scoped. And even once you accept that, a switch costs &lt;em&gt;more&lt;/em&gt; than a clean cold start would, because the &lt;strong&gt;token counts shift between models&lt;/strong&gt; — a separate effect that compounds the bill, covered after.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why the cache can't carry: caches are model-scoped &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It falls straight out of &lt;em&gt;What the cache stores&lt;/em&gt; (Mental model 2). A cache hit reuses the &lt;strong&gt;key/value vectors&lt;/strong&gt; a model computed for the prefix — and those vectors are produced by that model's &lt;em&gt;own weights&lt;/em&gt;. Run the identical tokens through a &lt;em&gt;different&lt;/em&gt; model and you get different queries, keys, and values, a different attention computation, and therefore a different KV state. So a cached entry is meaningful only to the exact model that produced it: hand Sonnet's KV cache to Haiku and it's noise. &lt;strong&gt;That's why the cache can't carry across a switch&lt;/strong&gt; — not a policy choice but a consequence of attention itself; the saved state simply isn't the state the new model would have computed.&lt;/p&gt;

&lt;p&gt;The clean proof — a byte-identical 63K-token prompt, varying &lt;em&gt;only&lt;/em&gt; the model:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;call&lt;/th&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_read&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_creation&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;sonnet #1&lt;/td&gt;
&lt;td&gt;sonnet-4-6&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;63,422&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sonnet #2&lt;/td&gt;
&lt;td&gt;sonnet-4-6&lt;/td&gt;
&lt;td&gt;63,422&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;haiku #1&lt;/td&gt;
&lt;td&gt;haiku-4-5&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;64,031&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;haiku #2&lt;/td&gt;
&lt;td&gt;haiku-4-5&lt;/td&gt;
&lt;td&gt;64,031&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Sonnet's second call reads its own warm entry. Haiku's &lt;em&gt;first&lt;/em&gt; call — same bytes — reads &lt;strong&gt;0&lt;/strong&gt; and cold-writes its own copy. The two models cannot share an entry.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The live consequence — duplicated cache writes in a mixed session&lt;/strong&gt; (a real 50/50 Sonnet/Haiku session, per-model totals) &lt;strong&gt;[measured]&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;th&gt;requests&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_creation&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_read&lt;/code&gt;&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;sonnet&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;27,374&lt;/td&gt;
&lt;td&gt;536,662&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;haiku&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;57,739&lt;/td&gt;
&lt;td&gt;321,744&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Haiku cold-wrote a &lt;strong&gt;57,739-token duplicate&lt;/strong&gt; of the shared prefix it could never read from Sonnet — about &lt;strong&gt;85,113 total &lt;code&gt;cache_creation&lt;/code&gt; tokens of pure duplication&lt;/strong&gt; a single-model session would never pay. At the 2× write rate you pay the write premium &lt;em&gt;twice&lt;/em&gt; and lose the 0.1× read discount on the rerouted slice. &lt;strong&gt;The savings from a cheaper model are eaten by cache duplication.&lt;/strong&gt; This is the central caveat of the whole guide.&lt;/p&gt;

&lt;p&gt;The bytes happen to differ too — diffing the system prompt across models, two lines change (the model-name line and the knowledge-cutoff line, e.g. &lt;code&gt;Opus 4.7 / January 2026&lt;/code&gt; → &lt;code&gt;Haiku 4.5 / February 2025&lt;/code&gt;). But that's moot for caching: the model-scoped key already settled it. No amount of byte-matching would let Haiku read Sonnet's KV state — so don't think of the differing text as a &lt;em&gt;second&lt;/em&gt; cause; it's the same wall.&lt;/p&gt;

&lt;h3&gt;
  
  
  And the cost shifts too: the tokenizers differ &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This one isn't about the cache at all — it's a separate cost effect that rides along with every switch. Token counts shift between models, so even "the same text" bills as a different number of tokens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Opus 4.8 = &lt;strong&gt;1.0&lt;/strong&gt; (reference)&lt;/li&gt;
&lt;li&gt;Haiku 4.5 = &lt;strong&gt;0.775&lt;/strong&gt; (measured: 64,316 vs 83,004 tokens for the same content)&lt;/li&gt;
&lt;li&gt;Sonnet 4.6 = &lt;strong&gt;0.775&lt;/strong&gt; (assumed — same pre-4.7 tokenizer family, unmeasured)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Anthropic publishes no official ratio (use &lt;code&gt;count_tokens&lt;/code&gt; per model); it documents the 4.7/4.8/Fable tokenizer as ~1×–1.35× an older one, putting older models at roughly 0.74–1.0× of Opus.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mental model refinement: first call cold, then warm catch-up
&lt;/h3&gt;

&lt;p&gt;The good news is that a switch is &lt;strong&gt;not cold forever:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Only the &lt;strong&gt;first&lt;/strong&gt; call to a given model is fully cold. Each &lt;em&gt;subsequent&lt;/em&gt; call to that model is &lt;strong&gt;partially warm&lt;/strong&gt;: it reads that model's own cache and cold-writes only the catch-up diff — the content added by intervening turns on the &lt;em&gt;other&lt;/em&gt; model. Cost = one cold start + a recurring catch-up write per re-entry, &lt;strong&gt;not&lt;/strong&gt; a cold start every call.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two bounds apply: the TTL (1 hour, refreshed on each read of that model's entry) and the 20-block lookback (~7–10 turns). Beyond the lookback the &lt;em&gt;tail&lt;/em&gt; can't re-link, but the front breakpoints (tools+system) still hit — so you re-read the system prefix warm and only cold-write the message history. At the 2× write rate, both the cold start and the catch-up writes hurt twice as much, which is exactly what flips routing economics in Act III.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — routing per turn to save money.&lt;/strong&gt; Bouncing between models mid-conversation pays a cold prefix write on the first switch &lt;em&gt;plus&lt;/em&gt; a catch-up write on every re-entry. For a higher-write-rate model (Sonnet), this can cost &lt;em&gt;more&lt;/em&gt; than just staying on Opus.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Route &lt;strong&gt;sticky&lt;/strong&gt;: pick a model per &lt;em&gt;conversation&lt;/em&gt; or per &lt;em&gt;sub-agent&lt;/em&gt; and stay there. (Act III quantifies it: 20%-Sonnet per-turn bouncing &lt;em&gt;loses&lt;/em&gt; 2.1%, while all-Sonnet sticky &lt;em&gt;saves&lt;/em&gt; 53%.)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;What to do:&lt;/strong&gt; Default to treating the model as a &lt;em&gt;per-conversation&lt;/em&gt; decision, not a per-turn one — and if you use multiple tiers, isolate them into separate sub-agents/conversations so each keeps its own warm cache. Plenty of commercial routers and gateways will route per request for you automatically, and they &lt;em&gt;can&lt;/em&gt; save money — but the win is workload-dependent, and (as the numbers above show) per-request routing can quietly cost &lt;em&gt;more&lt;/em&gt;, especially for a higher-write-rate model. So don't switch one on blindly. Adopt it only once you can see the evidence on &lt;strong&gt;your&lt;/strong&gt; traffic: measure &lt;code&gt;cache_read&lt;/code&gt; vs &lt;code&gt;cache_creation&lt;/code&gt; and your actual billed cost, understand &lt;em&gt;why&lt;/em&gt; the cache is model-scoped, and confirm a real net saving before you rely on it. And weigh more than cost: a cheaper tier can carry an older knowledge cutoff (e.g. Haiku 4.5's is ~11 months behind Opus's) — a &lt;em&gt;behavioral&lt;/em&gt; difference that routing-for-price quietly inherits.&lt;/p&gt;

&lt;h2&gt;
  
  
  Thinking blocks across a model switch
&lt;/h2&gt;

&lt;p&gt;Act I established that the client re-sends everything each turn, including the model's prior reasoning — the &lt;strong&gt;thinking blocks.&lt;/strong&gt; Carrying them is ordinary context, but a switch forces a choice with two kinds of consequence: they're either re-rendered into the target model's prompt and &lt;strong&gt;billed as input&lt;/strong&gt; (a cost), or &lt;strong&gt;stripped&lt;/strong&gt; before they get there (a &lt;em&gt;behavioral&lt;/em&gt; change — the new model loses the prior chain of thought), depending on the target model's class. To weigh either, you first need to know what a thinking block &lt;em&gt;is&lt;/em&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  First: what model "thinking" is
&lt;/h3&gt;

&lt;p&gt;If you haven't worked with reasoning models, start here; if you have, skip to Mental model 4.&lt;/p&gt;

&lt;p&gt;A reasoning model doesn't answer immediately. Given a hard prompt, it first generates a run of intermediate tokens — working the problem out step by step — and only then writes its reply. That working-out is the model's &lt;strong&gt;thinking&lt;/strong&gt; (also called &lt;em&gt;reasoning&lt;/em&gt;, &lt;em&gt;extended thinking&lt;/em&gt;, or &lt;em&gt;chain of thought&lt;/em&gt;): a scratch pad, the "let me work through this" pass a person makes before answering, except the model does it by emitting tokens.&lt;/p&gt;

&lt;p&gt;Why it does this: spending tokens on reasoning before answering measurably improves accuracy on anything multi-step — math, code, planning — where one wrong early step dooms the result. It's &lt;em&gt;test-time compute&lt;/em&gt; — trade tokens (and latency, and money) for a better answer. Modern models use &lt;strong&gt;adaptive thinking&lt;/strong&gt;: the model decides per request whether a problem is worth thinking about and how hard, so a trivial lookup gets none and a hard puzzle gets thousands of tokens.&lt;/p&gt;

&lt;p&gt;In the response, thinking isn't blended into the answer. The reply is a list of typed &lt;strong&gt;content blocks&lt;/strong&gt;, and thinking is its own block type, emitted &lt;em&gt;before&lt;/em&gt; the &lt;code&gt;text&lt;/code&gt; answer — the wire keeps the model's private working-out separate from the words meant for the user. That separate block is what gets carried back each turn, and what a model switch has to make a decision about. The next question is what's inside it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Mental model 4: the encrypted envelope
&lt;/h3&gt;

&lt;p&gt;A thinking block is not readable text you're carrying around. It's a sealed envelope.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;The model seals its reasoning into an encrypted &lt;strong&gt;&lt;code&gt;signature&lt;/code&gt;&lt;/strong&gt; and hands you an envelope you can't open. You carry it back each turn (stateless — &lt;em&gt;you&lt;/em&gt; hold it, not the server). The server has the key: it decrypts the signature to reconstruct the reasoning for the model. The result is &lt;strong&gt;private&lt;/strong&gt; (you can't read it), &lt;strong&gt;stateless&lt;/strong&gt; (the content rides in your request), and &lt;strong&gt;continuous&lt;/strong&gt; (the server reconstructs it each turn).&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;It wasn't always sealed.&lt;/strong&gt; The first generation of extended thinking handed the chain of thought back as plain, readable text — you got the model's working-out verbatim and could log it, diff it, even hand-edit it before resending. That openness is gone. Current Claude 4.x returns reasoning only in protected form: a &lt;em&gt;summary&lt;/em&gt; written by a separate model, or nothing but the encrypted signature. The motive is &lt;strong&gt;anti-distillation&lt;/strong&gt; — a raw chain of thought is exactly the training signal a competitor needs to clone the reasoning into their own model, so the readable text was replaced by a &lt;code&gt;signature&lt;/code&gt; you can carry but not inspect. (&lt;code&gt;summarized&lt;/code&gt; is the protected form, not a peek behind it — see &lt;em&gt;Why you can't just read the chain of thought&lt;/em&gt; below.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What you can still control — and what you can't.&lt;/strong&gt; A few knobs shape the envelope; none of them open it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Whether the model thinks, and how hard.&lt;/strong&gt; The &lt;code&gt;thinking&lt;/code&gt; parameter (&lt;code&gt;adaptive&lt;/code&gt;, or &lt;code&gt;enabled&lt;/code&gt; with a &lt;code&gt;budget_tokens&lt;/code&gt; ceiling) plus the effort setting decide whether a block is produced and how long the reasoning runs. More reasoning → a bigger signature (the depth table below shows ~45× across difficulty).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;How much of it you see.&lt;/strong&gt; &lt;code&gt;display&lt;/code&gt; has exactly two values — &lt;code&gt;"summarized"&lt;/code&gt; (a paraphrase) and &lt;code&gt;"omitted"&lt;/code&gt; (empty text, signature only) — and the default flips by model (newer models default to &lt;code&gt;omitted&lt;/code&gt;). Neither returns the raw reasoning, and &lt;code&gt;display&lt;/code&gt; is visibility-only: it &lt;strong&gt;does not change billing&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;What you can't touch.&lt;/strong&gt; You cannot read the raw thinking, edit it, reorder it, or author your own block. The sequence is integrity-checked — any modification is a hard &lt;code&gt;400&lt;/code&gt;. You take the envelope whole, or not at all.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Carry-over: required in some places, impossible in others.&lt;/strong&gt; Because the content is sealed &lt;em&gt;and&lt;/em&gt; integrity-locked, several moves that were fair game when thinking was open text are now off the table:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You can't &lt;em&gt;edit&lt;/em&gt; the reasoning across a switch — though the block itself still travels.&lt;/strong&gt; When thinking was readable you could take one model's chain of thought, trim or translate it, and feed it to another. Now it's a sealed, integrity-locked envelope: on a switch the previous blocks are re-rendered into the new model's prompt and billed as-is (across the Opus/Sonnet/Haiku family) — carried whole, never translated, trimmed, or merged — or, for the Fable/Mythos family, dropped. Encryption seals the &lt;em&gt;contents&lt;/em&gt; from you; it does &lt;strong&gt;not&lt;/strong&gt; stop the block from crossing the boundary and billing. (The replay matrix below measures which, and what it costs.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You can't curate it to save tokens.&lt;/strong&gt; Tempting as it is to prune a long chain before resending, editing the block is rejected outright, and &lt;em&gt;removing&lt;/em&gt; it — though accepted as a &lt;code&gt;200&lt;/code&gt; — silently breaks continuity and can convert cheap cache reads into cold writes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You sometimes &lt;em&gt;must&lt;/em&gt; carry it.&lt;/strong&gt; During tool use, the thinking blocks between tool calls are part of the integrity-checked sequence: drop them and the model loses the thread it built across the tool loop. Here carry-over isn't an optimization you opt into — it's required for the model to keep behaving the way the earlier turns set it up to.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The measured detail behind each of these — what an omitted block contains, how signature size tracks reasoning depth, how it's billed, and what survives a switch — follows.&lt;/p&gt;

&lt;h3&gt;
  
  
  What an omitted-display block actually contains &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Opus 4.7 with &lt;code&gt;display:"omitted"&lt;/code&gt; emits: &lt;code&gt;{ "type":"thinking", "thinking":"" (empty), "signature": "&amp;lt;360–732 chars&amp;gt;" }&lt;/code&gt;. Nothing else. The readable thinking text is empty; the signature is the payload.&lt;/p&gt;

&lt;h3&gt;
  
  
  The signature is the encrypted full thinking — not a tag &lt;strong&gt;[doc-confirmed]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Verbatim from Anthropic's extended-thinking documentation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"The &lt;code&gt;signature&lt;/code&gt; field still carries the encrypted full thinking for multi-turn continuity."&lt;/em&gt;&lt;br&gt;
&lt;em&gt;"The server decrypts the &lt;code&gt;signature&lt;/code&gt; to reconstruct the original thinking for prompt construction."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;It also enforces &lt;strong&gt;integrity&lt;/strong&gt; — blocks may not be edited or reordered:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"the entire sequence of consecutive &lt;code&gt;thinking&lt;/code&gt; blocks must match the outputs generated by the model… you can't rearrange or modify the sequence of these blocks."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Modifying a block returns &lt;code&gt;400 invalid_request_error&lt;/code&gt; (&lt;em&gt;"&lt;code&gt;thinking&lt;/code&gt; … blocks in the latest assistant message cannot be modified"&lt;/em&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  Signature length scales with reasoning depth &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;em&gt;What this shows:&lt;/em&gt; a thinking block isn't a fixed-size tag — it grows with how hard the model actually thought. Same model and settings (Sonnet 4.6, forced thinking, &lt;code&gt;display:"summarized"&lt;/code&gt;), five prompts from trivial to hard. The column that matters is the last one, signature size:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;prompt&lt;/th&gt;
&lt;th&gt;&lt;code&gt;output_tokens&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;summary text (chars)&lt;/th&gt;
&lt;th&gt;signature (chars)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;trivial&lt;/td&gt;
&lt;td&gt;20&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;276&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;easy&lt;/td&gt;
&lt;td&gt;35&lt;/td&gt;
&lt;td&gt;45&lt;/td&gt;
&lt;td&gt;332&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;medium&lt;/td&gt;
&lt;td&gt;444&lt;/td&gt;
&lt;td&gt;34&lt;/td&gt;
&lt;td&gt;320&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;hard (12-coin puzzle)&lt;/td&gt;
&lt;td&gt;5,595&lt;/td&gt;
&lt;td&gt;3,352&lt;/td&gt;
&lt;td&gt;12,524&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;very_hard&lt;/td&gt;
&lt;td&gt;64&lt;/td&gt;
&lt;td&gt;129&lt;/td&gt;
&lt;td&gt;448&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Read down the signature column: the hard puzzle's signature is &lt;strong&gt;~45×&lt;/strong&gt; the trivial one, and &lt;strong&gt;larger than the visible summary&lt;/strong&gt; (12,524 vs 3,352 chars). So the signature carries the &lt;em&gt;full&lt;/em&gt; thinking; the summary is just a condensation. (Adaptive thinking decides per prompt whether to think at all, which is why the jump tracks &lt;em&gt;actual reasoning&lt;/em&gt; — note &lt;code&gt;very_hard&lt;/code&gt; happened to reason little — not the difficulty &lt;em&gt;label&lt;/em&gt;.)&lt;/p&gt;

&lt;h3&gt;
  
  
  How thinking is billed &lt;strong&gt;[measured + docs]&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;At generation:&lt;/strong&gt; thinking is billed as &lt;strong&gt;output&lt;/strong&gt; tokens — even when &lt;code&gt;display:"omitted"&lt;/code&gt; and you never see the text.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On replay:&lt;/strong&gt; resent thinking bills as &lt;strong&gt;input&lt;/strong&gt; tokens. Sonnet visible-text blocks run ≈ 3,085 tokens each; Opus omitted blocks were small (~85–155 tokens) &lt;em&gt;only because those turns reasoned little&lt;/em&gt; — a &lt;strong&gt;deep&lt;/strong&gt; omitted block is large (a 12-coin puzzle ran &lt;strong&gt;+1,522&lt;/strong&gt; on replay with zero visible text). Replay cost tracks reasoning depth (signature length), not the empty &lt;code&gt;display&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;display&lt;/code&gt; does not change billing.&lt;/strong&gt; Thinking bills the same under &lt;code&gt;omitted&lt;/code&gt;, &lt;code&gt;summarized&lt;/code&gt;, or full.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why you can't just read the chain of thought &lt;strong&gt;[docs]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;There are only &lt;strong&gt;two&lt;/strong&gt; allowed &lt;code&gt;display&lt;/code&gt; values, and neither exposes the raw reasoning:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"summarized"&lt;/code&gt; — a &lt;em&gt;separate model's&lt;/em&gt; paraphrase. &lt;em&gt;"Summarization is processed by a different model than the one you target… The thinking model does not see the summarized output."&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"omitted"&lt;/code&gt; — empty text; the signature carries the encrypted full thinking.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;There is no value that returns verbatim chain of thought (&lt;em&gt;"In rare cases where you need access to full thinking output for Claude 4 models, contact Anthropic sales."&lt;/em&gt;). So switching to &lt;code&gt;summarized&lt;/code&gt; does &lt;strong&gt;not&lt;/strong&gt; bypass anti-distillation — &lt;code&gt;summarized&lt;/code&gt; &lt;em&gt;is&lt;/em&gt; the protected form.&lt;/p&gt;

&lt;h3&gt;
  
  
  Defaults by model &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Which models hide the thinking text out of the box: &lt;strong&gt;newer models default to &lt;code&gt;omitted&lt;/code&gt;&lt;/strong&gt; (signature only — Opus 4.7, Opus 4.8, Fable 5, Mythos 5, Mythos Preview), while &lt;strong&gt;older ones default to &lt;code&gt;summarized&lt;/code&gt;&lt;/strong&gt; (you also get the paraphrase — Opus 4.6, Sonnet 4.6, earlier Claude 4).&lt;/p&gt;

&lt;h3&gt;
  
  
  The cross-model replay matrix &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The key cost question for switching: when the next turn goes to a &lt;em&gt;different&lt;/em&gt; model, does it re-read the previous model's thinking blocks (and bill them as input), or silently drop them? The API never tells you which — so the test is to send the same request &lt;strong&gt;twice&lt;/strong&gt;, once with the blocks kept and once with them removed, and diff the prompt-token count. Costs more with them kept → rendered and billed. Identical → dropped.&lt;/p&gt;

&lt;p&gt;One concrete run, 3 Sonnet blocks replayed to Haiku: &lt;strong&gt;kept = 73,252&lt;/strong&gt; tokens vs &lt;strong&gt;removed = 63,997&lt;/strong&gt; — a 9,255-token gap (~3,085 per block), so they were rendered into Haiku's prompt and billed. (A naive model-only swap &lt;code&gt;400&lt;/code&gt;s first — Haiku rejects Sonnet's &lt;code&gt;adaptive&lt;/code&gt; thinking param — so the params have to be fixed before the keep-vs-removed comparison is even valid.)&lt;/p&gt;

&lt;p&gt;Run that same keep-minus-removed diff across every Opus/Sonnet/Haiku pairing and the verdict is uniform — &lt;strong&gt;always rendered, never dropped&lt;/strong&gt;. Each row is one source model's blocks replayed to one target; the number is the extra tokens billed when you keep them (positive = billed):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;source blocks&lt;/th&gt;
&lt;th&gt;→ target&lt;/th&gt;
&lt;th&gt;keep − removed&lt;/th&gt;
&lt;th&gt;verdict&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet (visible text)&lt;/td&gt;
&lt;td&gt;Opus 4.7&lt;/td&gt;
&lt;td&gt;+10,950&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet&lt;/td&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;+9,253&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet&lt;/td&gt;
&lt;td&gt;Sonnet (control)&lt;/td&gt;
&lt;td&gt;+9,077&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus (omitted/empty text)&lt;/td&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;+125&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus&lt;/td&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;+308&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus&lt;/td&gt;
&lt;td&gt;Opus (control)&lt;/td&gt;
&lt;td&gt;+170&lt;/td&gt;
&lt;td&gt;rendered &amp;amp; billed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The Sonnet rows are large (~3,085 per block of real visible reasoning); the Opus rows are tiny (+125 to +308) — but that gap is &lt;strong&gt;block size, not display&lt;/strong&gt;: those captured Opus blocks were &lt;em&gt;shallow&lt;/em&gt; (short signatures), not cheap &lt;em&gt;because&lt;/em&gt; their text is empty. Measured on a deep prompt &lt;strong&gt;[2026-06-25]&lt;/strong&gt;: an Opus 4.8 omitted block — &lt;strong&gt;zero&lt;/strong&gt; readable text, 4,300-char signature — cost &lt;strong&gt;+1,522 tokens&lt;/strong&gt; to replay (pure signature), and the same prompt on summarized Sonnet cost +6,106. The replay bill tracks serialized block size, dominated by the signature, not whether you can read it. Every number above is positive — nothing was dropped.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The Fable/Mythos exception [docs]:&lt;/strong&gt; the Fable/Mythos family's thinking blocks are &lt;em&gt;dropped (unbilled)&lt;/em&gt; when replayed to a different model. Not reproducible here (those models 404 on the test host), but the contrast is documented: the entire Opus/Sonnet/Haiku family replays freely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key distinction:&lt;/strong&gt; &lt;em&gt;omitting&lt;/em&gt; the thinking text (Opus 4.7/4.8 + Fable) is &lt;strong&gt;not&lt;/strong&gt; the same as &lt;em&gt;dropping blocks cross-model&lt;/em&gt; (Fable/Mythos only). Encryption is &lt;em&gt;orthogonal&lt;/em&gt; to transfer: a sealed, empty-text Opus block still replays across models — confirmed live by resuming an Opus session on Sonnet, where the sealed Opus block was carried verbatim, accepted, and billed. Caveat: &lt;em&gt;carried + accepted + billed&lt;/em&gt; is what's measured — it does &lt;strong&gt;not&lt;/strong&gt; prove the target model semantically reuses another model's reasoning; only that the block crosses the boundary and costs you.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  The documented strip rule, and Claude Code's override
&lt;/h3&gt;

&lt;p&gt;There is a documented rule about when previous thinking blocks are &lt;em&gt;stripped&lt;/em&gt; from context &lt;strong&gt;[docs]&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;"When a non-tool-result user block is included: on Opus 4.5+ and Sonnet 4.6+, previous thinking blocks are kept; on earlier Opus/Sonnet models and all Haiku models, all previous thinking blocks are ignored and stripped from context."&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;So by default: &lt;strong&gt;stripped&lt;/strong&gt; on all Haiku + earlier Opus/Sonnet; &lt;strong&gt;kept&lt;/strong&gt; on Opus 4.5+ and Sonnet 4.6+. The trigger is a normal (non-tool-result) user turn; inside tool loops, thinking is kept either way.&lt;/p&gt;

&lt;p&gt;Does that strip disturb the cache? Only if the model counts thinking as part of its cached prefix in the first place — and that differs by model. &lt;em&gt;How to read the next table:&lt;/em&gt; the same warm cache, measured with thinking &lt;strong&gt;kept&lt;/strong&gt; vs forcibly &lt;strong&gt;stripped&lt;/strong&gt;, on each model. Identical rows = thinking was never in that model's cache key (stripping is free); a &lt;code&gt;cache_read&lt;/code&gt; that collapses on STRIP = it was in the key (stripping re-keys it) &lt;strong&gt;[measured]&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;model&lt;/th&gt;
&lt;th&gt;variant&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_read&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_create&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;reading&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;KEEP&lt;/td&gt;
&lt;td&gt;64,202&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;rows identical → thinking isn't in Haiku's cache…&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;STRIP&lt;/td&gt;
&lt;td&gt;64,202&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;…so stripping it costs nothing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;KEEP&lt;/td&gt;
&lt;td&gt;72,499&lt;/td&gt;
&lt;td&gt;15&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;cache_read&lt;/code&gt; collapses on STRIP → thinking &lt;em&gt;is&lt;/em&gt; in Sonnet's cache…&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;STRIP&lt;/td&gt;
&lt;td&gt;54,080&lt;/td&gt;
&lt;td&gt;9,342&lt;/td&gt;
&lt;td&gt;…so stripping re-keys ~9.3K tokens (read drops ~18K)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Haiku strips thinking &lt;em&gt;before&lt;/em&gt; it reaches the cache or the bill, so a strip is free. Sonnet keeps it in the prefix, so removing it cold-rewrites that slice — on a keep-model, stripping thinking &lt;em&gt;hurts&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;But inside real Claude Code, neither default bites&lt;/strong&gt; — because Claude Code always sends one field &lt;strong&gt;[measured]&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="nl"&gt;"context_management"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="nl"&gt;"edits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"clear_thinking_20251015"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"keep"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"all"&lt;/span&gt;&lt;span class="p"&gt;}]}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;keep:"all"&lt;/code&gt; overrides Haiku's default strip and forces &lt;em&gt;every&lt;/em&gt; model to retain thinking:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;default API&lt;/th&gt;
&lt;th&gt;inside Claude Code (&lt;code&gt;keep:"all"&lt;/code&gt;)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;strips (excluded from cache + billing)&lt;/td&gt;
&lt;td&gt;keeps (in cache + billed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;keeps&lt;/td&gt;
&lt;td&gt;keeps&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So in genuine Claude Code usage every model keeps thinking; the Haiku-strip behavior only appears in raw API usage that omits &lt;code&gt;keep:"all"&lt;/code&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Switching Haiku → Sonnet stacks three costs &lt;strong&gt;[measured + inferred]&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Full cache miss&lt;/strong&gt; — model-scoped caches; Sonnet cold-writes its whole prefix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sonnet keeps and bills the thinking&lt;/strong&gt; that default-Haiku had been ignoring — work that was "free" on Haiku now costs input tokens on Sonnet's cold write.&lt;/li&gt;
&lt;li&gt;Any prefix-leanness benefit from Haiku's strip evaporates the moment Sonnet serves a turn.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Editing thinking blocks in a proxy: position-dependent cache cost &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;If you run a proxy that rewrites or strips thinking blocks, the cost depends sharply on &lt;em&gt;where&lt;/em&gt; the touched block sits — because everything before the edit stays cached and everything from the edit onward must be rewritten. &lt;em&gt;What this shows:&lt;/em&gt; on a keep-model, warm the cache, then remove a single thinking block from a 13-message conversation and watch how much warm &lt;code&gt;cache_read&lt;/code&gt; survives, depending on the removed block's position:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;variant&lt;/th&gt;
&lt;th&gt;block removed at&lt;/th&gt;
&lt;th&gt;&lt;code&gt;cache_read&lt;/code&gt;&lt;/th&gt;
&lt;th&gt;cache lost vs baseline&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;baseline&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;82,065&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;strip last&lt;/td&gt;
&lt;td&gt;msg 11 of 13&lt;/td&gt;
&lt;td&gt;81,625&lt;/td&gt;
&lt;td&gt;440&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;strip first&lt;/td&gt;
&lt;td&gt;msg 1 of 13&lt;/td&gt;
&lt;td&gt;73,961&lt;/td&gt;
&lt;td&gt;8,104&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same total prompt size either way — the loss is purely cheap &lt;code&gt;cache_read&lt;/code&gt; turning into expensive &lt;code&gt;cache_creation&lt;/code&gt;. Touch the &lt;strong&gt;last&lt;/strong&gt; block and you lose almost nothing (440); touch the &lt;strong&gt;first&lt;/strong&gt; and you re-key 8,104 tokens. The earlier the edit, the bigger the bill (~18× here).&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — assuming HTTP 200 means your cache survived.&lt;/strong&gt; A 200 means "valid request," &lt;em&gt;not&lt;/em&gt; "cache preserved." A proxy that removes a thinking block is accepted (200) but can silently convert tens of thousands of cheap cache reads into expensive cold writes. (Editing a block's content or signature is outright rejected — 400, integrity — but &lt;em&gt;removing&lt;/em&gt; is tolerated and quietly costly.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Never mutate the prefix in a proxy. If you must touch messages, do surgical byte-fragment replacement on the &lt;em&gt;tail&lt;/em&gt; only and confirm &lt;code&gt;cache_read&lt;/code&gt; is unchanged on the next request.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;What to do (end of Act II):&lt;/strong&gt; Switching models and carrying thinking are the two transitions that blow up a bill. Both reward the same discipline: pick a model per conversation and leave the request body — including its thinking blocks and its interlocked thinking/effort/context-management parameters — exactly as Claude Code assembled it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>claude</category>
      <category>programming</category>
    </item>
    <item>
      <title>Claude Code Costs, Act I — How the billing actually works</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Fri, 26 Jun 2026 05:58:50 +0000</pubDate>
      <link>https://dev.to/sumedhbala/claude-code-costs-act-i-how-the-billing-actually-works-25kn</link>
      <guid>https://dev.to/sumedhbala/claude-code-costs-act-i-how-the-billing-actually-works-25kn</guid>
      <description>&lt;p&gt;&lt;iframe class="speakerdeck-iframe ltag_speakerdeck" src="https://speakerdeck.com/player/2cbe7e1ffcb347e094895db28703c0c6"&gt;
&lt;/iframe&gt;
&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📊 New here? Scroll through the slides above for a quick visual overview of this act — then dive into the details below.&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;💬 &lt;strong&gt;Spot something unclear or wrong?&lt;/strong&gt; This guide aims to be precise. If any part could be explained better, or you think a claim is off, please leave a comment — I read every one and fix what's broken.&lt;/p&gt;

&lt;p&gt;👋 If you want more on &lt;strong&gt;AI-native engineering&lt;/strong&gt;, &lt;a href="https://www.linkedin.com/in/sumedhbala/" rel="noopener noreferrer"&gt;follow me on LinkedIn&lt;/a&gt;.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;em&gt;How billing actually works, the ecosystem of options to spend less, and the mistakes that quietly cost you money — grounded in measurement, not folklore.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;This guide teaches the cost model of &lt;a href="https://claude.com/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; from first principles. It is written for engineers who want to &lt;strong&gt;predict&lt;/strong&gt; their bill, &lt;strong&gt;lower&lt;/strong&gt; it deliberately, and &lt;strong&gt;recognize&lt;/strong&gt; the anti-patterns before they hit production. Every non-obvious claim carries a provenance tag — &lt;strong&gt;[measured]&lt;/strong&gt; (observed on the wire), &lt;strong&gt;[docs]&lt;/strong&gt; (from Anthropic's published documentation), or &lt;strong&gt;[doc-confirmed]&lt;/strong&gt; (measured &lt;em&gt;and&lt;/em&gt; matching the docs) — and the table or experiment that supports it.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The best way to use this guide:&lt;/strong&gt; keep a Claude session open on the side as you read. When something isn't clear, paste it in and ask Claude to explain it or to dig deeper. Better still, don't take the numbers on faith — ask Claude to &lt;strong&gt;reproduce the experiments&lt;/strong&gt; on your own setup, walk you through what the results mean, and &lt;strong&gt;show you the raw request/response logs&lt;/strong&gt; behind each table. The provenance tags exist so you can verify everything here yourself; treat this guide as a starting point for that conversation, not the last word.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What you'll be able to do after reading this
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Explain, in one sentence, why a Claude Code conversation gets more expensive with every turn — and why that is not a bug.&lt;/li&gt;
&lt;li&gt;Look at a proposed change (a new MCP server, a model switch, a "compression" proxy) and predict whether it helps or hurts your bill, &lt;em&gt;before&lt;/em&gt; you ship it.&lt;/li&gt;
&lt;li&gt;Read a &lt;code&gt;usage&lt;/code&gt; block and tell the difference between a warm session and one that is silently rebuilding its cache every turn.&lt;/li&gt;
&lt;li&gt;Choose the right cost-reduction tool for the cost line you actually want to attack — and avoid the popular tools that make things worse.&lt;/li&gt;
&lt;li&gt;Recognize the dozen most expensive mistakes and apply the fix for each.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  How to read this
&lt;/h2&gt;

&lt;p&gt;The guide is built around &lt;strong&gt;four mental models&lt;/strong&gt;. Learn them and the rest is derivable:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The amnesiac contractor&lt;/strong&gt; — the server remembers nothing; the request is the only state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The prefix-match cache&lt;/strong&gt; — caching is a strict byte-prefix match; one early byte change invalidates everything after it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The model-scoped key&lt;/strong&gt; — a cache entry belongs to exactly one model; another model can't read it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The encrypted envelope&lt;/strong&gt; — the model's reasoning rides back to you sealed; you carry it but can't open it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The guide is in four acts mapping to what you're trying to do: &lt;strong&gt;Act I&lt;/strong&gt; — how billing works; &lt;strong&gt;Act II&lt;/strong&gt; — where the big hidden costs are (model switching and thinking blocks); &lt;strong&gt;Act III&lt;/strong&gt; — the ecosystem of options for spending less; &lt;strong&gt;Act IV&lt;/strong&gt; — the consolidated mistakes catalogue and a one-page cheat sheet.&lt;/p&gt;

&lt;h2&gt;
  
  
  A note on how this was measured
&lt;/h2&gt;

&lt;p&gt;None of this relies on internal access. Claude Code honors the &lt;code&gt;ANTHROPIC_BASE_URL&lt;/code&gt; environment variable, so it can be pointed at a plain-HTTP local reverse proxy that logs each &lt;code&gt;/v1/messages&lt;/code&gt; request body and forwards it to &lt;code&gt;https://api.anthropic.com&lt;/code&gt;. There is no TLS interception: the OAuth &lt;code&gt;Authorization: Bearer&lt;/code&gt; token and &lt;code&gt;anthropic-beta&lt;/code&gt; headers pass through untouched; the proxy rewrites only the JSON &lt;code&gt;model&lt;/code&gt; field when an experiment calls for it. The proxy parses the streamed (SSE) response for the &lt;code&gt;usage&lt;/code&gt; block and counts &lt;code&gt;cache_control&lt;/code&gt; markers. A replay harness re-sends real captured requests verbatim — genuine headers — under single-variable variations, so each finding isolates one cause.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The honest constraint throughout:&lt;/strong&gt; the API exposes no field that announces "thinking was dropped" or "your cache was busted." Every conclusion here is &lt;em&gt;inferred&lt;/em&gt; from HTTP status codes, token-count deltas between near-identical requests, and inspection of the response. Where a claim is inferred rather than directly reported, it says so.&lt;/p&gt;

&lt;p&gt;Measurements were taken against Claude Code 2.1.150 (OAuth auth, mid-2026) on Sonnet 4.6, Opus 4.7/4.8, and Haiku 4.5. Fable 5 and the Mythos family were gated (HTTP 404) on the test host, so claims about them are documentation-only. Prices and exact behaviors are &lt;strong&gt;version-specific&lt;/strong&gt; — re-verify them for your own models and client version.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Billing-rate caveat, stated once and used throughout.&lt;/strong&gt; Claude Code's self-reported &lt;code&gt;total_cost_usd&lt;/code&gt; (what &lt;code&gt;/cost&lt;/code&gt; and the status line show) can &lt;strong&gt;under-report&lt;/strong&gt; cache-write cost: it prices the 1-hour-TTL writes it sends at the 5-minute &lt;strong&gt;1.25×&lt;/strong&gt; rate. Anthropic's published rate for a 1-hour-TTL write is &lt;strong&gt;2×&lt;/strong&gt; base input — the rate this guide uses everywhere. So expect your displayed session cost to read &lt;em&gt;lower&lt;/em&gt; than the figures here. Treat the Anthropic Console as authoritative for actual billing. This gap is itself one of the mistakes in Act IV.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;p&gt;The entire cost model falls out of one architectural fact. Get this fact and the three cost buckets, and you can already predict most of your bill.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one fact: Claude Code is a stateless API client
&lt;/h2&gt;

&lt;p&gt;Claude Code is a &lt;em&gt;client&lt;/em&gt; of Anthropic's &lt;code&gt;/v1/messages&lt;/code&gt; HTTP API. It holds no privileged channel to the model and no special server-side session. Every turn is &lt;strong&gt;one complete HTTP request&lt;/strong&gt; that carries the entire context the model will see: the tool definitions, the system prompt, and every message in the conversation so far. The server generates a response and then &lt;strong&gt;forgets everything&lt;/strong&gt; — no session, no memory, no "conversation" object. The next turn re-sends the whole thing again, one turn longer.&lt;/p&gt;

&lt;p&gt;That single fact — &lt;strong&gt;the request is the only state&lt;/strong&gt; — drives every cost lever in this guide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Because the client re-sends the full conversation each turn, &lt;strong&gt;cost grows with the conversation&lt;/strong&gt;: you pay to re-process the entire history on every turn.&lt;/li&gt;
&lt;li&gt;Because the server is stateless, the &lt;em&gt;only&lt;/em&gt; way to avoid re-processing that history is the &lt;strong&gt;prompt cache&lt;/strong&gt; (the rest of Act I).&lt;/li&gt;
&lt;li&gt;Because the request is co-designed for one target model, &lt;strong&gt;switching models mid-conversation throws the cache away&lt;/strong&gt; and re-bills accumulated reasoning (Act II).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Mental model 1: the amnesiac contractor
&lt;/h3&gt;

&lt;p&gt;Picture a brilliant contractor with no long-term memory. Every time you consult them, you must hand over a complete dossier: the tools they're allowed to use, their standing instructions, and the entire history of the project so far. They read it, give you excellent advice, and then forget the entire engagement the instant you leave. Next time, you bring the same dossier plus today's new page.&lt;/p&gt;

&lt;p&gt;That is exactly the relationship between Claude Code and the API. Everything expensive about Claude Code follows from "the dossier gets re-read, in full, every single time."&lt;/p&gt;

&lt;h3&gt;
  
  
  The proof: the server recalls nothing &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Two requests, same session design, isolating whether the server remembers a fact across turns:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;request sent&lt;/th&gt;
&lt;th&gt;model's answer&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;"my secret is 42" + acknowledgement + "what's my secret?" (the fact is in the payload)&lt;/td&gt;
&lt;td&gt;"42"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;only "what's my secret?" (the prior turn omitted from the payload)&lt;/td&gt;
&lt;td&gt;"I don't have a secret number stored in memory…"&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The model knows only what the &lt;em&gt;current request&lt;/em&gt; carries. When the prior turn is omitted from the request body, the secret is gone — there is no session to recall it from. The "memory" you experience in a chat is an illusion maintained entirely by the client re-sending history.&lt;/p&gt;

&lt;h3&gt;
  
  
  Thinking is physically resent, too &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;It isn't just visible messages. A captured Opus continuation shows message &lt;code&gt;[17]&lt;/code&gt; as &lt;code&gt;['thinking(sig=yes, len=0)', 'text', 'tool_use(Grep)']&lt;/code&gt;, followed by &lt;code&gt;[18]&lt;/code&gt; as a &lt;code&gt;tool_result&lt;/code&gt;. The thinking block is &lt;em&gt;in the request body&lt;/em&gt; — the client is sending the model's own prior reasoning back to it. (What those blocks contain, how they're billed, and what happens to them on a model switch is the whole of Act II's second half. For now: they are part of the dossier, and parts of the dossier cost money.)&lt;/p&gt;

&lt;p&gt;A subtlety worth internalizing early: &lt;strong&gt;resending thinking is not strictly mandatory in this configuration.&lt;/strong&gt; Replaying the captured continuation with thinking blocks (a) kept, (b) stripped from the last assistant turn, and (c) stripped everywhere all returned HTTP 200 &lt;strong&gt;[measured]&lt;/strong&gt;. The server doesn't error and doesn't substitute a remembered copy — because it has none. Resending thinking is how the &lt;em&gt;client&lt;/em&gt; gives the model reasoning continuity; it is not how the server maintains state. The server maintains no state.&lt;/p&gt;

&lt;h2&gt;
  
  
  The render order: stable content first, volatile content last
&lt;/h2&gt;

&lt;p&gt;Every request is assembled in this order:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;tools  →  system  →  messages
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Tool definitions sit at byte 0, the system prompt next, and the conversation last. This isn't cosmetic. It is the single most important &lt;em&gt;layout&lt;/em&gt; decision for cost, and it follows one rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Stable content first, volatile content last.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;You'll see why this ordering is load-bearing the moment we get to caching: the cache is a &lt;em&gt;prefix&lt;/em&gt; match, so the things that change least must come first, or they'll keep getting invalidated by the things that change most.&lt;/p&gt;

&lt;p&gt;One more piece of anatomy, because it decides cache behavior later. The &lt;code&gt;messages&lt;/code&gt; tier is an ordered list of &lt;strong&gt;messages&lt;/strong&gt; — conversation turns, each with a &lt;code&gt;role&lt;/code&gt; and some content — and each message's content is itself an ordered list of typed &lt;strong&gt;content blocks&lt;/strong&gt;: a text block, a &lt;code&gt;tool_use&lt;/code&gt; block (the model calling a tool), a &lt;code&gt;tool_result&lt;/code&gt; block (your answer to it), a thinking block, an image. &lt;strong&gt;One message can hold many blocks&lt;/strong&gt; — an assistant turn that fires ten parallel tools is a &lt;em&gt;single&lt;/em&gt; message but twenty-plus blocks. Hold onto the message-vs-block distinction: the cache's backward re-link counts &lt;em&gt;blocks&lt;/em&gt;, not messages, which is exactly what a big tool burst trips (see &lt;em&gt;Agentic tool bursts overflow the 20-block lookback&lt;/em&gt;).&lt;/p&gt;

&lt;h2&gt;
  
  
  The three things you pay for
&lt;/h2&gt;

&lt;p&gt;Every request's &lt;code&gt;usage&lt;/code&gt; block breaks input into three buckets, plus output. Learn what each one &lt;em&gt;costs relative to base input&lt;/em&gt;, because that ratio is the whole game:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Bucket (&lt;code&gt;usage.*&lt;/code&gt;)&lt;/th&gt;
&lt;th&gt;Meaning&lt;/th&gt;
&lt;th&gt;Price vs. base input&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cache_read_input_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;served from an existing cache entry&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;cache_creation_input_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;written to the cache this request&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;2×&lt;/strong&gt; (1-hour TTL)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;input_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;processed uncached, not cached&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;1×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;output_tokens&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;generated tokens&lt;/td&gt;
&lt;td&gt;the model's output rate&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two facts to burn in:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A cache hit is ~10× cheaper than processing the same tokens cold&lt;/strong&gt; (0.1× vs 1×). This is the prize.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output is the most expensive class&lt;/strong&gt; — it's billed at &lt;strong&gt;5× the input price&lt;/strong&gt; on every model, and it is &lt;strong&gt;never cached.&lt;/strong&gt; Tokens the model &lt;em&gt;writes&lt;/em&gt; dominate generation-heavy turns.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The rate card (Anthropic published list prices, per 1M tokens, as of 2026-06-15)
&lt;/h3&gt;

&lt;p&gt;Re-verify at &lt;code&gt;claude.com/pricing&lt;/code&gt;. Cache read = 0.1× input; 1-hour cache write = 2× input; output = 5× input.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Input (1×)&lt;/th&gt;
&lt;th&gt;Cache read (0.1×)&lt;/th&gt;
&lt;th&gt;Cache write, 1h (2×)&lt;/th&gt;
&lt;th&gt;Output (5×)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fable 5&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;$20.00&lt;/td&gt;
&lt;td&gt;$50.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Opus 4.8&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;td&gt;$0.50&lt;/td&gt;
&lt;td&gt;$10.00&lt;/td&gt;
&lt;td&gt;$25.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sonnet 4.6&lt;/td&gt;
&lt;td&gt;$3.00&lt;/td&gt;
&lt;td&gt;$0.30&lt;/td&gt;
&lt;td&gt;$6.00&lt;/td&gt;
&lt;td&gt;$15.00&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Haiku 4.5&lt;/td&gt;
&lt;td&gt;$1.00&lt;/td&gt;
&lt;td&gt;$0.10&lt;/td&gt;
&lt;td&gt;$2.00&lt;/td&gt;
&lt;td&gt;$5.00&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The ratios are clean and worth memorizing: &lt;strong&gt;Sonnet = 0.6× Opus, Haiku = 0.2× Opus, Sonnet = 3× Haiku.&lt;/strong&gt; They'll matter when we compare routing options.&lt;/p&gt;

&lt;h3&gt;
  
  
  Break-even: when does writing to the cache pay off?
&lt;/h3&gt;

&lt;p&gt;Caching is a trade: you pay a one-time &lt;strong&gt;expensive write&lt;/strong&gt; (2× the input price) so that later requests get &lt;strong&gt;cheap reads&lt;/strong&gt; (0.1×). Whether that trade wins depends on how many times you'll re-read the cached prefix.&lt;/p&gt;

&lt;p&gt;Process the same prefix over &lt;code&gt;N&lt;/code&gt; requests, two ways:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No cache:&lt;/strong&gt; every request pays full input price → total &lt;strong&gt;&lt;code&gt;N × 1×&lt;/code&gt;&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With cache:&lt;/strong&gt; the first request &lt;em&gt;writes&lt;/em&gt; (2×); each later request &lt;em&gt;reads&lt;/em&gt; (0.1×) → total &lt;strong&gt;&lt;code&gt;2× + (N−1) × 0.1×&lt;/code&gt;&lt;/strong&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Caching comes out ahead once &lt;code&gt;2 + 0.1(N−1) &amp;lt; N&lt;/code&gt; — i.e. &lt;strong&gt;from the 3rd request onward.&lt;/strong&gt; Worked out:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Over 3 requests: cache costs &lt;code&gt;2 + 0.1 + 0.1 = 2.2×&lt;/code&gt; vs &lt;code&gt;3×&lt;/code&gt; uncached. ✅ cache wins.&lt;/li&gt;
&lt;li&gt;Over 10 requests: &lt;code&gt;2.9×&lt;/code&gt; vs &lt;code&gt;10×&lt;/code&gt;. ✅ cache wins comfortably.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why "3rd request" and not "2nd"?&lt;/strong&gt; The TTL sets the write price, and this is exactly where Claude Code differs from the raw API. By default Anthropic uses a &lt;strong&gt;5-minute&lt;/strong&gt; cache TTL, where a write costs only &lt;strong&gt;1.25×&lt;/strong&gt; — so caching breaks even one request sooner, on the &lt;strong&gt;2nd&lt;/strong&gt;. But the Claude Code client &lt;strong&gt;overrides that default and requests the 1-hour TTL&lt;/strong&gt; on every breakpoint (measured later in this act), where a write costs &lt;strong&gt;2×&lt;/strong&gt; — pushing break-even to the &lt;strong&gt;3rd&lt;/strong&gt; request. This guide uses 2× / 1-hour throughout because that's what Claude Code actually sends.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — paying for a cache you never read back.&lt;/strong&gt; A &lt;em&gt;wasted write&lt;/em&gt; (content cached but never read again) costs &lt;strong&gt;2×&lt;/strong&gt; — &lt;em&gt;double&lt;/em&gt; what you'd have paid had you never cached it. Caching is not free insurance; it's a bet that you'll re-read the prefix at least three times. A short, one-shot interaction that ends after one or two turns can be &lt;em&gt;cheaper&lt;/em&gt; without caching.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Let Claude Code's defaults stand for interactive sessions (they will be re-read many times). Only worry about this in custom harnesses that cache aggressively but terminate early.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;One more corollary you'll lean on constantly: &lt;strong&gt;every read also refreshes the TTL.&lt;/strong&gt; A continuously-reused prefix never expires.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to do (Act I so far):&lt;/strong&gt; Internalize that cost = (re-processing history) + (output you generate). The cache is the only tool against the first term, and it's a strict prefix match — which is the next mental model.&lt;/p&gt;

&lt;h2&gt;
  
  
  Mental model 2: the prefix-match cache
&lt;/h2&gt;

&lt;p&gt;This is the model that, once you have it, makes cache behavior obvious instead of mysterious — and it starts one level down, in how the model reads your prompt at all.&lt;/p&gt;

&lt;h3&gt;
  
  
  What attention actually computes
&lt;/h3&gt;

&lt;p&gt;The model itself — a &lt;strong&gt;transformer&lt;/strong&gt; (the neural-network architecture every modern LLM, Claude included, is built on) — reads your prompt token by token, left to right. A &lt;strong&gt;token&lt;/strong&gt; is roughly a word or a word-piece. For every token the model computes three vectors — a &lt;strong&gt;query&lt;/strong&gt;, a &lt;strong&gt;key&lt;/strong&gt;, and a &lt;strong&gt;value&lt;/strong&gt; — and each token &lt;em&gt;attends&lt;/em&gt; to all the tokens before it: its result is a blend of those earlier tokens' values, weighted by how well its query matches their keys.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;In plain terms:&lt;/strong&gt; to handle each word, the model glances back over everything before it and weighs which earlier words matter — the way you look back to figure out what "it" refers to in &lt;em&gt;"I poured water into the glass until it was full."&lt;/em&gt; It does that for every token, against every earlier token at once.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two facts fall straight out of this, and together they explain the entire cache:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;The keys and values for the prefix are the expensive artifact.&lt;/strong&gt; "Processing the input" is mostly computing, for every token, its key/value vectors and its attention over all earlier tokens. The longer the prefix, the more of that work — and because the client re-sends the whole conversation each turn (Mental model 1), a naive server would redo all of it on every turn.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attention is causal: token N depends only on tokens 0…N.&lt;/strong&gt; A token's key/value never depend on anything that comes &lt;em&gt;after&lt;/em&gt; it, so the computed state for a stable prefix is identical no matter what you later append.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What the cache stores, and what it saves
&lt;/h3&gt;

&lt;p&gt;The prompt cache stores exactly those computed &lt;strong&gt;key/value vectors — the "KV cache" — for the prefix tokens&lt;/strong&gt;, keyed to the exact tokens that produced them. On a hit, the server &lt;strong&gt;loads that saved attention state instead of recomputing it&lt;/strong&gt; and starts real work only at the first uncached token. Every pricing rule in this guide follows from that one move:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;strong&gt;cache read is ~10× cheaper than a cold pass&lt;/strong&gt; (0.1× vs 1×): you pay to &lt;em&gt;load&lt;/em&gt; precomputed attention state, not to recompute it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Only a contiguous prefix from token 0 can be cached&lt;/strong&gt; — token N's state depends on every token before it, so cacheable spans grow from the start, never from the middle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Output is never cached at generation:&lt;/strong&gt; each output token's KV is computed once while decoding and discarded — billed purely as &lt;code&gt;output_tokens&lt;/code&gt;, absent from &lt;em&gt;that&lt;/em&gt; turn's &lt;code&gt;cache_creation&lt;/code&gt;. The output &lt;em&gt;text&lt;/em&gt;, though, doesn't vanish: next turn it's re-sent in the history and cache-written as input &lt;strong&gt;exactly once&lt;/strong&gt; (its token count shows up in that turn's &lt;code&gt;cache_creation&lt;/code&gt;), then read warm thereafter. So a generated span is paid once as output, once more as a single cache write next turn, then cheap reads — there's no cached output-KV to reuse, only the re-encoded text. (Measured: a 440-token turn-1 output had &lt;code&gt;cache_creation&lt;/code&gt; 0 that turn, then appeared as ~450 of turn-2's &lt;code&gt;cache_creation&lt;/code&gt;, then rode along in turn-3's warm read. That's why output is its own, uncacheable-at-generation cost line back in &lt;em&gt;The three things you pay for&lt;/em&gt;.) &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;And the headline invariant is just causality restated:&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Prompt caching is a strict prefix match.&lt;/strong&gt; A cache entry is keyed on the exact tokens from position 0 up to a cut point. &lt;strong&gt;Change one token at position N and every cached state at position ≥ N is invalid&lt;/strong&gt; — because each of those later key/value vectors was computed &lt;em&gt;attending to&lt;/em&gt; the token you changed.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The cache is therefore &lt;strong&gt;content-addressed&lt;/strong&gt;, not position-addressed: the API re-hashes the leading tokens each request and looks the hash up. Identical leading bytes → hit. This is why the render order (&lt;code&gt;tools → system → messages&lt;/code&gt;) is load-bearing — put the bytes that never change at the front, and the model &lt;strong&gt;reloads&lt;/strong&gt; the prefix's attention state instead of recomputing it.&lt;/p&gt;

&lt;p&gt;The cut point itself is a &lt;strong&gt;cache breakpoint&lt;/strong&gt; — a &lt;code&gt;cache_control&lt;/code&gt; marker on a block, meaning "cache everything from the start up to here." In Claude Code the client places these for you; the rest of this section is what they do.&lt;/p&gt;

&lt;h3&gt;
  
  
  The sliding window and delta-only writes
&lt;/h3&gt;

&lt;p&gt;Here's the elegant part. The &lt;em&gt;last&lt;/em&gt; breakpoint slides forward to the newest turn on each request (Claude Code does this automatically; on the raw API you do it yourself). Then three things happen together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;No invalidation.&lt;/strong&gt; A new turn &lt;em&gt;reads&lt;/em&gt; the unchanged earlier entries and &lt;em&gt;writes&lt;/em&gt; a longer one. Cache entries are immutable; stale ones simply age out, they aren't purged.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You pay the 2× write premium only on the delta&lt;/strong&gt; — the new tokens since the last boundary. The big system prompt is written exactly &lt;strong&gt;once.&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The TTL refreshes on every read,&lt;/strong&gt; so a continuously-used prefix never expires.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;write cost per turn ≈ (new tokens since last boundary) × 2× rate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The exception is the whole story of cache-busting: &lt;strong&gt;a byte change &lt;em&gt;inside&lt;/em&gt; the prefix&lt;/strong&gt; misses at that point, and the next write must span from the change forward — now 2× as costly to rebuild as the read it replaced.&lt;/p&gt;

&lt;p&gt;The slide also has a &lt;strong&gt;reach limit&lt;/strong&gt;: a breakpoint only re-links to a cache entry within &lt;strong&gt;~20 content blocks&lt;/strong&gt; of it. So a single turn that appends more than ~20 blocks — a large agentic tool burst — breaks the chain and forces a cold rewrite even when nothing else changed. That failure mode, and the proxy-side fix, are in &lt;em&gt;Agentic tool bursts overflow the 20-block lookback&lt;/em&gt; (under &lt;em&gt;What busts the cache&lt;/em&gt;).&lt;/p&gt;

&lt;h3&gt;
  
  
  The invalidation hierarchy
&lt;/h3&gt;

&lt;p&gt;Memorize which changes survive (✅) and which invalidate (❌) each tier:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Change&lt;/th&gt;
&lt;th&gt;Tools&lt;/th&gt;
&lt;th&gt;System&lt;/th&gt;
&lt;th&gt;Messages&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool definitions (add/remove/reorder)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Model switch&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;speed&lt;/code&gt; / web-search / citations toggle&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;System prompt content&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;code&gt;tool_choice&lt;/code&gt; / images / thinking toggle&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Message content&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Source: &lt;a href="https://platform.claude.com/docs/en/build-with-claude/prompt-caching" rel="noopener noreferrer"&gt;Anthropic — Prompt caching&lt;/a&gt; (cache-tiers / what-invalidates section).&lt;/p&gt;

&lt;p&gt;The two rows that force a &lt;strong&gt;full rebuild&lt;/strong&gt; — touching every tier — are the expensive ones: &lt;strong&gt;tool-definition changes&lt;/strong&gt; and &lt;strong&gt;model switches.&lt;/strong&gt; Because both sit at or before byte 0 of what's cached, they re-key everything, and at the 2× write rate that rebuild is twice as painful as a read.&lt;/p&gt;

&lt;h3&gt;
  
  
  Gotchas — in Claude Code
&lt;/h3&gt;

&lt;p&gt;Claude Code places the cache breakpoints for you — three of them, 1-hour TTL (proven in &lt;em&gt;Claude Code's actual caching&lt;/em&gt;) — so you never write &lt;code&gt;cache_control&lt;/code&gt; yourself. Your cache lever isn't &lt;em&gt;placing&lt;/em&gt; breakpoints; it's &lt;strong&gt;not disturbing the prefix the client already cached.&lt;/strong&gt; The ways a real session loses it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tool-set churn at byte 0.&lt;/strong&gt; Adding or removing an MCP server — or a server that mutates its tools at runtime (dynamic toolsets), or async connectors that register after startup — changes the tool definitions at byte 0 and re-keys the entire prefix. Stabilize the tool surface first; full treatment in &lt;em&gt;What busts the cache&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Switching models.&lt;/strong&gt; The cache is model-scoped, so any model change is a full cold rebuild — the whole of Act II.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long agentic tool bursts.&lt;/strong&gt; A single turn that emits more than ~20 tool-call/result blocks overflows the API's &lt;strong&gt;20-block lookback window&lt;/strong&gt;, so the next turn can't re-link to the prior entry and cold-writes the gap. You can't insert your own breakpoints to repair this from inside Claude Code, so the mitigation is to keep individual tool bursts bounded — full treatment, with the exact re-link rule and a proxy-side fix, in &lt;em&gt;What busts the cache&lt;/em&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Editing CLAUDE.md (or a memory file it imports) — free while a session runs; you pay a one-time re-key only on the &lt;em&gt;next&lt;/em&gt; start or resume, and only if the file actually changed.&lt;/strong&gt; "This" means the project-context layer Claude Code assembles for you: your &lt;strong&gt;CLAUDE.md files&lt;/strong&gt; (enterprise, user, project), the &lt;strong&gt;rules and memory files they &lt;code&gt;@import&lt;/code&gt;&lt;/strong&gt;, and the &lt;strong&gt;auto-memory store&lt;/strong&gt;. (It is &lt;em&gt;not&lt;/em&gt; your &lt;strong&gt;system prompt&lt;/strong&gt; — &lt;code&gt;--append-system-prompt&lt;/code&gt;, output styles, and the like sit in the &lt;strong&gt;system tier&lt;/strong&gt; at the front of the prefix, a separate and costlier case under &lt;em&gt;Tool-set churn&lt;/em&gt; / &lt;em&gt;What busts the cache&lt;/em&gt; — and &lt;em&gt;not&lt;/em&gt; the message you type, which is the live turn.) Claude Code reads this layer from disk &lt;strong&gt;once, at session start&lt;/strong&gt;, into a single &lt;code&gt;&amp;lt;system-reminder&amp;gt;&lt;/code&gt; in &lt;code&gt;messages[0]&lt;/code&gt; (the first user turn), placed &lt;em&gt;after&lt;/em&gt; the system breakpoints — so editing it can never disturb the expensive tools+system prefix (it is never the worst class). The process then reuses that start-of-session snapshot for its whole life, so the only question is &lt;strong&gt;when the snapshot is rebuilt from disk&lt;/strong&gt;:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;While running → free.&lt;/strong&gt; The live process never rewrites its &lt;code&gt;messages[0]&lt;/code&gt; snapshot, so the warm prefix is safe no matter who edits the file. Headless &lt;code&gt;-p&lt;/code&gt; ignores the edit until restart. Interactive surfaces it cheaply &lt;em&gt;at the tail&lt;/em&gt;: an &lt;strong&gt;out-of-band&lt;/strong&gt; edit (you, a linter, another process) appends a one-shot "CLAUDE.md was modified…" reminder (~155 tokens — the same append-don't-mutate trick as the date); an edit &lt;strong&gt;Claude makes itself&lt;/strong&gt; rides in as its Edit tool result, with no extra reminder. Either way &lt;code&gt;messages[0]&lt;/code&gt; stays byte-identical.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On &lt;code&gt;--continue&lt;/code&gt; or a fresh start → the bill lands, but only if the file changed.&lt;/strong&gt; A new process rebuilds &lt;code&gt;messages[0]&lt;/code&gt; by re-reading CLAUDE.md (and its &lt;code&gt;@import&lt;/code&gt;s) from disk. &lt;strong&gt;If the content changed&lt;/strong&gt; (whoever changed it), &lt;code&gt;cache_read&lt;/code&gt; collapses to the system tier and the &lt;strong&gt;whole message tier&lt;/strong&gt; cold-writes — measured &lt;code&gt;cache_read&lt;/code&gt; 30,216→27,975, &lt;code&gt;cache_creation&lt;/code&gt; 24→2,265 vs an unedited control; an &lt;code&gt;@import&lt;/code&gt;ed file re-keys identically (45→2,301). &lt;strong&gt;If it didn't change&lt;/strong&gt;, the resume is fully warm — &lt;code&gt;messages[0]&lt;/code&gt; comes back byte-identical and only the sliding tail re-keys (~17–21 tokens). So the re-key is the &lt;strong&gt;edit's&lt;/strong&gt; cost, not resume's. (A fresh session pays the same cold write but has no warm tier to lose.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Don't confuse this with system-prompt drift.&lt;/strong&gt; Any cross-process resume can &lt;em&gt;separately&lt;/em&gt; eat a bigger, occasional miss: the regenerated system prompt differs by ~6–17 chars and cold-writes the &lt;em&gt;entire&lt;/em&gt; prefix. That's process nondeterminism, independent of any CLAUDE.md edit — tell them apart by &lt;code&gt;system[2]&lt;/code&gt;'s hash (a CLAUDE.md re-key leaves &lt;code&gt;system[2]&lt;/code&gt; identical and flips only &lt;code&gt;messages[0]&lt;/code&gt;). &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adding a skill vs. a plugin/MCP server — opposite ends of the cost scale.&lt;/strong&gt; The split comes from &lt;em&gt;where each lands in the prefix&lt;/em&gt;: a skill's name+description goes into &lt;code&gt;messages[0]&lt;/code&gt; (same tier as CLAUDE.md, after the system breakpoints); a plugin/MCP server's &lt;strong&gt;tools&lt;/strong&gt; go to &lt;strong&gt;byte 0&lt;/strong&gt;, ahead of everything cached. That placement is the whole story:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Markdown skill → cheap, and &lt;code&gt;--continue&lt;/code&gt;-invisible.&lt;/strong&gt; Added mid-session, the interactive client shows it immediately as a one-shot tail &lt;strong&gt;append&lt;/strong&gt; (the full skills list is prepended to the current turn, ~1,016 tokens once); headless &lt;code&gt;-p&lt;/code&gt; ignores it. It's baked into the canonical &lt;code&gt;messages[0]&lt;/code&gt; only at the next &lt;strong&gt;fresh&lt;/strong&gt; start. Notably, &lt;strong&gt;&lt;code&gt;--continue&lt;/code&gt; does &lt;em&gt;not&lt;/em&gt; re-scan skills&lt;/strong&gt; — a resumed session never sees a newly-added skill and never re-keys for it. (This is the resume asymmetry from the bullet above: &lt;code&gt;--continue&lt;/code&gt; re-reads CLAUDE.md but &lt;em&gt;replays&lt;/em&gt; the stale skills list.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Plugin/MCP server → the expensive one.&lt;/strong&gt; Its tools sit at &lt;strong&gt;byte 0&lt;/strong&gt;, so changing the connected tool set re-keys the &lt;em&gt;entire&lt;/em&gt; prefix — system &lt;strong&gt;and&lt;/strong&gt; message tiers (measured: one added tool collapsed &lt;code&gt;cache_read&lt;/code&gt; 29,424→0 and cold-wrote all 29,505 tokens). A connected server's own tool change (e.g. a dynamic &lt;code&gt;tools/list_changed&lt;/code&gt;) applies &lt;strong&gt;live&lt;/strong&gt;; adding a brand-new server via config is &lt;strong&gt;restart-gated&lt;/strong&gt;. &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The full catalogue of what busts the prefix — silent invalidators, dynamic MCP tools, injected date/git/system-reminders — is the next section.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Note — extra gotchas if you use the Claude API directly.&lt;/strong&gt; Everything above assumes the Claude Code client, which manages caching for you. If you assemble &lt;code&gt;/v1/messages&lt;/code&gt; requests yourself, you also own the things Claude Code quietly handles:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You must add &lt;code&gt;cache_control&lt;/code&gt; yourself.&lt;/strong&gt; With no breakpoint, &lt;em&gt;nothing&lt;/em&gt; is cached. There's a &lt;strong&gt;maximum of 4&lt;/strong&gt; per request; place them at stability boundaries — &lt;code&gt;[tools + system]&lt;/code&gt; (frozen) / &lt;code&gt;[project context / CLAUDE.md]&lt;/code&gt; (per task) / &lt;code&gt;[conversation]&lt;/code&gt; (the sliding tail) — and the API reads the longest matching prefix, reprocessing only what follows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimum cacheable prefix ~1,024–4,096 tokens&lt;/strong&gt; (model-dependent). Below it the marker silently no-ops: &lt;code&gt;cache_creation_input_tokens&lt;/code&gt; comes back &lt;code&gt;0&lt;/code&gt; and nothing is cached, even though it looks enabled.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The default TTL is 5 minutes&lt;/strong&gt; (write ≈ 1.25× input); you must explicitly request the &lt;strong&gt;1-hour&lt;/strong&gt; TTL (write ≈ 2×) that Claude Code always sends. Choose by how long you'll keep reusing the prefix.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ordering is on you:&lt;/strong&gt; emit &lt;code&gt;tools → system → messages&lt;/code&gt;, stable content first, volatile last.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Serialization drift busts the cache silently:&lt;/strong&gt; re-serializing JSON with different key order, separators, or escaping — or interpolating a timestamp, UUID, or request-ID early in the prompt — changes the bytes even when the meaning doesn't, and the KV states no longer match. Keep the cached prefix byte-identical and push volatile tokens after the last breakpoint.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;You slide the tail breakpoint yourself&lt;/strong&gt; to the newest turn each request to get delta-only writes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Verify any of it with the &lt;code&gt;usage&lt;/code&gt; block: a non-zero &lt;code&gt;cache_creation_input_tokens&lt;/code&gt; on the first call, then a non-zero &lt;code&gt;cache_read_input_tokens&lt;/code&gt; on the next. If both stay near zero across requests that share a prefix, caching isn't engaging.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  What busts the cache
&lt;/h2&gt;

&lt;p&gt;This is where understanding turns directly into money. Everything below changes the prefix bytes, so the next request re-keys and &lt;strong&gt;rewrites at 2×&lt;/strong&gt; instead of reading at 0.1×. These are the ways it happens &lt;strong&gt;in a real Claude Code session&lt;/strong&gt;; a closing &lt;strong&gt;FYI&lt;/strong&gt; covers hazards that only apply if you drive the raw API or bolt your own content onto Claude Code.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dynamic MCP tools
&lt;/h3&gt;

&lt;p&gt;MCP servers can change their advertised tools at runtime by sending &lt;code&gt;notifications/tools/list_changed&lt;/code&gt;. When they do, the new tool definition lands at &lt;strong&gt;byte 0&lt;/strong&gt; (tools are first in the render order) → the entire prefix re-keys → a full cold rebuild, now at 2× the write rate.&lt;/p&gt;

&lt;p&gt;It helps to keep three distinct things straight, because only the first one lives at byte 0:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Thing&lt;/th&gt;
&lt;th&gt;Lives in&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Tool &lt;em&gt;definitions&lt;/em&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;tools&lt;/code&gt; param (byte 0) — changing these is catastrophic&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool &lt;em&gt;calls&lt;/em&gt; (&lt;code&gt;tool_use&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;assistant turn content (messages) — late, cheap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tool &lt;em&gt;results&lt;/em&gt; (&lt;code&gt;tool_result&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;user turn content (messages) — late, cheap&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The async-connector blow-up, caught live &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;In one real session the tool count grew &lt;strong&gt;30 → 55 → 85&lt;/strong&gt; across consecutive turns with &lt;em&gt;no model switch&lt;/em&gt;, busting the cache on every turn. The cause: four claude.ai MCP connectors (Zoom, Atlassian Rovo, Microsoft 365 [unauthenticated], Slack) — &lt;strong&gt;remote servers connecting asynchronously at startup.&lt;/strong&gt; Each request snapshots whatever tools have registered &lt;em&gt;so far&lt;/em&gt;, so the byte-0 tool list kept growing as connectors came online mid-session. Pinning the config with &lt;code&gt;--strict-mcp-config&lt;/code&gt; froze it at 30.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — optimizing anything before the tool surface is stable.&lt;/strong&gt; If tools are still registering across your first few turns, every "optimization" you measure is noise on top of a cold rebuild.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Stabilize the tool/MCP surface &lt;em&gt;first&lt;/em&gt;. Use &lt;code&gt;--strict-mcp-config&lt;/code&gt; (or a frozen, pinned server set) so byte 0 is constant from turn 1. At a 2× write rate, every cold turn is doubly expensive — this is the highest-leverage fix in the guide.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Injected context Claude Code handles for you: date, git, system-reminders
&lt;/h3&gt;

&lt;p&gt;These are the things people &lt;em&gt;expect&lt;/em&gt; to bust the cache — but Claude Code places them so they mostly don't:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Date&lt;/strong&gt; — lives in a system-reminder &lt;em&gt;after&lt;/em&gt; the system breakpoints, and a mid-session midnight rollover is &lt;strong&gt;effectively free&lt;/strong&gt;: the harness doesn't rewrite the date in &lt;code&gt;messages[0]&lt;/code&gt;, it &lt;em&gt;appends&lt;/em&gt; a &lt;code&gt;"The date has changed"&lt;/code&gt; reminder to the newest turn (measured: &lt;code&gt;cache_read&lt;/code&gt; held ~30K, only ~58 tokens written). It re-keys nothing but the sliding tail. &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Git status&lt;/strong&gt; — &lt;strong&gt;safe&lt;/strong&gt;: Claude Code snapshots it &lt;strong&gt;once at session start.&lt;/strong&gt; (Re-running &lt;code&gt;git status&lt;/code&gt; every turn and injecting it early would bust constantly — a cautionary design lesson.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;&amp;lt;system-reminder&amp;gt;&lt;/code&gt; blocks&lt;/strong&gt; — &lt;strong&gt;cheap&lt;/strong&gt; when appended to the &lt;em&gt;newest&lt;/em&gt; turn; only expensive if per-turn-varying content is placed early, re-stamped onto an already-cached turn, or stripped on replay (each changes a cached turn's bytes).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Case study: the GitHub MCP server's dynamic toolsets &lt;strong&gt;[measured against source]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This case is worth studying because the most popular official platform MCP server made exactly this mistake — and then &lt;em&gt;deleted the feature.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Verified against &lt;code&gt;github/github-mcp-server&lt;/code&gt; (Go, MIT) at tag &lt;strong&gt;&lt;code&gt;v1.0.5&lt;/code&gt;&lt;/strong&gt; (&lt;code&gt;1d17d33&lt;/code&gt;): the README's "Dynamic Tool Discovery," &lt;code&gt;pkg/github/dynamic_tools.go&lt;/code&gt;, and the bundled MCP Go SDK (&lt;code&gt;modelcontextprotocol/go-sdk v1.6.1&lt;/code&gt;).&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Mode&lt;/th&gt;
&lt;th&gt;Tools&lt;/th&gt;
&lt;th&gt;Mutates byte 0 at runtime?&lt;/th&gt;
&lt;th&gt;Cache-safe?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;default (no flags)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;default&lt;/code&gt; toolset, fixed at startup (~50 tools)&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--toolsets repos,issues,…&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;explicit groups, fixed at startup&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;--dynamic-toolsets&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3 meta-tools; starts ~empty, grows on demand&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Yes&lt;/strong&gt; (&lt;code&gt;enable_toolset&lt;/code&gt; → &lt;code&gt;AddTool&lt;/code&gt; → &lt;code&gt;tools/list_changed&lt;/code&gt;)&lt;/td&gt;
&lt;td&gt;❌&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The default and explicit &lt;code&gt;--toolsets&lt;/code&gt; paths resolve the tool set &lt;strong&gt;once at startup&lt;/strong&gt; (&lt;code&gt;ResolvedEnabledToolsets&lt;/code&gt;) and never touch it again — a frozen prefix, cache-safe. The &lt;code&gt;--dynamic-toolsets&lt;/code&gt; mode (off by default; env &lt;code&gt;GITHUB_DYNAMIC_TOOLSETS&lt;/code&gt;) instead starts nearly empty and exposes three meta-tools — &lt;code&gt;list_available_toolsets&lt;/code&gt;, &lt;code&gt;get_toolset_tools&lt;/code&gt;, &lt;code&gt;enable_toolset&lt;/code&gt; — so the model turns groups on as it needs them.&lt;/p&gt;

&lt;p&gt;That convenience is a cache-buster. When the model calls &lt;strong&gt;&lt;code&gt;enable_toolset&lt;/code&gt;&lt;/strong&gt;, the handler registers the group's tools against the &lt;strong&gt;live, already-connected&lt;/strong&gt; server (&lt;code&gt;EnableToolset&lt;/code&gt; → &lt;code&gt;RegisterFunc&lt;/code&gt; → &lt;code&gt;s.AddTool&lt;/code&gt;), and &lt;code&gt;AddTool&lt;/code&gt; fires &lt;code&gt;notifications/tools/list_changed&lt;/code&gt;. New tool definitions land at byte 0 → the whole prompt prefix re-keys → a full cold rebuild (2× write) on the next turn, paid again on &lt;strong&gt;every&lt;/strong&gt; &lt;code&gt;enable_toolset&lt;/code&gt; call.&lt;/p&gt;

&lt;p&gt;An epilogue, read honestly: &lt;strong&gt;GitHub removed the dynamic-toolsets feature entirely&lt;/strong&gt; in v1.1.0 (PR #2512, commit &lt;code&gt;0f0506d&lt;/code&gt;, 2026-05-20) — absent from every release since, including v1.3.0. But the maintainers do &lt;strong&gt;not&lt;/strong&gt; cite caching or token cost. Their stated reasons are &lt;em&gt;tech-debt cleanup&lt;/em&gt; — the dynamic path "carried real complexity: a separate config flag plumbed through stdio + http configs, a parallel registration path, four inventory methods … three meta-tools" — and that it was "a path no longer in active use," superseded by &lt;strong&gt;client-side&lt;/strong&gt; progressive discovery (the removal author explicitly names "Anthropic's Tool Search Tool, OpenAI's equivalent, 'Code Mode' patterns"). So this is &lt;em&gt;consistent with&lt;/em&gt; the frozen-prefix principle, &lt;strong&gt;not proof&lt;/strong&gt; that cache cost drove the deletion. The cache-busting cost itself is documented first-party — by &lt;strong&gt;Anthropic&lt;/strong&gt;, not GitHub: &lt;a href="https://platform.claude.com/docs/en/build-with-claude/prompt-caching" rel="noopener noreferrer"&gt;Prompt caching&lt;/a&gt; states "Modifying tool definitions … invalidates the entire cache." And tellingly, the replacement paradigm — tool search — is &lt;em&gt;cache-preserving by design&lt;/em&gt;: it &lt;strong&gt;appends&lt;/strong&gt; tool schemas inline rather than swapping the prefix, so "the prefix is untouched, so prompt caching is preserved" (&lt;a href="https://platform.claude.com/docs/en/agents-and-tools/tool-use/tool-search-tool" rel="noopener noreferrer"&gt;tool search docs&lt;/a&gt;). &lt;strong&gt;[removal + stated reasons verified via PR #2512; the cache motive is our analysis, not GitHub's]&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Live companion — Docker MCP Gateway, the same pattern shipped &lt;em&gt;default-on&lt;/em&gt;. [measured against source]&lt;/strong&gt; Where GitHub &lt;em&gt;removed&lt;/em&gt; its dynamic path, &lt;strong&gt;Docker MCP Gateway&lt;/strong&gt; (&lt;code&gt;docker/mcp-gateway&lt;/code&gt;, Go, MIT, ~1.5k★) ships it on by default. Verified at tag &lt;strong&gt;&lt;code&gt;v0.43.0&lt;/code&gt;&lt;/strong&gt; (&lt;code&gt;4833d8c&lt;/code&gt;): the &lt;code&gt;dynamic-tools&lt;/code&gt; feature is default-enabled (&lt;code&gt;cmd/docker-mcp/commands/feature.go&lt;/code&gt; → &lt;code&gt;defaultEnabledFeatures{"dynamic-tools": true}&lt;/code&gt;; the launch blog confirms the meta-tools are "available to your agent by default"), exposing &lt;code&gt;mcp-find&lt;/code&gt; / &lt;code&gt;mcp-add&lt;/code&gt; / &lt;code&gt;mcp-remove&lt;/code&gt; / &lt;code&gt;mcp-config-set&lt;/code&gt; / &lt;code&gt;code-mode&lt;/code&gt; that add and remove tools on the &lt;strong&gt;live, already-connected&lt;/strong&gt; server mid-session. The mutation hits byte 0 through the &lt;em&gt;identical sink&lt;/em&gt; as the GitHub case: &lt;code&gt;pkg/gateway/reload.go&lt;/code&gt; (&lt;code&gt;reloadConfiguration&lt;/code&gt;) calls &lt;code&gt;mcpServer.RemoveTools(…)&lt;/code&gt; + &lt;code&gt;AddTool(…)&lt;/code&gt;, which in the MCP Go SDK (&lt;code&gt;modelcontextprotocol/go-sdk v1.4.1&lt;/code&gt;) bottom out in &lt;code&gt;changeAndNotify(notificationToolListChanged, …)&lt;/code&gt; → &lt;code&gt;notifications/tools/list_changed&lt;/code&gt; — and Docker's own &lt;code&gt;TestIntegrationToolListChangeNotifications&lt;/code&gt; asserts it fires on a live add. So every &lt;code&gt;mcp-add&lt;/code&gt; / &lt;code&gt;mcp-remove&lt;/code&gt; cold-rebuilds the whole prefix at the 2× write rate, on each call.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Motive, stated honestly:&lt;/strong&gt; Docker argues the feature on &lt;strong&gt;raw token volume&lt;/strong&gt;, never caching — its slide deck cites "335 tools =&amp;gt; 209K tokens of tool description / request … \$1.25/1M tokens" (&lt;code&gt;examples/tool_registrations/embeddings.md&lt;/code&gt;) and the &lt;a href="https://www.docker.com/blog/dynamic-mcps-stop-hardcoding-your-agents-world/" rel="noopener noreferrer"&gt;launch blog&lt;/a&gt; warns the context window can "accumulate hundreds of thousands of tokens of nothing but tool definition." The word "cache" appears nowhere in its blog or docs; the token-volume cost is Docker's claim, the &lt;strong&gt;prompt-cache&lt;/strong&gt; consequence (byte-0 definitions cold-rewrite the cached prefix) is ours. &lt;strong&gt;[runtime mutation + &lt;code&gt;list_changed&lt;/code&gt; verified at &lt;code&gt;v0.43.0&lt;/code&gt;/&lt;code&gt;4833d8c&lt;/code&gt;; cache framing is ours]&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — enabling dynamic/runtime tool discovery for the convenience.&lt;/strong&gt; Every "enable this toolset on demand" call cold-rebuilds your entire prompt prefix.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Use a &lt;strong&gt;frozen startup tool set&lt;/strong&gt; (default or explicit &lt;code&gt;--toolsets&lt;/code&gt;). If you genuinely need many tools, prefer a single fixed &lt;em&gt;dispatcher&lt;/em&gt; tool (select the operation via arguments) or tool-search that &lt;em&gt;appends&lt;/em&gt; schemas to the tail rather than mutating byte 0.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  Agentic tool bursts overflow the 20-block lookback &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;This buster is different from the ones above: it fires from a decision the &lt;strong&gt;model&lt;/strong&gt; makes — a big parallel tool fan-out — not from anything you configured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The mechanism, in plain terms.&lt;/strong&gt; Each turn, the API tries to reuse the cache by matching the prefix at a breakpoint. If that exact match misses, it walks &lt;em&gt;backward&lt;/em&gt; looking for a still-cached chunk to extend — but only about &lt;strong&gt;20 content blocks&lt;/strong&gt;. Find a cached chunk inside that window → the whole prefix is served warm. Find nothing → the API gives up and re-charges the &lt;em&gt;entire&lt;/em&gt; prefix at the 2× cold-write rate. So when the previous turn appended more than ~20 blocks, last turn's cached chunk is now sitting too far back to reach, and the next turn pays full freight.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why tool bursts trip it so easily.&lt;/strong&gt; Recall a message holds many blocks, and this window counts &lt;strong&gt;blocks, not messages.&lt;/strong&gt; Each parallel tool call is &lt;strong&gt;two&lt;/strong&gt; blocks — the &lt;code&gt;tool_use&lt;/code&gt; and its &lt;code&gt;tool_result&lt;/code&gt; — so a turn with just &lt;strong&gt;~10 parallel tools is already ~20 blocks&lt;/strong&gt;, right at the edge. The threshold is razor-sharp: 19 added blocks still re-links, 20 misses (and 20 blocks crammed into a &lt;em&gt;single&lt;/em&gt; message still misses — it genuinely counts blocks). Measured on &lt;code&gt;claude-opus-4-8&lt;/code&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;one turn's fan-out&lt;/th&gt;
&lt;th&gt;blocks added&lt;/th&gt;
&lt;th&gt;next turn&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;28 parallel tools&lt;/td&gt;
&lt;td&gt;57&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;MISS&lt;/strong&gt; — entire prefix re-charged cold (28,149 written, 0 read)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5 parallel tools&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;HIT&lt;/strong&gt; — prefix served warm (25,672 read, 452 written)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same setup; only the burst size differs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Two facts that make a fix possible.&lt;/strong&gt; (1) Re-linking needs just &lt;strong&gt;one breakpoint within ~20 blocks of the last cached chunk&lt;/strong&gt; — how far the &lt;em&gt;end&lt;/em&gt; of the conversation is doesn't matter, and you don't need an unbroken chain of markers. (2) The cache entry isn't deleted when a turn overflows; it lives for the full hour. You only need a marker close enough to &lt;em&gt;point back at it&lt;/em&gt; on the very next request — so the repair is needed only on the oversized turn itself, not forever after.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;You can't do this inside Claude Code.&lt;/strong&gt; The client places the breakpoints for you — and spends only &lt;strong&gt;3 of the 4&lt;/strong&gt; the API allows (see &lt;em&gt;The 3-breakpoint finding&lt;/em&gt; below; the 4th slot sits unused) — with no way to add or move one. A feature request to expose breakpoint placement in &lt;code&gt;settings.json&lt;/code&gt; was filed and &lt;strong&gt;closed as &lt;em&gt;not planned&lt;/em&gt;&lt;/strong&gt; (&lt;a href="https://github.com/anthropics/claude-code/issues/58103" rel="noopener noreferrer"&gt;anthropics/claude-code#58103&lt;/a&gt;). So from inside the client the only lever is to &lt;strong&gt;keep bursts bounded&lt;/strong&gt; — fewer parallel tools per turn.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behind a proxy you could &lt;em&gt;potentially&lt;/em&gt; fix it.&lt;/strong&gt; A proxy sitting between Claude Code and the API sees the &lt;em&gt;entire&lt;/em&gt; conversation on every request — the API is stateless, so the full history is resent each turn — which means the proxy can rewrite the &lt;code&gt;cache_control&lt;/code&gt; markers before forwarding, with no memory of prior calls. And because Claude Code uses only 3 markers, a proxy has the whole budget of &lt;strong&gt;4&lt;/strong&gt; to work with. Place them as a sliding grid — at the tail and every ~18 blocks back (tail, −18, −36, −54) — and whatever the burst size, one marker still lands within ~20 blocks of the last cached chunk, re-links the full prefix, and re-charges only the genuinely new blocks:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;turn&lt;/th&gt;
&lt;th&gt;naive (Claude Code's 1 tail marker)&lt;/th&gt;
&lt;th&gt;proxy grid (4 markers, stride 18)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;+57-block burst&lt;/td&gt;
&lt;td&gt;read 0 · write 16,879 ✗&lt;/td&gt;
&lt;td&gt;read 14,794 · write 2,085 ✓&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;+31-block burst&lt;/td&gt;
&lt;td&gt;read 0 · write 18,011 ✗&lt;/td&gt;
&lt;td&gt;read 16,888 · write 1,123 ✓&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two limits on the proxy fix: four markers at stride 18 rescue a single turn of up to &lt;strong&gt;~74 new blocks&lt;/strong&gt; — past that, even the grid can't reach the prior chunk. And the proxy must forward the prefix &lt;strong&gt;byte-for-byte&lt;/strong&gt;; re-sorting the tools or re-serializing the JSON busts the cache on its own, markers or not.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — letting an agent fan out 10+ parallel tool calls in one turn on a cached prefix.&lt;/strong&gt; ~20 blocks overflows the lookback, and the next turn cold-rewrites the whole prefix at 2×.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — In Claude Code you can't place your own markers, so bound the burst (fewer parallel calls per turn) — that fixes the common case for free. The proxy grid (4 markers, stride ~18, rescues up to ~74 new blocks) is a &lt;strong&gt;last resort, not a default&lt;/strong&gt;: a burst big enough to overflow the window is rare, and a request-rewriting proxy now &lt;em&gt;owns your cache correctness&lt;/em&gt; — it must forward the prefix byte-for-byte and is still capped at the ~74-block ceiling. Don't stand one up — or buy a "cache-optimization" layer that does — without understanding every constraint above; for a problem this infrequent, the added complexity and failure surface rarely pay for themselves.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h3&gt;
  
  
  FYI — not Claude Code: silent invalidators in raw-API or custom setups
&lt;/h3&gt;

&lt;p&gt;The classic prompt-cache killers below &lt;strong&gt;don't come from Claude Code&lt;/strong&gt; — it keeps volatile content out of the cached prefix (the date sits after the system breakpoints, the per-request billing-header token is excluded from the cache key, git is snapshotted once). They bite only when you build the prefix &lt;strong&gt;yourself&lt;/strong&gt;: calling &lt;code&gt;/v1/messages&lt;/code&gt; directly, or bolting content onto Claude Code via &lt;code&gt;--append-system-prompt&lt;/code&gt; / &lt;code&gt;--system-prompt&lt;/code&gt;, a SessionStart hook or proxy, or a custom MCP server whose tool &lt;em&gt;descriptions&lt;/em&gt; embed per-request data (those land at byte 0). They're &lt;em&gt;silent&lt;/em&gt; because the meaning looks unchanged — only the bytes moved:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;datetime.now()&lt;/code&gt;, UUIDs, or request-IDs interpolated &lt;strong&gt;early&lt;/strong&gt; in the prompt.&lt;/li&gt;
&lt;li&gt;Unsorted &lt;code&gt;json.dumps&lt;/code&gt; (key order drifts between requests).&lt;/li&gt;
&lt;li&gt;Per-user data or conditional sections in the system prompt; per-user tool sets.&lt;/li&gt;
&lt;/ul&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Keep per-request tokens &lt;em&gt;after&lt;/em&gt; the last breakpoint (the message tier), exactly where Claude Code puts them. Verify with &lt;code&gt;cache_read_input_tokens ≈ 0&lt;/code&gt; across requests that &lt;em&gt;should&lt;/em&gt; share a prefix; if reads are zero, diff the rendered bytes of two consecutive requests.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  Claude Code's &lt;em&gt;actual&lt;/em&gt; caching, empirically
&lt;/h2&gt;

&lt;p&gt;Now we ground the abstract rules in what Claude Code 2.1.150 actually sends on the wire. (This layout is an implementation detail and is version-specific — re-verify per release.)&lt;/p&gt;

&lt;h3&gt;
  
  
  The 3-breakpoint finding &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Captured via raw request logging (&lt;code&gt;OTEL_LOG_RAW_API_BODIES=1&lt;/code&gt;): &lt;strong&gt;3 &lt;code&gt;cache_control&lt;/code&gt; breakpoints, all with &lt;code&gt;ttl:"1h"&lt;/code&gt;&lt;/strong&gt; — the 4th available budget slot is unused.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;TOOLS: 30 definitions             ← byte 0, no marker of their own
SYSTEM:
  system[0] len=85   billing/version header (per-request cch token)
  system[1] len=62   "You are a Claude agent…"        ◀ BREAKPOINT 1 (1h)  — covers tools + identity
  system[2] len=26887 "You are an interactive agent…" ◀ BREAKPOINT 2 (1h)  — full system prompt
MESSAGES:
  messages[0].content[0]  &amp;lt;system-reminder&amp;gt; skills list
  messages[0].content[1]  &amp;lt;system-reminder&amp;gt; context + DATE
  messages[0].content[2]  user prompt (turn 1)
  …turns…
  last user message                                    ◀ BREAKPOINT 3 (1h) — slides each turn
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Note that &lt;strong&gt;tools fold into Breakpoint 1&lt;/strong&gt; — there is no dedicated tool breakpoint. A multi-turn (&lt;code&gt;-c&lt;/code&gt;) capture shows the tail breakpoint (BP3) sliding forward each turn while the two system breakpoints stay put. This is the textbook sliding-window pattern from earlier: frozen system tier written once, sliding tail writing only the delta.&lt;/p&gt;

&lt;h3&gt;
  
  
  The billing-header gotcha &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;code&gt;system[0]&lt;/code&gt; contains a &lt;code&gt;cch=&lt;/code&gt; token that &lt;strong&gt;changes on every request&lt;/strong&gt; — yet warm reads still hit. How? The whole of &lt;code&gt;system[0]&lt;/code&gt; is &lt;strong&gt;special-cased out of the cache key&lt;/strong&gt;, so it never busts Breakpoint 1. The block is the billing/version header — &lt;code&gt;x-anthropic-billing-header: cc_version=…; cc_entrypoint=…; cch=…;&lt;/code&gt; — and the exclusion covers the &lt;strong&gt;entire block, not just the volatile &lt;code&gt;cch=&lt;/code&gt; token&lt;/strong&gt;: changing the version string or the entrypoint is ignored for caching too. (Measured: mutating a non-&lt;code&gt;cch&lt;/code&gt; byte of &lt;code&gt;system[0]&lt;/code&gt; across a warm turn still reads the full prefix; the &lt;em&gt;same&lt;/em&gt; one-byte change in &lt;code&gt;system[1]&lt;/code&gt; instead cold-rewrites — so the cache key is live, &lt;code&gt;system[0]&lt;/code&gt; is simply outside it.) It's a deliberate exception — a per-request-varying span sitting &lt;em&gt;inside&lt;/em&gt; the cached prefix that, by the prefix-match rule, &lt;em&gt;should&lt;/em&gt; invalidate everything after it, but doesn't. Don't chase it as a phantom cache-buster.&lt;/p&gt;

&lt;h3&gt;
  
  
  Date placement &lt;strong&gt;[measured]&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;The date lives in a &lt;code&gt;&amp;lt;system-reminder&amp;gt;&lt;/code&gt; in the &lt;strong&gt;first user message&lt;/strong&gt; — &lt;em&gt;after&lt;/em&gt; both system breakpoints. A mid-session midnight rollover is cheaper than you'd expect: it &lt;strong&gt;does not rewrite &lt;code&gt;messages[0]&lt;/code&gt;&lt;/strong&gt; at all. Verified by holding one session alive across Pacific midnight and capturing the wire — turn 1 (23:48, 06-16) and turn 2 (00:01, 06-17, same session):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;turn&lt;/th&gt;
&lt;th&gt;
&lt;code&gt;messages[0]&lt;/code&gt; date&lt;/th&gt;
&lt;th&gt;rollover notice&lt;/th&gt;
&lt;th&gt;usage&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1 (before)&lt;/td&gt;
&lt;td&gt;&lt;code&gt;2026-06-16&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;read 21,812 · write 8,305&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2 (after)&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;still &lt;code&gt;2026-06-16&lt;/code&gt;&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;"The date has changed. Today's date is now 2026-06-17"&lt;/code&gt; appended to the &lt;strong&gt;new user turn&lt;/strong&gt; (msg tail)&lt;/td&gt;
&lt;td&gt;read 30,117 · &lt;strong&gt;write 58&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;So the harness leaves the original date in &lt;code&gt;messages[0]&lt;/code&gt; untouched and &lt;strong&gt;appends a &lt;code&gt;"date has changed"&lt;/code&gt; reminder to the sliding tail&lt;/strong&gt; — re-keying only the cheapest tier (≈58 tokens written, the whole prefix still read warm). It doesn't touch the system tier &lt;em&gt;or&lt;/em&gt; &lt;code&gt;messages[0]&lt;/code&gt;. Git status is snapshotted once at session start (frozen). Transcripts don't persist the &lt;code&gt;cache_control&lt;/code&gt; markers — capture the wire request, not the transcript. &lt;strong&gt;[measured]&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;⚠️ Mistake — trusting the displayed session cost as your true bill.&lt;/strong&gt; Claude Code's &lt;code&gt;total_cost_usd&lt;/code&gt; prices its 1-hour writes at the 5-minute &lt;strong&gt;1.25×&lt;/strong&gt; rate, so it reads &lt;em&gt;lower&lt;/em&gt; than your actual invoice.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;✅ Fix&lt;/strong&gt; — Use the displayed cost for &lt;em&gt;relative&lt;/em&gt; comparisons within a session, but treat the &lt;strong&gt;Anthropic Console&lt;/strong&gt; as authoritative for absolute billing. (Act IV quantifies the gap on a real 16-turn session: $1.77 displayed vs $1.97 actual.)&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;What to do (end of Act I):&lt;/strong&gt; Keep the tool/MCP surface frozen from turn 1; keep volatile tokens (dates, IDs) &lt;em&gt;after&lt;/em&gt; the last breakpoint; verify caching with the &lt;code&gt;usage&lt;/code&gt; block, not faith; and don't trust the displayed cost as your invoice. With that, the steady-state cost of a single-model session is mostly cheap reads plus the output you generate. The expensive surprises come from the two things in Act II.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>claude</category>
      <category>programming</category>
    </item>
    <item>
      <title>Hotel Booking: Schema Design Comparison</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Wed, 19 Nov 2025 20:07:02 +0000</pubDate>
      <link>https://dev.to/sumedhbala/hotel-booking-schema-design-comparison-g3h</link>
      <guid>https://dev.to/sumedhbala/hotel-booking-schema-design-comparison-g3h</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;p&gt;Hotel booking systems face a fundamental design decision: how to model availability and reservations in the database. This document compares two primary approaches:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Schema 1: Row-Per-Day for Fungible Room Types&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One row per (hotel_id, room_type_id, date)&lt;/li&gt;
&lt;li&gt;Rooms of the same type are fungible (interchangeable)&lt;/li&gt;
&lt;li&gt;Uses B-tree indexes and optimistic locking&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Schema 2: GiST with Individual Rooms&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;One row per physical room&lt;/li&gt;
&lt;li&gt;Rooms are non-fungible (each room tracked separately)&lt;/li&gt;
&lt;li&gt;Uses GiST indexes and exclusion constraints to prevent overlapping bookings (see Section 3.2 for detailed explanation of how GiST works)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both approaches solve the same problem (preventing double-booking) but with different trade-offs in complexity, performance, and flexibility.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisite Reading (Shared Building Blocks)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This document extends the event-ticketing series by &lt;a class="mentioned-user" href="https://dev.to/sumedhbala"&gt;@sumedhbala&lt;/a&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/designing-a-large-scale-ticketing-system-2h1c"&gt;Designing a Large-Scale Ticketing System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-2-event-discovery-32o5"&gt;Part 2 – Event Discovery&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-3-seat-management-nn2"&gt;Part 3 – Seat Management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h"&gt;Part 4 – Payments &amp;amp; Ticket Issuance&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We reuse the same foundation: PostgreSQL as system of record, transactional outbox + Debezium CDC into Kafka, Elasticsearch for discovery, Redis for low-latency availability, and Stripe-style payment flows. Familiarity with those components is assumed so we can focus on hotel-specific schema and availability trade-offs here.&lt;/p&gt;
&lt;h3&gt;
  
  
  Key Technical Topics Covered (Interview Highlights)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GiST indexes + exclusion constraints&lt;/strong&gt; (Section 3): Prevent overlapping range bookings with PostgreSQL GiST/exclusion constraints.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimistic vs. pessimistic locking&lt;/strong&gt; (Sections 2.4 &amp;amp; 3.6): Version columns + &lt;code&gt;FOR UPDATE SKIP LOCKED&lt;/code&gt; to avoid overbooking.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fungible vs. non-fungible inventory modeling&lt;/strong&gt; (Sections 2 &amp;amp; 3): When to use row-per-day counts vs. per-room range bookings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-room-type pricing&lt;/strong&gt; (Section 2.1): &lt;code&gt;reservation_room_nights&lt;/code&gt; to support multiple room types in one reservation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid design&lt;/strong&gt; (Section 2.6): Schema 1 inventory + GiST room assignments to avoid room switching after check-in.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling patterns&lt;/strong&gt; (Section 7): Redis/search caches, read replicas, sharding by &lt;code&gt;hotel_id&lt;/code&gt;, regional deployments, streaming events.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  2. Schema 1: Row-Per-Day for Fungible Room Types
&lt;/h2&gt;
&lt;h3&gt;
  
  
  2.1 Schema Design
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Room types (fungible - Room 101 and 102 of same type are interchangeable)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;room_types&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;type_name&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- e.g., "Deluxe", "Suite"&lt;/span&gt;
    &lt;span class="n"&gt;max_capacity&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amenities&lt;/span&gt; &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Inventory: One row per room type per night&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Specific night date&lt;/span&gt;
    &lt;span class="n"&gt;total_rooms&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;reserved_rooms&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;version&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- For optimistic locking&lt;/span&gt;

    &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;room_types&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

    &lt;span class="c1"&gt;-- Database constraint ensures: total_rooms = available_rooms + reserved_rooms&lt;/span&gt;
    &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_inventory_balance&lt;/span&gt; 
        &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;total_rooms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;reserved_rooms&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

    &lt;span class="c1"&gt;-- Additional constraints for data integrity&lt;/span&gt;
    &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_inventory_non_negative&lt;/span&gt; 
        &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;reserved_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;total_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- B-tree indexes for fast lookups&lt;/span&gt;
&lt;span class="c1"&gt;-- Note: The PRIMARY KEY already creates an index on (hotel_id, room_type_id, date),&lt;/span&gt;
&lt;span class="c1"&gt;-- so idx_inventory_lookup is technically redundant but kept for clarity/explicit naming.&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_inventory_lookup&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Partial index: Only includes rows where available_rooms &amp;gt; 0&lt;/span&gt;
&lt;span class="c1"&gt;-- This is smaller and faster for availability queries (the most common operation).&lt;/span&gt;
&lt;span class="c1"&gt;-- PostgreSQL will automatically use this index when querying for available rooms.&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_inventory_available&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Reservations: Header table (one reservation can have multiple room types)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;check_in_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;check_out_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_guests&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_out'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cancelled'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'expired'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;total_amount_minor_units&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;hotels&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Reservation rooms: One row per room type in the reservation&lt;/span&gt;
&lt;span class="c1"&gt;-- Allows booking multiple room types in a single reservation&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;reservation_rooms&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;reservation_room_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Needed for composite foreign key&lt;/span&gt;
    &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;num_rooms&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Number of rooms of this type in the reservation&lt;/span&gt;

    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;room_types&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;CONSTRAINT&lt;/span&gt; &lt;span class="n"&gt;chk_num_rooms_positive&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- One row per room type per reservation&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Reservation room nights: One row per room type per night for pricing&lt;/span&gt;
&lt;span class="c1"&gt;-- This allows different room types in the same reservation to have different nightly rates&lt;/span&gt;
&lt;span class="c1"&gt;-- (e.g., Deluxe room at $150/night, Suite at $300/night)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;reservation_room_nights&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;reservation_room_night_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;reservation_room_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;night_date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;nightly_rate_minor_units&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Rate per room of this type for this night&lt;/span&gt;

    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_room_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;reservation_rooms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_room_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_room_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;night_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  2.2 Key Characteristics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fungible Inventory&lt;/strong&gt;: Room 101 and Room 102 of type "Deluxe" are treated as identical&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Room Types&lt;/strong&gt;: A single reservation can include multiple room types (e.g., 2 Deluxe + 1 Suite) via the &lt;code&gt;reservation_rooms&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row-Per-Day&lt;/strong&gt;: Each night gets its own inventory row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Count-Based&lt;/strong&gt;: Tracks &lt;code&gt;available_rooms&lt;/code&gt; count, not specific room assignments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;B-tree Indexes&lt;/strong&gt;: Standard indexes for point queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimistic Locking&lt;/strong&gt;: Uses &lt;code&gt;version&lt;/code&gt; column to detect concurrent modifications&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  2.3 Availability Check Query
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check availability for June 15-17 (3 nights) for 1 room&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;version&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'deluxe_001'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-16'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- Check for at least 1 room&lt;/span&gt;

&lt;span class="c1"&gt;-- Check availability for multiple rooms (e.g., 3 rooms)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;version&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;inventory&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'deluxe_001'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-16'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;-- Check for at least 3 rooms&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  2.4 Booking Creation (With Optimistic Locking)
&lt;/h3&gt;
&lt;h4&gt;
  
  
  Option 1: Single Room Type (Simplified)
&lt;/h4&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1_single_type&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Create a booking for one or more rooms of the same type.

    Args:
        num_rooms: Number of rooms to book (default: 1)
        num_guests: Total number of guests across all rooms
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Use the multi-type function with single room type
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; 
        &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;num_guests&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h4&gt;
  
  
  Option 2: Multiple Room Types (Full Implementation)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Example Usage&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Book 1 room of single type
&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotel_123&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-15&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-17&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Book multiple rooms of same type
&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotel_123&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-15&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-17&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Book multiple room types in one reservation
# Example: 2 Deluxe rooms + 1 Suite for a family
&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotel_123&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;suite_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-15&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-17&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2.5 Pros and Cons
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;br&gt;
✅ &lt;strong&gt;Simple and intuitive&lt;/strong&gt;: Easy to understand and maintain&lt;br&gt;
✅ &lt;strong&gt;Efficient point queries&lt;/strong&gt;: Direct date lookups are O(log n) with B-tree index&lt;br&gt;
✅ &lt;strong&gt;Easy updates&lt;/strong&gt;: Update individual nights independently&lt;br&gt;
✅ &lt;strong&gt;Supports per-night pricing&lt;/strong&gt;: Each row can have different pricing/rates&lt;br&gt;
✅ &lt;strong&gt;Handles overlapping bookings&lt;/strong&gt;: User books June 15-17, another books June 16-18&lt;br&gt;
✅ &lt;strong&gt;Flexible room assignment&lt;/strong&gt;: Any available room of the type can be assigned&lt;br&gt;
✅ &lt;strong&gt;Works with standard indexes&lt;/strong&gt;: No special index types needed&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;br&gt;
❌ &lt;strong&gt;More rows&lt;/strong&gt;: 365-day window = 365 rows per room type (scales linearly with booking window)&lt;br&gt;
❌ &lt;strong&gt;Optimistic locking complexity&lt;/strong&gt;: Requires retry logic on conflicts&lt;br&gt;
❌ &lt;strong&gt;No specific room assignment at booking&lt;/strong&gt;: Don't know which exact room guest will get until check-in&lt;br&gt;
❌ &lt;strong&gt;Count-based tracking&lt;/strong&gt;: Must update all three fields together (total, available, reserved) - enforced by database CHECK constraint&lt;/p&gt;
&lt;h3&gt;
  
  
  2.6 Room Assignment Guarantee After Check-In
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Question&lt;/strong&gt;: Can Schema 1 guarantee that guests won't have to switch rooms once they've checked in?&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Answer&lt;/strong&gt;: &lt;strong&gt;Schema 1 alone cannot guarantee this&lt;/strong&gt; because it only tracks room types, not specific room assignments. However, you can add a &lt;strong&gt;room assignment table&lt;/strong&gt; that locks in the specific room at check-in time.&lt;/p&gt;
&lt;h4&gt;
  
  
  Option 1: Add Room Assignment at Check-In (Recommended)
&lt;/h4&gt;

&lt;p&gt;Add a table to track specific room assignments after check-in:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Physical rooms table (needed for room assignments)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- e.g., "101", "102"&lt;/span&gt;
    &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;room_types&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Room assignments: Lock in specific room at check-in&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;room_assignments&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;assignment_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="n"&gt;DATERANGE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- [check_in_date, check_out_date)&lt;/span&gt;
    &lt;span class="n"&gt;assigned_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="k"&gt;DELETE&lt;/span&gt; &lt;span class="k"&gt;CASCADE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;

    &lt;span class="c1"&gt;-- GiST exclusion constraint prevents overlapping assignments for same room&lt;/span&gt;
    &lt;span class="c1"&gt;-- GiST is explained in section 3.3&lt;/span&gt;
    &lt;span class="n"&gt;EXCLUDE&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How This Works&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;At Booking Time&lt;/strong&gt;: Reservation is created with multiple room types via &lt;code&gt;reservation_rooms&lt;/code&gt; table (no specific rooms assigned yet)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;At Check-In&lt;/strong&gt;: Hotel staff assigns specific rooms (one &lt;code&gt;room_assignments&lt;/code&gt; entry per room) matching the room types and counts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GiST Constraint&lt;/strong&gt;: Prevents any other reservation from being assigned to the same room for overlapping dates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Guarantee&lt;/strong&gt;: Once assigned, each room cannot be assigned to anyone else during the guest's stay&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Check-In Flow&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_in_guest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_assignments&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Assign specific rooms to a reservation at check-in.

    Args:
        reservation_id: The reservation ID
        room_assignments: Dict mapping room_type_id to list of room numbers
                         Example: {&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: [&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;101&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;102&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;], &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;suite_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;: [&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;201&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;]}
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;reservation&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_reservation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confirmed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InvalidStateError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Reservation must be confirmed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Get reservation_rooms to validate assignments
&lt;/span&gt;    &lt;span class="n"&gt;reservation_rooms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT room_type_id, num_rooms
        FROM reservation_rooms
        WHERE reservation_id = ?
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Validate room assignments match reservation
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;res_room&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;reservation_rooms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;assigned_rooms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;room_assignments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;res_room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assigned_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;res_room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InvalidRoomCountError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Room type &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;res_room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: Reservation requires &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;res_room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rooms, &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;but &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;assigned_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; provided&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Assign each room (GiST constraint prevents overlaps)
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_numbers&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;room_assignments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;room_numbers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
                    INSERT INTO room_assignments 
                    (reservation_id, hotel_id, room_number, stay_dates)
                    VALUES (?, ?, ?, daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;))
                &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
                     &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="c1"&gt;# Update reservation status
&lt;/span&gt;        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            UPDATE reservations
            SET status = &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;checked_in&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, checked_in_at = NOW()
            WHERE reservation_id = ?
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ExclusionViolation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# One or more rooms already assigned to another guest for these dates
&lt;/span&gt;        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RoomAlreadyAssignedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;One or more rooms are not available&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Multiple Room Types Example&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Reservation for 2 Deluxe + 1 Suite
&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;hotel_123&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="p"&gt;[(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;suite_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-15&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;2024-06-17&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# At check-in, assign specific rooms for each type
&lt;/span&gt;&lt;span class="nf"&gt;check_in_guest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;deluxe_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;101&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;102&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;  &lt;span class="c1"&gt;# 2 Deluxe rooms
&lt;/span&gt;    &lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;suite_001&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;201&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;           &lt;span class="c1"&gt;# 1 Suite
&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="c1"&gt;# Creates 3 room_assignments entries total
# GiST constraint ensures no overlaps for any of the 3 rooms
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Benefits&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;strong&gt;Guarantees no room switching&lt;/strong&gt;: Once assigned, GiST constraint prevents conflicts&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Flexibility before check-in&lt;/strong&gt;: Can reassign rooms if needed before guest arrives&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Best of both worlds&lt;/strong&gt;: Fungible inventory for booking, specific assignment at check-in&lt;/li&gt;
&lt;li&gt;✅ &lt;strong&gt;Maintenance handling&lt;/strong&gt;: If room needs maintenance, can reassign before check-in&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Trade-offs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Requires additional &lt;code&gt;rooms&lt;/code&gt; and &lt;code&gt;room_assignments&lt;/code&gt; tables&lt;/li&gt;
&lt;li&gt;More complex than pure Schema 1, but simpler than full Schema 2&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Option 2: Pure Schema 1 (No Guarantee)
&lt;/h4&gt;

&lt;p&gt;If you don't add room assignments, Schema 1 &lt;strong&gt;cannot guarantee&lt;/strong&gt; that guests won't switch rooms because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No specific room is tracked&lt;/li&gt;
&lt;li&gt;Hotel could theoretically reassign rooms (though this would be bad practice)&lt;/li&gt;
&lt;li&gt;No database constraint prevents it&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When This Is Acceptable&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standard hotels where room switching is rare and handled operationally&lt;/li&gt;
&lt;li&gt;Hotels that prioritize flexibility over guarantees&lt;/li&gt;
&lt;li&gt;Systems where room assignment happens outside the booking system&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Recommendation
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;For production systems, add room assignments at check-in&lt;/strong&gt; (Option 1) to:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Guarantee no room switching after check-in&lt;/li&gt;
&lt;li&gt;Maintain flexibility before check-in&lt;/li&gt;
&lt;li&gt;Provide audit trail of room assignments&lt;/li&gt;
&lt;li&gt;Enable room-level tracking for maintenance&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This hybrid approach gives you the benefits of Schema 1 (fungible inventory, simple booking) with the guarantees of Schema 2 (specific room assignment, no conflicts).&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Schema 2: GiST with Individual Rooms
&lt;/h2&gt;

&lt;h3&gt;
  
  
  3.1 Tables: Reused from Schema 1 vs. New Tables
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Tables Reused from Schema 1:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;✅ &lt;code&gt;room_types&lt;/code&gt; - Room type metadata (Deluxe, Suite, etc.)&lt;/li&gt;
&lt;li&gt;✅ &lt;code&gt;reservations&lt;/code&gt; - Reservation header table (check-in/out dates, guest info, status)&lt;/li&gt;
&lt;li&gt;✅ &lt;code&gt;reservation_room_nights&lt;/code&gt; - Pricing per room type per night (optional, for pricing calculations)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Tables NOT Used from Schema 1:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;❌ &lt;code&gt;inventory&lt;/code&gt; - Not needed (availability calculated from &lt;code&gt;room_bookings&lt;/code&gt; overlaps)&lt;/li&gt;
&lt;li&gt;❌ &lt;code&gt;reservation_rooms&lt;/code&gt; - Not needed (bookings are per-room, not per-room-type)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;New Tables for Schema 2:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🆕 &lt;code&gt;rooms&lt;/code&gt; - Physical room inventory (hotel_id, room_number, room_type_id)&lt;/li&gt;
&lt;li&gt;🆕 &lt;code&gt;room_bookings&lt;/code&gt; - Individual room bookings with date ranges (uses GiST index with partial exclusion constraint to prevent overlaps for pending/confirmed/checked_in)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Difference:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Schema 1: Tracks availability by room type (fungible: "2 Deluxe rooms available")&lt;/li&gt;
&lt;li&gt;Schema 2: Tracks availability by specific room (non-fungible: "Room 101 available, Room 102 booked")&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.2 Schema Design
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Physical rooms (non-fungible - each room tracked separately)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- e.g., "101", "102", "Suite-A"&lt;/span&gt;
    &lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Links to room_types for metadata&lt;/span&gt;
    &lt;span class="n"&gt;floor_number&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;room_types&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Room bookings: One row per booking with date range&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;booking_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Specific physical room&lt;/span&gt;
    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="n"&gt;DATERANGE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- [check_in_date, check_out_date)&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_out'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'cancelled'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;

    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="k"&gt;FOREIGN&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- GiST index for fast range queries and overlap detection&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_room_bookings_dates&lt;/span&gt; 
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stay_dates&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Exclusion constraint prevents overlapping bookings for same room&lt;/span&gt;
&lt;span class="c1"&gt;-- Option 1: Partial exclusion constraint (RECOMMENDED - most idiomatic)&lt;/span&gt;
&lt;span class="c1"&gt;-- This constraint ONLY applies to "active" bookings (pending, confirmed, checked_in)&lt;/span&gt;
&lt;span class="c1"&gt;-- Cancelled/checked_out bookings don't block new bookings&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt;
&lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="n"&gt;EXCLUDE&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

&lt;span class="c1"&gt;-- Option 2: Use trigger (more flexible, but less idiomatic)&lt;/span&gt;
&lt;span class="c1"&gt;-- Triggers are more flexible because you can add custom logic beyond just checking overlaps.&lt;/span&gt;
&lt;span class="c1"&gt;-- For example, you could log failed booking attempts, allow VIP users to override conflicts,&lt;/span&gt;
&lt;span class="c1"&gt;-- or enforce additional business rules like "no bookings within 1 hour of each other".&lt;/span&gt;
&lt;span class="c1"&gt;-- However, we don't need this flexibility here - the simple overlap check is sufficient.&lt;/span&gt;
&lt;span class="c1"&gt;-- Note: We prevent overlaps for 'pending', 'confirmed', and 'checked_in' to avoid&lt;/span&gt;
&lt;span class="c1"&gt;-- double-booking scenarios where multiple pending reservations could later be confirmed&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;check_room_overlap&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stay_dates&lt;/span&gt;  &lt;span class="c1"&gt;-- Overlap operator&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;booking_id&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;booking_id&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
        &lt;span class="n"&gt;RAISE&lt;/span&gt; &lt;span class="n"&gt;EXCEPTION&lt;/span&gt; &lt;span class="s1"&gt;'Room already booked for these dates'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;trigger_check_overlap&lt;/span&gt;
&lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt;
&lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; 
&lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;check_room_overlap&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.3 How GiST Works
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;GiST (Generalized Search Tree)&lt;/strong&gt; is a PostgreSQL index type optimized for complex data types like ranges, geometric data, and full-text search. Unlike B-tree indexes (which work with simple comparisons like &lt;code&gt;&amp;lt;&lt;/code&gt;, &lt;code&gt;=&lt;/code&gt;, &lt;code&gt;&amp;gt;&lt;/code&gt;), GiST indexes support spatial and range operations.&lt;/p&gt;

&lt;h4&gt;
  
  
  What is GiST?
&lt;/h4&gt;

&lt;p&gt;GiST is an &lt;strong&gt;extensible indexing framework&lt;/strong&gt; that allows you to define custom index methods for complex data types. For date ranges, PostgreSQL provides built-in GiST support that understands range operations.&lt;/p&gt;

&lt;h4&gt;
  
  
  How GiST Indexes Range Data
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;1. Range Representation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- A date range is stored as: [start_date, end_date)&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[]'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- Includes both endpoints&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[)'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- Standard: includes start, excludes end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;2. GiST Index Structure:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GiST builds a tree where each node contains &lt;strong&gt;bounding boxes&lt;/strong&gt; (the smallest range that contains all child ranges)&lt;/li&gt;
&lt;li&gt;Leaf nodes contain actual date ranges&lt;/li&gt;
&lt;li&gt;Internal nodes contain bounding boxes of their children&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example GiST Tree Structure:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;                    Root Node
              [2024-01-01, 2024-12-31]  (bounding box)
                    /              \
         Node A                    Node B
    [2024-01-01, 2024-06-30]  [2024-07-01, 2024-12-31]
         /        \                  /        \
    Leaf 1      Leaf 2          Leaf 3      Leaf 4
[2024-06-15,  [2024-06-20,   [2024-07-10,  [2024-08-01,
 2024-06-17]   2024-06-22]    2024-07-15]   2024-08-05]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Range Operations GiST Supports
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;1. Overlap Operator (&lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check if two ranges overlap&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-16'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-18'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: TRUE (they overlap on June 16)&lt;/span&gt;

&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-18'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-20'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: FALSE (no overlap)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;How GiST Evaluates Overlap:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start at root node&lt;/li&gt;
&lt;li&gt;Check if query range overlaps with node's bounding box&lt;/li&gt;
&lt;li&gt;If yes, descend into that branch&lt;/li&gt;
&lt;li&gt;If no, skip entire branch (pruning)&lt;/li&gt;
&lt;li&gt;Continue until leaf nodes&lt;/li&gt;
&lt;li&gt;Check actual ranges for overlap&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;2. Contains Operator (&lt;code&gt;@&amp;gt;&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check if range contains a date&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-16'&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nb"&gt;date&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: TRUE&lt;/span&gt;

&lt;span class="c1"&gt;-- Check if range contains another range&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-20'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: TRUE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. Contained By Operator (&lt;code&gt;&amp;lt;@&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check if range is contained by another range&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;@&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-10'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-20'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: TRUE&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;4. Adjacent Operator (&lt;code&gt;-|-&lt;/code&gt;):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check if ranges are adjacent (touch but don't overlap)&lt;/span&gt;
&lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-|-&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-19'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;-- Returns: TRUE (they touch at June 17)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  How Exclusion Constraints Use GiST
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Exclusion Constraint Syntax:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;EXCLUDE&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;-- Equality operator for hotel_id&lt;/span&gt;
    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;-- Equality operator for room_number&lt;/span&gt;
    &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;      &lt;span class="c1"&gt;-- Overlap operator for date ranges&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What This Means:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For the same &lt;code&gt;(hotel_id, room_number)&lt;/code&gt; combination&lt;/li&gt;
&lt;li&gt;Prevent any two rows where &lt;code&gt;stay_dates&lt;/code&gt; overlap (&lt;code&gt;&amp;amp;&amp;amp;&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;The constraint is enforced at the &lt;strong&gt;database level&lt;/strong&gt; (before insert/update)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How It Works Internally:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;When inserting a new booking, PostgreSQL checks the GiST index&lt;/li&gt;
&lt;li&gt;Finds all existing bookings for the same &lt;code&gt;(hotel_id, room_number)&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;For each existing booking, checks if &lt;code&gt;stay_dates &amp;amp;&amp;amp; new_stay_dates&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;If any overlap is found, the insert &lt;strong&gt;fails&lt;/strong&gt; with an exclusion violation error&lt;/li&gt;
&lt;li&gt;This happens &lt;strong&gt;atomically&lt;/strong&gt; - no race conditions possible&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Booking 1: Room 101, June 15-17&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'booking_1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'101'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'reservation_1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
 &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- SUCCESS&lt;/span&gt;

&lt;span class="c1"&gt;-- Booking 2: Room 101, June 16-18 (OVERLAPS with Booking 1!)&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt;
&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'booking_2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'101'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'reservation_2'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
 &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-16'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-18'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[]'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- ERROR: conflicting key value violates exclusion constraint "room_bookings_excl"&lt;/span&gt;
&lt;span class="c1"&gt;-- The GiST index found that Booking 1's range [2024-06-15, 2024-06-17] &lt;/span&gt;
&lt;span class="c1"&gt;-- overlaps with Booking 2's range [2024-06-16, 2024-06-18]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Why GiST is Efficient for Range Queries
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;1. Pruning:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GiST can skip entire branches of the tree if their bounding boxes don't overlap&lt;/li&gt;
&lt;li&gt;Example: If query range is &lt;code&gt;[2024-06-15, 2024-06-17]&lt;/code&gt; and a node's bounding box is &lt;code&gt;[2024-12-01, 2024-12-31]&lt;/code&gt;, that entire branch is skipped&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Index-Only Scans:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For overlap checks, GiST can often answer queries using only the index (without accessing table data)&lt;/li&gt;
&lt;li&gt;This is much faster than scanning all rows&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Spatial Locality:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ranges that are close in time are stored near each other in the index&lt;/li&gt;
&lt;li&gt;Queries for nearby dates are very efficient&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Performance Comparison: GiST vs B-tree for Ranges
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;B-tree with Date Ranges (Inefficient):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Without GiST, you'd need to check every booking&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'101'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check_in_date&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check_in_date&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check_out_date&lt;/span&gt; &lt;span class="k"&gt;BETWEEN&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Must scan many rows, complex WHERE clause&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;GiST with Date Ranges (Efficient):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- With GiST, single overlap operator&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'101'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[]'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Uses GiST index, very fast, simple query&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  GiST Index Maintenance
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Insert Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inserting a new range requires updating the GiST tree&lt;/li&gt;
&lt;li&gt;May need to split nodes if bounding boxes become too large&lt;/li&gt;
&lt;li&gt;Generally O(log n) but can be slower than B-tree for simple inserts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Update Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Updating a range may require rebalancing the tree&lt;/li&gt;
&lt;li&gt;More expensive than B-tree updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Query Performance:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Overlap queries: &lt;strong&gt;Excellent&lt;/strong&gt; (O(log n) with pruning)&lt;/li&gt;
&lt;li&gt;Point queries: Good (but B-tree is better)&lt;/li&gt;
&lt;li&gt;Range containment: Excellent&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.4 Key Characteristics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Non-Fungible Inventory&lt;/strong&gt;: Each physical room (Room 101, Room 102) is tracked separately&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Date Ranges&lt;/strong&gt;: Uses &lt;code&gt;DATERANGE&lt;/code&gt; to store booking periods&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GiST Indexes&lt;/strong&gt;: Optimized for range queries and overlap detection (see section 3.3 for details)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database-Enforced Overlaps&lt;/strong&gt;: Exclusion constraints or triggers prevent double-booking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Specific Room Assignment&lt;/strong&gt;: Know exactly which room guest will get&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.5 Availability Check Query
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Find available rooms of a specific type for June 15-17&lt;/span&gt;
&lt;span class="c1"&gt;-- Exclude rooms with pending, confirmed, or checked_in bookings&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;rooms&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'hotel_123'&lt;/span&gt; 
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_type_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'deluxe_001'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;room_bookings&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;
      &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;
        &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;
        &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;stay_dates&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="n"&gt;daterange&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'2024-06-15'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'2024-06-17'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'[]'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;-- Overlap check&lt;/span&gt;
        &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'checked_in'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
  &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3.6 Booking Creation
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Single Room Booking
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema2_single&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Create a booking for a single room.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="c1"&gt;# Step 1: Find an available room of this type
&lt;/span&gt;    &lt;span class="c1"&gt;# Exclude rooms with pending, confirmed, or checked_in bookings for these dates
&lt;/span&gt;    &lt;span class="n"&gt;available_room&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT r.room_number
        FROM rooms r
        WHERE r.hotel_id = ? AND r.room_type_id = ?
          AND NOT EXISTS (
              SELECT 1 FROM room_bookings b
              WHERE b.hotel_id = r.hotel_id
                AND b.room_number = r.room_number
                AND b.stay_dates &amp;amp;&amp;amp; daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
                AND b.status IN (&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confirmed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;checked_in&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)  -- Include pending
          )
        LIMIT 1
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;available_room&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;room_number&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;available_room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 2: Create reservation
&lt;/span&gt;    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        INSERT INTO reservations 
        (hotel_id, check_in_date, check_out_date, num_guests, status, total_amount_minor_units)
        VALUES (?, ?, ?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending_payment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, 0)
        RETURNING reservation_id
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Step 3: Create room booking (exclusion constraint prevents overlap)
&lt;/span&gt;    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
            INSERT INTO room_bookings 
            (hotel_id, room_number, reservation_id, stay_dates, status)
            VALUES (?, ?, ?, daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;), &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ExclusionViolation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Another booking took this room - retry with different room
&lt;/span&gt;        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;rollback&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema2_single&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Multiple Rooms Booking (Complex - Requires Retry Logic)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;create_booking_schema2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Create a booking for multiple rooms of the same type.

    CRITICAL: This is significantly more complex than Schema 1&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;s atomic decrement.
    We must find and book N different rooms atomically, which requires retry logic
    if any room becomes unavailable during the transaction.

    Args:
        num_rooms: Number of rooms to book (e.g., 3 Deluxe rooms)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
                &lt;span class="c1"&gt;# Step 1: Find N available rooms of this type
&lt;/span&gt;                &lt;span class="c1"&gt;# Use FOR UPDATE SKIP LOCKED to lock available rooms atomically
&lt;/span&gt;                &lt;span class="c1"&gt;# 
&lt;/span&gt;                &lt;span class="c1"&gt;# FOR UPDATE SKIP LOCKED Explanation (VERY USEFUL FOR INTERVIEWS):
&lt;/span&gt;                &lt;span class="c1"&gt;# This PostgreSQL feature is crucial for preventing race conditions in concurrent systems.
&lt;/span&gt;                &lt;span class="c1"&gt;# 
&lt;/span&gt;                &lt;span class="c1"&gt;# How it works:
&lt;/span&gt;                &lt;span class="c1"&gt;# 1. FOR UPDATE: Locks the selected rows for this transaction (prevents other transactions
&lt;/span&gt;                &lt;span class="c1"&gt;#    from modifying them until we commit/rollback)
&lt;/span&gt;                &lt;span class="c1"&gt;# 2. SKIP LOCKED: If a row is already locked by another transaction, skip it and continue
&lt;/span&gt;                &lt;span class="c1"&gt;#    to the next available row (instead of waiting/blocking)
&lt;/span&gt;                &lt;span class="c1"&gt;# 
&lt;/span&gt;                &lt;span class="c1"&gt;# Why it's useful:
&lt;/span&gt;                &lt;span class="c1"&gt;# - Prevents "lost update" problems: Multiple transactions can't grab the same room
&lt;/span&gt;                &lt;span class="c1"&gt;# - High concurrency: Transactions don't block each other - they just skip locked rows
&lt;/span&gt;                &lt;span class="c1"&gt;# - Perfect for work queues, booking systems, inventory allocation
&lt;/span&gt;                &lt;span class="c1"&gt;# - Common interview question: "How do you prevent two workers from processing the same job?"
&lt;/span&gt;                &lt;span class="c1"&gt;# 
&lt;/span&gt;                &lt;span class="c1"&gt;# Example scenario:
&lt;/span&gt;                &lt;span class="c1"&gt;#   T1: SELECT ... FOR UPDATE SKIP LOCKED LIMIT 3 → locks rooms 101, 102, 103
&lt;/span&gt;                &lt;span class="c1"&gt;#   T2: SELECT ... FOR UPDATE SKIP LOCKED LIMIT 3 → skips 101,102,103, locks 104, 105, 106
&lt;/span&gt;                &lt;span class="c1"&gt;#   Both transactions proceed without blocking each other!
&lt;/span&gt;                &lt;span class="c1"&gt;# 
&lt;/span&gt;                &lt;span class="c1"&gt;# Alternative (FOR UPDATE without SKIP LOCKED):
&lt;/span&gt;                &lt;span class="c1"&gt;#   T1: SELECT ... FOR UPDATE LIMIT 3 → locks rooms 101, 102, 103
&lt;/span&gt;                &lt;span class="c1"&gt;#   T2: SELECT ... FOR UPDATE LIMIT 3 → WAITS for T1 to commit (blocks, reduces concurrency)
&lt;/span&gt;                &lt;span class="n"&gt;available_rooms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
                    SELECT r.room_number
                    FROM rooms r
                    WHERE r.hotel_id = ? AND r.room_type_id = ?
                      AND NOT EXISTS (
                          SELECT 1 FROM room_bookings b
                          WHERE b.hotel_id = r.hotel_id
                            AND b.room_number = r.room_number
                            AND b.stay_dates &amp;amp;&amp;amp; daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
                            AND b.status IN (&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confirmed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;checked_in&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
                      )
                    ORDER BY r.room_number
                    LIMIT ?
                    FOR UPDATE SKIP LOCKED  -- Lock rows, skip if already locked
                &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;available_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Only &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;available_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rooms available, need &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                    &lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="c1"&gt;# Step 2: Create reservation header
&lt;/span&gt;                &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
                    INSERT INTO reservations 
                    (hotel_id, check_in_date, check_out_date, num_guests, status, total_amount_minor_units)
                    VALUES (?, ?, ?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending_payment&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, 0)
                    RETURNING reservation_id
                &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_guests&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="c1"&gt;# Step 3: Create room bookings for all N rooms
&lt;/span&gt;                &lt;span class="c1"&gt;# The exclusion constraint will prevent overlaps if another transaction
&lt;/span&gt;                &lt;span class="c1"&gt;# grabbed a room between our SELECT and INSERT
&lt;/span&gt;                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;room&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;available_rooms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
                            INSERT INTO room_bookings 
                            (hotel_id, room_number, reservation_id, stay_dates, status)
                            VALUES (?, ?, ?, daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;), &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
                        &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ExclusionViolation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="c1"&gt;# Room was taken by another transaction - abort entire booking
&lt;/span&gt;                        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ConcurrentModificationError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Room &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;room&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room_number&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; was booked by another transaction&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                        &lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;

        &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ConcurrentModificationError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ExclusionViolation&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;max_retries&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Could not book &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rooms after &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;max_retries&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; attempts&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="c1"&gt;# Exponential backoff before retry
&lt;/span&gt;            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;

    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Booking failed after retries&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Challenges with Multi-Room Bookings in Schema 2&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Concurrency Risk&lt;/strong&gt;: Even with &lt;code&gt;FOR UPDATE SKIP LOCKED&lt;/code&gt;, there's a small window between SELECT and INSERT where another transaction could grab a room. Important: &lt;code&gt;FOR UPDATE&lt;/code&gt; does NOT prevent other transactions from reading the rows - it only prevents them from:

&lt;ul&gt;
&lt;li&gt;Taking a &lt;code&gt;FOR UPDATE&lt;/code&gt; lock on the same rows (they'd wait or skip with &lt;code&gt;SKIP LOCKED&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Modifying the rows (UPDATE/DELETE)&lt;/li&gt;
&lt;li&gt;Other transactions can still read the rows with regular SELECT (no lock required)&lt;/li&gt;
&lt;li&gt;This means another transaction could check availability (read) and insert into &lt;code&gt;room_bookings&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The exclusion constraint on &lt;code&gt;room_bookings&lt;/code&gt; is the final protection (will raise &lt;code&gt;ExclusionViolation&lt;/code&gt; if overlap detected)&lt;/li&gt;
&lt;li&gt;This is why we catch &lt;code&gt;ExclusionViolation&lt;/code&gt; and retry the entire booking&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;All-or-Nothing for Same Room Type&lt;/strong&gt;: If booking 3 rooms of the same type and we can only find 2 available, the entire booking fails. This is different from Schema 1, where booking 3 rooms of the same type is a single atomic &lt;code&gt;UPDATE inventory SET available_rooms = available_rooms - 3&lt;/code&gt; operation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Retry Complexity&lt;/strong&gt;: Requires retry logic with exponential backoff&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: &lt;code&gt;FOR UPDATE SKIP LOCKED&lt;/code&gt; helps prevent conflicts but adds locking overhead (see detailed explanation above)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Comparison with Schema 1&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schema 1 (same room type)&lt;/strong&gt;: &lt;code&gt;UPDATE inventory SET available_rooms = available_rooms - 3&lt;/code&gt; (single atomic operation for N rooms of same type)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema 1 (multiple room types)&lt;/strong&gt;: Also all-or-nothing - if any room type lacks availability, entire booking fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema 2&lt;/strong&gt;: Must find N different physical rooms, lock them, and insert N rows (multiple operations, higher failure risk even for same room type)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  3.7 Pros and Cons
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;br&gt;
✅ &lt;strong&gt;Automatic overlap prevention&lt;/strong&gt;: Database enforces no double-booking at constraint level (for single rooms)&lt;br&gt;
✅ &lt;strong&gt;Exact room assignment&lt;/strong&gt;: Know exactly which room guest will get (Room 101, not "any Deluxe room")&lt;br&gt;
✅ &lt;strong&gt;Room-level tracking&lt;/strong&gt;: Can track maintenance, room-specific issues, preferences&lt;br&gt;
✅ &lt;strong&gt;Efficient range queries&lt;/strong&gt;: GiST index optimized for date range operations&lt;br&gt;
✅ &lt;strong&gt;No optimistic locking needed&lt;/strong&gt;: Exclusion constraint handles single-room concurrency&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;:&lt;br&gt;
❌ &lt;strong&gt;Multi-room booking complexity&lt;/strong&gt;: Booking N rooms requires finding and locking N different rooms, with retry logic if any room becomes unavailable. Much more complex than Schema 1's atomic &lt;code&gt;UPDATE inventory SET available_rooms = available_rooms - N&lt;/code&gt;&lt;br&gt;
❌ &lt;strong&gt;Concurrency challenges for multi-room&lt;/strong&gt;: Between finding available rooms and inserting bookings, another transaction may grab one, requiring full retry&lt;br&gt;
❌ &lt;strong&gt;Less flexible&lt;/strong&gt;: Can't reassign rooms easily (guest booked Room 101, but it needs maintenance)&lt;br&gt;
❌ &lt;strong&gt;More complex queries&lt;/strong&gt;: Need to find available rooms by checking non-overlapping ranges&lt;br&gt;
❌ &lt;strong&gt;Room assignment logic&lt;/strong&gt;: Must decide which available room to assign (first available? best view?)&lt;br&gt;
❌ &lt;strong&gt;More dynamic booking rows&lt;/strong&gt;: For a booking of 3 rooms, creates 3 rows in &lt;code&gt;room_bookings&lt;/code&gt; (vs. Schema 1's 1 row in &lt;code&gt;reservation_rooms&lt;/code&gt;)&lt;br&gt;
❌ &lt;strong&gt;GiST index overhead&lt;/strong&gt;: Larger index size, more complex query planning&lt;br&gt;
❌ &lt;strong&gt;Not fungible&lt;/strong&gt;: Room 101 and 102 are different, even if same type (like event seats)&lt;/p&gt;


&lt;h2&gt;
  
  
  4. Comparative Analysis
&lt;/h2&gt;
&lt;h3&gt;
  
  
  4.1 Schema Structure
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Schema 1: Row-Per-Day (Fungible)&lt;/th&gt;
&lt;th&gt;Schema 2: GiST (Individual Rooms)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inventory Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fungible (Room 101 = Room 102)&lt;/td&gt;
&lt;td&gt;Non-fungible (Room 101 ≠ Room 102)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inventory Table&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;inventory(hotel_id, room_type_id, date)&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;rooms(hotel_id, room_number)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Booking Table&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;reservations&lt;/code&gt; + &lt;code&gt;reservation_rooms(room_type_id, num_rooms)&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;&lt;code&gt;room_bookings(room_number, stay_dates)&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Date Storage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One row per night (DATE)&lt;/td&gt;
&lt;td&gt;Date range per booking (DATERANGE)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Index Type&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;B-tree&lt;/td&gt;
&lt;td&gt;GiST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rows per Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1 reservation + N nights&lt;/td&gt;
&lt;td&gt;1 room_booking (with date range)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  4.2 Query Performance
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Schema 1&lt;/th&gt;
&lt;th&gt;Schema 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Check Availability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;SELECT ... WHERE date IN (...)&lt;/code&gt; - O(log n) per date&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;SELECT ... WHERE NOT EXISTS (overlap check)&lt;/code&gt; - O(log n) with GiST&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Point Query&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Excellent (direct date lookup)&lt;/td&gt;
&lt;td&gt;Good (range overlap check)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Range Query&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Good (multiple point queries)&lt;/td&gt;
&lt;td&gt;Excellent (single range query)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Find Available Room&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Count-based (simple)&lt;/td&gt;
&lt;td&gt;Must query all rooms, check overlaps (more complex)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  4.3 Concurrency Control
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Schema 1&lt;/th&gt;
&lt;th&gt;Schema 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Method&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimistic locking (version numbers)&lt;/td&gt;
&lt;td&gt;Database constraints (GiST exclusion)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Conflict Detection&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Check version in WHERE clause&lt;/td&gt;
&lt;td&gt;Constraint violation on overlap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Retry Logic&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Required (on version mismatch)&lt;/td&gt;
&lt;td&gt;Required for multi-room (must find N rooms atomically)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Deadlocks&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (no long-held locks)&lt;/td&gt;
&lt;td&gt;Possible (if using FOR UPDATE SKIP LOCKED)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Medium (retry logic for version)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;High for multi-room&lt;/strong&gt; (find N rooms, lock, insert N rows, handle failures)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Single Room&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Atomic &lt;code&gt;UPDATE ... SET available_rooms = available_rooms - 1&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;Simple: exclusion constraint handles it&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Room&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Atomic &lt;code&gt;UPDATE ... SET available_rooms = available_rooms - N&lt;/code&gt;
&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Complex&lt;/strong&gt;: Must find N different rooms, lock them, insert N rows&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  4.4 Update Operations
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Schema 1&lt;/th&gt;
&lt;th&gt;Schema 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Create Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;UPDATE inventory (decrement count)&lt;/td&gt;
&lt;td&gt;INSERT room_booking (with date range)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cancel Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;UPDATE inventory (increment count)&lt;/td&gt;
&lt;td&gt;DELETE room_booking or update status&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Modify Dates&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;UPDATE multiple inventory rows&lt;/td&gt;
&lt;td&gt;DELETE + INSERT new date range&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Room Reassignment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No change needed (fungible)&lt;/td&gt;
&lt;td&gt;DELETE + INSERT (different room)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  4.5 Storage and Scalability
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Important Distinction&lt;/strong&gt;: We must separate &lt;strong&gt;static inventory rows&lt;/strong&gt; from &lt;strong&gt;dynamic booking rows&lt;/strong&gt;.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Schema 1&lt;/th&gt;
&lt;th&gt;Schema 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static Inventory Rows&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;365 rows per room type per hotel (scales with booking window)&lt;/td&gt;
&lt;td&gt;1 row per physical room (fixed)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Note on "365"&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;This represents a 1-year (365-day) booking window (common in hotels). The number scales linearly: 90-day window = 90 rows, 365-day window = 365 rows. This is configurable based on business needs.&lt;/td&gt;
&lt;td&gt;Fixed regardless of booking window&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rows per Single-Room Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1 reservation + 1 reservation_rooms + N reservation_room_nights&lt;/td&gt;
&lt;td&gt;1 reservation + 1 room_booking&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Rows per Multi-Room Booking&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1 reservation + M reservation_rooms + (M × N) reservation_room_nights&lt;br&gt;(M = room types, N = nights)&lt;/td&gt;
&lt;td&gt;1 reservation + &lt;strong&gt;N room_bookings&lt;/strong&gt; (one per room)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Index Size&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;~2MB (B-tree, 100 hotels × 3 types × 365 days)&lt;/td&gt;
&lt;td&gt;~200KB (GiST, but more complex)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Scalability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Static rows scale with booking window&lt;/td&gt;
&lt;td&gt;Static rows fixed; dynamic rows scale with bookings&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Example Calculation&lt;/strong&gt; (100 hotels, 3 room types, 10 rooms per type, 365-day window):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schema 1 Static&lt;/strong&gt;: 100 × 3 × 365 = &lt;strong&gt;109,500 inventory rows&lt;/strong&gt; (large, but simple queries)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema 2 Static&lt;/strong&gt;: 100 × 3 × 10 = &lt;strong&gt;3,000 room rows&lt;/strong&gt; (smaller, fixed)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why 365 days?&lt;/strong&gt; This is a typical booking window for hotels (1 year ahead). The number is configurable:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;90-day window&lt;/strong&gt;: 90 rows per room type (shorter horizon, less storage)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;180-day window&lt;/strong&gt;: 180 rows per room type (6 months ahead)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;365-day window&lt;/strong&gt;: 365 rows per room type (1 year ahead, most common)&lt;/li&gt;
&lt;li&gt;The number scales linearly: &lt;code&gt;rows = booking_window_days&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example: Booking 3 Deluxe Rooms for 2 Nights&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schema 1 Dynamic&lt;/strong&gt;: 1 reservation + 1 reservation_rooms + 2 reservation_room_nights = &lt;strong&gt;4 rows&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema 2 Dynamic&lt;/strong&gt;: 1 reservation + &lt;strong&gt;3 room_bookings&lt;/strong&gt; = &lt;strong&gt;4 rows&lt;/strong&gt; (same count, but different structure)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example: Booking 2 Deluxe + 1 Suite for 2 Nights&lt;/strong&gt; (multiple room types):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Schema 1 Dynamic&lt;/strong&gt;: 1 reservation + 2 reservation_rooms + (2 × 2) reservation_room_nights = &lt;strong&gt;7 rows&lt;/strong&gt;

&lt;ul&gt;
&lt;li&gt;1 reservation header&lt;/li&gt;
&lt;li&gt;2 reservation_rooms (one for Deluxe, one for Suite)&lt;/li&gt;
&lt;li&gt;4 reservation_room_nights (2 nights × 2 room types, each with potentially different rates)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema 2 Dynamic&lt;/strong&gt;: 1 reservation + &lt;strong&gt;3 room_bookings&lt;/strong&gt; = &lt;strong&gt;4 rows&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Insight&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Schema 1 has &lt;strong&gt;more static rows&lt;/strong&gt; (inventory table) but &lt;strong&gt;fewer dynamic rows per multi-room booking&lt;/strong&gt; (1 reservation_rooms row regardless of N)&lt;/li&gt;
&lt;li&gt;Schema 2 has &lt;strong&gt;fewer static rows&lt;/strong&gt; (rooms table) but &lt;strong&gt;more dynamic rows per multi-room booking&lt;/strong&gt; (N room_bookings rows for N rooms)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  4.6 Flexibility and Use Cases
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Requirement&lt;/th&gt;
&lt;th&gt;Schema 1&lt;/th&gt;
&lt;th&gt;Schema 2&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Room Reassignment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Easy (any room of type works)&lt;/td&gt;
&lt;td&gt;Hard (must change room_number)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Maintenance Scheduling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Flexible (remove from inventory)&lt;/td&gt;
&lt;td&gt;Must track per-room&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Room Preferences&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Not supported&lt;/td&gt;
&lt;td&gt;Supported (per-room metadata)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Exact Room Assignment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;No (assigned at check-in)&lt;/td&gt;
&lt;td&gt;Yes (assigned at booking)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Boutique Hotels&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Less suitable&lt;/td&gt;
&lt;td&gt;More suitable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Standard Hotels&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;More suitable&lt;/td&gt;
&lt;td&gt;Less suitable&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;


&lt;h2&gt;
  
  
  5. Detailed Code Examples
&lt;/h2&gt;
&lt;h3&gt;
  
  
  5.1 Availability Check: Schema 1
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_availability_schema1&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Check if enough rooms are available for all nights.

    Args:
        num_rooms: Number of rooms needed (default: 1)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;nights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_nights&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# [2024-06-15, 2024-06-16]
&lt;/span&gt;
    &lt;span class="n"&gt;availability&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT date, available_rooms
        FROM inventory
        WHERE hotel_id = ? 
          AND room_type_id = ?
          AND date IN (?)
          AND available_rooms &amp;gt;= ?
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;nights&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Check if all nights have enough availability
&lt;/span&gt;    &lt;span class="n"&gt;available_nights&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;row&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;availability&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;available_nights&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;missing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;nights&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;available_nights&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Not enough rooms available on: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;. Need &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rooms.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  5.2 Availability Check: Schema 2
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_availability_schema2&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Check if enough rooms are available for the date range.

    Args:
        num_rooms: Number of rooms needed (default: 1)
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;available_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
        SELECT COUNT(*) as count
        FROM rooms r
        WHERE r.hotel_id = ? 
          AND r.room_type_id = ?
          AND NOT EXISTS (
              SELECT 1 FROM room_bookings b
              WHERE b.hotel_id = r.hotel_id
                AND b.room_number = r.room_number
                AND b.stay_dates &amp;amp;&amp;amp; daterange(?, ?, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;[]&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)
                AND b.status IN (&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;pending&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;confirmed&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;checked_in&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;)  -- Include pending
          )
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;hotel_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;room_type_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_in_date&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;check_out_date&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;available_count&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;InsufficientAvailabilityError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Only &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;available_count&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;count&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; rooms available, need &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;num_rooms&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;h3&gt;
  
  
  5.3 Booking Creation: Schema 2 (Full Example)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: See Section 3.6 for the complete implementation with both single-room and multi-room booking examples. The key difference is that Schema 2 requires significantly more complex logic for multi-room bookings compared to Schema 1's atomic &lt;code&gt;UPDATE inventory SET available_rooms = available_rooms - N&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Points&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single room booking: Relatively straightforward with exclusion constraint&lt;/li&gt;
&lt;li&gt;Multi-room booking: Requires finding and locking N rooms, with retry logic if any room becomes unavailable&lt;/li&gt;
&lt;li&gt;See Section 3.6 for the full &lt;code&gt;create_booking_schema2()&lt;/code&gt; implementation with &lt;code&gt;num_rooms&lt;/code&gt; parameter&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  6. When to Use Each Schema
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Use Schema 1 (Row-Per-Day, Fungible) When:
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;Standard hotel operations&lt;/strong&gt;: Rooms of the same type are truly interchangeable&lt;br&gt;
✅ &lt;strong&gt;Flexibility needed&lt;/strong&gt;: Want to reassign rooms easily (maintenance, upgrades)&lt;br&gt;
✅ &lt;strong&gt;Simplicity priority&lt;/strong&gt;: Prefer straightforward queries and standard indexes&lt;br&gt;
✅ &lt;strong&gt;High booking volume&lt;/strong&gt;: Need to handle many concurrent bookings efficiently&lt;br&gt;
✅ &lt;strong&gt;Per-night pricing&lt;/strong&gt;: Different rates per night (weekend vs weekday)&lt;br&gt;
✅ &lt;strong&gt;Most hotels&lt;/strong&gt;: Matches how most hotels actually operate&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Use Cases&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Large chain hotels (Marriott, Hilton)&lt;/li&gt;
&lt;li&gt;Standard hotel rooms (not unique suites)&lt;/li&gt;
&lt;li&gt;High-volume booking systems&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Use Schema 2 (GiST, Individual Rooms) When:
&lt;/h3&gt;

&lt;p&gt;✅ &lt;strong&gt;Unique rooms&lt;/strong&gt;: Rooms are non-fungible (boutique hotels, unique suites)&lt;br&gt;
✅ &lt;strong&gt;Exact room assignment&lt;/strong&gt;: Need to know which specific room guest will get&lt;br&gt;
✅ &lt;strong&gt;Room-level tracking&lt;/strong&gt;: Need to track maintenance, preferences per room&lt;br&gt;
✅ &lt;strong&gt;Database-enforced integrity&lt;/strong&gt;: Want strongest guarantees against double-booking&lt;br&gt;
✅ &lt;strong&gt;Event venues&lt;/strong&gt;: Similar to fixed-seat event booking (non-fungible)&lt;br&gt;
✅ &lt;strong&gt;Lower booking volume&lt;/strong&gt;: Can handle more complex queries&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Use Cases&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Boutique hotels with unique rooms&lt;/li&gt;
&lt;li&gt;Luxury resorts with distinct suites&lt;/li&gt;
&lt;li&gt;Event venues (similar pattern)&lt;/li&gt;
&lt;li&gt;Hotels where guests request specific rooms&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Interview Tidbit: When to Use GiST Indexes
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Use GiST when a row-per-entry approach is not scalable or not possible.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Classic Example: Calendar Meeting Booking&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine building a calendar system where users can book meetings. You need to check for time slot overlaps (e.g., "Is 2:00 PM - 3:00 PM available?").&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;❌ Row-Per-Minute Approach (Not Scalable):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Bad: One row per minute for every possible time slot&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;time_slots&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="nb"&gt;date&lt;/span&gt; &lt;span class="nb"&gt;DATE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="k"&gt;minute&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- 0 to 1439 (minutes in a day)&lt;/span&gt;
    &lt;span class="n"&gt;is_booked&lt;/span&gt; &lt;span class="nb"&gt;BOOLEAN&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- Problem: 1,440 rows per day × 365 days = 525,600 rows per year per resource&lt;/span&gt;
&lt;span class="c1"&gt;-- For 1000 resources: 525 million rows per year!&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;✅ GiST Range Approach (Scalable):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Good: One row per booking with date range&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;meetings&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;meeting_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;resource_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;meeting_time&lt;/span&gt; &lt;span class="n"&gt;TSRANGE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Time range: [start_time, end_time)&lt;/span&gt;
    &lt;span class="n"&gt;EXCLUDE&lt;/span&gt; &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="n"&gt;GIST&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;resource_id&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;meeting_time&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- One row per actual booking - scales with usage, not time granularity&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Insight&lt;/strong&gt;: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row-per-entry&lt;/strong&gt;: Scales with time granularity (minutes, seconds) × duration × resources → exponential growth&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;GiST ranges&lt;/strong&gt;: Scales with actual bookings → linear growth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Other Use Cases for GiST:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resource booking&lt;/strong&gt;: Conference rooms, equipment, vehicles&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IP address ranges&lt;/strong&gt;: Network allocation, geolocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geographic data&lt;/strong&gt;: Spatial queries, map boundaries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-series overlaps&lt;/strong&gt;: Scheduling, availability windows&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  7. Recommendation
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;For most hotel booking systems, Schema 1 (Row-Per-Day, Fungible) is recommended&lt;/strong&gt; because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Matches real-world operations&lt;/strong&gt;: Most hotels treat rooms of the same type as interchangeable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simpler queries&lt;/strong&gt;: Direct date lookups are easier to understand and optimize&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better performance&lt;/strong&gt;: B-tree indexes are well-understood and perform excellently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;More flexible&lt;/strong&gt;: Can reassign rooms without database changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Proven pattern&lt;/strong&gt;: Widely used in production systems&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Use Schema 2 (GiST, Individual Rooms) only if&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You have unique, non-fungible rooms&lt;/li&gt;
&lt;li&gt;You need exact room assignment at booking time&lt;/li&gt;
&lt;li&gt;You're building an event venue system (where this design excels)&lt;/li&gt;
&lt;li&gt;Room-level tracking is critical&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: If you need Schema 1's fungible inventory but also want to guarantee no room switching after check-in, see Section 2.6 for a hybrid approach that adds a room assignment table with GiST constraints.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scaling Beyond a Single Database
&lt;/h3&gt;

&lt;p&gt;If a single database instance cannot sustain throughput/latency requirements:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Vertical scale + read replicas&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Keep &lt;code&gt;inventory&lt;/code&gt;, &lt;code&gt;reservations&lt;/code&gt;, &lt;code&gt;reservation_rooms&lt;/code&gt;, and &lt;code&gt;reservation_room_nights&lt;/code&gt; on the primary for strong consistency.
&lt;/li&gt;
&lt;li&gt;Serve availability/search queries from read replicas (or a Redis cache) to offload the primary.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Logical partitioning (sharding)&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Shard by &lt;code&gt;hotel_id&lt;/code&gt; (most queries are scoped to a hotel).
&lt;/li&gt;
&lt;li&gt;Each shard owns its own &lt;code&gt;inventory&lt;/code&gt;, &lt;code&gt;reservations&lt;/code&gt;, &lt;code&gt;room_assignments&lt;/code&gt;, etc.
&lt;/li&gt;
&lt;li&gt;Global services (payments, user accounts) remain shared; booking services route to the correct shard via a metadata service.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Regional replication&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Deploy per-region databases (still sharded by hotel) with asynchronous replication for cross-region disaster recovery.
&lt;/li&gt;
&lt;li&gt;Keep writes region-local; replicate summary events to a central analytics store.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Streaming + caches&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Publish inventory delta events to Kafka/PubSub.
&lt;/li&gt;
&lt;li&gt;Materialize availability views in Redis/Elastic for search so that the OLTP database handles only booking-critical writes.
&lt;/li&gt;
&lt;li&gt;Ensure cache invalidation by listening to the same event stream.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Background reconciliation&lt;/strong&gt;:  &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Periodically reconcile per-shard inventory totals against master room counts (or via nightly jobs) to catch drift.
&lt;/li&gt;
&lt;li&gt;Use idempotent booking operations so retries across shards are safe.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These patterns let you grow from a single-node deployment to multi-shard, multi-region topologies without rewriting the core schemas.&lt;/p&gt;




&lt;h2&gt;
  
  
  8. Summary
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Decision Factor&lt;/th&gt;
&lt;th&gt;Winner&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Simplicity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (Row-Per-Day)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Flexibility&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (Row-Per-Day)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Database-Enforced Integrity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 2 (GiST)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Exact Room Assignment&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 2 (GiST)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Query Performance&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Tie (depends on query pattern)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Concurrency Control (Single Room)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 2 (GiST) - simpler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Concurrency Control (Multi-Room)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (atomic decrement) - much simpler&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Static Rows&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 2 (fewer static rows)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Dynamic Rows (Multi-Room)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (1 row per booking vs N rows)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Multi-Room Booking Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (atomic decrement vs find/lock N rooms)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Most Hotels&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Schema 1 (Row-Per-Day)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Final Recommendation&lt;/strong&gt;: Start with &lt;strong&gt;Schema 1 (Row-Per-Day, Fungible)&lt;/strong&gt; for most hotel booking systems. Consider &lt;strong&gt;Schema 2 (GiST, Individual Rooms)&lt;/strong&gt; only if you have specific requirements for exact room assignment or unique, non-fungible rooms.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>booking</category>
      <category>postgres</category>
      <category>indexes</category>
    </item>
    <item>
      <title>Hotel Search System Design</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Wed, 19 Nov 2025 20:05:19 +0000</pubDate>
      <link>https://dev.to/sumedhbala/hotel-search-system-design-5ea4</link>
      <guid>https://dev.to/sumedhbala/hotel-search-system-design-5ea4</guid>
      <description>&lt;h2&gt;
  
  
  1. Introduction
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key Technical Topics Covered (Interview Highlights)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hybrid OLAP/OLTP architecture&lt;/strong&gt; (Sections 1 &amp;amp; 2): Elasticsearch for multi-dimensional search + Redis for real-time availability (and why neither alone is sufficient).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transactional Outbox + CDC&lt;/strong&gt; (Section 2.2): Keeping Elasticsearch in sync with the primary DB without dual writes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch modeling&lt;/strong&gt; (Sections 3 &amp;amp; 4): Denormalized hotel/room documents, geo fields, amenity facets, script scoring, and index aliasing for zero-downtime reindex.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis availability design&lt;/strong&gt; (Section 6): Per-night bitmaps / sorted sets, multi-night intersections, TTL/eviction strategies, and consistency with bookings.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query pipeline&lt;/strong&gt; (Section 7): Coordinated ES + Redis flow with fallbacks when availability or index data diverges.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advanced filtering &amp;amp; ranking&lt;/strong&gt; (Section 5): Amenity faceting, price buckets, popularity boosts, personalization hooks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scaling patterns&lt;/strong&gt; (Sections 2 &amp;amp; 7): Multi-region clusters, hot/warm node tiers, cache layers, snapshot/alias-based reindexing. &lt;em&gt;(Cross-region replication—e.g., keeping Redis regional while asynchronously syncing aggregates or using multi-region Elasticsearch—follows the same patterns but is out of scope for this doc.)&lt;/em&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Prerequisite Reading (Shared Components)&lt;/strong&gt;&lt;br&gt;&lt;br&gt;
This hotel search design builds on the same platform described in &lt;a class="mentioned-user" href="https://dev.to/sumedhbala"&gt;@sumedhbala&lt;/a&gt;’s ticketing series:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/designing-a-large-scale-ticketing-system-2h1c"&gt;Designing a Large-Scale Ticketing System&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-2-event-discovery-32o5"&gt;Part 2 – Event Discovery&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-3-seat-management-nn2"&gt;Part 3 – Seat Management&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h"&gt;Part 4 – Payments &amp;amp; Ticket Issuance&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;From those posts we inherit the core stack: PostgreSQL + transactional outbox, Debezium/Kafka pipelines, Elasticsearch clusters, Redis caches, and Stripe-style payment flows. This document focuses on how those building blocks are reconfigured for hotel-specific search, availability, and ranking concerns.&lt;/p&gt;
&lt;h3&gt;
  
  
  Hotel Search Challenges
&lt;/h3&gt;

&lt;p&gt;Hotel search and booking introduce complexities that differ sharply from event search. Two fundamental differences from event search drive the architectural complexity:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Availability Integration in Search:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Availability Validation&lt;/strong&gt;: Availability must be checked during the search phase, not after results are displayed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Date Range Queries&lt;/strong&gt;: Users search for stays spanning multiple nights (check-in to check-out), requiring multi-date availability validation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Recurring Availability&lt;/strong&gt;: Same rooms available daily, unlike one-time events&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Updates&lt;/strong&gt;: Availability changes constantly as bookings are made, requiring sub-millisecond response times&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Extensive Filter Requirements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multiple Amenity Filters&lt;/strong&gt;: Users expect to filter by numerous amenities (pool, spa, beachfront, gym, restaurant, concierge, parking, pet-friendly, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room Type Complexity&lt;/strong&gt;: Multiple room categories (Standard, Deluxe, Suite) with different capacities and room-level amenities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complex Filter Combinations&lt;/strong&gt;: Location, amenities, price range, room type, guest count, and date combinations all working together&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Additional challenges include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Fungible Inventory&lt;/strong&gt;: Room 101 and Room 102 of the same type are interchangeable&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic Pricing&lt;/strong&gt;: Prices vary by date, season, demand, and length of stay&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Key Differences from Event Search
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Aspect&lt;/th&gt;
&lt;th&gt;Event Search&lt;/th&gt;
&lt;th&gt;Hotel Search&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Inventory Model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fixed seats, non-fungible&lt;/td&gt;
&lt;td&gt;Room types, fungible inventory&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Availability&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;One-time events&lt;/td&gt;
&lt;td&gt;Recurring daily availability&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Date Handling&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Single event date&lt;/td&gt;
&lt;td&gt;Date ranges (check-in to check-out)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Pricing&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Dynamic per seat/section (demand-based)&lt;/td&gt;
&lt;td&gt;Dynamic per room type and date (seasonal, demand-based)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Complexity&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple (venue, date, category)&lt;/td&gt;
&lt;td&gt;Complex (location, dates, room type, multiple amenity filters: pool, spa, beachfront, gym, etc.)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Availability in Search&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Checked after displaying events&lt;/td&gt;
&lt;td&gt;Must be part of search criteria&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;
&lt;h3&gt;
  
  
  Hybrid Architecture Overview: Mandatory OLAP/OLTP Separation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Terminology refresher&lt;/strong&gt;: &lt;em&gt;OLAP (Online Analytical Processing)&lt;/em&gt; handles read-heavy, multi-dimensional queries (e.g., Elasticsearch), while &lt;em&gt;OLTP (Online Transaction Processing)&lt;/em&gt; powers write-heavy, transactional workloads (e.g., Redis + primary DB). The rest of this document uses these terms frequently.&lt;/p&gt;

&lt;p&gt;Hotel search systems face two fundamentally contradictory workloads that cannot be solved by a single system:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. OLAP Workload (Analytical Search):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Complex, read-intensive, multi-dimensional queries&lt;/li&gt;
&lt;li&gt;Example: "Find all 4-star hotels in New York with pool and free breakfast for 2 adults and 2 children from 2024-10-26 to 2024-10-30, sorted by review score"&lt;/li&gt;
&lt;li&gt;Optimized for: High-speed reads of large datasets, complex filtering, full-text search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technology&lt;/strong&gt;: Elasticsearch (built on Apache Lucene, designed for OLAP)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. OLTP Workload (Transactional Availability):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-frequency, simple, isolated transactions&lt;/li&gt;
&lt;li&gt;Example: "Decrement room count for Hotel 123, room type 'King', on date 2024-10-26"&lt;/li&gt;
&lt;li&gt;Optimized for: High-throughput, low-latency updates, absolute data integrity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Technology&lt;/strong&gt;: Redis (in-memory database designed for OLTP)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Architectural Mandate:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The dual-system architecture (Elasticsearch + Redis) is &lt;strong&gt;not optional&lt;/strong&gt;—it is &lt;strong&gt;mandatory&lt;/strong&gt; because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;OLTP optimizes for write efficiency&lt;/strong&gt; (small, fast, atomic updates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;OLAP optimizes for read efficiency&lt;/strong&gt; (complex, large-scale, multi-dimensional queries)&lt;/li&gt;
&lt;li&gt;These optimization goals are &lt;strong&gt;directly contradictory&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Attempting to use Elasticsearch for OLTP availability updates would fail catastrophically due to its "update penalty" (see Section 3.2). Conversely, using a relational database for OLAP search queries lacks the ability to efficiently perform complex, multi-dimensional, and full-text search.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: The combination of OLAP (Elasticsearch) and OLTP (Redis) systems is not merely the "best" solution—it is the &lt;strong&gt;only viable one&lt;/strong&gt; for a business requiring both robust transaction processing and powerful data analysis.&lt;/p&gt;
&lt;h2&gt;
  
  
  2. Baseline Functional Solution
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Services
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search API Service&lt;/strong&gt; – Handles user queries, coordinates Elasticsearch and Redis operations, returns ranked results with availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Service&lt;/strong&gt; – Manages Redis availability data, handles booking updates, ensures data consistency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hotel Management Service&lt;/strong&gt; – Source of truth for hotel metadata (writes to primary database)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Data synchronization from the primary database to Elasticsearch is handled by an event-driven pipeline (see Section 2.2: Data Synchronization Architecture)&lt;/p&gt;
&lt;h3&gt;
  
  
  Data Stores
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Cluster&lt;/strong&gt; – Search index for hotels, room types, amenities, location data (includes built-in query caching)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Cluster&lt;/strong&gt; (or AWS ElastiCache) – Real-time availability data exclusively&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary Database&lt;/strong&gt; (PostgreSQL/MySQL) – Hotel metadata, room configurations, pricing rules&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Redis is used exclusively for availability data. Search result caching is handled by Elasticsearch's built-in query cache or application-level caching mechanisms. (This is explained in detail in Section 6.)&lt;/p&gt;
&lt;h3&gt;
  
  
  High-Level Architecture
&lt;/h3&gt;

&lt;p&gt;The system follows a three-tier approach:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Search Tier&lt;/strong&gt;: Elasticsearch handles complex queries (location, amenities, text search)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Tier&lt;/strong&gt;: Redis provides real-time availability validation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Tier&lt;/strong&gt;: Primary database maintains canonical hotel information&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Soundbite for interviews&lt;/strong&gt;: "Hotel search requires a hybrid architecture because Elasticsearch excels at complex filtering but struggles with real-time availability updates, while Redis provides sub-millisecond availability checks but lacks sophisticated search capabilities."&lt;/p&gt;
&lt;h3&gt;
  
  
  Data Synchronization Architecture: Transactional Outbox + Change Data Capture (CDC)
&lt;/h3&gt;

&lt;p&gt;At a glance, data flows &lt;strong&gt;DB → Outbox → CDC/Debezium → Kafka → Elasticsearch&lt;/strong&gt;. The subsections below break this down so readers can skim the summary and dive deeper only if needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Dual-Write Anti-Pattern Problem:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A naive approach would use an "Indexing Service" that attempts to write to both the database and Elasticsearch synchronously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Write A: Update PostgreSQL (hotel name change)
Write B: Update Elasticsearch (hotel name change)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is a &lt;strong&gt;well-known anti-pattern&lt;/strong&gt; because it cannot guarantee atomic execution across two independent systems. The risk of data inconsistency is not a risk—it is a &lt;strong&gt;guarantee&lt;/strong&gt; over time and at scale.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure Scenarios:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Partial Failure (Stale Search Index)&lt;/strong&gt;: Database update succeeds, Elasticsearch update fails → Search index permanently stale&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partial Failure (Phantom Data)&lt;/strong&gt;: Elasticsearch update succeeds, database update fails → Hotel searchable but doesn't exist in system of record&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Race Conditions&lt;/strong&gt;: Concurrent updates arrive out of order → Permanent data corruption&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;The Solution: Transactional Outbox Pattern + CDC&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The robust solution decouples writes using an asynchronous, event-driven architecture:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Transactional Outbox Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create an &lt;code&gt;Outbox_Events&lt;/code&gt; table in the primary PostgreSQL database&lt;/li&gt;
&lt;li&gt;When updating hotel data, perform &lt;strong&gt;one atomic database transaction&lt;/strong&gt;:
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;  &lt;span class="k"&gt;BEGIN&lt;/span&gt; &lt;span class="n"&gt;TRANSACTION&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;-- Business data write&lt;/span&gt;
  &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;Hotels&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'New Name'&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="c1"&gt;-- Event write (atomic within same transaction)&lt;/span&gt;
  &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;Outbox_Events&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'{"id": 123, "name": "New Name"}'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;This guarantees atomicity: if UPDATE fails, INSERT is rolled back; if INSERT fails, UPDATE is rolled back&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;2. Change Data Capture (CDC):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use a log-based CDC tool (e.g., Debezium) that reads PostgreSQL's Write-Ahead Log (WAL)&lt;/li&gt;
&lt;li&gt;Debezium sees the committed event in &lt;code&gt;Outbox_Events&lt;/code&gt; and publishes it to Apache Kafka&lt;/li&gt;
&lt;li&gt;This is non-intrusive and low-overhead (reads transaction log, doesn't query database)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;3. Event-Driven Pipeline:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;PostgreSQL (System of Record)
  ↓ (WAL)
Debezium (CDC Source Connector)
  ↓ (Kafka message)
Apache Kafka (Event Bus - provides resilience and buffering)
  ↓ (Kafka message)
Kafka Connect (Elasticsearch Sink Connector)
  ↓ (index operation)
Elasticsearch (Search Index)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Integrity&lt;/strong&gt;: Only 100% committed data is ever published&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilience&lt;/strong&gt;: If Elasticsearch is down, events buffer in Kafka with no data loss&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling&lt;/strong&gt;: Database and Elasticsearch are fully independent&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Each component can scale independently&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eventual Consistency&lt;/strong&gt;: Tunable lag (typically seconds to low minutes) is the correct trade-off for high-performance distributed systems&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Configuration Requirements:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt;: Enable logical replication (&lt;code&gt;wal_level = logical&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debezium&lt;/strong&gt;: Configure to read only from &lt;code&gt;Outbox_Events&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Kafka&lt;/strong&gt;: Long retention period (e.g., 7 days) for disaster recovery&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Sink&lt;/strong&gt;: Configure for idempotency (&lt;code&gt;pk.mode: record_key&lt;/code&gt;) and delete handling (&lt;code&gt;behavior.on.null.values: delete&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dead Letter Queue (DLQ)&lt;/strong&gt;: Route failed messages to DLQ to prevent pipeline halting&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: The Transactional Outbox + CDC pattern eliminates the dual-write anti-pattern, providing a production-grade, resilient architecture for data synchronization that guarantees data integrity and system availability.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. Elasticsearch Document Structure
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Availability Data Structure Options
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Quick overview&lt;/strong&gt;: We evaluate four availability strategies:  &lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Keep all availability in the primary database + Redis (Elasticsearch remains metadata-only).
&lt;/li&gt;
&lt;li&gt;Embed detailed availability inside Elasticsearch documents.
&lt;/li&gt;
&lt;li&gt;Hybrid flag in Elasticsearch plus full detail in Redis.
&lt;/li&gt;
&lt;li&gt;Maintain a separate availability-focused Elasticsearch index.
Use this summary to choose which option details to read.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Availability Only in Primary Database and Redis&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch&lt;/strong&gt;: No availability data stored&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis&lt;/strong&gt;: Real-time availability for all date ranges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary Database&lt;/strong&gt;: Source of truth for availability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single Source of Truth&lt;/strong&gt;: No data synchronization complexity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Always Accurate&lt;/strong&gt;: No stale availability data in Elasticsearch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simpler Architecture&lt;/strong&gt;: Fewer moving parts, less to maintain&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Guarantees&lt;/strong&gt;: Redis provides sub-millisecond updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Post-Filtering Required&lt;/strong&gt;: Must filter Elasticsearch results with Redis availability check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two-Step Process&lt;/strong&gt;: Elasticsearch search → Redis availability check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Slightly Higher Latency&lt;/strong&gt;: Additional Redis lookup after search&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Nested Availability in Elasticsearch&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Availability data embedded in room type documents&lt;/li&gt;
&lt;li&gt;Good for simple queries, limited scalability for date ranges&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filtering in Search Phase&lt;/strong&gt;: Can filter by availability during Elasticsearch query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single Query&lt;/strong&gt;: All filtering happens in one place&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Synchronization&lt;/strong&gt;: Must keep Elasticsearch in sync with DB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stale Data Risk&lt;/strong&gt;: Availability in Elasticsearch may be outdated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Complexity&lt;/strong&gt;: Every booking requires Elasticsearch update&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Index Size&lt;/strong&gt;: Large index with date-based availability fields&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 3: Hybrid Approach&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch&lt;/strong&gt;: Summary availability flag (e.g., "has_availability_next_30_days: true")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis&lt;/strong&gt;: Detailed real-time availability for specific dates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Primary Database&lt;/strong&gt;: Source of truth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Initial Filtering&lt;/strong&gt;: Elasticsearch can filter out hotels with no availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Accuracy&lt;/strong&gt;: Redis provides precise availability for date ranges&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Load&lt;/strong&gt;: Fewer hotels passed to Redis for availability check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best of Both Worlds&lt;/strong&gt;: Search performance + real-time accuracy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dual Maintenance&lt;/strong&gt;: Must update both Elasticsearch summary and Redis details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Complexity&lt;/strong&gt;: More components to manage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 4: Separate Availability Index in Elasticsearch&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dedicated index for availability data&lt;/li&gt;
&lt;li&gt;Better for complex date range queries, requires joins&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Complex Queries&lt;/strong&gt;: Can handle sophisticated date range queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Separation of Concerns&lt;/strong&gt;: Availability data separate from hotel metadata&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Join Overhead&lt;/strong&gt;: Requires joining availability index with hotel index&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synchronization Complexity&lt;/strong&gt;: Multiple indexes to keep in sync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Joins are slower than single-index queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Recommendation: Option 1 (Redis/DB Only) is Recommended&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Option 1 is the Default Choice:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Availability changes frequently&lt;/strong&gt; - Every booking updates availability in real-time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis is already optimized&lt;/strong&gt; - Sub-millisecond lookups, perfect for real-time data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplicity wins&lt;/strong&gt; - No sync complexity, no stale data risk&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance is acceptable&lt;/strong&gt; - Parallel execution (Elasticsearch + Redis) provides good performance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single source of truth&lt;/strong&gt; - Database is the authority, Redis is a performance cache (see Section 6 for database authority details)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Why Elasticsearch Cannot Handle OLTP Availability Updates: The "Update Penalty"
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The Fundamental Problem: Elasticsearch's Immutable Segment Architecture&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Elasticsearch is built on Apache Lucene, which stores data in immutable segments on disk. This design has a critical consequence: &lt;strong&gt;Elasticsearch does not support in-place updates or deletes&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How "Updates" Actually Work in Elasticsearch:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When a document is "updated" (e.g., changing &lt;code&gt;num_available_rooms&lt;/code&gt; from 10 to 9), Elasticsearch performs two operations:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Soft Delete&lt;/strong&gt;: Marks the old document (with 10 rooms) as "deleted"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Re-index&lt;/strong&gt;: Indexes a brand new document (with 9 rooms) into a new, small segment&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This means a simple transactional counter change becomes a &lt;strong&gt;full document re-index&lt;/strong&gt;, which is CPU-intensive and requires the entire document to be processed and analyzed again.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Compounding Cost: Segment Merging&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;High-frequency updates create thousands of tiny segments and massive numbers of soft-deleted documents. This is catastrophic for search performance because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Elasticsearch must open and query every tiny segment&lt;/li&gt;
&lt;li&gt;CPU cycles are wasted filtering out soft-deleted documents&lt;/li&gt;
&lt;li&gt;To combat this, Lucene runs a background &lt;strong&gt;segment merging&lt;/strong&gt; process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Segment Merging Overhead:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reads small segments&lt;/li&gt;
&lt;li&gt;Filters out soft-deleted documents&lt;/li&gt;
&lt;li&gt;Merges remaining "live" documents into new, larger segments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extremely resource-intensive&lt;/strong&gt;: Consumes substantial CPU, disk I/O, and memory&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The "Update Penalty" Impact:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If Elasticsearch were used for the OLTP availability workload:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;High-frequency updates (thousands per minute) would trigger constant, aggressive segment merging&lt;/li&gt;
&lt;li&gt;Intense background CPU and I/O activity would compete directly with primary OLAP search queries&lt;/li&gt;
&lt;li&gt;Result: &lt;strong&gt;Increased query latency and system instability&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is why using Elasticsearch for frequently updated, mutable data is a well-known architectural anti-pattern. The OLTP availability workload must be handled by a system designed for high-throughput, low-latency updates—Redis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;When to Consider Option 3 (Hybrid with Summary) - Edge Cases Only:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Very high query volume&lt;/strong&gt; - When you need to reduce Redis load by pre-filtering (rare)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large candidate sets&lt;/strong&gt; - When Elasticsearch returns 10,000+ hotels and you want to reduce Redis checks (unusual)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geographic filtering&lt;/strong&gt; - When initial geographic filter yields too many results (can be handled with better Elasticsearch queries)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: For most hotel search systems, Option 1 (Redis/DB Only) is sufficient and preferred due to its simplicity and real-time accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architecture Pattern (Option 1 - Recommended):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch&lt;/strong&gt;: Filters by location, amenities, room types, price (no availability)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis&lt;/strong&gt;: Validates availability for filtered hotels (date range check)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result&lt;/strong&gt;: Only hotels with availability are returned&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Performance Impact:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch search&lt;/strong&gt;: 25ms (filters by location, amenities, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis availability check&lt;/strong&gt;: 10ms (checks availability for filtered hotels)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total&lt;/strong&gt;: 35ms with parallel execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: For most hotel search systems, keeping availability only in Redis and the primary database is the recommended approach. It provides simplicity, real-time accuracy, and acceptable performance. Only use Elasticsearch for availability if you have very high query volumes and need to reduce Redis load through pre-filtering.&lt;/p&gt;

&lt;h3&gt;
  
  
  Hotel Document Schema
&lt;/h3&gt;

&lt;p&gt;Hotel documents in Elasticsearch use nested structures to represent the complex relationships between hotels and their room types:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Recommended Approach (Option 1 - Redis/DB Only):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hotel Document:
├── Basic Information (hotel_id, name, location, rating)
├── Amenities (pool, spa, gym, restaurant)
├── Location Data (geo_point, address, city, country)
└── Room Types (nested array)
    ├── Room Type 1 (room_type_id: "deluxe_001", type: "deluxe", capacity: 2, base_price: 299, amenities: ["wifi", "tv"])
    ├── Room Type 2 (room_type_id: "suite_garden_001", type: "suite", view: "garden", capacity: 4, base_price: 599, amenities: ["wifi", "tv", "balcony"])
    └── Room Type 3 (room_type_id: "suite_ocean_001", type: "suite", view: "ocean", capacity: 4, base_price: 799, amenities: ["wifi", "tv", "balcony", "ocean_view"])
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Room types can have variations (e.g., "suite with garden view" vs "suite with ocean view"). Each variation has a unique &lt;code&gt;room_type_id&lt;/code&gt; used for availability tracking in Redis. (See "Important: Room Type ID in All Keys" in Section 6 for detailed explanation.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: No availability data in Elasticsearch. Availability is checked in Redis after Elasticsearch filtering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Alternative Approach (Option 3 - Hybrid with Summary):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hotel Document:
├── Basic Information (hotel_id, name, location, rating)
├── Amenities (pool, spa, gym, restaurant)
├── Location Data (geo_point, address, city, country)
└── Room Types (nested array)
    ├── Room Type 1 (room_type_id: "deluxe_001", type: "deluxe", capacity: 2, base_price: 299, amenities: ["wifi", "tv"])
    ├── Room Type 2 (room_type_id: "suite_garden_001", type: "suite", view: "garden", capacity: 4, base_price: 599, amenities: ["wifi", "tv", "balcony"])
    ├── Room Type 3 (room_type_id: "suite_ocean_001", type: "suite", view: "ocean", capacity: 4, base_price: 799, amenities: ["wifi", "tv", "balcony", "ocean_view"])
    └── Availability Summary
        ├── has_availability_next_30_days: true/false
        ├── price_range: {min: 199, max: 599}
        └── last_updated: timestamp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What is Availability Summary? (Only for Option 3 - Hybrid Approach)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If using the hybrid approach, the Availability Summary contains aggregated availability information used for initial filtering:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fields:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;has_availability_next_30_days&lt;/strong&gt;: Boolean flag indicating if hotel has any availability in the next 30 days&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Purpose: Quickly filter out hotels with no availability&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;true&lt;/code&gt; means hotel has at least one room available in next 30 days&lt;/li&gt;
&lt;li&gt;Updated: Periodically (e.g., every 15 minutes or when significant availability changes)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;price_range&lt;/strong&gt;: Minimum and maximum prices across all available rooms&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Purpose: Pre-filter by price range before detailed Redis check&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;{min: 199, max: 599}&lt;/code&gt; means cheapest room is $199, most expensive is $599&lt;/li&gt;
&lt;li&gt;Updated: When room prices change significantly&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;last_updated&lt;/strong&gt;: Timestamp of when summary was last refreshed&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Purpose: Track data freshness, implement cache invalidation&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;2024-06-15T10:30:00Z&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Important Notes:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Not Real-Time&lt;/strong&gt;: Summary is updated periodically, not on every booking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coarse Filtering&lt;/strong&gt;: Used to eliminate hotels with no availability, not for final availability check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Still Required&lt;/strong&gt;: Detailed availability check still happens in Redis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trade-off&lt;/strong&gt;: Summary reduces Redis load but adds sync complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;When Availability Summary is Updated:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Periodic refresh (every 15-30 minutes)&lt;/li&gt;
&lt;li&gt;When availability drops to zero (no rooms available)&lt;/li&gt;
&lt;li&gt;When availability becomes available after being zero&lt;/li&gt;
&lt;li&gt;Manual refresh triggered by significant booking events&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Understanding Nested Structures (Beginner-Friendly)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is a Nested Structure?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Imagine you have a hotel document. Without nested structures, you might store room types like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hotel: "Luxury Hotel"
Room Types: ["deluxe", "suite", "standard"]
Room Prices: [299, 599, 199]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Problem&lt;/strong&gt;: If you search for hotels with "deluxe" rooms under $300, Elasticsearch might match this hotel incorrectly. Why? Because it sees "deluxe" (✓) and "$199" (✓) in the arrays, but doesn't know that "$199" belongs to "standard", not "deluxe". It treats all array items as independent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution - Nested Structures:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;With nested structures, each room type (including variations) is stored as a separate "mini-document" inside the hotel document:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hotel: "Luxury Hotel"
Room Types (nested):
  - Room Type 1: {room_type_id: "deluxe_001", type: "deluxe", price: 299, capacity: 2}
  - Room Type 2: {room_type_id: "suite_garden_001", type: "suite", view: "garden", price: 599, capacity: 4}
  - Room Type 3: {room_type_id: "suite_ocean_001", type: "suite", view: "ocean", price: 799, capacity: 4}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now when you search for "deluxe rooms under $300", Elasticsearch checks each room type as a complete unit. It finds "deluxe" with price "$299" as a matched pair, which is correct.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Room Type Variations:&lt;/strong&gt;&lt;br&gt;
The same room type category (e.g., "suite") can have multiple variations with different amenities or views:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"suite with garden view" (room_type_id: "suite_garden_001")&lt;/li&gt;
&lt;li&gt;"suite with ocean view" (room_type_id: "suite_ocean_001")&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each variation has its own unique &lt;code&gt;room_type_id&lt;/code&gt; used for availability tracking in Redis, allowing separate availability management for each variation. (See Section 6 for how &lt;code&gt;room_type_id&lt;/code&gt; is used in Redis keys.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-World Analogy:&lt;/strong&gt;&lt;br&gt;
Think of a hotel document as a filing cabinet. Without nesting, all room information is in one big drawer mixed together. With nesting, each room type has its own folder in the drawer, keeping related information together.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How Does It Help?&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Accurate Filtering&lt;/strong&gt;: When filtering by "deluxe rooms under $300", you get hotels that actually have deluxe rooms at that price, not hotels that have a deluxe room AND a cheaper room (but unrelated).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Independent Queries&lt;/strong&gt;: You can ask "show me hotels where the suite has ocean view but the standard room doesn't" - nested structures allow you to query each room type independently.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Better Performance&lt;/strong&gt;: Elasticsearch can optimize queries better because it knows which fields belong together, reducing false matches.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Why Nested Model Over Parent-Child: Optimizing for OLAP Reads
&lt;/h3&gt;

&lt;p&gt;Elasticsearch offers two options for modeling one-to-many relationships (hotel → room types): &lt;strong&gt;Nested&lt;/strong&gt; and &lt;strong&gt;Parent-Child&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Nested Model (Selected):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hotel and room types stored as a &lt;strong&gt;single document&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Parent (hotel) and children (rooms) are &lt;strong&gt;co-located in the same Lucene block&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Significantly faster queries, low memory overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: High update cost—updating any field forces re-indexing of the entire document (hotel + all rooms)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Parent-Child Model (Rejected):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hotel (parent) and rooms (children) indexed as &lt;strong&gt;separate documents&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pros&lt;/strong&gt;: Low update cost—updating one child only re-indexes that child&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Slower queries due to join overhead, higher memory overhead (requires in-memory "join list")&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;The Decision: OLTP Workload Removed = No Compromise Needed&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Nested vs Parent-Child debate is a microcosm of the entire OLAP/OLTP problem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nested model&lt;/strong&gt; optimizes for reads (OLAP)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parent-Child model&lt;/strong&gt; compromises on read performance to gain update efficiency (a step towards OLTP)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Because the high-frequency OLTP workload (availability) has been completely removed and placed in Redis&lt;/strong&gt;, the system is no longer forced to compromise. The correct choice is to &lt;strong&gt;optimize the search index for its primary, read-heavy workload&lt;/strong&gt;. Therefore, the &lt;strong&gt;Nested model is used&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rationale:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Infrequent updates to static hotel data (handled by asynchronous CDC pipeline) will incur the higher re-indexing cost&lt;/li&gt;
&lt;li&gt;This is a worthwhile trade-off for maximizing query performance for all users&lt;/li&gt;
&lt;li&gt;The update penalty only affects static data changes (hotel name, amenities), not high-frequency availability updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Index Refresh Configuration:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For this OLAP index, &lt;code&gt;index.refresh_interval&lt;/code&gt; can be set to a high value (e.g., 30s or 60s) or disabled during bulk loads&lt;/li&gt;
&lt;li&gt;A high-frequency OLTP workload would require a low interval (e.g., default 1s), triggering constant, costly refreshes&lt;/li&gt;
&lt;li&gt;With OLTP removed, the index can be tuned for maximum indexing performance and stability&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: By moving the OLTP availability workload to Redis, Elasticsearch can be fully optimized for OLAP reads using the Nested model, without compromising on update performance for the primary search workload.&lt;/p&gt;
&lt;h3&gt;
  
  
  Field Mapping Strategy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;What is Field Mapping?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Field mapping tells Elasticsearch how to store and index each field in your documents. Different field types enable different query capabilities and performance characteristics. Choosing the right field type is crucial for search performance and accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Field Type Breakdown:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;hotel_id&lt;/strong&gt;: keyword (exact matching, aggregations)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Hotel IDs are unique identifiers that need exact matching only&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: "Find hotel with ID 'hotel_123'" or "Count hotels by ID"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Very fast exact lookups, no text analysis overhead&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;name&lt;/strong&gt;: text with keyword subfield (full-text search + exact matching)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Hotel names need both fuzzy text search AND exact matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;text field&lt;/strong&gt;: Handles typos, partial matches, relevance scoring (e.g., "Luxury" matches "Luxurious")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;keyword subfield&lt;/strong&gt;: Exact matching for autocomplete, sorting, or filtering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Text search finds "Luxury Downtown Hotel" even if user types "Luxry Downtown", while keyword enables exact filtering&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;location&lt;/strong&gt;: geo_point (geospatial queries, distance calculations)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Geographic coordinates need special handling for distance queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: "Find hotels within 5km of coordinates [34.0522, -118.2437]"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Optimized spatial indexing for fast distance calculations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;amenities&lt;/strong&gt;: keyword array (filtering, faceted search)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Amenities are standardized discrete values (e.g., ["pool", "spa", "gym"]) that need exact matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Filter by "pool AND spa" - must match exact amenity values, not partial text matches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Why not text&lt;/strong&gt;: &lt;/li&gt;
&lt;li&gt;Text analysis would tokenize values, breaking exact matching needed for filtering&lt;/li&gt;
&lt;li&gt;Amenities are categorical data (not free-form text), so they should be matched exactly&lt;/li&gt;
&lt;li&gt;Example: If "pool table" appears in hotel description, &lt;code&gt;text&lt;/code&gt; type might match "pool" filter incorrectly in the wrong field (pool table = billiards, not swimming pool)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Fast exact matching for multiple amenities simultaneously&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;room_types&lt;/strong&gt;: nested object (complex room type queries)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Room types have their own attributes (price, capacity, amenities) that need independent filtering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: "Find hotels with deluxe rooms under $300" - must check room type AND price together&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Enables filtering within nested documents without false matches&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;rating&lt;/strong&gt;: float (range queries, sorting)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Why&lt;/strong&gt;: Ratings are numeric values that need range queries and sorting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: "Find hotels with rating &amp;gt;= 4.0" or "Sort by rating descending"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Optimized for numeric comparisons and sorting operations&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Right field type = Accurate results + Fast queries&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Each field type is optimized for specific query patterns. Text fields handle fuzzy matching, keyword fields handle exact matching, geo_point handles distance calculations, and nested objects handle complex relationships. Choosing the right type ensures both query accuracy and performance.&lt;/p&gt;
&lt;h2&gt;
  
  
  4. Elasticsearch Index Design
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Index Mapping Configuration
&lt;/h3&gt;

&lt;p&gt;The hotel index uses a carefully designed mapping to optimize for different query patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text Fields&lt;/strong&gt;: Standard analyzer for full-text search with stemming and stop words

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stemming&lt;/strong&gt;: Reduces words to their root form (e.g., "swimming" → "swim", "pools" → "pool") so that variations of the same word match&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stop Words&lt;/strong&gt;: Removes common words like "the", "and", "is" that don't add search value&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keyword Fields&lt;/strong&gt;: Exact matching for filters and aggregations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geo Fields&lt;/strong&gt;: Geo-point mapping for distance-based queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested Fields&lt;/strong&gt;: Specialized mapping for parent-child relationships&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Nested Document Configuration
&lt;/h3&gt;

&lt;p&gt;From an index design perspective, nested documents provide:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-level Filtering&lt;/strong&gt;: Enables filtering by both hotel amenities (pool, spa) and room-level amenities (wifi, minibar) simultaneously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Nested queries are optimized for parent-child relationships with efficient BitSet operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Flexibility&lt;/strong&gt;: Easy to add new room types or room attributes without schema changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: The index design balances search performance with query flexibility, using appropriate field types for different use cases. For the recommended approach (Option 1), availability data is not stored in Elasticsearch - it's managed entirely in Redis and the primary database for real-time accuracy.&lt;/p&gt;
&lt;h2&gt;
  
  
  5. Query Patterns and Optimization
&lt;/h2&gt;
&lt;h3&gt;
  
  
  Filter-First Execution Strategy
&lt;/h3&gt;

&lt;p&gt;Elasticsearch optimizes query performance by executing filters before expensive scoring operations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Execution Order:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Filter Context&lt;/strong&gt;: Execute all filters, create BitSets (fast, cached)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Context&lt;/strong&gt;: Execute text search, create BitSets (fast, cached)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BitSet Intersection&lt;/strong&gt;: Combine filters with AND operations (very fast)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring Phase&lt;/strong&gt;: Apply BM25 scoring only to filtered results (expensive but small set)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Performance Impact:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Without Filter-First&lt;/strong&gt;: Score 50,000 documents, then filter to 1,000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;With Filter-First&lt;/strong&gt;: Filter to 1,000 documents, then score only those&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance Improvement&lt;/strong&gt;: 25-50x faster execution&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example: Search for "Luxury hotels with pool and spa in Los Angeles"&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Components:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Text search: "Luxury" (query context - needs scoring)&lt;/li&gt;
&lt;li&gt;Location filter: "Los Angeles" (filter context - no scoring)&lt;/li&gt;
&lt;li&gt;Amenity filter: "pool AND spa" (filter context - no scoring)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Without Filter-First (Inefficient):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Text search finds 50,000 hotels with "luxury" in name/description&lt;/li&gt;
&lt;li&gt;BM25 scoring calculated for all 50,000 hotels (expensive!)&lt;/li&gt;
&lt;li&gt;Field boosting applied to all 50,000 hotels&lt;/li&gt;
&lt;li&gt;Results ranked: Hotel_A (score 0.95), Hotel_B (score 0.92), ...&lt;/li&gt;
&lt;li&gt;Location filter applied: 50,000 → 2,000 hotels in Los Angeles&lt;/li&gt;
&lt;li&gt;Amenity filter applied: 2,000 → 800 hotels with pool AND spa&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total cost&lt;/strong&gt;: 50,000 scoring operations + ranking + filtering&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;With Filter-First (Optimized):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Location filter: Create BitSet A of hotels in "Los Angeles" → 2,000 hotels (fast, cached)&lt;/li&gt;
&lt;li&gt;Amenity filter: Create BitSet B of hotels with "pool AND spa" → 5,000 hotels (fast, cached)&lt;/li&gt;
&lt;li&gt;Text search: Create BitSet C of hotels containing "luxury" → 50,000 hotels (fast - from inverted index)&lt;/li&gt;
&lt;li&gt;BitSet intersection: A ∩ B ∩ C = 600 hotels (very fast - bitwise AND operation)&lt;/li&gt;
&lt;li&gt;BM25 scoring: Calculate scores for only 600 hotels (expensive but small set)&lt;/li&gt;
&lt;li&gt;Field boosting: Apply to only 600 hotels&lt;/li&gt;
&lt;li&gt;Results ranked: Final 600 hotels with scores&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total cost&lt;/strong&gt;: 600 scoring operations (25-50x fewer than without filter-first)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Key Insight:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Text search BitSet creation&lt;/strong&gt;: Still processes ALL documents (50,000) from inverted index - this is fast&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filter BitSet creation&lt;/strong&gt;: Processes ALL documents (fast, uses doc values)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BitSet intersection&lt;/strong&gt;: Very fast - just bitwise AND operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring optimization&lt;/strong&gt;: Only scores the 600 hotels in the intersection, not all 50,000&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The magic&lt;/strong&gt;: BitSet operations are fast even on large sets, but scoring is expensive - so we minimize scoring by intersecting BitSets first&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why This Works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Filters are fast&lt;/strong&gt;: BitSet operations are O(1) per document&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring is expensive&lt;/strong&gt;: BM25 calculation is O(term_frequency) per document&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduce scoring set&lt;/strong&gt;: Filter first to minimize expensive operations
&lt;strong&gt;Interview Takeaway: BitSet Caching and Precomputation&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Caching Strategy&lt;/strong&gt;: BitSets are cached for repeated queries (85-90% cache hit rate for popular filters)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Precomputation Opportunity&lt;/strong&gt;: Amenity-based BitSets can be precomputed since amenities don't change frequently

&lt;ul&gt;
&lt;li&gt;Common filters like "pool", "spa", "gym" can be precomputed and stored in memory&lt;/li&gt;
&lt;li&gt;Only updated when hotel amenities are added/removed (rare event)&lt;/li&gt;
&lt;li&gt;Provides instant filter execution without recalculating BitSets on every query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Insight&lt;/strong&gt;: Identify filters that change infrequently (amenities, location) vs. frequently (availability, price) - precompute the stable ones&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real-World Impact:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Query latency&lt;/strong&gt;: 200ms → 30ms (6.7x faster)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CPU usage&lt;/strong&gt;: 50,000 scoring operations → 600 operations (83x reduction)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt;: Cached BitSets reused for popular queries (85-90% cache hit rate)&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Text Search with Multi-Match
&lt;/h3&gt;

&lt;p&gt;Multi-match queries handle complex text search across multiple fields:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Field Boosting&lt;/strong&gt;: Hotel name gets higher weight than description&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Types&lt;/strong&gt;: best_fields, most_fields, cross_fields for different matching strategies&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fuzzy Matching&lt;/strong&gt;: Automatic typo tolerance and synonym expansion&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phrase Matching&lt;/strong&gt;: Exact phrase matching with proximity scoring&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  Nested Queries for Room Type Filtering
&lt;/h3&gt;

&lt;p&gt;Nested queries enable complex filtering within room type documents:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Room Type Filtering&lt;/strong&gt;: Find hotels with specific room types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Capacity Filtering&lt;/strong&gt;: Filter by guest count requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amenity Filtering&lt;/strong&gt;: Room-level amenities (wifi, minibar, ocean_view)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Range Filtering&lt;/strong&gt;: Filter by room type pricing&lt;/li&gt;
&lt;/ul&gt;
&lt;h3&gt;
  
  
  How Filtering Works (Inverted Index vs Filtering)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Understanding the Difference:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Text Search (Uses Inverted Index):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inverted Index&lt;/strong&gt;: Maps each word → list of documents containing that word&lt;/li&gt;
&lt;li&gt;Example: "deluxe" → [Hotel_A, Hotel_B, Hotel_C]&lt;/li&gt;
&lt;li&gt;Used for: Finding documents that contain specific words&lt;/li&gt;
&lt;li&gt;Fast for: "Find hotels with 'deluxe' in the name"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Filtering (Uses Doc Values):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Doc Values&lt;/strong&gt;: Columnar storage format - stores all values for a field together&lt;/li&gt;
&lt;li&gt;Example: All room prices stored in one column: [299, 599, 199, 450, ...]&lt;/li&gt;
&lt;li&gt;Used for: Exact matching, range queries, aggregations&lt;/li&gt;
&lt;li&gt;Fast for: "Find hotels with deluxe rooms under $300"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why Different Data Structures?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Inverted indexes are optimized for text search (finding which documents contain terms), but they're inefficient for filtering operations like numeric comparisons or exact matches. Doc values store data in a columnar format that's optimized for filtering, sorting, and aggregations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Comparison to Columnar Databases:&lt;/strong&gt;&lt;br&gt;
Elasticsearch's doc values use a similar storage strategy to columnar databases (e.g., Amazon Redshift, Google BigQuery, ClickHouse):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Columnar Storage&lt;/strong&gt;: All values for a field are stored together in a column (e.g., all prices: [299, 599, 199, 450])&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized for Analytics&lt;/strong&gt;: Fast filtering, sorting, and aggregations on numeric/categorical fields&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Difference&lt;/strong&gt;: Columnar databases store ALL data in columnar format, while Elasticsearch uses a hybrid approach:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Inverted indexes&lt;/strong&gt; (row-oriented) for text search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Doc values&lt;/strong&gt; (columnar) for filtering/aggregations&lt;/li&gt;
&lt;li&gt;This hybrid approach gives Elasticsearch both fast text search AND fast filtering in one system&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example: "Deluxe Rooms Under $300" Query&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Let's trace through how Elasticsearch handles this query:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Understanding the Data Structure&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Hotel Document:
  hotel_id: "hotel_123"
  name: "Luxury Hotel"
  room_types (nested):
    - {type: "deluxe", price: 299, capacity: 2}
    - {type: "suite", price: 599, capacity: 4}
    - {type: "standard", price: 199, capacity: 2}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;Pricing Reality (Interview Tip)&lt;/strong&gt;: In production the &lt;code&gt;price&lt;/code&gt; stored in Elasticsearch is usually a summary (&lt;code&gt;min_price_next_30_days&lt;/code&gt;, &lt;code&gt;max_price&lt;/code&gt;, or a bucketed price range). This lets Elasticsearch filter aggressively without forcing constant reindexing. The precise per-date price (and availability) comes from Redis/the pricing service during the availability step, ensuring tomorrow’s rate (e.g., $400) is accurate even if Elasticsearch still lists $300 as the minimum. The rule of thumb: &lt;strong&gt;use Elasticsearch for coarse filtering, Redis for rapidly-changing truth.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Nested Query Execution (Using BitSet Operations)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Access Nested Documents&lt;/strong&gt;: Elasticsearch treats each nested room type as a separate "mini-document"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Room Type 1: {type: "deluxe", price: 299}&lt;/li&gt;
&lt;li&gt;Room Type 2: {type: "suite", price: 599}&lt;/li&gt;
&lt;li&gt;Room Type 3: {type: "standard", price: 199}&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Filter by Type (BitSet Creation)&lt;/strong&gt;: Using doc values, create BitSet of nested documents with type = "deluxe"&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Room Type 1: type = "deluxe" ✓ → BitSet bit 1 = true&lt;/li&gt;
&lt;li&gt;Room Type 2: type = "suite" ✗ → BitSet bit 2 = false&lt;/li&gt;
&lt;li&gt;Room Type 3: type = "standard" ✗ → BitSet bit 3 = false&lt;/li&gt;
&lt;li&gt;Result: BitSet A = &lt;a href="https://dev.toonly%20Room%20Type%201%20matches"&gt;1, 0, 0&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Filter by Price (BitSet Creation)&lt;/strong&gt;: Using doc values, create BitSet of nested documents with price &amp;lt; 300&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Room Type 1: price = 299 &amp;lt; 300 ✓ → BitSet bit 1 = true&lt;/li&gt;
&lt;li&gt;Room Type 2: price = 599 &amp;lt; 300 ✗ → BitSet bit 2 = false&lt;/li&gt;
&lt;li&gt;Room Type 3: price = 199 &amp;lt; 300 ✓ → BitSet bit 3 = true&lt;/li&gt;
&lt;li&gt;Result: BitSet B = &lt;a href="https://dev.toRoom%20Type%201%20and%203%20match"&gt;1, 0, 1&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;BitSet Intersection&lt;/strong&gt;: A ∩ B = [1, 0, 0] ∩ [1, 0, 1] = [1, 0, 0]&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Only Room Type 1 matches both conditions (bitwise AND operation)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Return Parent Document&lt;/strong&gt;: If any nested document matches, the parent hotel document is included&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Result: Hotel_123 is returned&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Key Insight:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Doc Values&lt;/strong&gt;: Used to create BitSets for filtering operations (fast, cached)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BitSet Operations&lt;/strong&gt;: Filtering by type and price uses BitSet intersection (bitwise AND)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested Documents&lt;/strong&gt;: Each room type is stored separately, allowing independent BitSet creation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: BitSet operations are O(1) per document, much faster than scoring all documents&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Combination&lt;/strong&gt;: The query can use both - text search for hotel names AND BitSet filtering for room types&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance Comparison:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Operation&lt;/th&gt;
&lt;th&gt;Data Structure&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Text search ("deluxe" in name)&lt;/td&gt;
&lt;td&gt;Inverted Index&lt;/td&gt;
&lt;td&gt;Very Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filter (price &amp;lt; 300)&lt;/td&gt;
&lt;td&gt;Doc Values&lt;/td&gt;
&lt;td&gt;Very Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Filter (type = "deluxe")&lt;/td&gt;
&lt;td&gt;Doc Values&lt;/td&gt;
&lt;td&gt;Very Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Nested filter (deluxe AND price &amp;lt; 300)&lt;/td&gt;
&lt;td&gt;Doc Values + Nested&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inverted indexes are great for "find documents containing words"&lt;/li&gt;
&lt;li&gt;Doc values are great for "find documents matching criteria"&lt;/li&gt;
&lt;li&gt;Nested structures allow filtering on related data (room types) independently&lt;/li&gt;
&lt;li&gt;Combining both gives you powerful search + accurate filtering&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Query Execution Order and Performance
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Optimal Query Structure (Option 1 - Recommended):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Geographic Filters&lt;/strong&gt;: city, radius (most selective)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amenity Filters&lt;/strong&gt;: pool, spa, gym (moderately selective)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room Type Filters&lt;/strong&gt;: nested queries (selective)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Search&lt;/strong&gt;: multi-match queries (expensive)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scoring&lt;/strong&gt;: BM25 calculation (expensive but small set)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: Date range validation is handled by Redis after Elasticsearch filtering, not during the Elasticsearch query phase. This aligns with Option 1 (Redis/DB Only) where availability is not stored in Elasticsearch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Filter-first execution with BitSet caching provides 25-50x performance improvement by reducing expensive scoring operations to only the documents that pass all filters. Detailed BitSet mechanics and caching strategies are explained in the "Filter-First Execution Strategy" section above.&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Redis Availability Architecture
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Architecture Note&lt;/strong&gt;: Redis (or AWS ElastiCache) is used exclusively for availability data. Search result caching and other data caching are handled by Elasticsearch's built-in caching mechanisms or application-level caching.&lt;/p&gt;

&lt;h3&gt;
  
  
  Redis Data Structures for Availability
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Important: Room Type ID in All Keys&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All Redis availability keys include &lt;code&gt;room_type_id&lt;/code&gt; because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Hotels have multiple room types and variations (e.g., "suite with garden view" vs "suite with ocean view") with separate availability&lt;/li&gt;
&lt;li&gt;Each room type variation has a unique &lt;code&gt;room_type_id&lt;/code&gt; (e.g., "suite_garden_001", "suite_ocean_001")&lt;/li&gt;
&lt;li&gt;Elasticsearch filters by room type and returns &lt;code&gt;room_type_id&lt;/code&gt;, which is used to build Redis keys&lt;/li&gt;
&lt;li&gt;This allows separate availability tracking for each variation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Redis uses multiple data structures optimized for availability patterns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Simple Key-Value Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Key Format&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id:check_in:check_out&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Value&lt;/strong&gt;: "1" (at least one room available) or number of available rooms (e.g., "5" for 5 rooms available) or null (unavailable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Simple availability checks (not recommended due to update complexity)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(1) lookup time&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How the Key Gets Created:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Booking Event&lt;/strong&gt;: When a booking is made or cancelled for a specific room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room Type ID from Elasticsearch&lt;/strong&gt;: Elasticsearch nested query filters hotels by room type and returns &lt;code&gt;room_type_id&lt;/code&gt; (e.g., "suite_garden_001", "suite_ocean_001")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Generation&lt;/strong&gt;: Combine hotel_id, room_type_id, check_in date, and check_out date

&lt;ul&gt;
&lt;li&gt;Example: &lt;code&gt;availability:hotel_123:suite_garden_001:2024-06-15:2024-06-17&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Check&lt;/strong&gt;: Query database for availability of that room type variation across all dates in range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Creation&lt;/strong&gt;: If all dates are available for that room type variation, set key with value "1" (at least one available) or the number of available rooms (e.g., "5" for 5 rooms)

&lt;ul&gt;
&lt;li&gt;Redis command: &lt;code&gt;SET availability:hotel_123:suite_garden_001:2024-06-15:2024-06-17 "1" EX 14400&lt;/code&gt; (4 hour TTL) - for at least one room&lt;/li&gt;
&lt;li&gt;Redis command: &lt;code&gt;SET availability:hotel_123:suite_garden_001:2024-06-15:2024-06-17 "5" EX 14400&lt;/code&gt; (4 hour TTL) - for 5 available rooms&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Key Deletion&lt;/strong&gt;: If any date becomes unavailable for that room type variation, delete the key

&lt;ul&gt;
&lt;li&gt;Redis command: &lt;code&gt;DEL availability:hotel_123:suite_garden_001:2024-06-15:2024-06-17&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Trigger&lt;/strong&gt;: Keys are created/updated/deleted when:

&lt;ul&gt;
&lt;li&gt;Bookings are confirmed for specific room type variations&lt;/li&gt;
&lt;li&gt;Bookings are cancelled for specific room type variations&lt;/li&gt;
&lt;li&gt;New availability is added for specific room type variations&lt;/li&gt;
&lt;li&gt;Periodic sync from database&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Challenge: Finding Which Keys to Modify - Combinatorial Explosion&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The Problem:&lt;/strong&gt;&lt;br&gt;
When a booking is made for a specific date and room type variation (e.g., 2024-06-16, suite with ocean view), you need to invalidate all keys that include that date for that room type variation. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Combinatorial Explosion Example:&lt;/strong&gt;&lt;br&gt;
For a 30-day booking window, there are (30 × 31) / 2 = 465 possible check-in/check-out combinations. A single booking on one day (e.g., June 16th) would require finding and invalidating every single one of these 465 keys that contains "June 16th". This creates a combinatorial explosion of keys that must be tracked and updated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Example Keys to Invalidate:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:suite_ocean_001:2024-06-15:2024-06-17&lt;/code&gt; (includes 2024-06-16)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:suite_ocean_001:2024-06-16:2024-06-18&lt;/code&gt; (includes 2024-06-16)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:suite_ocean_001:2024-06-14:2024-06-17&lt;/code&gt; (includes 2024-06-16)&lt;/li&gt;
&lt;li&gt;... and 462 more keys for a 30-day window&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But Redis doesn't provide an efficient way to find all keys matching a date pattern.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Solution Options:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Option 1: Pattern Matching (Inefficient)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use &lt;code&gt;KEYS availability:hotel_123:*&lt;/code&gt; to find all keys for a hotel&lt;/li&gt;
&lt;li&gt;Filter keys that include the affected date&lt;/li&gt;
&lt;li&gt;Delete matching keys&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Problem&lt;/strong&gt;: &lt;code&gt;KEYS&lt;/code&gt; command is O(N) and blocks Redis, not suitable for production&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 2: Secondary Index (Recommended)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Maintain a separate index tracking which keys exist for each date and room type variation&lt;/li&gt;
&lt;li&gt;Use a hash: &lt;code&gt;availability_index:hotel_123:suite_ocean_001:2024-06-16&lt;/code&gt; → Set of all keys containing this date for this room type variation&lt;/li&gt;
&lt;li&gt;When booking affects 2024-06-16 for suite with ocean view:

&lt;ol&gt;
&lt;li&gt;Get all keys from index: &lt;code&gt;SMEMBERS availability_index:hotel_123:suite_ocean_001:2024-06-16&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Delete all those keys: &lt;code&gt;DEL key1 key2 key3...&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Remove index entry: &lt;code&gt;DEL availability_index:hotel_123:suite_ocean_001:2024-06-16&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When creating a key&lt;/strong&gt;: Add key to index for each date in range and room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(M) where M = number of keys containing that date for that room type variation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Option 3: Hash Structure (Better Alternative) - Solves Combinatorial Explosion&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use hash structure instead of simple keys (see Hash Structure section)&lt;/li&gt;
&lt;li&gt;Key format: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Hash fields: Individual dates (2024-06-15, 2024-06-16, etc.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Solves the Problem&lt;/strong&gt;: Instead of 465 keys for a 30-day window, you have only 30 hash fields (one per day)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;When booking affects 2024-06-16 for suite with garden view&lt;/strong&gt;: 

&lt;ul&gt;
&lt;li&gt;Simply update/delete the hash field: &lt;code&gt;HDEL availability:hotel_123:suite_garden_001 2024-06-16&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;No need to find which keys to modify - directly update the specific date field&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(1) - directly access the date field for specific room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Linear growth (30 fields for 30 days) vs. quadratic growth (465 keys for 30 days)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Recommendation:&lt;/strong&gt;&lt;br&gt;
For simple availability checks, &lt;strong&gt;Hash Structure is preferred&lt;/strong&gt; because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No need to track which keys to invalidate&lt;/li&gt;
&lt;li&gt;Direct date-based access (O(1))&lt;/li&gt;
&lt;li&gt;Easier to update individual dates&lt;/li&gt;
&lt;li&gt;More efficient for date range queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hash Structure:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Key Format&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Fields&lt;/strong&gt;: Individual dates representing nights (2024-06-15, 2024-06-16, 2024-06-17, etc.)

&lt;ul&gt;
&lt;li&gt;Each date field represents availability for that night&lt;/li&gt;
&lt;li&gt;Example: Field "2024-06-15" = availability for the night of June 15th&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Values&lt;/strong&gt;: JSON object with &lt;code&gt;room_count&lt;/code&gt; and &lt;code&gt;last_updated&lt;/code&gt; (price comes from Elasticsearch/database, not Redis)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Detailed availability per room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(1) field access&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:deluxe_001&lt;/code&gt; (for deluxe rooms)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:suite_garden_001&lt;/code&gt; (for suite with garden view)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;availability:hotel_123:suite_ocean_001&lt;/code&gt; (for suite with ocean view)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Advantage: Direct Date-Based Updates&lt;/strong&gt;&lt;br&gt;
Unlike the simple key-value pattern, hash structure allows direct updates to specific dates without needing to find which keys to modify. When a booking affects 2024-06-16 for a specific room type variation (e.g., suite with garden view), you directly update that hash field - no pattern matching or secondary indexes needed.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How the Hash Gets Created:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Initial Creation&lt;/strong&gt;: When hotel availability data is loaded into Redis for a specific room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Key&lt;/strong&gt;: Create hash with key &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Example: &lt;code&gt;availability:hotel_123:suite_garden_001&lt;/code&gt; (for suite with garden view)&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;availability:hotel_123:suite_ocean_001&lt;/code&gt; (for suite with ocean view)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Field Population&lt;/strong&gt;: For each date with availability, add a field:

&lt;ul&gt;
&lt;li&gt;Redis command: &lt;code&gt;HSET availability:hotel_123:suite_garden_001 2024-06-15 '{"room_count":5,"last_updated":"2024-06-15T10:30:00Z"}'&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Redis command: &lt;code&gt;HSET availability:hotel_123:suite_garden_001 2024-06-16 '{"room_count":3,"last_updated":"2024-06-15T10:30:00Z"}'&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field Updates&lt;/strong&gt;: When availability changes for that room type variation:

&lt;ul&gt;
&lt;li&gt;Update specific date field: &lt;code&gt;HSET availability:hotel_123:suite_garden_001 2024-06-15 '{"room_count":4,"last_updated":"2024-06-15T14:20:00Z"}'&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field Deletion&lt;/strong&gt;: When date becomes unavailable for that room type variation:

&lt;ul&gt;
&lt;li&gt;Remove field: &lt;code&gt;HDEL availability:hotel_123:suite_garden_001 2024-06-17&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTL&lt;/strong&gt;: Set expiration on entire hash: &lt;code&gt;EXPIRE availability:hotel_123:suite_garden_001 14400&lt;/code&gt; (4 hours)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Triggers&lt;/strong&gt;: Hash is updated when:

&lt;ul&gt;
&lt;li&gt;Bookings change availability for specific room type variations and dates&lt;/li&gt;
&lt;li&gt;Periodic batch sync from database (availability data only)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h3&gt;
  
  
  Atomic Operations for OLTP Transactions: Redis Hash Design
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;The OLTP Transaction Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The entire booking transaction (decrementing room count) is a &lt;strong&gt;single, atomic Redis command&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HINCRBY availability:hotel_123:suite_garden_001 2024-06-15 -1
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why This Works for OLTP:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;O(1) Performance&lt;/strong&gt;: Constant time operation, designed for high-throughput scenarios&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic&lt;/strong&gt;: The increment/decrement operation is atomic—no race conditions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Single Operation&lt;/strong&gt;: No multi-step process, no re-indexing, no segment merging&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conceptual Opposite of Elasticsearch&lt;/strong&gt;: This is the exact opposite of Elasticsearch's multi-stage, resource-intensive re-indexing "update"&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Hash Structure Design:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis Key&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt; (e.g., &lt;code&gt;availability:hotel_123:suite_garden_001&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Fields&lt;/strong&gt;: &lt;code&gt;{date}&lt;/code&gt; (e.g., &lt;code&gt;2024-06-15&lt;/code&gt;, &lt;code&gt;2024-06-16&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Values&lt;/strong&gt;: Integer representing room count (e.g., &lt;code&gt;10&lt;/code&gt;, &lt;code&gt;5&lt;/code&gt;, &lt;code&gt;0&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Alternative: JSON Value with Room Count&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The document shows JSON values with &lt;code&gt;room_count&lt;/code&gt; and &lt;code&gt;last_updated&lt;/code&gt;. For pure OLTP counter operations, you can also use simple integer values:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Simple Integer&lt;/strong&gt;: &lt;code&gt;HSET availability:hotel_123:suite_garden_001 2024-06-15 10&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Decrement&lt;/strong&gt;: &lt;code&gt;HINCRBY availability:hotel_123:suite_garden_001 2024-06-15 -1&lt;/code&gt; → Result: &lt;code&gt;9&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Increment&lt;/strong&gt;: &lt;code&gt;HINCRBY availability:hotel_123:suite_garden_001 2024-06-15 1&lt;/code&gt; → Result: &lt;code&gt;10&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Date Range Check (OLAP Query Pattern):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Checking availability for a 5-night stay is a single, efficient command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;HMGET availability:hotel_123:suite_garden_001 2024-06-15 2024-06-16 2024-06-17 2024-06-18 2024-06-19
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Performance&lt;/strong&gt;: O(N) where N = number of fields requested (i.e., number of nights), &lt;strong&gt;not&lt;/strong&gt; the total number of documents in the database. This is extremely fast.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Redis Hash structure with atomic HINCRBY operations provides the exact OLTP capabilities needed for high-frequency availability updates—simple, fast, atomic, and designed for this exact workload.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Date Range Validation with Hash Structure (Room Type Variation Aware):&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical: Hotel Bookings are for Nights&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Guest checks in on check-in date (stays that night)&lt;/li&gt;
&lt;li&gt;Guest checks out on check-out date (stays the previous night, leaves on check-out date)&lt;/li&gt;
&lt;li&gt;Room is occupied on check-in date and (check-out - 1 day), but available again on check-out date&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Date range to check&lt;/strong&gt;: &lt;a href="https://dev.toinclusive"&gt;check_in_date, check_out_date - 1 day&lt;/a&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Room Type ID from Elasticsearch&lt;/strong&gt;: Elasticsearch nested query filters hotels by room type and returns &lt;code&gt;room_type_id&lt;/code&gt; (e.g., "suite_garden_001", "suite_ocean_001")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate date range&lt;/strong&gt;: Dates from check-in to (check-out - 1 day) inclusive&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hash Key&lt;/strong&gt;: Use &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt; (e.g., &lt;code&gt;availability:hotel_123:suite_garden_001&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch check&lt;/strong&gt;: &lt;code&gt;HMGET availability:hotel_id:room_type_id date1 date2...&lt;/code&gt; (all nights in range)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validation&lt;/strong&gt;: If all dates return non-null values with room_count &amp;gt; 0, room type variation is available for entire stay&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(N) where N = number of nights (check-out - check-in), not total available dates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: Check-in 2024-06-15, check-out 2024-06-17 → Check dates [2024-06-15, 2024-06-16]&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Sorted Set Pattern:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Key Format&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id:dates&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Members&lt;/strong&gt;: date strings (2024-06-15, 2024-06-16)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scores&lt;/strong&gt;: availability status (1 = available, 0 = unavailable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Date range queries with ranking per room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(log N) for range queries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Examples:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;availability:hotel_123:suite_garden_001:dates&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;availability:hotel_123:suite_ocean_001:dates&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How the Sorted Set Gets Created:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Initial Creation&lt;/strong&gt;: When hotel availability data is loaded for a specific room type variation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sorted Set Key&lt;/strong&gt;: Create sorted set with key &lt;code&gt;availability:hotel_id:room_type_id:dates&lt;/code&gt;

&lt;ul&gt;
&lt;li&gt;Example: &lt;code&gt;availability:hotel_123:suite_garden_001:dates&lt;/code&gt; (for suite with garden view)&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;availability:hotel_123:suite_ocean_001:dates&lt;/code&gt; (for suite with ocean view)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Member Addition&lt;/strong&gt;: For each available date for that room type variation, add as member with score 1:

&lt;ul&gt;
&lt;li&gt;Redis command: &lt;code&gt;ZADD availability:hotel_123:suite_garden_001:dates 1 "2024-06-15"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Redis command: &lt;code&gt;ZADD availability:hotel_123:suite_garden_001:dates 1 "2024-06-16"&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Redis command: &lt;code&gt;ZADD availability:hotel_123:suite_garden_001:dates 1 "2024-06-17"&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Member Updates&lt;/strong&gt;: When availability changes for that room type variation:

&lt;ul&gt;
&lt;li&gt;Mark available: &lt;code&gt;ZADD availability:hotel_123:suite_garden_001:dates 1 "2024-06-17"&lt;/code&gt; (add date with score 1)&lt;/li&gt;
&lt;li&gt;Mark unavailable: &lt;code&gt;ZREM availability:hotel_123:suite_garden_001:dates "2024-06-17"&lt;/code&gt; (remove date from set)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clean Set&lt;/strong&gt;: Only available dates are stored in the sorted set (no "unavailable" members with score 0)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Range Queries&lt;/strong&gt;: Query all available dates for that room type variation:

&lt;ul&gt;
&lt;li&gt;Redis command: &lt;code&gt;ZRANGE availability:hotel_123:suite_garden_001:dates 0 -1&lt;/code&gt; (get all available dates, sorted)&lt;/li&gt;
&lt;li&gt;Filter dates in desired range (e.g., June 2024)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TTL&lt;/strong&gt;: Set expiration: &lt;code&gt;EXPIRE availability:hotel_123:suite_garden_001:dates 14400&lt;/code&gt; (4 hours)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Triggers&lt;/strong&gt;: Sorted set is updated when:

&lt;ul&gt;
&lt;li&gt;Bookings change availability&lt;/li&gt;
&lt;li&gt;New availability periods are added&lt;/li&gt;
&lt;li&gt;Periodic sync from database&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Hash Structure vs Sorted Set: When to Use Each
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hash Structure Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;O(1) field access&lt;/strong&gt;: Direct access to any date&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct updates&lt;/strong&gt;: Update specific dates without pattern matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simple date range validation&lt;/strong&gt;: Check all dates in range using HMGET&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient storage&lt;/strong&gt;: Only store dates that have availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better for&lt;/strong&gt;: Checking if specific dates are available, updating individual dates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Use Case:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query: "Is hotel_123 available for check-in 2024-06-15 to check-out 2024-06-17?"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Date range to check&lt;/strong&gt;: &lt;a href="https://dev.tosee%20Section%206%20for%20explanation%20of%20nights%20vs%20days"&gt;2024-06-15, 2024-06-16&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Hash: &lt;code&gt;HMGET availability:hotel_123:suite_garden_001 2024-06-15 2024-06-16&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Check if all dates return non-null values&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(N) where N = number of nights (check-out - check-in)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Sorted Set Advantages:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Range queries&lt;/strong&gt;: Efficiently find all available dates in a date range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sorted order&lt;/strong&gt;: Dates are automatically sorted, making range queries fast&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bulk operations&lt;/strong&gt;: Get all available dates in a range with one query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better for&lt;/strong&gt;: Finding all available dates in a future period, date range discovery&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Use Case:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query: "Find all available dates for hotel_123 in June 2024"&lt;/li&gt;
&lt;li&gt;Sorted Set: &lt;code&gt;ZRANGE availability:hotel_123:suite_garden_001:dates 0 -1&lt;/code&gt; (get all available dates, sorted)&lt;/li&gt;
&lt;li&gt;Filter dates in June 2024 range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: O(log N + M) where N = total dates, M = dates in range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Advantage&lt;/strong&gt;: Clean set with only available dates, no need to filter out "unavailable" members&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Comparison Table:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Hash Structure&lt;/th&gt;
&lt;th&gt;Sorted Set&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Single Date Check&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;O(log N)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Date Range Check&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;O(N) - check each date&lt;/td&gt;
&lt;td&gt;O(log N + M) - efficient range query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Update Single Date&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;O(1) - direct update&lt;/td&gt;
&lt;td&gt;O(log N) - need to find member&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Find All Available Dates&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;O(N) - scan all fields&lt;/td&gt;
&lt;td&gt;O(log N + M) - efficient range query&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory Efficiency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Efficient (only available dates)&lt;/td&gt;
&lt;td&gt;Efficient (only available dates)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Use Case&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Check specific date ranges&lt;/td&gt;
&lt;td&gt;Discover available dates in periods&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Recommendation for Hotel Search:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use Hash Structure&lt;/strong&gt; for the recommended approach because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Most queries check specific date ranges (check-in to check-out)&lt;/li&gt;
&lt;li&gt;Need to verify all dates in range are available&lt;/li&gt;
&lt;li&gt;Hash structure provides O(1) access to specific dates&lt;/li&gt;
&lt;li&gt;Simpler to update when bookings change&lt;/li&gt;
&lt;li&gt;More intuitive for date range validation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use Sorted Set&lt;/strong&gt; if you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;To find all available dates in a future period (e.g., "show me all available dates in July")&lt;/li&gt;
&lt;li&gt;Efficient range queries for date discovery&lt;/li&gt;
&lt;li&gt;Automatic sorting of dates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;For Most Hotel Search Systems&lt;/strong&gt;: Hash structure is the better choice because the primary use case is checking if a specific date range is available, not discovering all available dates.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Naming Conventions
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Redis Key Patterns (Availability Only):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hash Structure (Recommended)&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt; (hash fields are dates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simple Key-Value&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id:check_in:check_out&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sorted Set&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id:dates&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  TTL Strategies for Availability Data
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Time-to-Live Configuration (Redis/ElastiCache):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Availability Keys&lt;/strong&gt;: 1-4 hours (matches booking timeout, ensures fresh availability data)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Atomic Operations for Concurrent Booking
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Important: Database is the Final Authority&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The primary database is the source of truth for all availability data. Redis is a fast access layer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt;: Authoritative data, transactional consistency, ACID guarantees&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis&lt;/strong&gt;: Performance optimization layer, real-time access, eventual consistency with database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synchronization&lt;/strong&gt;: Redis must be updated to reflect database state, not the other way around&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Conflict Resolution&lt;/strong&gt;: If Redis and database disagree, database state is correct&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Race Condition Prevention:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;SET with NX&lt;/strong&gt;: Only set if key doesn't exist (prevent concurrent updates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WATCH/MULTI/EXEC&lt;/strong&gt;: Transactional updates for complex operations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lua Scripts&lt;/strong&gt;: Atomic multi-step operations&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Booking Flow (Database as Authority):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Check Redis&lt;/strong&gt;: Fast availability check in Redis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reserve in Database&lt;/strong&gt;: Create booking record in database (authoritative)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Redis&lt;/strong&gt;: Update Redis to reflect database state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If Database Fails&lt;/strong&gt;: Booking is not confirmed, Redis remains unchanged&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If Redis Fails&lt;/strong&gt;: Database state is correct, Redis will be synced later&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Real-Time Update Patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Update Flow: Database → Redis (One-Way Sync)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;All updates originate from the database. Redis is updated to reflect database state:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update Strategies:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Immediate Updates&lt;/strong&gt;: After database transaction commits, immediately update Redis to reflect new availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Updates&lt;/strong&gt;: Periodic sync from database to Redis to catch any missed updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event-Driven&lt;/strong&gt;: Database triggers or change data capture (CDC) events update Redis in real-time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fallback Sync&lt;/strong&gt;: Periodic reconciliation with database to ensure Redis matches database state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Update Order (Critical):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Database Transaction&lt;/strong&gt;: Make changes in database first (authoritative)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Commit&lt;/strong&gt;: Wait for database commit to succeed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Update&lt;/strong&gt;: Update Redis to match database state&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;If Redis Update Fails&lt;/strong&gt;: Database state is still correct, Redis will be synced later&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Why This Matters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Data Integrity&lt;/strong&gt;: Database transactions ensure consistency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis is Cache&lt;/strong&gt;: Redis can be rebuilt from database if needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Data Loss&lt;/strong&gt;: If Redis fails, database has all the data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Eventual Consistency&lt;/strong&gt;: Redis may be temporarily out of sync, but database is always correct&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Redis provides sub-millisecond availability checks through optimized data structures and atomic operations, enabling real-time booking systems that would be impossible with database-only approaches. The primary database remains the final authority (see "Important: Database is the Final Authority" above) - all updates flow from database to Redis, ensuring data integrity and consistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  7. Hybrid Search Execution Flow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Sequential vs Parallel Execution
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Sequential Execution (Simple but Slower):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Elasticsearch search (25ms) → Get candidate hotels&lt;/li&gt;
&lt;li&gt;Redis availability check (10ms) → Filter by availability&lt;/li&gt;
&lt;li&gt;Total: 35ms&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Parallel Execution (Complex but Faster):&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start Elasticsearch search (25ms) and Redis availability check (15ms) simultaneously&lt;/li&gt;
&lt;li&gt;Wait for both to complete (max of both = 25ms)&lt;/li&gt;
&lt;li&gt;Filter Elasticsearch results by Redis availability&lt;/li&gt;
&lt;li&gt;Total: 25ms (28% improvement)&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Elasticsearch Candidate Generation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Search Phase:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Geographic Filtering&lt;/strong&gt;: City, radius, location-based filtering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amenity Filtering&lt;/strong&gt;: Pool, spa, gym, restaurant availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Search&lt;/strong&gt;: Hotel name, description, location text matching&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room Type Filtering&lt;/strong&gt;: Nested queries for room type requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Range Filtering&lt;/strong&gt;: Room type pricing constraints&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Result Set:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Candidate Hotels&lt;/strong&gt;: 100-1000 hotels matching search criteria&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hotel Metadata&lt;/strong&gt;: Name, location, amenities, room types&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search Scores&lt;/strong&gt;: Relevance ranking for result ordering&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Redis Availability Validation
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;How Elasticsearch and Redis Work Together:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Elasticsearch Filters by Room Type&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User search includes room type filter (e.g., "suite with ocean view")&lt;/li&gt;
&lt;li&gt;Elasticsearch nested query filters hotels that have matching room type variations (e.g., suite with ocean view)&lt;/li&gt;
&lt;li&gt;Elasticsearch returns: List of hotels with matching room types (hotel metadata + room type info including &lt;code&gt;room_type_id&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result&lt;/strong&gt;: We know which hotels have matching room types and which &lt;code&gt;room_type_id&lt;/code&gt; to check in Redis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Redis Checks Availability for Specific Room Type Variation&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;For each hotel from Elasticsearch, we get the &lt;code&gt;room_type_id&lt;/code&gt; (e.g., "suite_ocean_001", "deluxe_001")&lt;/li&gt;
&lt;li&gt;Use hash key: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt; (e.g., &lt;code&gt;availability:hotel_123:suite_ocean_001&lt;/code&gt;)&lt;/li&gt;
&lt;li&gt;Check availability for the specific room type variation across date range&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Availability Check Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Extract Room Type ID&lt;/strong&gt;: From Elasticsearch results, get &lt;code&gt;room_type_id&lt;/code&gt; (e.g., "suite_ocean_001", "deluxe_001")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate Date List&lt;/strong&gt;: Create list of dates from check-in to (check-out - 1 day) inclusive

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical&lt;/strong&gt;: Hotel bookings are for nights, not days (see Section 6 for detailed explanation)&lt;/li&gt;
&lt;li&gt;Dates to check: [check_in_date, check_out_date - 1 day]&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Build Hash Key&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HMGET Check&lt;/strong&gt;: &lt;code&gt;HMGET availability:hotel_id:room_type_id date1 date2...&lt;/code&gt; (all nights in range)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Validate Results&lt;/strong&gt;: Check if all dates have non-null values with room_count &amp;gt; 0&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Example Flow:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User Query: "Suite with ocean view in Los Angeles, check-in: 2024-06-15, check-out: 2024-06-17"

Elasticsearch:
  - Filters: city=Los Angeles, room_type="suite", view="ocean"
  - Returns: [hotel_123 (has suite_ocean_001), hotel_456 (has suite_ocean_001), hotel_789 (has suite_ocean_001)]
  - Each result includes: hotel_id, room_type_id="suite_ocean_001"

Redis Availability Check:
  - Dates to check: [2024-06-15, 2024-06-16] (see Section 6 for explanation of date range logic)
  - For hotel_123: HMGET availability:hotel_123:suite_ocean_001 "2024-06-15" "2024-06-16"
  - For hotel_456: HMGET availability:hotel_456:suite_ocean_001 "2024-06-15" "2024-06-16"
  - For hotel_789: HMGET availability:hotel_789:suite_ocean_001 "2024-06-15" "2024-06-16"

Results:
  - hotel_123: [value, value] → Available (both nights have suite with ocean view)
  - hotel_456: [value, null] → Unavailable (2024-06-16 has no suite with ocean view available)
  - hotel_789: [value, value] → Available
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Capacity Validation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If user needs multiple rooms (e.g., 2 suites with ocean view):

&lt;ul&gt;
&lt;li&gt;Check &lt;code&gt;room_count&lt;/code&gt; in hash values: &lt;code&gt;{"room_count": 5, ...}&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Ensure &lt;code&gt;room_count &amp;gt;= required_rooms&lt;/code&gt; for all dates in range&lt;/li&gt;
&lt;li&gt;Example: If user needs 2 rooms, but room_count is 1 → Unavailable&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Price Validation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Price comes from Elasticsearch room type data (Redis is exclusively for availability - see Section 6)&lt;/li&gt;
&lt;li&gt;Check price from Elasticsearch results against user's budget&lt;/li&gt;
&lt;li&gt;Ensure price is within budget for the selected room type variation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Filtering Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Hash Key Pattern&lt;/strong&gt;: &lt;code&gt;availability:hotel_id:room_type_id&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Operations&lt;/strong&gt;: Use Redis pipeline to batch multiple HMGET commands&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Early Exit&lt;/strong&gt;: Stop checking if any date returns null (hotel unavailable)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result Filtering&lt;/strong&gt;: Remove hotels where room type is unavailable for any date in range&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Result Merging and Ranking Strategy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Merging Process:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Results&lt;/strong&gt;: Ranked by relevance score&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Availability&lt;/strong&gt;: Binary available/unavailable status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Intersection&lt;/strong&gt;: Keep only available hotels from Elasticsearch results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Final Ranking&lt;/strong&gt;: Maintain Elasticsearch relevance order for available hotels&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Ranking Factors:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search Relevance&lt;/strong&gt;: BM25 score from Elasticsearch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Status&lt;/strong&gt;: Available hotels ranked higher&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Competitiveness&lt;/strong&gt;: Lower prices get slight boost&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;User Preferences&lt;/strong&gt;: Historical booking patterns, loyalty status&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Interview Tip: Streaming Results with Pagination&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Once Elasticsearch returns ranked results, you don't need to wait for all availability checks before sending results to the UI. Instead, use a streaming/pagination approach:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Streaming Optimization:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Returns&lt;/strong&gt;: Ranked list of hotels (e.g., 500 hotels)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Processing&lt;/strong&gt;: Process hotels in batches of 10-20&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Check&lt;/strong&gt;: Check availability for first batch (e.g., first 10 hotels)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Send to UI&lt;/strong&gt;: Immediately send available hotels from first batch to UI&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Continue Processing&lt;/strong&gt;: While UI displays first page, check availability for next batch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pagination&lt;/strong&gt;: As user scrolls, send next batch of available hotels&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Benefits:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Time to First Result (TTFR)&lt;/strong&gt;: User sees results faster (e.g., 30ms instead of 200ms)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better UX&lt;/strong&gt;: Progressive loading feels faster than waiting for all results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reduced Latency&lt;/strong&gt;: Don't check availability for hotels user might never see&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Efficient Resource Usage&lt;/strong&gt;: Only process what's needed for current page&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Flow:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Elasticsearch: Returns 500 ranked hotels (25ms)
↓
Batch 1: Check availability for hotels 1-10 (5ms) → Send to UI
↓
Batch 2: Check availability for hotels 11-20 (5ms) → Cache for next page
↓
Batch 3: Check availability for hotels 21-30 (5ms) → Cache for next page
...
User scrolls → Load cached batch 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Insight&lt;/strong&gt;: Since results are already ranked by Elasticsearch, you can stream them in order. The UI only needs the first 10-20 results immediately, so there's no need to wait for all 500 hotels to be checked.&lt;/p&gt;

&lt;h3&gt;
  
  
  Performance Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Execution Pattern&lt;/th&gt;
&lt;th&gt;Elasticsearch&lt;/th&gt;
&lt;th&gt;Redis&lt;/th&gt;
&lt;th&gt;Total Time&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Sequential&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;25ms&lt;/td&gt;
&lt;td&gt;10ms&lt;/td&gt;
&lt;td&gt;35ms&lt;/td&gt;
&lt;td&gt;Simple&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Parallel&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;25ms&lt;/td&gt;
&lt;td&gt;15ms&lt;/td&gt;
&lt;td&gt;25ms&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Batch Redis&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;25ms&lt;/td&gt;
&lt;td&gt;5ms&lt;/td&gt;
&lt;td&gt;30ms&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cached Results&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;5ms&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;7ms&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Parallel execution provides 28% performance improvement while batch Redis operations can reduce availability check time by 50%, making the hybrid approach significantly faster than sequential processing.&lt;/p&gt;

&lt;h2&gt;
  
  
  8. APIs and Query Flow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Search API Endpoint Structure
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Primary Endpoint&lt;/strong&gt;: GET /api/v1/hotels/search&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Parameters:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;location&lt;/strong&gt; (required): City name or coordinates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;check_in&lt;/strong&gt; (required): Check-in date (YYYY-MM-DD)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;check_out&lt;/strong&gt; (required): Check-out date (YYYY-MM-DD)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;guests&lt;/strong&gt; (optional): Number of guests (default: 2)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;room_type&lt;/strong&gt; (optional): Specific room type (deluxe, suite)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;amenities&lt;/strong&gt; (optional): Array of amenities (pool, spa, gym)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;price_min/max&lt;/strong&gt; (optional): Price range constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;radius&lt;/strong&gt; (optional): Search radius in kilometers (default: 30)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;sort&lt;/strong&gt; (optional): Sort order (relevance, price, rating, distance)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Step-by-Step Query Execution Flow
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Phase 1: Query Processing (5ms)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Input Validation&lt;/strong&gt;: Validate dates, location, parameters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Normalization&lt;/strong&gt;: Standardize location names, date formats&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parameter Extraction&lt;/strong&gt;: Extract search criteria and filters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Check&lt;/strong&gt;: Check for cached results&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Phase 2: Elasticsearch Search (25ms)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Geographic Filter&lt;/strong&gt;: Filter by location and radius&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Amenity Filter&lt;/strong&gt;: Filter by requested amenities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Room Type Filter&lt;/strong&gt;: Nested queries for room type requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text Search&lt;/strong&gt;: Multi-match queries for hotel names and descriptions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Filter&lt;/strong&gt;: Range queries for price constraints&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Result Ranking&lt;/strong&gt;: BM25 scoring and relevance ranking&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Phase 3: Redis Availability Check (10ms)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Hotel ID Extraction&lt;/strong&gt;: Get hotel IDs from Elasticsearch results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Keys&lt;/strong&gt;: Generate Redis keys for date range&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch Check&lt;/strong&gt;: MGET operation for multiple hotels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Filtering&lt;/strong&gt;: Remove unavailable hotels&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Price Validation&lt;/strong&gt;: Verify pricing within constraints&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Phase 4: Result Assembly (5ms)&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Result Merging&lt;/strong&gt;: Combine Elasticsearch and Redis results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Final Ranking&lt;/strong&gt;: Apply availability and pricing factors&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Response Formatting&lt;/strong&gt;: Structure response with hotel details&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Storage&lt;/strong&gt;: Store results for future queries&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Response Format with Availability and Pricing
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Response Structure:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hotels"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hotel_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"hotel_123"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Luxury Downtown Hotel"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"location"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Los Angeles"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"coordinates"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;34.0522&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;-118.2437&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"123 Main St"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"rating"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"amenities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"pool"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"spa"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"gym"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"room_types"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"deluxe"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"capacity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"price_per_night"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;299&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"available"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"rooms_left"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"total_price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;598&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"search_score"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_results"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;150&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"search_time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;35&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"cache_hit"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Error Handling and Fallback Strategies
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Error Scenarios:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Unavailable&lt;/strong&gt;: Fallback to database with reduced functionality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Unavailable&lt;/strong&gt;: Skip availability filtering, show all results&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Timeout Errors&lt;/strong&gt;: Return partial results with warning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Invalid Parameters&lt;/strong&gt;: Return error with parameter validation details&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Fallback Strategies:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database Fallback&lt;/strong&gt;: Use PostgreSQL for basic search when Elasticsearch fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cached Results&lt;/strong&gt;: Serve cached results during outages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Degraded Mode&lt;/strong&gt;: Reduce functionality but maintain basic search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful Degradation&lt;/strong&gt;: Show maintenance message with trending hotels&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Monitoring and Alerting:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency Monitoring&lt;/strong&gt;: Alert if response time &amp;gt; 100ms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Rate Monitoring&lt;/strong&gt;: Alert if error rate &amp;gt; 5%&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Performance&lt;/strong&gt;: Monitor cache hit rates and memory usage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability Monitoring&lt;/strong&gt;: Track Elasticsearch and Redis health&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: The API design balances comprehensive search capabilities with performance optimization, using parallel execution and intelligent caching to maintain sub-100ms response times while providing rich search functionality and robust error handling.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Soundbite for interviews&lt;/strong&gt;: "Hotel search requires a hybrid Elasticsearch + Redis architecture because no single system can efficiently handle both complex search/filtering and real-time availability updates. The key is using Elasticsearch for what it does best (search and filtering) and Redis for what it does best (real-time data), with parallel execution improving time to first results."&lt;/p&gt;

&lt;h2&gt;
  
  
  Next
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://dev.to/sumedhbala/hotel-booking-schema-design-comparison-g3h"&gt;Hotel Booking Schema Design&lt;/a&gt;&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>elasticsearch</category>
      <category>redis</category>
      <category>bitmaps</category>
    </item>
    <item>
      <title>Part 4: Payments and Ticket Issuance</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Thu, 23 Oct 2025 17:51:10 +0000</pubDate>
      <link>https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h</link>
      <guid>https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h</guid>
      <description>&lt;h2&gt;
  
  
  Idempotency, Consistency, and Race-Safe Design
&lt;/h2&gt;

&lt;p&gt;This section focuses on the critical final steps of the user journey: securely processing payments and reliably issuing digital tickets. These operations demand high integrity, fault tolerance, and consistency to prevent double-charging, lost tickets, or overselling — while also protecting sensitive financial data and preventing race conditions.&lt;/p&gt;

&lt;p&gt;Today, we're going to architect a production-grade solution using a modern, event-driven approach. Our design is built on three key principles:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Microservice Architecture:&lt;/strong&gt; We'll separate our system into distinct services, each with one job.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Saga Pattern:&lt;/strong&gt; We'll manage the complex, multi-step purchase process as a single, coordinated workflow.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Transactional Outbox:&lt;/strong&gt; We'll use this pattern as the "atomic glue" that guarantees our services stay in sync, even if one of them fails.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Saga Pattern
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Saga Pattern&lt;/strong&gt; is a design pattern for managing distributed transactions across multiple services. Instead of using traditional ACID transactions (which don't work well across service boundaries), sagas break down complex business processes into a sequence of local transactions, each with a corresponding compensating action.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each step in the saga is a local transaction within a single service&lt;/li&gt;
&lt;li&gt;If any step fails, the saga executes compensating transactions to undo previous steps&lt;/li&gt;
&lt;li&gt;This ensures eventual consistency across the entire system&lt;/li&gt;
&lt;li&gt;Perfect for complex workflows like: Reserve Seats → Process Payment → Generate Tickets → Send Notifications&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Transactional Outbox Pattern
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Transactional Outbox Pattern&lt;/strong&gt; solves the dual-write problem in distributed systems. When you need to update your database AND publish an event (like sending a notification), you can't do both atomically across service boundaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it's useful:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Atomicity&lt;/strong&gt;: Write to database and outbox table in the same transaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt;: Guarantees events are eventually published, even if the service fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: Ensures database state and event publishing stay in sync&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilience&lt;/strong&gt;: Handles network failures, service restarts, and partial failures gracefully&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Application writes business data AND event data to the same database transaction&lt;/li&gt;
&lt;li&gt;A separate process (outbox processor) reads from the outbox table&lt;/li&gt;
&lt;li&gt;Publishes events to message queues (Kafka, RabbitMQ, etc.)&lt;/li&gt;
&lt;li&gt;Marks events as processed to prevent duplicates&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The central theme is &lt;strong&gt;separation of concerns&lt;/strong&gt;. Let's see how it works.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Architecture: Separating Our Concerns
&lt;/h2&gt;

&lt;p&gt;We'll decompose our system into three primary services. The key is to define what each service is—and is not—responsible for.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Order Management Service:&lt;/strong&gt; This is our "smart" orchestrator. It acts as the brain of the operation, managing the business workflow (the Saga). Its only concern is the &lt;em&gt;state&lt;/em&gt; of an order (e.g., &lt;code&gt;pending&lt;/code&gt;, &lt;code&gt;paid&lt;/code&gt;, &lt;code&gt;tickets_issued&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Payment Processing Service:&lt;/strong&gt; This is a "dumb," single-purpose service. Its &lt;em&gt;only&lt;/em&gt; concern is interacting with a payment gateway (like Stripe) and reporting success or failure. It knows nothing about seats or tickets.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ticket Issuance Service:&lt;/strong&gt; This is another "dumb" service. Its &lt;em&gt;only&lt;/em&gt; concern is managing inventory (seats, sections) and generating ticket records. It knows nothing about credit cards or payment intents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This separation is what allows us to scale and maintain the system. We can update our payment provider without ever touching the ticket generation code. If the ticket service is slow, it won't block new payments from being accepted.&lt;/p&gt;

&lt;p&gt;But how do these separate services talk to each other reliably?&lt;/p&gt;




&lt;h2&gt;
  
  
  The Glue: A Saga for Distributed Transactions
&lt;/h2&gt;

&lt;p&gt;You can't use a single database &lt;code&gt;COMMIT&lt;/code&gt; across three different services. So, how do you guarantee that a successful payment &lt;em&gt;always&lt;/em&gt; results in a ticket?&lt;/p&gt;

&lt;p&gt;This is where the &lt;strong&gt;Saga&lt;/strong&gt; and &lt;strong&gt;Transactional Outbox&lt;/strong&gt; patterns come in.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;The Saga:&lt;/strong&gt; Our "Order Management Service" will manage the purchase as a step-by-step process.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Step 1:&lt;/strong&gt; Tell the &lt;code&gt;Payment Service&lt;/code&gt; to process a payment.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Step 2 (if success):&lt;/strong&gt; Tell the &lt;code&gt;Ticket Issuance Service&lt;/code&gt; to generate tickets.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Step 3 (if fail):&lt;/strong&gt; Tell the &lt;code&gt;Ticket Issuance Service&lt;/code&gt; (or Reservation Service) to release the held seats.&lt;/li&gt;
&lt;/ol&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;p&gt;&lt;strong&gt;The Transactional Outbox:&lt;/strong&gt; This is the &lt;em&gt;mechanism&lt;/em&gt; that makes the Saga reliable. When the &lt;code&gt;Payment Service&lt;/code&gt; confirms a payment, it can't just &lt;em&gt;hope&lt;/em&gt; to send a message to the &lt;code&gt;Ticket Issuance Service&lt;/code&gt;. What if it crashes right after it saves the payment to its own database? The payment would be successful, but the ticket would never be issued.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;Outbox Pattern&lt;/strong&gt; solves this. In a single, atomic database transaction, the &lt;code&gt;Payment Service&lt;/code&gt; does two things:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; Updates the &lt;code&gt;payments&lt;/code&gt; table to &lt;code&gt;status = 'successful'&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt; Inserts a &lt;em&gt;message&lt;/em&gt; into a &lt;code&gt;transactional_outbox&lt;/code&gt; table in its &lt;em&gt;own database&lt;/em&gt;.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Because this is one transaction, it's 100% atomic. It's impossible for the payment to be marked "successful" without the "issue tickets" message being created. A separate background process (a "message relay") then reliably publishes this message from the outbox to the rest of the system.&lt;/p&gt;


&lt;/li&gt;

&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;How the Message Relay Works:&lt;/strong&gt;&lt;br&gt;
The message relay is a critical infrastructure component that bridges the gap between database transactions and message queues. It has two common implementation patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Polling Pattern&lt;/strong&gt;: A simple service that polls the &lt;code&gt;transactional_outbox&lt;/code&gt; table every few seconds for &lt;code&gt;status = 'pending'&lt;/code&gt; events, publishes them to Kafka/RabbitMQ, then marks them as &lt;code&gt;status = 'published'&lt;/code&gt;. This is reliable but has higher latency (2-5 seconds).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Change Data Capture (CDC) Pattern&lt;/strong&gt;: A more advanced approach using tools like Debezium that tail the database's transaction log (WAL) and instantly convert table inserts into Kafka messages. This offers lower latency (milliseconds) but requires more sophisticated infrastructure.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Both patterns guarantee that every outbox event is eventually published, even if the service fails mid-process.&lt;/p&gt;

&lt;p&gt;This is the ultimate separation of concerns. The &lt;code&gt;Payment Service&lt;/code&gt;'s only job is to update its own state and "leave a memo" in its outbox. It doesn't know or care &lt;em&gt;who&lt;/em&gt; reads that memo, allowing our services to be beautifully decoupled.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Blueprint: Our Production-Grade Database Schema
&lt;/h2&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create ENUM types for reusability&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="n"&gt;order_payment_status_enum&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;               &lt;span class="c1"&gt;-- Order created, awaiting payment&lt;/span&gt;
    &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Client-side /confirm received, awaiting webhook&lt;/span&gt;
    &lt;span class="s1"&gt;'paid'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                  &lt;span class="c1"&gt;-- Authoritative webhook received&lt;/span&gt;
    &lt;span class="s1"&gt;'failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                &lt;span class="c1"&gt;-- Payment failed&lt;/span&gt;
    &lt;span class="s1"&gt;'refunded'&lt;/span&gt;               &lt;span class="c1"&gt;-- Payment refunded&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="n"&gt;payment_transaction_status_enum&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;             &lt;span class="c1"&gt;-- Payment intent created&lt;/span&gt;
    &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- Client-side /confirm received&lt;/span&gt;
    &lt;span class="s1"&gt;'successful'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;            &lt;span class="c1"&gt;-- Authoritative webhook received&lt;/span&gt;
    &lt;span class="s1"&gt;'failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                &lt;span class="c1"&gt;-- Payment failed&lt;/span&gt;
    &lt;span class="s1"&gt;'refunded'&lt;/span&gt;               &lt;span class="c1"&gt;-- Payment refunded&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="n"&gt;ticket_status_enum&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'issued'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'failed'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="n"&gt;ticket_scan_status_enum&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'scanned'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'canceled'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- New ENUM for outbox processing&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="n"&gt;outbox_status_enum&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'sent'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Orders table with constraints&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="n"&gt;order_payment_status_enum&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;ticket_status&lt;/span&gt; &lt;span class="n"&gt;ticket_status_enum&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Payments table with transaction tracking&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;payment_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;-- NO CASCADE DELETE&lt;/span&gt;
    &lt;span class="n"&gt;payment_gateway_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;payment_method_token&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;transaction_reference&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;amount_minor_units&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="nb"&gt;CHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;char_length&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="n"&gt;payment_transaction_status_enum&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;idempotency_key&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Tickets table with seat/section exclusivity&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;ticket_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;-- NO CASCADE DELETE&lt;/span&gt;
    &lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;owner_user_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;qr_code_url&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="n"&gt;ticket_scan_status_enum&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;issued_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="k"&gt;UNIQUE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;  &lt;span class="c1"&gt;-- prevent duplicate seat assignment within same order&lt;/span&gt;
    &lt;span class="k"&gt;CHECK&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Critical constraint: prevent double-booking across orders for ACTIVE tickets&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;UNIQUE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_tickets_event_seat_unique_active&lt;/span&gt;
&lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'scanned'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- NEW: Transactional Outbox Table&lt;/span&gt;
&lt;span class="c1"&gt;-- This table guarantees atomic event publishing&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;transactional_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="n"&gt;aggregate_type&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;aggregate_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;topic&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;255&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;payload&lt;/span&gt; &lt;span class="n"&gt;JSONB&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="n"&gt;outbox_status_enum&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="nb"&gt;TIME&lt;/span&gt; &lt;span class="k"&gt;ZONE&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Index for the message relay to efficiently find pending events&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_outbox_pending_status&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;transactional_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;


&lt;span class="c1"&gt;-- Indexes for performance&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_reservation_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_payments_order_id&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_tickets_event_status&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_tickets_owner&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;owner_user_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The API &amp;amp; Workflow: A Step-by-Step Breakdown
&lt;/h2&gt;

&lt;p&gt;Here is the complete user flow, showing how each API call and background process works within our separated architecture.&lt;/p&gt;
&lt;h3&gt;
  
  
  Payment Processing APIs
&lt;/h3&gt;
&lt;h4&gt;
  
  
  1. Initiate Payment
&lt;/h4&gt;

&lt;p&gt;This is the first step. The client asks the &lt;code&gt;Payment Service&lt;/code&gt; to create a payment intent. &lt;strong&gt;No money is charged yet&lt;/strong&gt; - this only creates the payment intent with Stripe.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /payments&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Request Body&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"order_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"order_789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"payment_method_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pm_123456789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"idempotency_key"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"idem_987654321"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"payment_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pay_123456789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"initiated"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"payment_intent_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pi_stripe_abc123"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Associated SQL Operations&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Check for idempotency&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;payment_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; 
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; 
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;idempotency_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'idem_987654321'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. Get server-side price (NEVER trust the client)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount_minor_units&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;currency&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 3. [Application calls Stripe API with the server-side amount]&lt;/span&gt;

&lt;span class="c1"&gt;-- 4. Create the payment record&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;payment_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;amount_minor_units&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;idempotency_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'pay_123456789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;total_amount_minor_units&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;-- ✅ Server-validated amount&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                  &lt;span class="c1"&gt;-- ✅ Server-validated currency&lt;/span&gt;
    &lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'idem_987654321'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; 
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;--    ⚠️  NOTE: This creates the PaymentIntent but does NOT charge the user yet.&lt;br&gt;
--    The actual payment happens on the client-side when the user confirms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Complete Payment Flow:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;POST /payments&lt;/code&gt;&lt;/strong&gt; (this endpoint): Creates PaymentIntent, returns &lt;code&gt;client_secret&lt;/code&gt; to browser&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client-Side&lt;/strong&gt;: Browser uses &lt;code&gt;client_secret&lt;/code&gt; with Stripe.js to show payment form&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client-Side&lt;/strong&gt;: User clicks "Pay" → &lt;code&gt;stripe.confirmCardPayment()&lt;/code&gt; is called → &lt;strong&gt;This is when money moves!&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client-Side&lt;/strong&gt;: On success, browser calls &lt;code&gt;POST /payments/{id}/confirm&lt;/code&gt; (optimistic UX)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Server-Side&lt;/strong&gt;: Stripe sends &lt;code&gt;POST /webhooks/payments&lt;/code&gt; (authoritative confirmation)&lt;/li&gt;
&lt;/ol&gt;


&lt;h4&gt;
  
  
  2. Payment Webhook (The Asynchronous Source of Truth)
&lt;/h4&gt;

&lt;p&gt;This is the &lt;strong&gt;most important&lt;/strong&gt; endpoint. It's called by Stripe's servers, not the user. It is the &lt;em&gt;authoritative&lt;/em&gt; record of a transaction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /webhooks/payments&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: &lt;code&gt;X-Webhook-Signature: &amp;lt;HMAC_SHA256&amp;gt;&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hardened Implementation Logic:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt; &lt;strong&gt;Verify Signature:&lt;/strong&gt; Cryptographically verify the signature.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Respond Immediately:&lt;/strong&gt; Return a &lt;code&gt;200 OK&lt;/code&gt; to Stripe to acknowledge receipt.&lt;/li&gt;
&lt;li&gt; &lt;strong&gt;Queue for Processing:&lt;/strong&gt; Place the event on an internal queue.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Asynchronous Handler SQL (Processing the event from the internal queue):&lt;/strong&gt;&lt;br&gt;
This logic demonstrates our separation of concerns perfectly. The &lt;code&gt;Payment Service&lt;/code&gt;'s &lt;em&gt;only&lt;/em&gt; job is to update its own tables and (atomically) create an &lt;code&gt;outbox&lt;/code&gt; event. It has no knowledge of &lt;em&gt;how&lt;/em&gt; to issue a ticket.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- CASE 1: Payment Successful (e.g., 'payment_intent.succeeded')&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 1: Update the payment record idempotently.&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'successful'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;transaction_reference&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'txn_987654321'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- from webhook&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 2: Update the master order status.&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'paid'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 3: Update reservation status to confirmed&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;confirmed_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 4: Create the outbox event to trigger Ticket Issuance&lt;/span&gt;
    &lt;span class="c1"&gt;-- This is ATOMIC with the reservation update.&lt;/span&gt;
    &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;transactional_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;aggregate_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;aggregate_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;'order'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'order.paid'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'{ ... webhook payload ... }'&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;


&lt;span class="c1"&gt;-- CASE 2: Payment Failed (e.g., 'payment_intent.payment_failed')&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 1: Update payment to failed&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 2: Update order to failed&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 3: Update reservation status to canceled&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'canceled'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;canceled_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 4: Create outbox event to trigger reservation release&lt;/span&gt;
    &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;transactional_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;aggregate_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;aggregate_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;'order'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pi_stripe_abc123'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="s1"&gt;'order.failed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'{ ... failure details ... }'&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  3. Confirm Payment (The Optimistic UX Endpoint)
&lt;/h4&gt;

&lt;p&gt;This endpoint is &lt;em&gt;only&lt;/em&gt; for user experience. It's called by the user's browser to give them an &lt;em&gt;instant&lt;/em&gt; success screen, without waiting for the (slower) webhook.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /payments/{payment_id}/confirm&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"payment_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pay_123456789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"confirmed_optimistic"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"message"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Payment confirmed. Your tickets will be issued shortly."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Associated SQL Operations&lt;/strong&gt;:&lt;br&gt;
This transaction is deliberately fast and lightweight. It &lt;em&gt;only&lt;/em&gt; updates the status to &lt;code&gt;confirmed_optimistic&lt;/code&gt;. It &lt;strong&gt;does not&lt;/strong&gt; trigger ticket issuance.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 1: Mark the payment as optimistically confirmed&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pay_123456789'&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'initiated'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 2: Mark the order as optimistically confirmed&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'confirmed_optimistic'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;payment_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pay_123456789'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  The Saga Continues: Asynchronous Ticket Issuance
&lt;/h3&gt;

&lt;p&gt;This is &lt;strong&gt;not an API&lt;/strong&gt;. This is where our separation of concerns pays off. The &lt;code&gt;Ticket Issuance Service&lt;/code&gt; has &lt;em&gt;one&lt;/em&gt; job: listen for &lt;code&gt;order.paid&lt;/code&gt; events and make tickets.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trigger:&lt;/strong&gt; Consuming an &lt;code&gt;order.paid&lt;/code&gt; event from the message bus (which was reliably published by the &lt;code&gt;Payment Service&lt;/code&gt;'s outbox).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Associated SQL Operations (Deadlock-Proofed)&lt;/strong&gt;:&lt;br&gt;
This is the big, complex transaction. By moving it to a background worker, we prevent it from ever blocking our payment flow. If it fails, it can safely retry without losing data.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 1: IDEMPOTENCY CHECK - Prevent duplicate ticket generation&lt;/span&gt;
    &lt;span class="c1"&gt;-- This makes the entire operation idempotent against message re-delivery&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ticket_status&lt;/span&gt; 
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;-- Lock the order row&lt;/span&gt;

    &lt;span class="c1"&gt;-- [Application Logic]&lt;/span&gt;
    &lt;span class="c1"&gt;-- IF ticket_status is already 'issued', COMMIT immediately and ACK the message.&lt;/span&gt;
    &lt;span class="c1"&gt;-- The order is already processed - no need to generate tickets again.&lt;/span&gt;

    &lt;span class="c1"&gt;-- IF ticket_status is 'pending', proceed with the rest of the logic...&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 2: Lock the parent reservation&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 3: CRITICAL: Lock seats in a consistent order&lt;/span&gt;
    &lt;span class="c1"&gt;-- This (ORDER BY s.seat_id) is the key to preventing deadlocks&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; 
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; 
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'reserved'&lt;/span&gt;
    &lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;  &lt;span class="c1"&gt;-- This deterministic ordering is vital&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 4: Get reservation details for ticket generation&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; 
        &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rsg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rsg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
    &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
    &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_ga&lt;/span&gt; &lt;span class="n"&gt;rsg&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rsg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 5: Generate individual tickets (example)&lt;/span&gt;
    &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;event_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="n"&gt;section_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;qr_code_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;issued_at&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; 
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'TKT-ABC123DEF456'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'event_456'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'A-101'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'...'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'TKT-JKL345MNO678'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'event_456'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'GA1'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'...'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 6: Atomically update seats from 'reserved' to 'sold'&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'sold'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; 
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; 
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'reserved'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 7: Atomically update reservation to 'completed'&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'completed'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'confirmed'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 8: Update order ticket status&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;ticket_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'issued'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Step 9: Create outbox event to notify user via email&lt;/span&gt;
    &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;transactional_outbox&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;aggregate_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;aggregate_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;topic&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="s1"&gt;'order'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'tickets.issued'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="s1"&gt;'{ ... ticket URLs and delivery info ... }'&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Supporting Operations
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Ticket Status Update (Scan)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /tickets/{ticket_id}/scan&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Associated SQL Operations&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="c1"&gt;-- Lock the specific ticket row to prevent double-scans&lt;/span&gt;
    &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;ticket_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;
    &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; 
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ticket_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'TKT-ABC123DEF456'&lt;/span&gt;
    &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;-- Check status and update (if active)&lt;/span&gt;
    &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt; 
    &lt;span class="k"&gt;SET&lt;/span&gt; 
        &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'scanned'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scan_count&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;scan_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;scanned_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;ticket_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'TKT-ABC123DEF456'&lt;/span&gt;
      &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'active'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h4&gt;
  
  
  Event Cancellation Refunds (Bulk Refunds)
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /refunds/event-cancellation&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Associated SQL Operations&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- This query is CRITICAL: It selects the `transaction_reference`&lt;/span&gt;
&lt;span class="c1"&gt;-- which is the *actual charge ID* needed by the gateway to process a refund.&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payment_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_reference&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;amount_minor_units&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'event_456'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'successful'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;transaction_reference&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;em&gt;(The application then loops over these results, calls the gateway for each refund, and updates the &lt;code&gt;payments&lt;/code&gt; and &lt;code&gt;orders&lt;/code&gt; tables to &lt;code&gt;status = 'refunded'&lt;/code&gt;.)&lt;/em&gt;&lt;/p&gt;




&lt;h4&gt;
  
  
  Verify Anonymous Refund Request
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: &lt;code&gt;POST /refunds/verify&lt;/code&gt;&lt;br&gt;
&lt;strong&gt;Associated SQL Operations&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. Validate order exists and is eligible for refund&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; 
    &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payment_status&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
&lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;payments&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;payment_status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'paid'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'successful'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. [Application Logic: Check rate limits]&lt;/span&gt;

&lt;span class="c1"&gt;-- 3. Generate and store verification code&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;refund_verifications&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;verification_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;order_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;contact_method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;contact_value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;verification_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
    &lt;span class="n"&gt;expires_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s1"&gt;'ver_001'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'order_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'email'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s1"&gt;'user@example.com'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'123456'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- Securely generated OTP&lt;/span&gt;
    &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'15 minutes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Our Design's Superpowers
&lt;/h2&gt;

&lt;p&gt;By strictly separating concerns and using the Saga + Transactional Outbox patterns, our system is now:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Resilient:&lt;/strong&gt; If the &lt;code&gt;Ticket Issuance Service&lt;/code&gt; fails, payments are still safely accepted. The &lt;code&gt;order.paid&lt;/code&gt; events will simply wait in the outbox to be processed when the service recovers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistent:&lt;/strong&gt; We have an atomic guarantee that a committed payment &lt;em&gt;will always&lt;/em&gt; result in an event to generate a ticket. It's impossible to charge a customer and forget to issue their ticket.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalable:&lt;/strong&gt; We can add 100 more &lt;code&gt;Ticket Issuance&lt;/code&gt; workers to handle a massive on-sale event without ever needing to touch the &lt;code&gt;Payment Service&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintainable:&lt;/strong&gt; Our services are small and focused. Our &lt;code&gt;Payment Service&lt;/code&gt; team can work on compliance and gateways, while the &lt;code&gt;Ticket Issuance&lt;/code&gt; team can optimize for inventory and concurrency. They don't need to be in the same meetings.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Further Improving Consistency: The Reconciliation Service
&lt;/h2&gt;

&lt;p&gt;While our Saga + Transactional Outbox patterns provide strong consistency guarantees, production systems can benefit from an additional layer of validation. A &lt;strong&gt;Reconciliation Service&lt;/strong&gt; can further enhance data consistency across all system components by continuously monitoring and validating that all financial transactions, seat allocations, and ticket issuances are properly synchronized.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why it's valuable as an afterthought:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Financial Accuracy&lt;/strong&gt;: Ensures payment amounts match between our system and payment gateways&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inventory Consistency&lt;/strong&gt;: Verifies seat allocations match between reservations and actual seat status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ticket Integrity&lt;/strong&gt;: Confirms all issued tickets correspond to valid, paid orders&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit Compliance&lt;/strong&gt;: Provides complete audit trails for regulatory requirements&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discrepancy Detection&lt;/strong&gt;: Identifies and flags inconsistencies before they impact customers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What it reconciles:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Payment Gateway vs Database&lt;/strong&gt;: Stripe/PayPal transaction records vs our payment records&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reservation vs Seat Status&lt;/strong&gt;: Reserved seats in our system vs actual seat availability&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Order vs Ticket Count&lt;/strong&gt;: Number of tickets issued vs order quantities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revenue vs Transactions&lt;/strong&gt;: Total revenue calculations vs individual payment sums&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Refund Consistency&lt;/strong&gt;: Refund amounts vs original payment amounts&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How it works:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Scheduled Reconciliation&lt;/strong&gt;: Runs every 15 minutes to check recent transactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Monitoring&lt;/strong&gt;: Continuously validates critical operations (payments, ticket generation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-system Validation&lt;/strong&gt;: Compares data across payment gateways, databases, and external systems&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automated Correction&lt;/strong&gt;: Fixes minor discrepancies automatically (e.g., status updates)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Alert Generation&lt;/strong&gt;: Flags major discrepancies for manual investigation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit Reporting&lt;/strong&gt;: Generates comprehensive reports for compliance and analysis&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This reconciliation service acts as a safety net, providing an additional layer of confidence that our distributed system maintains data integrity even under extreme load and complex failure scenarios.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://gemini.google.com/share/15d3d82059b5" rel="noopener noreferrer"&gt;Review by Google Gemini&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>idempotency</category>
      <category>payments</category>
    </item>
    <item>
      <title>Part 3: Seat Management</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Thu, 23 Oct 2025 17:48:52 +0000</pubDate>
      <link>https://dev.to/sumedhbala/part-3-seat-management-nn2</link>
      <guid>https://dev.to/sumedhbala/part-3-seat-management-nn2</guid>
      <description>&lt;h2&gt;
  
  
  Reservations, Locking, Availability &amp;amp; Queuing
&lt;/h2&gt;

&lt;p&gt;Seat management ensures users can reliably reserve seats without conflicts, overselling, or double-booking. It handles reserved vs General Admission seats, tracks reservations until payment, manages concurrency, and provides real-time availability updates. Queuing prevents overload during high-demand events.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Baseline Functional Solution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Service Responsibilities &amp;amp; API Mapping
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Service&lt;/th&gt;
&lt;th&gt;Primary Responsibilities&lt;/th&gt;
&lt;th&gt;API Endpoints&lt;/th&gt;
&lt;th&gt;Data Sources&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Seat Management Service&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Seat reservations, locking, cancellations, expired cleanup&lt;/td&gt;
&lt;td&gt;POST /events/{event_id}/reservations&lt;/td&gt;
&lt;td&gt;Primary Database, Redis Cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Queue Management Service&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;User queuing, position tracking, admission control&lt;/td&gt;
&lt;td&gt;POST /events/{event_id}/queue&lt;br&gt;GET /events/{event_id}/queue/status&lt;/td&gt;
&lt;td&gt;Redis Queue, Database&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Availability Aggregation Service&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Section counts, real-time availability tracking&lt;/td&gt;
&lt;td&gt;GET /events/{event_id}/sections&lt;/td&gt;
&lt;td&gt;Message Bus, Redis Cache&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Real-time Notification Service&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SSE connections for live updates&lt;/td&gt;
&lt;td&gt;SSE /events/{event_id}/sections/updates&lt;/td&gt;
&lt;td&gt;Redis Pub/Sub, Message Bus&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Databases &amp;amp; Caches
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary Database&lt;/strong&gt; (PostgreSQL / MySQL / DynamoDB): Stores canonical seat and reservation state.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reservation Store&lt;/strong&gt; (DynamoDB / Redis / PostgreSQL): Tracks in-progress reservations. Only stores queue_token for anonymous users; logged-in users are identified via JWT.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt; (Redis / Memcached): Hot lookups for section counts, seat maps, and user queue positions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message Bus&lt;/strong&gt; (Kafka / Pulsar / Kinesis / Pub/Sub): Streams reservation events for real-time updates and availability aggregation.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queueing Infrastructure&lt;/strong&gt;: Queued requests for high-demand events. Distributed coordination ensures unique positions and queue ordering.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Real-Time Updates Infrastructure&lt;/strong&gt;: In-memory counters maintain section-level seats_remaining. Fanout layer scales push notifications to many clients.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Example Database Schema
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The following are simplified schema overviews for understanding the basic structure.&lt;/p&gt;

&lt;h4&gt;
  
  
  Seats Table
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"A-101"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;row&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
&lt;span class="n"&gt;seat_number&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt;
&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;available&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reserved&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Reservations Table (Header)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;queue_token&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;only&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;anonymous&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;logged&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pending_payment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;confirmed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;canceled&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expired&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;total_amount_minor_units&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;integer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;amount&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="n"&gt;smallest&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="n"&gt;unit&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;avoid&lt;/span&gt; &lt;span class="n"&gt;floating&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt; &lt;span class="nb"&gt;precision&lt;/span&gt; &lt;span class="n"&gt;issues&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;currency&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"USD"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;payment_intent_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;
&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;
&lt;span class="n"&gt;confirmed_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;nullable&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Reservation_Seats Table (Reserved Seats)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;reservation_seat_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Reservation_GA Table (General Admission)
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;reservation_ga_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;quantity&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;integer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;number&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="n"&gt;GA&lt;/span&gt; &lt;span class="n"&gt;tickets&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Sections Table
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"section_A"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;foreign&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;string&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"104"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;capacity&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;integer&lt;/span&gt;
&lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;integer&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;updated&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;real&lt;/span&gt; &lt;span class="nb"&gt;time&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Queue Table
&lt;/h4&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;queue_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;FK&lt;/span&gt;
&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="k"&gt;nullable&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;logged&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="k"&gt;in&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;queue_token&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="k"&gt;nullable&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;anonymous&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;signed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;waiting&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ready&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expired&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;completed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;last_updated_at&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;timestamps&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Reservation Flow
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Enter Queue (High-Demand Events)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Anonymous users receive a queue_token.&lt;/li&gt;
&lt;li&gt;Logged-in users are identified via JWT; no token issued.&lt;/li&gt;
&lt;li&gt;If the queue is empty, users are immediately ready to reserve seats.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Browse Section Availability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Anyone can fetch aggregated section counts before joining the queue.&lt;/li&gt;
&lt;li&gt;Individual reserved seat details require ready status (queue_token or JWT).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reserve Seats / General Admission Slots
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Must be ready in the queue.&lt;/li&gt;
&lt;li&gt;Backend validates queue readiness via queue_token (anonymous) or JWT/account ID (logged-in).&lt;/li&gt;
&lt;li&gt;Reserved seats have TTL until payment is completed.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Real-Time Updates
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Users receive live updates on section availability via Server-Sent Events (SSE).&lt;/li&gt;
&lt;li&gt;Anonymous users subscribe with queue_token.&lt;/li&gt;
&lt;li&gt;Logged-in users subscribe with JWT.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  3. APIs (in order of user flow)
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Enter Queue
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: POST /events/{event_id}/queue&lt;br&gt;
&lt;strong&gt;Request Body&lt;/strong&gt;: {}&lt;br&gt;
&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"queue_token"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"&amp;lt;signed_token&amp;gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"estimated_wait_time"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"position_in_queue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ready"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notes&lt;/strong&gt;: Logged-in users receive no token; identity tracked via JWT.&lt;/p&gt;

&lt;h3&gt;
  
  
  Queue Status
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: GET /events/{event_id}/queue/status&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Authorization: Bearer  (anonymous) OR Authorization: Bearer  (logged-in)&lt;br&gt;
&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"position_in_queue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;123&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"waiting"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notes&lt;/strong&gt;: All users must check queue status before reserving seats during high-demand events.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Section Availability
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: GET /events/{event_id}/sections&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Optional&lt;br&gt;
&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"sections"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"section_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"total_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"available_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"section_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"B"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"total_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;800&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"available_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;300&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"section_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GA1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"General Admission"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"available_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notes&lt;/strong&gt;: Anyone can fetch counts even before joining the queue. Reserved seat details are not returned.&lt;/p&gt;

&lt;h3&gt;
  
  
  Get Seats in a Section (Reserved Seating Only, Paginated)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: GET /events/{event_id}/sections/{section_id}/seats&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Authorization: Bearer  (anonymous, ready) OR Authorization: Bearer  (logged-in, ready)&lt;br&gt;
&lt;strong&gt;Query Params&lt;/strong&gt;: page=1, page_size=50&lt;br&gt;
&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"section_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"total_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"page"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"page_size"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"seat_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A-101"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"row"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"available"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"seat_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A-102"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"row"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"reserved"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"seat_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A-103"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"row"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"available"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reserve Seats / General Admission Slots
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: POST /events/{event_id}/reservations&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Authorization: Bearer  (anonymous) OR Authorization: Bearer  (logged-in)&lt;br&gt;
&lt;strong&gt;Request Body&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"A-101"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"A-103"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ga_section"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GA1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ga_quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Database Operations&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create Reservation Header&lt;/strong&gt;: Insert into &lt;code&gt;reservations&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Link Reserved Seats&lt;/strong&gt;: Insert into &lt;code&gt;reservation_seats&lt;/code&gt; table for each seat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Link GA Quantities&lt;/strong&gt;: Insert into &lt;code&gt;reservation_ga&lt;/code&gt; table for GA sections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Seat Status&lt;/strong&gt;: Mark seats as 'reserved' in &lt;code&gt;seats&lt;/code&gt; table&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Section Counts&lt;/strong&gt;: Decrement &lt;code&gt;seats_remaining&lt;/code&gt; in &lt;code&gt;sections&lt;/code&gt; table&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Response&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reservation_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"res_789"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"reserved_until"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-09-02T12:35:00Z"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"A-101"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="s2"&gt;"A-103"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ga_section"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"GA1"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"ga_quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"pending_payment"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notes&lt;/strong&gt;: Backend validates queue readiness. TTL ensures release if payment isn't completed.&lt;/p&gt;

&lt;h3&gt;
  
  
  Real-Time Section Updates (Server-Sent Events)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Why Server-Sent Events (SSE) for Seat Notifications:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;SSE is the optimal choice for seat availability notifications because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One-Way Communication&lt;/strong&gt;: Seat updates flow server → client only (no client → server needed)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simpler Implementation&lt;/strong&gt;: Native browser support with automatic reconnection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better Performance&lt;/strong&gt;: Lower memory overhead (~2KB vs ~8KB per WebSocket connection)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher Scalability&lt;/strong&gt;: Can handle 50K+ connections per pod vs 10K for WebSockets&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Built-in Resilience&lt;/strong&gt;: Automatic reconnection and error handling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Perfect Use Case Match&lt;/strong&gt;: Ideal for push notifications like seat availability changes&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  WebSocket vs SSE: When to Use Each
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Why WebSocket Has Lower Latency (1ms vs 8ms):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Connection&lt;/strong&gt;: No HTTP handshake overhead per message&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Binary Protocol&lt;/strong&gt;: More efficient than HTTP text-based SSE&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bidirectional&lt;/strong&gt;: Client can send acknowledgments, reducing server-side queuing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct Connection&lt;/strong&gt;: No intermediate services (SNS/SQS) adding latency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use WebSocket When:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Bidirectional Communication&lt;/strong&gt;: Chat, real-time collaboration, gaming&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low Latency Critical&lt;/strong&gt;: Real-time trading, live sports updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Interactive Features&lt;/strong&gt;: User can send commands/responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Custom Protocols&lt;/strong&gt;: Need binary data or custom message formats&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Use SSE When:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One-Way Communication&lt;/strong&gt;: Seat availability updates, notifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP Infrastructure&lt;/strong&gt;: Leverage existing HTTP caching, CDNs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browser Compatibility&lt;/strong&gt;: Better support for automatic reconnection&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simpler Implementation&lt;/strong&gt;: No need for connection state management&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: /events/{event_id}/sections/updates&lt;br&gt;
&lt;strong&gt;Headers&lt;/strong&gt;: Authorization: Bearer  (anonymous, ready) OR Authorization: Bearer  (logged-in, ready)&lt;br&gt;
&lt;strong&gt;Message&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"section_id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"A"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"available_seats"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;120&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"timestamp"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2025-09-02T12:30:00Z"&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Notes&lt;/strong&gt;: General Admission counts and reserved seat holds are pushed in real time. Updates are throttled to reduce load.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Params Are in the Path vs. Request Body
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Event ID, Section ID&lt;/strong&gt; → Path Params: Represent resources in a hierarchy (/events/{event_id}/sections/{section_id}/seats). REST convention prefers path params when accessing a specific resource.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seats, GA Section, GA Quantity&lt;/strong&gt; → Request Body: Represent actions or state changes (reserving seats). Describe what the client wants to do, not the resource being fetched.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Token / JWT&lt;/strong&gt; → Headers: Authentication and authorization tokens always go in headers, not bodies or paths, to keep APIs clean, stateless, and cacheable.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This separation keeps the API design RESTful, predictable, and consistent with industry practices.&lt;/p&gt;

&lt;h3&gt;
  
  
  Key Takeaways
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Anonymous users are tracked via queue_token.&lt;/li&gt;
&lt;li&gt;Logged-in users are tracked via JWT/account ID; no token issued.&lt;/li&gt;
&lt;li&gt;Queue status is required for all users during high-demand events before reserving seats.&lt;/li&gt;
&lt;li&gt;Pagination improves performance for large sections.&lt;/li&gt;
&lt;li&gt;Real-time updates keep section availability accurate.&lt;/li&gt;
&lt;li&gt;Unauthorized users (no JWT or queue_token) cannot reserve seats.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Virtual Waiting Room: A Deep Dive into High-Scale Queueing
&lt;/h2&gt;

&lt;p&gt;This deep dive focuses on the Queue Management Service that gracefully manages demand spikes for seat reservations. We'll cover the full system design: its runtime architecture, the exact Redis data structures and operations, a modern approach to consistency and recovery, and a head-to-head comparison of two primary Redis queue implementations.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Problem: Surviving a Demand Spike
&lt;/h3&gt;

&lt;p&gt;Imagine a major concert ticket sale. Millions of users hit your site simultaneously. Without an admission control system, your backend services—for seat reservation and payment—would be instantly overwhelmed. This leads to a chaotic user experience: slow, unresponsive pages, frustrating timeouts, and a flood of angry customer service calls. This is a critical business problem, costing millions in lost sales and brand trust. The solution isn't to build a bigger backend overnight; it's to create a virtual waiting room that gracefully manages the flood of users and ensures an orderly process.&lt;/p&gt;

&lt;h3&gt;
  
  
  Goals
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Protect downstream seat/reservation systems from overload (admission control).&lt;/li&gt;
&lt;li&gt;Maintain predictable, mostly FIFO ordering across millions of waiting users.&lt;/li&gt;
&lt;li&gt;Provide "where am I?" and "how long left?" with low latency.&lt;/li&gt;
&lt;li&gt;Allow anonymous and logged-in users (queue_token vs JWT).&lt;/li&gt;
&lt;li&gt;Recover from cache failures quickly without losing order.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Non-Goals (handled elsewhere)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Payments and final ticket issuance.&lt;/li&gt;
&lt;li&gt;Full bot mitigation (you will still rate-limit and verify).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  High-Level Architecture
&lt;/h3&gt;

&lt;p&gt;The system's core is a hybrid, two-part architecture: a durable database for long-term state and a blazing-fast Redis cache for real-time operations.&lt;/p&gt;

&lt;h4&gt;
  
  
  Write Path (Join Queue)
&lt;/h4&gt;

&lt;p&gt;The API handles a dual-write process for speed and durability:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A user joins the queue by hitting an API endpoint. The API authenticates them (logged-in via JWT, anonymous gets a signed queue_token).&lt;/li&gt;
&lt;li&gt;The API performs two writes in parallel: it persists a minimal user entry into the Queue DB (our single source of truth) and enqueues the user into Redis (our hot path for ordering).&lt;/li&gt;
&lt;li&gt;The API instantly returns the user's initial position and ETA, derived from the monotonic sequence number (join_seq) received from the Redis write. &lt;em&gt;We'll discuss the O(1) position calculation algorithm in detail in the "Technical Deep Dive" section below.&lt;/em&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Admission (Making People Ready)
&lt;/h4&gt;

&lt;p&gt;A Gatekeeper worker steadily admits users from the head of the queue at a controlled rate (e.g., N users/sec). It marks them "ready" in the DB and emits an event to the message bus for observability and client notification.&lt;/p&gt;

&lt;h4&gt;
  
  
  Read Path (Position, ETA)
&lt;/h4&gt;

&lt;p&gt;In steady state, all reads are served from Redis, providing low-latency updates. If Redis is degraded, the system falls back to a degraded mode (details in section 5).&lt;/p&gt;

&lt;h4&gt;
  
  
  State Transitions
&lt;/h4&gt;

&lt;p&gt;waiting → ready → (reserve) → done/expired. A user who doesn't reserve in time can re-queue subject to system policy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Model (DB + Redis)
&lt;/h3&gt;

&lt;h4&gt;
  
  
  Relational DB (minimal, durable)
&lt;/h4&gt;

&lt;p&gt;This is our rock-solid, durable layer. We store only the essential, long-lived data.&lt;/p&gt;

&lt;h4&gt;
  
  
  Redis (Reconstructible Cache)
&lt;/h4&gt;

&lt;p&gt;This is our high-performance, live-ordering layer. We'll explore two primary implementations.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A) Sorted Set (ZSET) per event&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Key: q:{event_id}:z&lt;/li&gt;
&lt;li&gt;Member: user_key (e.g., u:{user_id})&lt;/li&gt;
&lt;li&gt;Score: join_seq (monotonic sequence number)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;B) List + Hash (list as deque, hash for metadata)&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Key: q:{event_id}:list (RPUSH/LPOP)&lt;/li&gt;
&lt;li&gt;Hash: q:{event_id}:meta:{user_key} → field: join_seq (monotonic sequence number)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Consistency, Failure &amp;amp; Recovery (Event-Sourced Approach)
&lt;/h3&gt;

&lt;p&gt;To build a truly resilient system, we use an event-sourced model where the database is the single source of truth and Redis acts as a reconstructible, high-speed cache. This approach eliminates race conditions and manual intervention.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;DB-First Writes&lt;/strong&gt;: All changes, like a user joining or being admitted, are first written to the database by the API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimized Recovery (The Snapshot + Stream Model)&lt;/strong&gt;: Our primary cache, Redis, is highly available but can still fail. To ensure a fast and accurate rebuild, we combine periodic snapshots with a continuous event stream.

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Periodic Snapshots&lt;/strong&gt;: The system periodically takes a snapshot of the Redis queue's live ordering state. This is a quick baseline backup of the essential user data and their order. These snapshots are stored in durable storage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rapid Rebuild&lt;/strong&gt;: If Redis crashes, a dedicated recovery worker performs two steps: It loads the latest durable snapshot into Redis. It then queries the event log table to replay only the events that occurred since that snapshot was taken. This is a much smaller number of events, allowing for a fast and reliable catch-up.&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;How CDC Event Replay Works&lt;/strong&gt;: CDC maintains a complete log of all database changes with timestamps. When Redis needs to be rebuilt, the recovery worker queries the CDC log for all changes that occurred after the last snapshot timestamp. This approach provides complete coverage since CDC captures every database change automatically, ensuring no events are missed during recovery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Redis Design Choices &amp;amp; Time Complexity
&lt;/h3&gt;

&lt;p&gt;A fundamental challenge in high-scale queueing is guaranteeing predictable behavior. A simple timestamp isn't reliable because millions of users could join the queue at the exact same millisecond, leading to a race condition and a random, non-deterministic order. Additionally, system clocks can sometimes go backwards due to clock synchronization issues (NTP adjustments, server reboots, etc.), which would make timestamp-based ordering even more confusing. To solve this, we use a monotonically increasing sequence number (join_seq) generated by Redis's atomic INCR command. This ensures that every single user gets a unique, sequential ticket, making the queue's internal logic perfectly deterministic.&lt;/p&gt;

&lt;p&gt;This simple design choice unlocks a massive performance benefit: O(1) position lookup. While native Redis commands for position finding are slow (O(logN) for Sorted Sets and O(N) for Lists), we can bypass them entirely. We simply track the sequence number of the queue's head and subtract it from the user's join_seq to get their exact position in constant time. This is the "Ah-Ha!" moment that makes our system so fast and responsive for millions of users checking their status.&lt;/p&gt;

&lt;h4&gt;
  
  
  A) Sorted Set (ZSET)
&lt;/h4&gt;

&lt;p&gt;A Redis Sorted Set is a unique data structure that combines a hash table and a skip list to balance performance. A skip list is a probabilistic data structure that offers logarithmic time complexity (O(logN)) for searches, insertions, and deletions, much like a balanced binary search tree, but is generally simpler to implement.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enqueue (ZADD)&lt;/strong&gt;: O(logN)

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Explanation&lt;/strong&gt;: When you add a new user to the Sorted Set using the ZADD command, Redis needs to place it in the correct, sorted position. A Redis Sorted Set is a hybrid data structure that uses a skip list to maintain order. Finding the correct spot in a skip list to insert an element requires traversing a subset of the elements, which is a logarithmic time operation, hence O(logN).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Dequeue (Batch ZPOPMIN)&lt;/strong&gt;: O(M⋅logN)

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Explanation&lt;/strong&gt;: The ZPOPMIN command removes the element with the lowest score (the head of your queue). The complexity is O(logN) for a single removal. When the operation is done in a batch to remove M users, Redis has to perform this O(logN) removal process M times. This multiplies the complexity, resulting in O(M⋅logN).&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;

&lt;strong&gt;Get User's Position&lt;/strong&gt;: O(1) (bypasses the native ZRANK command).&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why ZRANK Has O(logN) Complexity&lt;/strong&gt;: Redis Sorted Sets use skip lists where each node maintains &lt;strong&gt;counts&lt;/strong&gt; - the number of elements between itself and the next node at each level. To find a user's rank:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Start at the header&lt;/strong&gt;: Begin at the top level of the skip list&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Traverse levels&lt;/strong&gt;: For each level, follow forward pointers while the next node's score &amp;lt; target score&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Accumulate counts&lt;/strong&gt;: Add the count stored at each node to the running rank total&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Level descent&lt;/strong&gt;: Move down one level and repeat until reaching the bottom level&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Count Maintenance&lt;/strong&gt;: When inserting/deleting elements, Redis must update counts at all affected levels. Each insertion requires updating counts at log(N) levels on average, and each deletion requires similar updates to maintain count accuracy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why O(logN)&lt;/strong&gt;: Even with counts, ZRANK must traverse log(N) levels, performing count accumulation at each level. For 1M users: ~20 level traversals × count operations = O(logN) complexity.&lt;/p&gt;

&lt;h4&gt;
  
  
  B) List + Hash
&lt;/h4&gt;

&lt;p&gt;This design separates the two primary functions of the queue: a List for ordered processing and a Hash for user metadata. A List is a doubly linked list, optimized for fast additions and removals from the head and tail. A Hash provides O(1) access to a user's metadata.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Enqueue (RPUSH)&lt;/strong&gt;: O(1)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dequeue (LPOP)&lt;/strong&gt;: O(1)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Get User's Position&lt;/strong&gt;: O(1) (bypasses the native LINDEX command).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why LINDEX Has O(N) Complexity&lt;/strong&gt;: The native LINDEX command finds an element at a specific index within a Redis List. A Redis List is implemented as a doubly linked list. To find the position of a specific user, Redis has no choice but to start from the beginning of the list and traverse each element one by one until it finds the matching user ID. This means the time it takes is directly proportional to the number of elements in the list, making it a linear time operation, or O(N). For a large queue with millions of users, this would be prohibitively slow and would not scale.&lt;/p&gt;

&lt;h3&gt;
  
  
  How We Achieve O(1)
&lt;/h3&gt;

&lt;p&gt;We achieve a constant time position lookup by bypassing native Redis commands entirely and using simple application-level logic. This is possible because our queue design is based on a monotonically increasing sequence number (join_seq).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Monotonic Counter&lt;/strong&gt;: The key to this approach is that every user is assigned a unique, sequential number upon joining the queue. This number (join_seq) serves as their unchanging ticket number.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tracking the Head&lt;/strong&gt;: The system maintains a global counter for the "head" of the queue, representing the join_seq of the user currently being processed or the last user admitted.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The Simple Math&lt;/strong&gt;: A user's position is simply the difference between their join_seq and the current head of the queue. This is a single subtraction operation.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;: If the queue head is at join_seq = 1,000,000, and a user has join_seq = 1,000,150, their position is 1,000,150 − 1,000,000 = 150. This calculation is a single step that takes constant time, regardless of whether there are 10 users or 10 million in the queue. This mathematical approach transforms a potentially slow operation into an instant one.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Con of O(1) Application-Level Logic
&lt;/h3&gt;

&lt;p&gt;While powerful, relying on application-level logic for O(1) lookups creates a risk of race conditions. A user's position could be calculated incorrectly if the head_seq counter changes mid-request.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How to Prevent Race Conditions&lt;/strong&gt;: The most reliable method is to use a Redis Lua script. The script atomically fetches the user's join_seq and the head_seq counter. Redis guarantees that the entire script runs as a single, indivisible operation, eliminating any risk of a race condition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Lua Script for Atomic Position Lookup&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight lua"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Atomic position calculation to prevent race conditions&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;user_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;head_seq_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;queue_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;KEYS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;-- Get user's join sequence number&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;user_seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'ZSCORE'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;queue_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;user_seq&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;"User not in queue"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;-- Get current head sequence&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;head_seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'GET'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;head_seq_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;head_seq&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt;
    &lt;span class="n"&gt;head_seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;-- Calculate position atomically&lt;/span&gt;
&lt;span class="kd"&gt;local&lt;/span&gt; &lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_seq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;tonumber&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;head_seq&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;position&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;position&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;user_seq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;head_seq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;head_seq&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Race Condition Prevention&lt;/strong&gt;: The Lua script ensures atomicity by fetching both user_seq and head_seq in a single Redis operation, preventing the race condition where head_seq changes between the two reads.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Note&lt;/strong&gt;: The specific implementation of the Lua script will vary slightly based on the chosen Redis data structure. For the List + Hash model, the join_seq is fetched from the Hash, while for the Sorted Set model, it is retrieved as the score of the user's member in the set.&lt;/p&gt;

&lt;h3&gt;
  
  
  What This Gives Your Product
&lt;/h3&gt;

&lt;p&gt;This design delivers a robust, scalable, and resilient queue that transforms a chaotic demand spike into a predictable user experience. By separating concerns—the DB for durable identity, Redis for real-time ordering, and an event bus for reliable communication—we've built a system that provides a smooth user experience, ensures business continuity, and is easy to maintain.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Grand Unification of Ticketing: Seat Management Done Right
&lt;/h2&gt;

&lt;p&gt;The art of building a reliable ticketing platform lies in one core principle: preventing chaos. When millions of fans are clamoring for a limited number of tickets, your system must be a bastion of order. This deep dive into the Seat Management Service reveals the crucial mechanisms—database transactions and locking—that ensure every reservation is a clean, atomic operation, preventing the nightmare of overselling and double-booking.&lt;/p&gt;

&lt;p&gt;We'll build this system using three key tables: Seats to manage unique tickets, Sections for overall availability, and Reservations to handle temporary holds.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Database as Your Security Guard 🛡️
&lt;/h3&gt;

&lt;p&gt;At the heart of our strategy is the database, specifically PostgreSQL. Its powerful transactional capabilities allow us to treat a series of operations as a single, indivisible unit. The entire reservation process either succeeds completely, or it fails and is entirely rolled back, leaving no trace behind. This is the ACID principle in action, guaranteeing atomicity and consistency.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Atomic Reservation Flow
&lt;/h3&gt;

&lt;p&gt;When a user requests seats, the backend initiates a carefully choreographed sequence of database operations.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 1: Start the Transaction
&lt;/h4&gt;

&lt;p&gt;The very first action is to begin a transaction. Think of this as putting a "Do Not Disturb" sign on the data you're about to work on.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;BEGIN&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 2: Check &amp;amp; Lock for a Flawless Hold
&lt;/h4&gt;

&lt;p&gt;This is where we prevent overselling. The process differs slightly depending on the type of seating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For Reserved Seating: Pessimistic Locking 🔒&lt;/strong&gt;&lt;br&gt;
When a user selects a unique seat (e.g., "Seat A101"), the system immediately places a lock on that exact seat row in the database. This is a pessimistic lock, so named because we're being pessimistic and assuming another user might want the same seat. It guarantees that no other user can even read or attempt to modify that seat's status until our transaction is complete. The other user is forced to wait, preventing a conflict from ever occurring.&lt;/p&gt;

&lt;p&gt;This approach is perfect for reserved seats because they are unique and non-fungible. If the query returns fewer seats than requested, it means some were already taken. The transaction is immediately rolled back, and the user receives an error.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Select the requested seat and lock it for update&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'A101'&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why not Optimistic Locking for Reserved Seats?&lt;/strong&gt; While optimistic locking can offer higher concurrency, it's a poor fit for unique, non-fungible items like reserved seats. The approach involves checking for conflicts at the final moment of the transaction. A second user could start a reservation on the same seat, only to have their entire transaction rejected at the very end when the system detects the conflict. This creates a frustrating and unpredictable user experience, as a "seat not available" message delivered late in the process is far worse than an immediate one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Comparing the Costs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cost of a Pessimistic Lock&lt;/strong&gt;: The cost is a single, brief wait at the beginning of the process. The application sends one SELECT...FOR UPDATE query to the database, which handles all locking and waiting internally. This is a predictable, easy-to-manage cost.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cost of a Failed Optimistic Lock&lt;/strong&gt;: This cost is deceptive. While it avoids an up-front wait, it introduces the high cost of a complete re-run when a conflict occurs. The application must perform an entire sequence of operations—fetching data, running business logic, and attempting the final update—only for the update to fail. The application then has to discard all the work and restart the entire process, leading to redundant CPU cycles and wasted database round trips.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The essential difference is that a pessimistic lock's cost is a single, brief, and transparent wait. A failed optimistic lock's cost is the wasteful re-execution of a full and complex reservation attempt.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For General Admission (GA): Optimistic Locking 🟢&lt;/strong&gt;&lt;br&gt;
For GA, we must avoid a bottleneck. Many users will try to reserve GA slots at the same time, all hitting the same Sections table row. A pessimistic lock would serialize these requests, forcing them to wait in a single line, which defeats the purpose of the queuing system.&lt;/p&gt;

&lt;p&gt;Instead, we use optimistic locking, which assumes conflicts are rare. We don't lock the data upfront. We rely on a single, atomic SQL statement to perform a compare-and-swap (CAS) operation, checking and updating the value simultaneously. This approach is highly performant and scalable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Conflicts Are Rare in GA Reservations:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Conflicts occur when &lt;code&gt;seats_remaining&lt;/code&gt; drops to 0 or below the requested quantity between the time a user checks availability and attempts to reserve. This is rare because:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Large Capacity Buffers&lt;/strong&gt;: GA sections typically have thousands of seats (2,000-10,000+), making it unlikely for the last few seats to be contested simultaneously.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Queue-Controlled Access&lt;/strong&gt;: The queuing system already limits concurrent users. Only users who have been "admitted" from the queue can attempt reservations, reducing simultaneous access.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Time Windows&lt;/strong&gt;: Users have limited time (10-15 minutes) to complete reservations, and most successful reservations happen within the first few minutes of admission.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Natural Distribution&lt;/strong&gt;: User behavior naturally spreads out reservation attempts - some users are faster at selecting, others take time to decide.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Batch Processing&lt;/strong&gt;: The system can process multiple small reservations (1-4 tickets) simultaneously without conflict, as the remaining capacity buffer is usually large enough.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;When Conflicts Do Occur&lt;/strong&gt;: Conflicts become more likely only when &lt;code&gt;seats_remaining&lt;/code&gt; approaches very low numbers (e.g., &amp;lt; 10 seats remaining), but by then most users have already completed their reservations, and the queue system has already controlled the flow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why Conflicts Are NOT Rare for Reserved Seats&lt;/strong&gt;: Unlike GA sections with thousands of seats, reserved seating creates high conflict scenarios. Popular seats (front row, center sections, VIP areas) are unique and non-fungible - only one person can have seat A-101. When thousands of users simultaneously target the same premium seats, conflicts are inevitable. This is why reserved seating requires pessimistic locking to prevent overselling and ensure data consistency, while GA's large capacity buffers make optimistic locking viable.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;ga_quantity&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'section_GA1'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;ga_quantity&lt;/span&gt;
&lt;span class="n"&gt;RETURNING&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This single statement is executed atomically. The WHERE clause acts as our check, ensuring the database only performs the UPDATE if the seats_remaining count is sufficient at that exact moment. If another transaction has already updated the count, the WHERE condition will fail, and the UPDATE will affect zero rows.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why this is better than Pessimistic Locking for GA&lt;/strong&gt;: This approach avoids the single-row bottleneck. The database does not need to serialize all reservation requests; it only needs to check for a conflict at the moment of the update. This allows for massive parallelism, making the system highly scalable for high-demand GA events.&lt;/p&gt;

&lt;h4&gt;
  
  
  Performance Analysis: Locking Strategies Comparison
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Pessimistic Locking Performance (Reserved Seats)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row Lookup&lt;/strong&gt;: O(log N) - B-tree index lookup by seat_id&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock Acquisition&lt;/strong&gt;: O(1) - Single row lock after finding row&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock Duration&lt;/strong&gt;: O(1) - Constant time for transaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Contention Impact&lt;/strong&gt;: O(N) - Linear with concurrent users&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best Case&lt;/strong&gt;: 10-20ms (no contention)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Worst Case&lt;/strong&gt;: 5-10 seconds (high contention)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Average Case&lt;/strong&gt;: 50-100ms (moderate contention)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;CAS-Based Performance (General Admission)&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row Lookup&lt;/strong&gt;: O(log N) - B-tree index lookup by section_id&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Update Operation&lt;/strong&gt;: O(1) - Single row update with condition check&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No Retries&lt;/strong&gt;: O(1) - Either succeeds or fails atomically&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Success Case&lt;/strong&gt;: 20-30ms (atomic update succeeds)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Failure Case&lt;/strong&gt;: 20-30ms (atomic update fails due to insufficient seats)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;High Concurrency&lt;/strong&gt;: O(log N) per transaction regardless of load&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Locking Strategy Decision Matrix&lt;/strong&gt;:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Factor&lt;/th&gt;
&lt;th&gt;Pessimistic&lt;/th&gt;
&lt;th&gt;CAS&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Consistency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Concurrency&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency (Low Load)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Latency (High Load)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Slow&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Memory Usage&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;High&lt;/td&gt;
&lt;td&gt;Low&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Implementation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Simple&lt;/td&gt;
&lt;td&gt;Complex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Best For&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Reserved Seats&lt;/td&gt;
&lt;td&gt;General Admission&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h4&gt;
  
  
  Step 3: Update Counts
&lt;/h4&gt;

&lt;p&gt;With the seats or GA section securely locked, we can now confidently update their availability.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- For reserved seating, update the seat status to 'reserved' (with availability check)&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'reserved'&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'A101'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'available'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Also update the count in the sections table&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt; &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'section_A'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This step ensures the section-level availability information is always accurate for other users browsing the event.&lt;/p&gt;

&lt;h4&gt;
  
  
  Step 4: Create the Reservation Record
&lt;/h4&gt;

&lt;p&gt;A new record is inserted into the Reservations table. This record is the master holding information, linking the user to their selected seats or GA quantity. Crucially, we set a Time-to-Live (TTL) using the expires_at timestamp. This is an essential safety net. It ensures that if a user doesn't complete the payment, the reservation will automatically expire and release the seats.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create reservation header&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;expires_at&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;total_amount_minor_units&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'res_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'user_123'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'10 minutes'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;8950&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'USD'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;

&lt;span class="c1"&gt;-- Link reserved seat to reservation&lt;/span&gt;
&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;reservation_seat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;reservation_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;seat_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'rs_001'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'res_789'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'A-101'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;());&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Step 5: The Final Commitment
&lt;/h4&gt;

&lt;p&gt;If all preceding steps succeed, we execute the COMMIT command. This makes all changes permanent. At this point, the seats are officially marked as reserved (or the GA count is reduced), and the locks are released. If any step failed, the ROLLBACK command is issued, and the database returns to its state before the transaction began.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- All changes are made permanent and locks are released&lt;/span&gt;
&lt;span class="k"&gt;COMMIT&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- If a failure occurs, all changes are undone&lt;/span&gt;
&lt;span class="c1"&gt;-- ROLLBACK;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Expiring Abandoned Reservations
&lt;/h3&gt;

&lt;p&gt;No matter how robust your system, some users will abandon their carts. To prevent these "seat leaks," a separate background process, or worker, periodically scans the Reservations table for records where expires_at is in the past. When an expired reservation is found, the worker executes a new transaction to return the seats to the available pool. For reserved seats, it updates their status to available. For GA, it adds the ga_quantity back to the seats_remaining in the Sections table. This mechanism ensures that inventory is never held indefinitely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;⚠️ Race Condition Considerations&lt;/strong&gt;: The cleanup process must handle concurrent operations safely. Multiple cleanup jobs, new reservations being created during cleanup, and concurrent seat releases can all cause data inconsistency if not properly managed.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Cleanup log table for monitoring and debugging&lt;/span&gt;
&lt;span class="c1"&gt;-- This table tracks every cleanup operation so we can monitor the system's health&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;cleanup_log&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;log_id&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="n"&gt;AUTO_INCREMENT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;           &lt;span class="c1"&gt;-- Unique identifier for each log entry&lt;/span&gt;
    &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;                          &lt;span class="c1"&gt;-- Which section was cleaned up&lt;/span&gt;
    &lt;span class="n"&gt;expired_ga_count&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                            &lt;span class="c1"&gt;-- How many GA seats were released&lt;/span&gt;
    &lt;span class="n"&gt;expired_reserved_count&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                      &lt;span class="c1"&gt;-- How many reserved seats were released&lt;/span&gt;
    &lt;span class="n"&gt;affected_reservations&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                       &lt;span class="c1"&gt;-- How many reservations were marked as expired&lt;/span&gt;
    &lt;span class="n"&gt;cleaned_at&lt;/span&gt; &lt;span class="nb"&gt;TIMESTAMP&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="k"&gt;CURRENT_TIMESTAMP&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;-- When the cleanup happened&lt;/span&gt;
    &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_cleaned_at&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cleaned_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;               &lt;span class="c1"&gt;-- Index for efficient queries by time&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- PostgreSQL version (more robust with better locking)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;PROCEDURE&lt;/span&gt; &lt;span class="n"&gt;cleanup_expired_reservations&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p_section_id&lt;/span&gt; &lt;span class="nb"&gt;VARCHAR&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;
&lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;DECLARE&lt;/span&gt;
    &lt;span class="n"&gt;v_expired_ga_count&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;v_expired_reserved_count&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="n"&gt;v_affected_rows&lt;/span&gt; &lt;span class="nb"&gt;INT&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
    &lt;span class="c1"&gt;-- Start a transaction block (automatically handled in procedures in PG &amp;gt;=11)&lt;/span&gt;
    &lt;span class="c1"&gt;-- We still use EXCEPTION handling for rollback on any error.&lt;/span&gt;

    &lt;span class="k"&gt;BEGIN&lt;/span&gt;
        &lt;span class="c1"&gt;-- Lock the section row to prevent concurrent cleanups&lt;/span&gt;
        &lt;span class="n"&gt;PERFORM&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt; &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Lock all affected reservations early to prevent concurrent modifications&lt;/span&gt;
        &lt;span class="n"&gt;PERFORM&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_ga&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;
        &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Count expired GA reservations for this section&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SUM&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;v_expired_ga_count&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_ga&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Count expired reserved seat reservations for this section&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;v_expired_reserved_count&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Restore seats_remaining count&lt;/span&gt;
        &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;v_expired_ga_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="n"&gt;v_expired_reserved_count&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
            &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;sections&lt;/span&gt;
            &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;seats_remaining&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;v_expired_ga_count&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;v_expired_reserved_count&lt;/span&gt;
            &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Re-release reserved seats for this section only&lt;/span&gt;
        &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;
        &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'available'&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;-- Mark expired reservations (both GA and reserved) as expired for this section&lt;/span&gt;
        &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;SET&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'expired'&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending_payment'&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_ga&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
              &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt;
              &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; 
                  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; 
                  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
              &lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;-- Count affected rows (optional: count only those for this section)&lt;/span&gt;
        &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;v_affected_rows&lt;/span&gt;
        &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservations&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;
        &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'expired'&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expires_at&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
          &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
              &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_ga&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;rga&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
              &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt;
              &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;reservation_id&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;reservation_seats&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt; 
                  &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;seats&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;rs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seat_id&lt;/span&gt; 
                  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;section_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p_section_id&lt;/span&gt;
              &lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="p"&gt;);&lt;/span&gt;

        &lt;span class="c1"&gt;-- Log cleanup operation&lt;/span&gt;
        &lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;cleanup_log&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;section_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;expired_ga_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;expired_reserved_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;affected_reservations&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;cleaned_at&lt;/span&gt;
        &lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;VALUES&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="n"&gt;p_section_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;v_expired_ga_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;v_expired_reserved_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;v_affected_rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
            &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="n"&gt;EXCEPTION&lt;/span&gt;
        &lt;span class="k"&gt;WHEN&lt;/span&gt; &lt;span class="n"&gt;OTHERS&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
            &lt;span class="n"&gt;RAISE&lt;/span&gt; &lt;span class="n"&gt;NOTICE&lt;/span&gt; &lt;span class="s1"&gt;'Error during cleanup: %'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;SQLERRM&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;ROLLBACK&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="n"&gt;RAISE&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key Race Condition Protections&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row-Level Locking&lt;/strong&gt;: &lt;code&gt;FOR UPDATE&lt;/code&gt; prevents concurrent section modifications&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Atomic Operations&lt;/strong&gt;: All updates in a single transaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Automatic rollback on any failure&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit Trail&lt;/strong&gt;: Preserves expired reservations for compliance&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Monitoring&lt;/strong&gt;: Cleanup log tracks operations for debugging&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Anatomy of a Hyper-Scale Real-Time System: A Production-Ready Deep Dive
&lt;/h2&gt;

&lt;p&gt;A detailed technical guide to designing a resilient, high-performance, and scalable notification platform.&lt;/p&gt;

&lt;p&gt;In the high-stakes world of live event ticketing, a real-time notification system is a critical business tool, not a luxury. It directly drives revenue by preventing lost sales and builds brand trust by providing a transparent user experience.&lt;/p&gt;

&lt;p&gt;A static "seats available" number leads to frustration and cart abandonment when users attempt to reserve seats that are already gone. A real-time system provides an accurate, live view of inventory, which eliminates this negative experience. Additionally, seeing seat availability change dynamically creates a sense of urgency that encourages faster purchasing decisions.&lt;/p&gt;

&lt;p&gt;Building a real-time system capable of serving millions of concurrent users requires a series of deliberate architectural choices that prioritize decoupling, resilience, and performance. This article provides a deep dive into a production-ready architecture for a large-scale ticketing notification system. We will define each component's role, trace the data flow in detail, and analyze the system's resilience to failure.&lt;/p&gt;

&lt;h3&gt;
  
  
  Part 1: Foundational Components
&lt;/h3&gt;

&lt;p&gt;This architecture is composed of several specialized components, each responsible for a distinct part of the data lifecycle.&lt;/p&gt;

&lt;h4&gt;
  
  
  Event Producers &amp;amp; The Durable Log (Amazon MSK)
&lt;/h4&gt;

&lt;p&gt;The system is event-driven. State changes are captured automatically using Change Data Capture (CDC) from the database, ensuring complete decoupling between high-frequency reservation operations and event publishing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;How CDC Works in Practice:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Database Changes&lt;/strong&gt;: When the Seat Management Service commits a reservation transaction (e.g., seat status changed from 'available' to 'reserved'), the database records the change&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Automatic Capture&lt;/strong&gt;: CDC automatically captures all database changes at the transaction log level, including:

&lt;ul&gt;
&lt;li&gt;Table modifications (seats, reservations, sections)&lt;/li&gt;
&lt;li&gt;Before/after values for each change&lt;/li&gt;
&lt;li&gt;Transaction metadata and timestamps&lt;/li&gt;
&lt;li&gt;User context from the application&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Transformation&lt;/strong&gt;: A CDC processor transforms raw database changes into business events:

&lt;ul&gt;
&lt;li&gt;Raw UPDATE → 'seat_reserved' event&lt;/li&gt;
&lt;li&gt;Raw INSERT → 'reservation_created' event&lt;/li&gt;
&lt;li&gt;Raw DELETE → 'reservation_expired' event&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;MSK Publishing&lt;/strong&gt;: Transformed events are published to appropriate Kafka topics (e.g., 'seat-events', 'reservation-events')&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero Performance Impact&lt;/strong&gt;: Reservation operations have no latency impact since CDC runs asynchronously&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Downstream Processing&lt;/strong&gt;: Other services (Availability Aggregation, Notifications) consume these events to update their state&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;MSK serves as the system's durable log and provides several critical functions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling&lt;/strong&gt;: It decouples producers from consumers, allowing them to operate and scale independently.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Durability &amp;amp; Replayability&lt;/strong&gt;: Events are durably stored, allowing them to be re-processed in case of a consumer-side error.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backpressure Management&lt;/strong&gt;: The bus acts as a large buffer, absorbing traffic spikes.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The Processing Engine (Apache Flink)
&lt;/h4&gt;

&lt;p&gt;The Availability Aggregation Service is implemented as a stateful stream processing application using Apache Flink. It consumes the raw event stream from MSK and transforms it into clean, aggregated availability counts.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Stateful Processing&lt;/strong&gt;: Flink uses a keyBy operation on the section_id to ensure all events for a single section are processed sequentially by the same task.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fault Tolerance&lt;/strong&gt;: Flink is configured to take periodic checkpoints of its state to a durable store like Amazon S3, guaranteeing data accuracy through failures.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Windowed Aggregation &amp;amp; Throttling&lt;/strong&gt;: Instead of emitting an update for every single event, Flink can operate on windows of time. For example, it can collect all events for a specific section over a 2-second window, perform a single aggregation at the end of that window, and then emit a single, consolidated update. This micro-batching is crucial for throttling the update stream, reducing the load on the entire downstream notification system and preventing the end user's UI from flickering with an excessive number of rapid updates.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The Distribution Fleet (Amazon EKS)
&lt;/h4&gt;

&lt;p&gt;The Real-time Notification Service is a fleet of containerized applications deployed on Amazon EKS (Elastic Kubernetes Service). These long-running pods are the "SSE hosts."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Persistent Connections&lt;/strong&gt;: Each pod is capable of maintaining thousands of persistent Server-Sent Events (SSE) connections with clients.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local State&lt;/strong&gt;: Each pod maintains two private, in-memory hash tables to manage its local connections, enabling extremely fast lookups.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Horizontal Scalability&lt;/strong&gt;: The fleet runs as a Kubernetes Deployment, allowing it to be scaled horizontally.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  The Connection Manager (Application Load Balancer)
&lt;/h4&gt;

&lt;p&gt;An Application Load Balancer (ALB) sits in front of the EKS fleet, terminating TLS, performing health checks, and distributing incoming SSE connection requests to the EKS pods using a "Least Connections" algorithm.&lt;/p&gt;

&lt;h4&gt;
  
  
  The Routing Directory (Amazon ElastiCache for Redis)
&lt;/h4&gt;

&lt;p&gt;A central, low-latency in-memory database acts as our real-time routing table, or "Director."&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Function&lt;/strong&gt;: Its primary job is to maintain a live map between a logical topic (e.g., section_id) and the physical identifiers of the EKS pods that are currently serving clients interested in that topic.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Implementation&lt;/strong&gt;: This is implemented using Amazon ElastiCache for Redis, configured for high availability. The data structure is a simple Redis Set: section:104:subscribers -&amp;gt; { "pod-a-ip", "pod-c-ip" }.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Part 2: The Core Architecture — A Deep Dive into the Data Flow
&lt;/h3&gt;

&lt;p&gt;The architecture is designed for low-latency, targeted message delivery.&lt;/p&gt;

&lt;h4&gt;
  
  
  Phase 1: Connection &amp;amp; State Registration
&lt;/h4&gt;

&lt;p&gt;This phase establishes the client connection and builds the two-layer routing map.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A client initiates an SSE connection request to the ALB. The ALB selects a target pod, pod-C, and forwards the request.&lt;/li&gt;
&lt;li&gt;The application inside pod-C establishes the SSE connection and assigns it a unique internal ID.&lt;/li&gt;
&lt;li&gt;The client sends a subscription message (e.g., for section-104).&lt;/li&gt;
&lt;li&gt;pod-C updates its two local, in-memory hash tables:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Forward Map&lt;/strong&gt; (connectionId -&amp;gt; sections): Used for efficient cleanup when the client disconnects.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reverse Map&lt;/strong&gt; (section -&amp;gt; connectionIds): Used for O(1) lookups during message delivery.&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;pod-C then updates the Redis Director, adding its own unique, addressable identifier to the Redis Set for section-104.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   redis.SADD("section:104:subscribers", "pod-c-ip")
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h4&gt;
  
  
  Phase 2: Real-Time Message Delivery
&lt;/h4&gt;

&lt;p&gt;This phase traces an event from processing to final delivery.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;The Flink application publishes a processed update for section-104 to an Amazon SNS Topic.&lt;/li&gt;
&lt;li&gt;SNS invokes a subscribed AWS Lambda function (the "Router Lambda").&lt;/li&gt;
&lt;li&gt;The Lambda function queries the Redis Director to get the set of pod identifiers subscribed to section-104.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;   redis.SMEMBERS("section:104:subscribers") -&amp;gt; returns { "pod-c-ip", "pod-f-ip" }
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ol&gt;
&lt;li&gt;The Lambda performs a direct, service-to-service push to the target pods via an internal API endpoint on each pod.&lt;/li&gt;
&lt;li&gt;The target pod, pod-C, receives this internal API call.&lt;/li&gt;
&lt;li&gt;pod-C performs a highly efficient O(1) lookup in its local in-memory reverse_map to get the list of specific client connectionIds.&lt;/li&gt;
&lt;li&gt;The pod's application iterates through this list and writes the message into the open SSE streams for each connection.&lt;/li&gt;
&lt;/ol&gt;

&lt;h4&gt;
  
  
  Message Delivery Optimization Analysis
&lt;/h4&gt;

&lt;p&gt;&lt;strong&gt;Fanout Pattern Performance&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Direct Broadcast (Naive Approach)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 message → 100K users = 100K individual WebSocket writes
Time Complexity: O(N) where N = number of subscribers
Memory Complexity: O(N) for connection management
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;SNS Fanout (Optimized Approach)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1 message → SNS → 100K Lambda invocations → 100K WebSocket writes
Time Complexity: O(1) for publish + O(N) for delivery
Memory Complexity: O(1) for SNS + O(N) for Lambda memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Batching Optimization&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;100 messages → Batch → 1K WebSocket writes
Time Complexity: O(N/K) where K = batch size
Memory Complexity: O(N/K) for batched connections
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Message Delivery Guarantees&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;At-Most-Once Delivery&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Implementation&lt;/strong&gt;: Fire-and-forget with no acknowledgments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Non-critical updates (seat availability changes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Highest throughput, lowest latency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trade-off&lt;/strong&gt;: Some messages may be lost&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;At-Least-Once Delivery&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Implementation&lt;/strong&gt;: Retry mechanism with acknowledgments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Critical updates (payment confirmations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Lower throughput, higher latency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trade-off&lt;/strong&gt;: Some messages may be duplicated&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Exactly-Once Delivery&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Implementation&lt;/strong&gt;: Idempotent processing with deduplication&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Use Case&lt;/strong&gt;: Financial transactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Lowest throughput, highest latency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trade-off&lt;/strong&gt;: Highest complexity, highest reliability&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Part 3: Bulletproofing the System — A Deep Dive into Resilience
&lt;/h3&gt;

&lt;p&gt;A production-ready architecture must be designed explicitly for failure.&lt;/p&gt;

&lt;h4&gt;
  
  
  Resilience of the EKS Fleet
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pod Failure&lt;/strong&gt;: If a pod crashes, Kubernetes's ReplicaSet immediately launches a replacement. The ALB's health check will have already stopped routing traffic to the failed pod. Clients that were connected to it will trigger their automatic reconnect logic and will be seamlessly routed to a healthy pod by the ALB.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Node Failure&lt;/strong&gt;: The EKS worker nodes are managed by an Auto Scaling Group. If an EC2 instance fails, the ASG will terminate it and launch a replacement. Kubernetes will then automatically reschedule the affected pods onto the remaining healthy nodes in the cluster.&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  Resilience of the Director (Redis Cluster)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary Defense (High Availability)&lt;/strong&gt;: The Director is deployed as an Amazon ElastiCache for Redis cluster with Multi-AZ and automatic failover enabled. If the primary Redis node fails, ElastiCache automatically promotes a replica.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disaster Recovery (Graceful Degradation)&lt;/strong&gt;: In a total service failure, the Router Lambda can be programmed with a fallback. Upon detecting that Redis is unavailable, it could revert to a less efficient broadcast model or log the failure and wait for recovery.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Performance Benchmarks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Queue Management Performance Comparison
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Maximum QPS &amp;amp; Memory at 10,000 Concurrent Users:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;P95 Latency&lt;/th&gt;
&lt;th&gt;Max QPS&lt;/th&gt;
&lt;th&gt;Memory (10K users)&lt;/th&gt;
&lt;th&gt;Complexity&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Queue&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redis ZSET&lt;/td&gt;
&lt;td&gt;5ms&lt;/td&gt;
&lt;td&gt;25,000&lt;/td&gt;
&lt;td&gt;400MB&lt;/td&gt;
&lt;td&gt;O(logN)&lt;/td&gt;
&lt;td&gt;Skip list implementation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Queue&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redis List+Hash&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;50,000&lt;/td&gt;
&lt;td&gt;320MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Linked list + hash table&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Queue&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Database Queue&lt;/td&gt;
&lt;td&gt;50ms&lt;/td&gt;
&lt;td&gt;2,000&lt;/td&gt;
&lt;td&gt;1GB&lt;/td&gt;
&lt;td&gt;O(N)&lt;/td&gt;
&lt;td&gt;PostgreSQL-based&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Queue&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;In-Memory Array&lt;/td&gt;
&lt;td&gt;1ms&lt;/td&gt;
&lt;td&gt;100,000&lt;/td&gt;
&lt;td&gt;200MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Single-threaded only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reservation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pessimistic Lock&lt;/td&gt;
&lt;td&gt;30ms&lt;/td&gt;
&lt;td&gt;2,000&lt;/td&gt;
&lt;td&gt;800MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Strong consistency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reservation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Optimistic Lock&lt;/td&gt;
&lt;td&gt;15ms&lt;/td&gt;
&lt;td&gt;8,000&lt;/td&gt;
&lt;td&gt;600MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;High concurrency&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Reservation&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Database Transaction&lt;/td&gt;
&lt;td&gt;40ms&lt;/td&gt;
&lt;td&gt;1,500&lt;/td&gt;
&lt;td&gt;500MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;ACID guarantees&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Notification&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redis Pub/Sub&lt;/td&gt;
&lt;td&gt;2ms&lt;/td&gt;
&lt;td&gt;30,000&lt;/td&gt;
&lt;td&gt;200MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Fire-and-forget&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Notification&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;WebSocket Direct&lt;/td&gt;
&lt;td&gt;1ms&lt;/td&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;800MB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;Bidirectional, persistent connection&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Notification&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;SSE + SNS&lt;/td&gt;
&lt;td&gt;8ms&lt;/td&gt;
&lt;td&gt;15,000&lt;/td&gt;
&lt;td&gt;1.2GB&lt;/td&gt;
&lt;td&gt;O(1)&lt;/td&gt;
&lt;td&gt;AWS managed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Memory Usage Calculation Breakdown
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Performance at 10,000 Concurrent Users:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Queue Methods:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis ZSET (400MB)&lt;/strong&gt;: 10K users × 32 bytes (user_id + score + skip list pointers) + 25% Redis overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis List+Hash (320MB)&lt;/strong&gt;: 10K users × 40 bytes (list + hash entries) + Redis metadata&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database Queue (1GB)&lt;/strong&gt;: 10K users × 100 bytes (PostgreSQL row) + connection pools + indexes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;In-Memory Array (200MB)&lt;/strong&gt;: 10K users × 8 bytes (integer user_id) + JVM overhead + GC buffers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reservation Methods:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pessimistic Lock (800MB)&lt;/strong&gt;: 10K concurrent locks × 1KB (lock metadata) + database buffers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Optimistic Lock (600MB)&lt;/strong&gt;: No lock overhead, just version numbers + database buffers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database Transaction (500MB)&lt;/strong&gt;: 10K transactions × 50 bytes (transaction state) + database buffers&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Notification Methods:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis Pub/Sub (200MB)&lt;/strong&gt;: 10K QPS × 100 bytes (message overhead) + connection buffers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket Direct (800MB)&lt;/strong&gt;: 10K connections × 8KB (connection state) + WebSocket server overhead&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SSE + SNS (1.2GB)&lt;/strong&gt;: 10K connections × 2KB (SSE state) + AWS Lambda + SNS message processing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Assumptions&lt;/strong&gt;: 10K concurrent users baseline, Redis overhead 20-25%, database connection pools, network buffers, AWS service overhead. Actual usage varies by implementation details and configuration.&lt;/p&gt;

&lt;h3&gt;
  
  
  Production Metrics
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;High-Demand Event Benchmarks&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Peak Queue Length&lt;/strong&gt;: 2M+ users waiting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Admission Rate&lt;/strong&gt;: 1,000 users/second (controlled)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reservation Success Rate&lt;/strong&gt;: 95%+ (5% timeout/abandonment)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification Delivery&lt;/strong&gt;: 99.9% within 2 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System Recovery&lt;/strong&gt;: &amp;lt;30 seconds from Redis failure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Resource Utilization&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis Cluster&lt;/strong&gt;: 3 nodes, 16GB RAM each, 50% CPU utilization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Database&lt;/strong&gt;: 8 cores, 32GB RAM, 70% CPU utilization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Notification Fleet&lt;/strong&gt;: 20 pods, 4GB RAM each, 60% CPU utilization&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Monitoring &amp;amp; Observability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key Metrics
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Queue Management Metrics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Queue Length&lt;/strong&gt;: Current number of users waiting&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Average Wait Time&lt;/strong&gt;: Mean time from join to admission&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Admission Rate&lt;/strong&gt;: Users admitted per second&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Position Accuracy&lt;/strong&gt;: Consistency of position calculations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Redis Memory Usage&lt;/strong&gt;: Memory consumption by queue data structures&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reservation System Metrics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reservation Success Rate&lt;/strong&gt;: Percentage of successful reservations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock Contention&lt;/strong&gt;: Number of lock conflicts per second&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Duration&lt;/strong&gt;: Average time for reservation transactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Abandonment Rate&lt;/strong&gt;: Percentage of users who don't complete payment&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seat Release Rate&lt;/strong&gt;: Expired reservations released per minute&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Real-Time Notification Metrics&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Message Delivery Rate&lt;/strong&gt;: Notifications delivered per second&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delivery Latency&lt;/strong&gt;: Time from event to user notification&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection Health&lt;/strong&gt;: Active WebSocket/SSE connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message Loss Rate&lt;/strong&gt;: Failed deliveries due to connection drops&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fanout Efficiency&lt;/strong&gt;: Messages per SNS publish&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Alerting Thresholds
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Critical Alerts&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queue length &amp;gt; 1M users&lt;/li&gt;
&lt;li&gt;Reservation failure rate &amp;gt; 10%&lt;/li&gt;
&lt;li&gt;Notification delivery latency &amp;gt; 5 seconds&lt;/li&gt;
&lt;li&gt;Redis memory usage &amp;gt; 80%&lt;/li&gt;
&lt;li&gt;Database connection pool exhaustion&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Warning Alerts&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Queue length &amp;gt; 500K users&lt;/li&gt;
&lt;li&gt;Reservation failure rate &amp;gt; 5%&lt;/li&gt;
&lt;li&gt;Notification delivery latency &amp;gt; 2 seconds&lt;/li&gt;
&lt;li&gt;Redis memory usage &amp;gt; 60%&lt;/li&gt;
&lt;li&gt;Average wait time &amp;gt; 15 minutes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Channels&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical&lt;/strong&gt;: PagerDuty (immediate escalation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Warning&lt;/strong&gt;: Slack (team notification)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Info&lt;/strong&gt;: Email (daily reports)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Logging Strategy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Queue System Logs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User join/leave events with timestamps&lt;/li&gt;
&lt;li&gt;Position changes and ETA updates&lt;/li&gt;
&lt;li&gt;Redis operation performance&lt;/li&gt;
&lt;li&gt;Queue admission decisions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reservation System Logs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Lock acquisition/release events&lt;/li&gt;
&lt;li&gt;Transaction success/failure with reasons&lt;/li&gt;
&lt;li&gt;Seat status changes&lt;/li&gt;
&lt;li&gt;Payment timeout events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Notification System Logs&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Message publish events&lt;/li&gt;
&lt;li&gt;Delivery confirmations&lt;/li&gt;
&lt;li&gt;Connection establishment/teardown&lt;/li&gt;
&lt;li&gt;Fanout performance metrics&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Disaster Recovery
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Failure Scenarios
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Queue System Failures&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Redis Cluster Failure&lt;/strong&gt;: 15-30 second recovery with automatic failover&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Order Corruption&lt;/strong&gt;: 2-5 minute recovery using database reconstruction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Position Calculation Errors&lt;/strong&gt;: Immediate detection via monitoring, &amp;lt;1 minute fix&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Admission Rate Overload&lt;/strong&gt;: Automatic throttling, 30 second stabilization&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reservation System Failures&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database Connection Loss&lt;/strong&gt;: 10-20 second recovery with connection pooling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock Deadlock&lt;/strong&gt;: Automatic detection and resolution in &amp;lt;5 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Rollback&lt;/strong&gt;: Immediate cleanup, no data corruption&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seat Status Inconsistency&lt;/strong&gt;: Background reconciliation process&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Notification System Failures&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;WebSocket Connection Drops&lt;/strong&gt;: Automatic reconnection in &amp;lt;3 seconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SNS Service Outage&lt;/strong&gt;: Fallback to direct database polling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lambda Function Timeout&lt;/strong&gt;: Automatic retry with exponential backoff&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message Bus Failure&lt;/strong&gt;: Graceful degradation to polling mode&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Recovery Procedures
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Queue System Recovery&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Redis Failover&lt;/strong&gt;: Automatic promotion of replica to primary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Reconstruction&lt;/strong&gt;: Rebuild from database using join_seq ordering&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Position Recalculation&lt;/strong&gt;: Update all user positions using head_seq tracking&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Admission Resume&lt;/strong&gt;: Restart controlled admission at safe rate&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Reservation System Recovery&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Connection Pool Reset&lt;/strong&gt;: Clear failed connections and establish new ones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lock Cleanup&lt;/strong&gt;: Release any orphaned locks from failed transactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seat Status Audit&lt;/strong&gt;: Verify all seat statuses match reservation records&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Log Replay&lt;/strong&gt;: Replay any committed transactions that weren't reflected&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Notification System Recovery&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Connection Re-establishment&lt;/strong&gt;: Reconnect all dropped WebSocket/SSE connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Message Replay&lt;/strong&gt;: Replay missed notifications from event log&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fanout Restart&lt;/strong&gt;: Resume SNS-based message distribution&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Client Notification&lt;/strong&gt;: Notify users of temporary service interruption&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Backup Strategy
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Queue State Backup&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Real-time Replication&lt;/strong&gt;: Continuous Redis replication to standby cluster&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Log Backup&lt;/strong&gt;: All queue events stored in durable database (primary recovery mechanism)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Queue Reconstruction&lt;/strong&gt;: Rebuild queue from database events if Redis fails&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RTO/RPO&lt;/strong&gt;: 30 second recovery time, 5 second data loss maximum&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Reservation Data Backup&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Database Replication&lt;/strong&gt;: Real-time replication to standby database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transaction Log Backup&lt;/strong&gt;: Continuous backup of all reservation transactions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Seat Map Backup&lt;/strong&gt;: Daily backup of seat configuration and status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RTO/RPO&lt;/strong&gt;: 15 second recovery time, 1 second data loss maximum&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Notification System Backup&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Message Queue Backup&lt;/strong&gt;: SNS topic replication across regions&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Connection State Backup&lt;/strong&gt;: Periodic backup of active connections&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Stream Backup&lt;/strong&gt;: All notification events stored in durable log&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RTO/RPO&lt;/strong&gt;: 45 second recovery time, 10 second data loss maximum&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h"&gt;&lt;strong&gt;Part 4&lt;/strong&gt;: Payments and ticket issuance (idempotency and consistency)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gemini.google.com/share/15d3d82059b5" rel="noopener noreferrer"&gt;Review by Google Gemini&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>database</category>
      <category>redis</category>
      <category>sse</category>
    </item>
    <item>
      <title>Part 2: Event Discovery</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Thu, 23 Oct 2025 17:46:01 +0000</pubDate>
      <link>https://dev.to/sumedhbala/part-2-event-discovery-32o5</link>
      <guid>https://dev.to/sumedhbala/part-2-event-discovery-32o5</guid>
      <description>&lt;h2&gt;
  
  
  Search &amp;amp; Indexing
&lt;/h2&gt;

&lt;p&gt;Event discovery is the foundation of any ticketing system. The goal is to let users quickly find concerts, sports matches, or theater events by name, location, or venue. While it looks simple from the outside, the design must handle millions of queries, abbreviations (LA vs. Los Angeles), nearby cities (Pasadena), and fuzzy matches (typos like San Franciso).&lt;/p&gt;

&lt;p&gt;This breakdown covers the baseline design first, followed by a deep dive into enhanced search capabilities.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. Baseline Functional Solution
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Services
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search API Service&lt;/strong&gt; – Handles user queries, normalizes input, executes searches, and returns ranked results.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indexing Service&lt;/strong&gt; – Updates the search index whenever new events are created or updated.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Event Management Service (EMS)&lt;/strong&gt; – Manages canonical event metadata and feeds it to the search index.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Databases &amp;amp; Caches
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Primary Database&lt;/strong&gt; (PostgreSQL / MySQL / DynamoDB) – Stores event metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search Index&lt;/strong&gt; (Elasticsearch / OpenSearch / Solr) – Provides full-text and geospatial search.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache&lt;/strong&gt; (Redis / Memcached) – Stores trending searches and frequently accessed results.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database Schema (Events Table)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;event_id&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="nb"&gt;BIGINT&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;primary&lt;/span&gt; &lt;span class="k"&gt;key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;event_name&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Event&lt;/span&gt; &lt;span class="n"&gt;title&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"Taylor Swift Eras Tour"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;venue_name&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Venue&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"SoFi Stadium"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;city_normalized&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Canonical&lt;/span&gt; &lt;span class="n"&gt;city&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"Los Angeles"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;state&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="k"&gt;State&lt;/span&gt; &lt;span class="k"&gt;or&lt;/span&gt; &lt;span class="n"&gt;province&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"CA"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Country&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="nv"&gt;"USA"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;location_lat&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Venue&lt;/span&gt; &lt;span class="n"&gt;latitude&lt;/span&gt;
&lt;span class="n"&gt;location_lng&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;Venue&lt;/span&gt; &lt;span class="n"&gt;longitude&lt;/span&gt;
&lt;span class="n"&gt;event_start_time&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="k"&gt;Start&lt;/span&gt; &lt;span class="nb"&gt;datetime&lt;/span&gt;
&lt;span class="n"&gt;event_end_time&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="k"&gt;End&lt;/span&gt; &lt;span class="nb"&gt;datetime&lt;/span&gt;
&lt;span class="n"&gt;search_synonyms&lt;/span&gt; &lt;span class="err"&gt;–&lt;/span&gt; &lt;span class="n"&gt;JSON&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="n"&gt;array&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="n"&gt;synonyms&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;g&lt;/span&gt;&lt;span class="p"&gt;.,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;"LA"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"L.A."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;"Los Angeles"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  2. Why Indexes Differ by Column
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;event_id&lt;/strong&gt; → B-Tree index, fast lookup after search returns candidate IDs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;event_name, venue_name, search_synonyms&lt;/strong&gt; → Full-text or trigram indexes (support partial matches/typos).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;city_normalized, state, country&lt;/strong&gt; → B-Tree index (exact match queries).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;location_lat, location_lng&lt;/strong&gt; → GiST or SP-GiST index (PostGIS) for geo queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;event_start_time, event_end_time&lt;/strong&gt; → B-Tree index for range queries.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Composite Indexes&lt;/strong&gt; → (city_normalized, event_start_time) for common multi-column queries like "finding all upcoming events in Los Angeles, ordered by date."&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Takeaway&lt;/strong&gt;: Use the right index type per query pattern: B-Trees for equality/range, GIN/trigrams for text, GiST for geo, and composite for multi-column queries. PostgreSQL handles the system of record; Elasticsearch handles fuzzy candidate generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. APIs
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Search API (Basic)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Endpoint&lt;/strong&gt;: GET /search&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Query Parameters&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;q (string) → user query ("LA concerts")&lt;/li&gt;
&lt;li&gt;location (optional, lat/lng) → geo filtering&lt;/li&gt;
&lt;li&gt;radius (optional, default 30km)&lt;/li&gt;
&lt;li&gt;date_range (optional, start + end)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Response&lt;/strong&gt;: JSON object with an events array, each including:&lt;br&gt;
event_id, event_name, venue, city, state, country, event_start_time, location (lat/lng)&lt;/p&gt;
&lt;h2&gt;
  
  
  4. Example SQL Queries
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Find events in a city&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;city_normalized&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'Los Angeles'&lt;/span&gt; &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;event_start_time&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Find events within 30 km radius&lt;/strong&gt;:&lt;br&gt;
Use spherical distance function with lat/lng coordinates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Find events by name (text search)&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;event_name&lt;/span&gt; &lt;span class="k"&gt;ILIKE&lt;/span&gt; &lt;span class="s1"&gt;'%Taylor Swift%'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  5. Problems Using PostgreSQL Alone
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Free-text search is slow at scale.&lt;/li&gt;
&lt;li&gt;Fuzzy/typo handling is limited.&lt;/li&gt;
&lt;li&gt;Synonyms/aliases require hacks.&lt;/li&gt;
&lt;li&gt;Ranking, autocomplete, aggregations are inefficient.&lt;/li&gt;
&lt;li&gt;High-QPS text search is not distributed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Why Elasticsearch/OpenSearch is Needed&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inverted indices → sub-linear text lookup&lt;/li&gt;
&lt;li&gt;Built-in typo tolerance, synonyms, stemming&lt;/li&gt;
&lt;li&gt;BM25 ranking &amp;amp; boosting&lt;/li&gt;
&lt;li&gt;Efficient geo + text + time queries&lt;/li&gt;
&lt;li&gt;Autocomplete and faceted counts&lt;/li&gt;
&lt;li&gt;Horizontally scalable and fault-tolerant&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Soundbite for interviews&lt;/strong&gt;: "Postgres is the source of truth, but breaks down for fuzzy search, synonyms, relevance, autocomplete, or scale. Production systems pair it with Elasticsearch/OpenSearch."&lt;/p&gt;

&lt;h2&gt;
  
  
  6. Enhancing User Experience
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Synonym Expansion&lt;/strong&gt;: Map "LA" → "Los Angeles" and expand queries accordingly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geolocation Queries&lt;/strong&gt;: Lat/lng filtering ensures nearby cities (e.g., Pasadena) appear.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Similarity Search&lt;/strong&gt;: Fuzzy matching using Elasticsearch analyzers or trigram indexes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;User Query Flow Example&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User types "LA concerts."&lt;/li&gt;
&lt;li&gt;Query normalized (lowercase, punctuation removed).&lt;/li&gt;
&lt;li&gt;Synonym expansion (LA → Los Angeles).&lt;/li&gt;
&lt;li&gt;Geo coordinates retrieved.&lt;/li&gt;
&lt;li&gt;Search index queried (text + geo filter).&lt;/li&gt;
&lt;li&gt;Results ranked (relevance, time, popularity, distance).&lt;/li&gt;
&lt;li&gt;Results cached if trending.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  7. Non-Functional Requirements
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Latency&lt;/strong&gt;: Redis caching → sub-100ms for popular searches&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Elasticsearch distributed indexing/sharding&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Availability&lt;/strong&gt;: Replicated DB + multi-AZ Elasticsearch&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consistency&lt;/strong&gt;: Event Management Service is source of truth; eventual consistency to search index&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  8. Deeper Dive: Fuzzy Search
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Scenario&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Dictionary: all cities (~200k–500k entries)&lt;/li&gt;
&lt;li&gt;User input: "San Franciso" (typo of "San Francisco")&lt;/li&gt;
&lt;li&gt;Goal: find the closest matching city efficiently, despite typos&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  8.1 Naive Levenshtein Distance
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Compare input string with every city in the dictionary&lt;/li&gt;
&lt;li&gt;Compute edit distance (insertions, deletions, substitutions) for each city&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Input: "San Franciso"&lt;/li&gt;
&lt;li&gt;"San Francisco" → edit distance = 2 → match&lt;/li&gt;
&lt;li&gt;"Los Angeles" → edit distance = 11 → ignored&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Data storage&lt;/strong&gt;: Simple array or list of city names (no preprocessing)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Time complexity&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Let n = number of cities, m = average city name length&lt;/li&gt;
&lt;li&gt;Computing edit distance per city: O(m²) &lt;em&gt;(see detailed explanation at end of document)&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;Total: O(n * m²) → very slow for large n&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;: Simple to implement&lt;br&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Not scalable for real-time search with hundreds of thousands of entries&lt;/p&gt;
&lt;h3&gt;
  
  
  8.2 BK-Tree (Burkhard-Keller Tree)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: BK-Trees are designed for approximate string matching under a metric (commonly Levenshtein distance). They allow you to find all strings within a given "distance" efficiently, without scanning the entire dataset.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structure&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each node = a string (city name)&lt;/li&gt;
&lt;li&gt;Each edge = edit distance between parent and child&lt;/li&gt;
&lt;li&gt;If a child exists at that edge, insertion continues down that edge&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Root Choice&lt;/strong&gt;: The first inserted word becomes the root. Root choice doesn't affect correctness, but a "central" root can reduce tree depth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Insertion Example (Detailed)&lt;/strong&gt;:&lt;br&gt;
Insertion order: Los Angeles → Pasadena → Glendale → San Jose → San Mateo → San Francisco → Fresno&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Insert Los Angeles → root&lt;/li&gt;
&lt;li&gt;Insert Pasadena → dist(Pasadena, Los Angeles) = 8 → add under Los Angeles&lt;/li&gt;
&lt;li&gt;Insert Glendale → dist(Glendale, Los Angeles) = 8 → edge exists → recurse into Pasadena → dist(Glendale, Pasadena) = 7 → add under Pasadena&lt;/li&gt;
&lt;li&gt;Insert San Jose → dist(San Jose, Los Angeles) = 11 → add under Los Angeles&lt;/li&gt;
&lt;li&gt;Insert San Mateo → dist(San Mateo, Los Angeles) = 11 → recurse into San Jose → dist(San Mateo, San Jose) = 4 → add under San Jose&lt;/li&gt;
&lt;li&gt;Insert San Francisco → dist(San Francisco, Los Angeles) = 10 → add under Los Angeles&lt;/li&gt;
&lt;li&gt;Insert Fresno → dist(Fresno, Los Angeles) = 10 → recurse into San Francisco → dist(Fresno, San Francisco) = 9 → add under San Francisco&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Final Tree&lt;/strong&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Los Angeles──(8)──▶ Pasadena  ──(7)──▶ Glendale
              ──(10)──▶ San Francisco  ──(9)──▶ Fresno
              ──(11)──▶ San Jose       ──(4)──▶ San Mateo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Search Logic (Fuzzy Search)&lt;/strong&gt;:&lt;br&gt;
Query: "San Jsoe" (typo for San Jose), max edit distance = 2&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Start at Los Angeles → dist = 11 → allowed child edges [9,13]&lt;/li&gt;
&lt;li&gt;Children within range: San Francisco (10), San Jose (11)&lt;/li&gt;
&lt;li&gt;Check San Francisco → dist = 6 → prune branch&lt;/li&gt;
&lt;li&gt;Check San Jose → dist = 1 → ✅ Match found&lt;/li&gt;
&lt;li&gt;Explore San Jose's children → San Mateo → dist = 6 → prune&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Key Observations&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Branch pruning via [d-k, d+k] makes search efficient&lt;/li&gt;
&lt;li&gt;Chains form naturally when cities share distances&lt;/li&gt;
&lt;li&gt;Insertion order affects tree shape but not correctness&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;BK-Tree Complexity&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Distance computation per node: O(m²)&lt;/li&gt;
&lt;li&gt;Average case: O(log n) nodes visited&lt;/li&gt;
&lt;li&gt;Worst case: O(n) nodes visited&lt;/li&gt;
&lt;li&gt;Total search time: O(v * m²), v = nodes actually visited&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Intuition&lt;/strong&gt;: BK-Tree is like a binary tree where only relevant branches are explored, but each node comparison costs O(m²).&lt;/p&gt;
&lt;h3&gt;
  
  
  8.3 Trie + Levenshtein Automaton
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Store all city names in a Trie (prefix tree)&lt;/li&gt;
&lt;li&gt;Example: "San Francisco" → S → a → n → _ → F → r → …&lt;/li&gt;
&lt;li&gt;Use Levenshtein Automaton to traverse the Trie:

&lt;ul&gt;
&lt;li&gt;Track allowed edits at each step&lt;/li&gt;
&lt;li&gt;Prune paths exceeding threshold&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Time Complexity&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Worst-case: O(n)&lt;/li&gt;
&lt;li&gt;Average case with pruning: much less → efficient for large dictionaries&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;: Scales well, precise, dynamic typo handling&lt;br&gt;
&lt;strong&gt;Cons&lt;/strong&gt;: Higher implementation complexity than BK-Tree&lt;/p&gt;
&lt;h2&gt;
  
  
  &lt;strong&gt;Fuzzy Search with Trie + Levenshtein Automaton (Beginner-Friendly)&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Suppose you have a list of city names:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;["San Francisco", "San Jose", "San Mateo", "Los Angeles", "Fresno"]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You want to search for &lt;code&gt;"San Jsoe"&lt;/code&gt; (typo) and still find &lt;code&gt;"San Jose"&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We can do this efficiently using &lt;strong&gt;two things together&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A &lt;strong&gt;Trie&lt;/strong&gt; (prefix tree) to store city names.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Levenshtein distance&lt;/strong&gt; to measure "how different" two strings are.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;1. Trie Basics&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;A &lt;strong&gt;Trie&lt;/strong&gt; is a tree where each node is a letter. Words are formed by following a path from the root to a leaf.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Root
 |
 S
 |
 a
 |
 n
 |
(space)
 ├── F → "San Francisco"
 ├── J → "San Jose"
 └── M → "San Mateo"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Every path from root → leaf = a city name.&lt;/li&gt;
&lt;li&gt;We can quickly follow letters to see which words match a prefix.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;2. Levenshtein Distance Basics&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Levenshtein distance = &lt;strong&gt;minimum edits to turn one string into another&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Edit = insert, delete, or change a letter&lt;/li&gt;
&lt;li&gt;Example: &lt;code&gt;"San Jsoe"&lt;/code&gt; → &lt;code&gt;"San Jose"&lt;/code&gt; → 1 edit (swap &lt;code&gt;'s'&lt;/code&gt; and &lt;code&gt;'o'&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We allow a &lt;strong&gt;maximum distance&lt;/strong&gt; (e.g., 2) so we match words even with typos.&lt;/p&gt;




&lt;h2&gt;
  
  
  &lt;strong&gt;3. How the Search Works&lt;/strong&gt;
&lt;/h2&gt;

&lt;p&gt;Instead of comparing &lt;code&gt;"San Jsoe"&lt;/code&gt; with every city (slow), we combine &lt;strong&gt;Trie + Levenshtein Automaton&lt;/strong&gt;.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;At each Trie node, we maintain a &lt;strong&gt;distance row&lt;/strong&gt; representing &lt;strong&gt;edits needed&lt;/strong&gt; to match the query so far.&lt;/li&gt;
&lt;li&gt;Row length = query length + 1 = 8 + 1 = 9.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 0: Start at empty string&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Query = &lt;code&gt;"San Jsoe"&lt;/code&gt; (8 chars)&lt;br&gt;
Trie path = &lt;code&gt;""&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;Distance row from empty string:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[0, 1, 2, 3, 4, 5, 6, 7, 8]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Index = number of query characters considered&lt;/li&gt;
&lt;li&gt;Value = edits needed to turn empty string → query prefix&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 1: Add first letter &lt;code&gt;'S'&lt;/code&gt; → Trie path &lt;code&gt;"S"&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Compute new row using formula:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;new_row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;previous_row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# deletion
&lt;/span&gt;    &lt;span class="n"&gt;new_row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;         &lt;span class="c1"&gt;# insertion
&lt;/span&gt;    &lt;span class="n"&gt;previous_row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;cost&lt;/span&gt;  &lt;span class="c1"&gt;# substitution (0 if match, 1 if not)
&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Example table:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;j&lt;/th&gt;
&lt;th&gt;Query prefix&lt;/th&gt;
&lt;th&gt;Compare 'S' vs query[j-1]&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Computation&lt;/th&gt;
&lt;th&gt;new_row[j]&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;""&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;0 + 1 = 1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;"S"&lt;/td&gt;
&lt;td&gt;S vs S&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;min(1+1,1+1,0+0)=0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;"Sa"&lt;/td&gt;
&lt;td&gt;S vs a&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(2+1,0+1,1+1)=1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;"San"&lt;/td&gt;
&lt;td&gt;S vs n&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(3+1,1+1,2+1)=2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;"San "&lt;/td&gt;
&lt;td&gt;S vs (space)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(4+1,2+1,3+1)=3&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;"San J"&lt;/td&gt;
&lt;td&gt;S vs J&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(5+1,3+1,4+1)=4&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;"San Js"&lt;/td&gt;
&lt;td&gt;S vs s&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(6+1,4+1,5+1)=5&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;"San Jso"&lt;/td&gt;
&lt;td&gt;S vs o&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(7+1,5+1,6+1)=6&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;"San Jsoe"&lt;/td&gt;
&lt;td&gt;S vs e&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(8+1,6+1,7+1)=7&lt;/td&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Resulting row:&lt;/strong&gt; &lt;code&gt;[1, 0, 1, 2, 3, 4, 5, 6, 7]&lt;/code&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 2: Add &lt;code&gt;'a'&lt;/code&gt; → Trie path &lt;code&gt;"Sa"&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Example table:&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;j&lt;/th&gt;
&lt;th&gt;Query prefix&lt;/th&gt;
&lt;th&gt;Compare 'a' vs query[j-1]&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;th&gt;Computation&lt;/th&gt;
&lt;th&gt;new_row[j]&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;""&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;td&gt;1 + 1 = 2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;"S"&lt;/td&gt;
&lt;td&gt;a vs S&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(0+1,2+1,1+1)=1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;"Sa"&lt;/td&gt;
&lt;td&gt;a vs a&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;min(1+1,1+1,0+0)=0&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;"San"&lt;/td&gt;
&lt;td&gt;a vs n&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(2+1,0+1,1+1)=1&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;"San "&lt;/td&gt;
&lt;td&gt;a vs (space)&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(3+1,1+1,2+1)=2&lt;/td&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;"San J"&lt;/td&gt;
&lt;td&gt;a vs J&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(4+1,2+1,3+1)=3&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;"San Js"&lt;/td&gt;
&lt;td&gt;a vs s&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(5+1,3+1,4+1)=4&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;"San Jso"&lt;/td&gt;
&lt;td&gt;a vs o&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(6+1,4+1,5+1)=5&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;"San Jsoe"&lt;/td&gt;
&lt;td&gt;a vs e&lt;/td&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;min(7+1,5+1,6+1)=6&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Resulting row:&lt;/strong&gt; &lt;code&gt;[2, 1, 0, 1, 2, 3, 4, 5, 6]&lt;/code&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 3: Add &lt;code&gt;'n'&lt;/code&gt; → Trie path &lt;code&gt;"San"&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Distance row:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[3, 2, 1, 0, 1, 2, 3, 4, 5]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Minimum distance = 0 → path &lt;code&gt;"San"&lt;/code&gt; matches query &lt;code&gt;"San"&lt;/code&gt; perfectly so far.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 4: Add space &lt;code&gt;' '&lt;/code&gt; → Trie path &lt;code&gt;"San "&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;p&gt;Distance row:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[4, 2, 1, 1, 0, 1, 2, 3, 4]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;Minimum distance = 0 → &lt;code&gt;"San "&lt;/code&gt; matches query so far&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 5: Continue down &lt;code&gt;"San J"&lt;/code&gt; → &lt;code&gt;"San Jose"&lt;/code&gt;&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;Add &lt;code&gt;'J'&lt;/code&gt; → distance row: &lt;code&gt;[5,4,3,2,1,0,1,2,3]&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add &lt;code&gt;'o'&lt;/code&gt; → &lt;code&gt;[6,5,4,3,2,1,1,2,3]&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add &lt;code&gt;'s'&lt;/code&gt; → &lt;code&gt;[7,6,5,4,3,2,1,1,2]&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Add &lt;code&gt;'e'&lt;/code&gt; → &lt;code&gt;[8,7,6,5,4,3,2,1,1]&lt;/code&gt; ✅&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Final distance = 1 → &lt;strong&gt;MATCH FOUND&lt;/strong&gt; &lt;code&gt;"San Jose"&lt;/code&gt;&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 6: Early Pruning&lt;/strong&gt;
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;"San Francisco"&lt;/code&gt; → distance eventually &amp;gt; 2 → stop exploring&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"San Jose"&lt;/code&gt; → distance ≤ 2 → match&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;"San Mateo"&lt;/code&gt; → distance &amp;gt; 2 → stop&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  &lt;strong&gt;Step 7: Intuition&lt;/strong&gt;
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;Walk down the Trie (letter by letter).&lt;/li&gt;
&lt;li&gt;Track minimum edits to match the query at each step (distance row).&lt;/li&gt;
&lt;li&gt;Stop exploring paths where edits exceed max allowed distance.&lt;/li&gt;
&lt;li&gt;If a complete word reaches end node within max edits → success!&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;This method powers &lt;strong&gt;fuzzy search&lt;/strong&gt;, &lt;strong&gt;autocomplete&lt;/strong&gt;, and &lt;strong&gt;spell correction&lt;/strong&gt; in real systems.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Key Advantages&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Early Pruning&lt;/strong&gt;: Stop exploring paths when distance exceeds threshold&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Shared Prefixes&lt;/strong&gt;: Multiple cities with same prefix share computation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incremental Updates&lt;/strong&gt;: Distance matrix updates are O(m) per character&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory Efficient&lt;/strong&gt;: Only store one distance matrix per search path&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;6. Performance Comparison&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Time Complexity&lt;/th&gt;
&lt;th&gt;Space&lt;/th&gt;
&lt;th&gt;Implementation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Naive Levenshtein&lt;/td&gt;
&lt;td&gt;O(n × m²)&lt;/td&gt;
&lt;td&gt;O(m²)&lt;/td&gt;
&lt;td&gt;Simple&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;BK-Tree&lt;/td&gt;
&lt;td&gt;O(log n × m²)&lt;/td&gt;
&lt;td&gt;O(n)&lt;/td&gt;
&lt;td&gt;Medium&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Trie + Levenshtein Automaton&lt;/td&gt;
&lt;td&gt;O(n × m)&lt;/td&gt;
&lt;td&gt;O(n + m)&lt;/td&gt;
&lt;td&gt;Complex&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;7. When to Use&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Large dictionaries&lt;/strong&gt; (100k+ entries)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Frequent fuzzy searches&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory constraints&lt;/strong&gt; (trie is more compact than BK-tree)&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Need for prefix-based filtering&lt;/strong&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Trie + Levenshtein Automaton approach is the most sophisticated but also the most efficient for large-scale fuzzy string matching in production systems.&lt;/p&gt;

&lt;h3&gt;
  
  
  8.4 Trigram Index / Elasticsearch Approach
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;How it works&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Create Trigrams: Break each city name into 3-character overlapping sequences

&lt;ul&gt;
&lt;li&gt;Example: "San Francisco" → ["San", "an ", "n F", " Fr", "Fra", "ran", "anc", "nci", "cis", "isc", "sco"]&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Build Inverted Index: Each trigram → list of city IDs

&lt;ul&gt;
&lt;li&gt;Example: "Fra" → [San Francisco, Frankfurt]&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;Querying: Input "San Franciso" → generate trigrams → retrieve all cities containing at least one trigram → compute similarity score → return top matches&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Pros&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scales to millions of entries&lt;/li&gt;
&lt;li&gt;Handles typos, partial matches, multi-word queries&lt;/li&gt;
&lt;li&gt;Produces ranked results for UX&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Cons&lt;/strong&gt;: Requires search engine infrastructure, slightly more memory overhead&lt;/p&gt;

&lt;h2&gt;
  
  
  9. How Elasticsearch Ranks Results: BM25 + Boosting
&lt;/h2&gt;

&lt;h3&gt;
  
  
  BM25 (Best Match 25)
&lt;/h3&gt;

&lt;p&gt;Statistical ranking algorithm (improvement on TF-IDF)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key ingredients&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Term Frequency (TF)&lt;/strong&gt; → frequency of term in field, diminishing returns&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inverse Document Frequency (IDF)&lt;/strong&gt; → rare terms weigh more&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field Length Normalization&lt;/strong&gt; → shorter fields get higher weight&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Boosting
&lt;/h3&gt;

&lt;p&gt;Manually tweak ranking&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Field Boosting&lt;/strong&gt;: event_name^3, artist_name^2, city^1&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Boosting&lt;/strong&gt;: recency, proximity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business Logic Boosting&lt;/strong&gt;: trending events, sponsored events&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Interview Takeaway&lt;/strong&gt;: BM25 + Boosting ensures results are relevant (BM25) and aligned with business/user intent (boosting).&lt;/p&gt;

&lt;h2&gt;
  
  
  10. Indexing Every PostgreSQL Field: Pros &amp;amp; Cons
&lt;/h2&gt;

&lt;p&gt;When designing a large-scale ticketing system, deciding which fields to index in PostgreSQL is crucial, especially when pairing it with a specialized search engine like Elasticsearch.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pros of PostgreSQL Indexing:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Backup System&lt;/strong&gt;: PostgreSQL can act as a reliable fallback if Elasticsearch experiences an outage, ensuring continuous service.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Direct Access for Specific Queries&lt;/strong&gt;: For certain precise queries or filters (e.g., direct lookups by event_id), direct access through PostgreSQL indexes can be significantly faster.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strong Consistency &amp;amp; Data Integrity&lt;/strong&gt;: Indexes on crucial fields (like event_id, often as a primary key) are paramount for enforcing strong consistency and data integrity. While the guarantee of integrity comes from database constraints (e.g., PRIMARY KEY, UNIQUE) and transactional properties, indexes are essential for the efficient and scalable enforcement of these guarantees. They enable the database to quickly check for uniqueness and relationships, preventing issues like duplicate events, even when the events table contains millions of records. This is a core requirement that search engines are not designed to fulfill.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Simplified Read-Your-Own-Writes&lt;/strong&gt;: Immediate query results for newly created or updated events are possible directly from PostgreSQL, providing consistent results for the user who made the change.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cons of PostgreSQL Indexing:
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Increased Write Overhead&lt;/strong&gt;: More indexes generally lead to slower insert and update operations, as each index needs to be maintained.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Higher Storage Costs&lt;/strong&gt;: Each index consumes additional disk space, increasing overall storage expenses.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Inefficient for Complex Queries&lt;/strong&gt;: PostgreSQL's built-in text and fuzzy search capabilities are less efficient at scale compared to specialized search engines like Elasticsearch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Maintenance Complexity&lt;/strong&gt;: Managing and tuning a large number of indexes can add to operational overhead.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Recommendation for Ticketing Systems: Strategic PostgreSQL Indexing
&lt;/h3&gt;

&lt;p&gt;While it's not necessary to index every field in PostgreSQL, it plays a vital role. The primary responsibility for high-volume, complex search queries is offloaded to Elasticsearch, which is horizontally scalable and optimized for such tasks.&lt;/p&gt;

&lt;p&gt;PostgreSQL indexes serve two critical, complementary roles in this architecture:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Ensuring Data Integrity and Transactional Consistency&lt;/strong&gt;: Indexes on crucial fields like event_id (typically as a primary key) are paramount. They enable the database to efficiently enforce UNIQUE and PRIMARY KEY constraints, ensuring that event data is always correct and preventing issues like duplicate event entries. This is a core requirement that search engines cannot guarantee.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Providing a Failsafe&lt;/strong&gt;: Event creation and updates are typically low-volume operations in a ticketing system (compared to search queries). Therefore, the write overhead incurred by maintaining additional PostgreSQL indexes is acceptable. This investment ensures a robust and consistent backup of critical event data, providing a crucial safety net in the event of a search engine outage. The key here is that writes to the events table are very infrequent (as very few events are added relative to read operations), so the overhead of more indexes does not pose a significant problem in terms of performance.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Soundbite for Interview&lt;/strong&gt;: "Mentioning PostgreSQL indexing demonstrates you've considered distributed system failure modes and have a plan B. Downsides are outweighed by reliability, consistency, and data integrity."&lt;/p&gt;

&lt;h2&gt;
  
  
  11. Indexing Service &amp;amp; Event Management Service
&lt;/h2&gt;

&lt;p&gt;These services ensure search data is accurate, timely, and scalable, maintaining a strong connection between canonical event data in PostgreSQL and the search index in Elasticsearch/OpenSearch.&lt;/p&gt;

&lt;h3&gt;
  
  
  11.1 Event Management Service (EMS)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: EMS is the source of truth for all event metadata. Responsible for creating, updating, and managing canonical event records.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Responsibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Manage all event-related data:

&lt;ul&gt;
&lt;li&gt;Event name, venue, city, state, country&lt;/li&gt;
&lt;li&gt;Start and end times&lt;/li&gt;
&lt;li&gt;Event capacity and ticket types&lt;/li&gt;
&lt;li&gt;Synonyms for search queries (e.g., LA → Los Angeles)&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Validate incoming event data for consistency and correctness&lt;/li&gt;

&lt;li&gt;Track changes via event versioning or timestamps&lt;/li&gt;

&lt;li&gt;Emit notifications for downstream systems when events are created or updated&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Considerations for High Scale&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Use atomic transactions in PostgreSQL to ensure consistency&lt;/li&gt;
&lt;li&gt;Implement soft deletes to prevent broken references in search index&lt;/li&gt;
&lt;li&gt;Emit event change logs to message broker (Kafka, RabbitMQ) for asynchronous processing by Indexing Service&lt;/li&gt;
&lt;li&gt;Ensure idempotency in updates to avoid duplicate indexing&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Flow&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Event created/updated in PostgreSQL&lt;/li&gt;
&lt;li&gt;EMS validates the record&lt;/li&gt;
&lt;li&gt;EMS publishes event notification (message queue / Kafka topic)&lt;/li&gt;
&lt;li&gt;Indexing Service consumes the message to update the search index&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  11.2 Indexing Service
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Purpose&lt;/strong&gt;: Keep the search index up-to-date with canonical data for fast, accurate results&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Responsibilities&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Consume event change notifications from EMS&lt;/li&gt;
&lt;li&gt;Transform data into search-friendly format:

&lt;ul&gt;
&lt;li&gt;Text fields → tokenized&lt;/li&gt;
&lt;li&gt;Locations → geo-points&lt;/li&gt;
&lt;li&gt;Dates → sortable timestamps&lt;/li&gt;
&lt;/ul&gt;


&lt;/li&gt;

&lt;li&gt;Update/delete documents in Elasticsearch/OpenSearch&lt;/li&gt;

&lt;li&gt;Handle batch updates and real-time streaming&lt;/li&gt;

&lt;li&gt;Maintain versioning to detect stale updates&lt;/li&gt;

&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Key Implementation Details&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Message Queue Consumption&lt;/strong&gt;: Kafka, RabbitMQ, or Kinesis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Idempotent Updates&lt;/strong&gt;: Each event has a unique ID&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bulk Indexing&lt;/strong&gt;: For new events or large updates&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Error Handling&lt;/strong&gt;: Failed updates retried, dead-letter queue stores permanently failed updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Consistency Model&lt;/strong&gt;:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Elasticsearch is eventually consistent&lt;/li&gt;
&lt;li&gt;EMS remains the source of truth&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example Flow&lt;/strong&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Receive "event_created" message from EMS&lt;/li&gt;
&lt;li&gt;Transform event record (normalize names, add synonyms, compute geo coordinates)&lt;/li&gt;
&lt;li&gt;Push to Elasticsearch via bulk API&lt;/li&gt;
&lt;li&gt;Confirm update or retry if it fails&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  11.3 Architecture &amp;amp; Reliability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Decoupling&lt;/strong&gt;: EMS and Indexing Service communicate via message queues → prevents blocking writes in Postgres during indexing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resilience&lt;/strong&gt;: Retry mechanisms and dead-letter queues for failures&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scalability&lt;/strong&gt;: Multiple consumers can scale horizontally for high volumes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Observability&lt;/strong&gt;: Metrics (throughput, error rates, lag), logging full trace (event_id, timestamp, status)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  11.4 Interview Takeaways
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;EMS ensures canonical, consistent event data&lt;/li&gt;
&lt;li&gt;Indexing Service keeps search index current and performant&lt;/li&gt;
&lt;li&gt;Decoupled, message-driven architecture → scalable &amp;amp; fault-tolerant&lt;/li&gt;
&lt;li&gt;Idempotent + bulk operations → reliability under high load&lt;/li&gt;
&lt;li&gt;Plan B for indexing failures demonstrates awareness of real-world distributed system design&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  12. Performance Benchmarks
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Search Performance Comparison
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Category&lt;/th&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;P95 Latency&lt;/th&gt;
&lt;th&gt;QPS&lt;/th&gt;
&lt;th&gt;Memory&lt;/th&gt;
&lt;th&gt;Notes&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DB-native&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL ILIKE&lt;/td&gt;
&lt;td&gt;200ms&lt;/td&gt;
&lt;td&gt;1,000&lt;/td&gt;
&lt;td&gt;2GB&lt;/td&gt;
&lt;td&gt;Exact text matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;DB-native&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;PostgreSQL Trigram&lt;/td&gt;
&lt;td&gt;500ms&lt;/td&gt;
&lt;td&gt;500&lt;/td&gt;
&lt;td&gt;4GB&lt;/td&gt;
&lt;td&gt;Fuzzy text search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Elasticsearch (Fuzzy)&lt;/td&gt;
&lt;td&gt;50ms&lt;/td&gt;
&lt;td&gt;10,000+&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;Scalable fuzzy search&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Elasticsearch (Regex)&lt;/td&gt;
&lt;td&gt;80ms&lt;/td&gt;
&lt;td&gt;8,000+&lt;/td&gt;
&lt;td&gt;10GB&lt;/td&gt;
&lt;td&gt;Regex/wildcard queries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;OpenSearch&lt;/td&gt;
&lt;td&gt;60ms&lt;/td&gt;
&lt;td&gt;9,000+&lt;/td&gt;
&lt;td&gt;8GB&lt;/td&gt;
&lt;td&gt;AWS-managed Elasticsearch&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Search Engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Solr&lt;/td&gt;
&lt;td&gt;70ms&lt;/td&gt;
&lt;td&gt;8,500+&lt;/td&gt;
&lt;td&gt;9GB&lt;/td&gt;
&lt;td&gt;Enterprise search platform&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Algorithm&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;BK-Tree&lt;/td&gt;
&lt;td&gt;20ms&lt;/td&gt;
&lt;td&gt;5,000&lt;/td&gt;
&lt;td&gt;16GB&lt;/td&gt;
&lt;td&gt;Specialized fuzzy matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Algorithm&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Trie + Levenshtein&lt;/td&gt;
&lt;td&gt;10ms&lt;/td&gt;
&lt;td&gt;15,000+&lt;/td&gt;
&lt;td&gt;6GB&lt;/td&gt;
&lt;td&gt;Efficient fuzzy matching&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cache&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Redis Cached&lt;/td&gt;
&lt;td&gt;5ms&lt;/td&gt;
&lt;td&gt;50,000+&lt;/td&gt;
&lt;td&gt;4GB&lt;/td&gt;
&lt;td&gt;Hot queries only&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Why Trigram Indexes Excel at Complex Queries
&lt;/h3&gt;

&lt;p&gt;While Trie + Levenshtein Automaton is faster for simple fuzzy matching, &lt;strong&gt;Trigram indexes are superior for complex queries&lt;/strong&gt; because they handle multiple search patterns simultaneously. Trigram indexes break text into overlapping 3-character sequences, allowing them to efficiently process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Multi-word queries&lt;/strong&gt;: "Taylor Swift concert" matches documents containing "Taylor", "Swift", and "concert" even with typos&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Partial matches&lt;/strong&gt;: "San Fran*" matches "San Francisco", "San Fernando", "San Francisco Bay"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Regex patterns&lt;/strong&gt;: Complex regular expressions like &lt;code&gt;[A-Z][a-z]+ [A-Z][a-z]+&lt;/code&gt; for proper names&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Wildcard queries&lt;/strong&gt;: "conc*" matches "concert", "conference", "conclusion"&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Phrase proximity&lt;/strong&gt;: Finding documents where "Taylor" and "Swift" appear within 5 words of each other&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Performance advantage&lt;/strong&gt;: Trigram indexes pre-compute all possible 3-character combinations, making complex pattern matching O(1) lookup time instead of O(n×m²) for each query. This is why Elasticsearch uses trigrams for regex/wildcard queries despite higher memory overhead - the complexity of pattern matching makes the trade-off worthwhile.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real-world example&lt;/strong&gt;: A query like "LA concerts this weekend" with typos becomes much more efficient with trigram indexing because it can simultaneously match "LA" (city), "concert*" (event type), and "weekend" (time) across multiple fields and handle variations in each term.&lt;/p&gt;

&lt;h3&gt;
  
  
  Production Metrics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Cluster&lt;/strong&gt;: 3-node, 16GB RAM, 500GB index&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query Latency&lt;/strong&gt;: P50: 15ms, P95: 45ms, P99: 100ms&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Throughput&lt;/strong&gt;: 8,000 QPS sustained, 15,000 QPS burst&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Hit Rate&lt;/strong&gt;: 85% for trending searches&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  13. Caching Strategy
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Multi-Layer Architecture
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;CDN Cache&lt;/strong&gt; (CloudFlare) - Static content, 24h TTL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Application Cache&lt;/strong&gt; (Redis) - Query results, 5-15min TTL&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch Query Cache&lt;/strong&gt; - Search index cache&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Cache Layers
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Query Results&lt;/strong&gt;: Popular searches cached for 15 minutes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Autocomplete&lt;/strong&gt;: City/venue suggestions cached for 1 hour&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Geo Data&lt;/strong&gt;: Location lookups cached for 24 hours&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trending Searches&lt;/strong&gt;: Top queries cached for 1 hour&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cache Invalidation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Event Updates&lt;/strong&gt;: Invalidate related searches by city/venue&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Time-based&lt;/strong&gt;: Different TTLs based on data volatility&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Warming&lt;/strong&gt;: Pre-populate trending searches during off-peak&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  14. Monitoring &amp;amp; Observability
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Key Metrics
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search Performance&lt;/strong&gt;: Latency (P50/P95/P99), success rate, error rate&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;System Health&lt;/strong&gt;: Elasticsearch cluster status, node health, shard status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Business Metrics&lt;/strong&gt;: Search volume, popular queries, user engagement&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cache Performance&lt;/strong&gt;: Hit rate, memory usage, eviction rate&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Alerting Thresholds
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Critical&lt;/strong&gt;: Search latency &amp;gt; 100ms, error rate &amp;gt; 5%, cluster health = Red&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Warning&lt;/strong&gt;: Search latency &amp;gt; 50ms, error rate &amp;gt; 1%, cluster health = Yellow&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Channels&lt;/strong&gt;: PagerDuty (critical), Slack (warnings), Email (reports)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Logging
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search Queries&lt;/strong&gt;: Query, user, latency, results count, cache hit status&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Errors&lt;/strong&gt;: Timeout, connection failures, retry attempts&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt;: Elasticsearch response times, cache hit rates&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  15. Disaster Recovery
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Failure Scenarios
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Single Node&lt;/strong&gt;: 0 downtime, automatic shard rebalancing&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cluster Failure&lt;/strong&gt;: 15-30 min downtime, restore from snapshots&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data Corruption&lt;/strong&gt;: 2-4 hours downtime, full reindex from PostgreSQL&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Recovery Procedures
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cluster Health Check&lt;/strong&gt;: Verify cluster status and shard allocation&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Snapshot Restore&lt;/strong&gt;: Restore from daily snapshots stored in S3&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Full Rebuild&lt;/strong&gt;: Reindex entire dataset from PostgreSQL source&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incremental Sync&lt;/strong&gt;: Catch up on missed updates using timestamps&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Fallback Strategies
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PostgreSQL Fallback&lt;/strong&gt;: Use ILIKE queries with reduced functionality&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read-Only Mode&lt;/strong&gt;: Serve cached results only during outages&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Graceful Degradation&lt;/strong&gt;: Show maintenance message with trending events&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Backup Strategy
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Daily Snapshots&lt;/strong&gt;: Automated daily backups to S3&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-Region Replication&lt;/strong&gt;: Real-time replication to backup cluster&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;RTO/RPO&lt;/strong&gt;: 15-30 min recovery time, 15 min data loss maximum&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Appendix: Why Edit Distance is O(m²)
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;Levenshtein distance&lt;/strong&gt; (edit distance) has &lt;strong&gt;O(m²)&lt;/strong&gt; time complexity because it uses a &lt;strong&gt;dynamic programming approach&lt;/strong&gt; with a &lt;strong&gt;2D table&lt;/strong&gt; where m is the length of the strings being compared.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Algorithm Structure
&lt;/h3&gt;

&lt;p&gt;For two strings of length &lt;code&gt;m&lt;/code&gt; each, the algorithm creates a &lt;strong&gt;m × m table&lt;/strong&gt; and fills it using the following recurrence relation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;levenshtein_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Create (m+1) × (n+1) table
&lt;/span&gt;    &lt;span class="n"&gt;dp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;

    &lt;span class="c1"&gt;# Base cases
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;

    &lt;span class="c1"&gt;# Fill the table
&lt;/span&gt;    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;s1&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;s2&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
                &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;  &lt;span class="c1"&gt;# No operation needed
&lt;/span&gt;            &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;      &lt;span class="c1"&gt;# Deletion
&lt;/span&gt;                    &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;      &lt;span class="c1"&gt;# Insertion  
&lt;/span&gt;                    &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;     &lt;span class="c1"&gt;# Substitution
&lt;/span&gt;                &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Understanding the Base Cases
&lt;/h3&gt;

&lt;p&gt;The base cases initialize the first row and first column of the DP table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Base cases
&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;dp&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What these base cases represent:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;dp[i][0] = i&lt;/code&gt;&lt;/strong&gt; (First column): &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Converting string1[0...i-1] to an empty string&lt;/li&gt;
&lt;li&gt;Requires &lt;strong&gt;i deletions&lt;/strong&gt; (delete all i characters)&lt;/li&gt;
&lt;li&gt;Example: "San Francisco" → "" needs 13 deletions&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;&lt;code&gt;dp[0][j] = j&lt;/code&gt;&lt;/strong&gt; (First row):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Converting empty string to string2[0...j-1] &lt;/li&gt;
&lt;li&gt;Requires &lt;strong&gt;j insertions&lt;/strong&gt; (insert all j characters)&lt;/li&gt;
&lt;li&gt;Example: "" → "San Franciso" needs 12 insertions&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;&lt;strong&gt;Visual representation:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     ""  S  a  n     F  r  a  n  c  i  s  o
""  [0][1][2][3][4][5][6][7][8][9][10][11][12]
S   [1][?][?][?][?][?][?][?][?][?][?][?][?]
a   [2][?][?][?][?][?][?][?][?][?][?][?][?]
n   [3][?][?][?][?][?][?][?][?][?][?][?][?]
    [4][?][?][?][?][?][?][?][?][?][?][?][?]
F   [5][?][?][?][?][?][?][?][?][?][?][?][?]
r   [6][?][?][?][?][?][?][?][?][?][?][?][?]
a   [7][?][?][?][?][?][?][?][?][?][?][?][?]
n   [8][?][?][?][?][?][?][?][?][?][?][?][?]
c   [9][?][?][?][?][?][?][?][?][?][?][?][?]
i   [10][?][?][?][?][?][?][?][?][?][?][?][?]
s   [11][?][?][?][?][?][?][?][?][?][?][?][?]
c   [12][?][?][?][?][?][?][?][?][?][?][?][?]
o   [13][?][?][?][?][?][?][?][?][?][?][?][?]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Why these base cases make sense:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Row 0, Column 0&lt;/strong&gt;: &lt;code&gt;dp[0][0] = 0&lt;/code&gt; (empty string to empty string = 0 operations)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row 0, Column j&lt;/strong&gt;: &lt;code&gt;dp[0][j] = j&lt;/code&gt; (empty string to j-character string = j insertions)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Row i, Column 0&lt;/strong&gt;: &lt;code&gt;dp[i][0] = i&lt;/code&gt; (i-character string to empty string = i deletions)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These base cases provide the foundation for the dynamic programming recurrence relation that fills the rest of the table.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why O(m²)
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Table Size&lt;/strong&gt;: We create a table of size &lt;code&gt;(m+1) × (n+1)&lt;/code&gt;, which is approximately &lt;code&gt;m × m&lt;/code&gt; when strings are similar length&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Nested Loops&lt;/strong&gt;: We have two nested loops:

&lt;ul&gt;
&lt;li&gt;Outer loop: &lt;code&gt;for i in range(1, m + 1)&lt;/code&gt; → &lt;strong&gt;O(m) iterations&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;Inner loop: &lt;code&gt;for j in range(1, n + 1)&lt;/code&gt; → &lt;strong&gt;O(m) iterations&lt;/strong&gt; (when n ≈ m)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total Operations&lt;/strong&gt;: &lt;code&gt;O(m) × O(m) = O(m²)&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Visual Example
&lt;/h3&gt;

&lt;p&gt;For strings "San Francisco" (m=13) and "San Franciso" (m=12):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;     S  a  n     F  r  a  n  c  i  s  o
   [0][1][2][3][4][5][6][7][8][9][10][11][12]
S  [1][0][1][2][3][4][5][6][7][8][9][10][11]
a  [2][1][0][1][2][3][4][5][6][7][8][9][10]
n  [3][2][1][0][1][2][3][4][5][6][7][8][9]
   [4][3][2][1][0][1][2][3][4][5][6][7][8]
F  [5][4][3][2][1][0][1][2][3][4][5][6][7]
r  [6][5][4][3][2][1][0][1][2][3][4][5][6]
a  [7][6][5][4][3][2][1][0][1][2][3][4][5]
n  [8][7][6][5][4][3][2][1][0][1][2][3][4]
c  [9][8][7][6][5][4][3][2][1][0][1][2][3]
i  [10][9][8][7][6][5][4][3][2][1][0][1][2]
s  [11][10][9][8][7][6][5][4][3][2][1][0][1]
c  [12][11][10][9][8][7][6][5][4][3][2][1][0]
o  [13][12][11][10][9][8][7][6][5][4][3][2][1]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Each cell requires constant time to compute&lt;/strong&gt;, but we need to fill &lt;strong&gt;13 × 12 = 156 cells&lt;/strong&gt;, which is &lt;strong&gt;O(m²)&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why This Matters in the Ticketing System
&lt;/h3&gt;

&lt;p&gt;In the context of the ticketing system:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;City Names&lt;/strong&gt;: "San Francisco" vs "San Franciso" (typo)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Scale Problem&lt;/strong&gt;: With 200k-500k cities, computing edit distance for each one would be:

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Per city&lt;/strong&gt;: O(m²) where m ≈ 10-15 characters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Total&lt;/strong&gt;: O(n × m²) = O(500,000 × 15²) = O(112.5 million operations)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance Impact&lt;/strong&gt;: This is why the document mentions it's "very slow for large n"&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  Alternative Solutions
&lt;/h3&gt;

&lt;p&gt;This is exactly why the document suggests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;BK-Trees&lt;/strong&gt;: O(log n) nodes visited, but each comparison still costs O(m²)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trie + Levenshtein Automaton&lt;/strong&gt;: More complex but can be optimized&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Trigram Indexing&lt;/strong&gt;: O(1) lookup time for approximate matches&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;strong&gt;O(m²)&lt;/strong&gt; complexity is inherent to the edit distance algorithm itself - it's the mathematical cost of computing the minimum number of operations needed to transform one string into another.&lt;/p&gt;

&lt;h2&gt;
  
  
  Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-3-seat-management-nn2"&gt;&lt;strong&gt;Part 3&lt;/strong&gt;: Seat selection and reservation (concurrency control)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h"&gt;&lt;strong&gt;Part 4&lt;/strong&gt;: Payments and ticket issuance (idempotency and consistency)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gemini.google.com/share/15d3d82059b5" rel="noopener noreferrer"&gt;Review by Google Gemini&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>systemdesign</category>
      <category>elasticsearch</category>
      <category>algorithms</category>
    </item>
    <item>
      <title>Designing a Large-Scale Ticketing System</title>
      <dc:creator>Sumedh Bala</dc:creator>
      <pubDate>Thu, 23 Oct 2025 17:41:50 +0000</pubDate>
      <link>https://dev.to/sumedhbala/designing-a-large-scale-ticketing-system-2h1c</link>
      <guid>https://dev.to/sumedhbala/designing-a-large-scale-ticketing-system-2h1c</guid>
      <description>&lt;h2&gt;
  
  
  Introduction
&lt;/h2&gt;

&lt;p&gt;Inspired by &lt;a href="https://www.youtube.com/watch?v=fhdPyoO6aXI&amp;amp;t=48s" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=fhdPyoO6aXI&amp;amp;t=48s&lt;/a&gt;&lt;br&gt;
Ticketing systems such as Ticketmaster or Eventbrite appear simple on the surface: a user finds an event, selects a seat, and completes the purchase. To a user, it feels like a straightforward transaction. In reality, these systems are among the most challenging distributed systems to design because they must handle extreme concurrency, strict consistency requirements, fraud prevention, and unpredictable demand spikes.&lt;/p&gt;

&lt;p&gt;This series will walk through the design of a ticketing system step by step, highlighting the functional requirements, non-functional requirements, design trade-offs, and architectural patterns that are relevant for a system design interview.&lt;/p&gt;

&lt;p&gt;This series takes a different approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Go deep into multiple areas – sometimes interviewers drive the follow-ups, not the candidate. Each functional requirement is treated as a system design interview in itself.&lt;/li&gt;
&lt;li&gt;Cover database schemas and APIs – some interviewers want to see practical modeling and integration.&lt;/li&gt;
&lt;li&gt;Build reusable solutions – patterns from one domain (e.g., event search) can often apply to others (e.g., product search).&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Functional Requirements
&lt;/h2&gt;

&lt;p&gt;A ticketing system supports the entire user journey: discovery, selection, purchase, and ticket delivery.&lt;/p&gt;

&lt;h3&gt;
  
  
  Event Discovery
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Search for events by location, category, or date.&lt;/li&gt;
&lt;li&gt;View event details (venue, performers, time).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Seat Selection
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Display seating maps with real-time availability.&lt;/li&gt;
&lt;li&gt;Support both reserved seating (specific seat numbers) and general admission (ticket buckets).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reservation
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Temporarily reserve seats while a user is in the checkout flow.&lt;/li&gt;
&lt;li&gt;Expire reservations if payment is not completed within a time limit (usually 2–10 minutes depending on policy).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Purchase &amp;amp; Payment
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Securely process payments.&lt;/li&gt;
&lt;li&gt;Ensure idempotency (no double-charging if retries occur).&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Ticket Issuance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Generate a digital ticket (QR/barcode) upon successful purchase.&lt;/li&gt;
&lt;li&gt;Send confirmation via email and/or mobile app.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Non-Functional Requirements
&lt;/h2&gt;

&lt;p&gt;Large-scale ticketing systems are defined as much by how they perform under stress as by their feature set.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scalability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Handle sudden spikes (e.g., millions of users joining at ticket release time).&lt;/li&gt;
&lt;li&gt;Support horizontal scaling across multiple regions.&lt;/li&gt;
&lt;li&gt;Adapt to bursty traffic patterns rather than steady-state load.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Low Latency
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Fast responses for search, seat availability, and reservation status (&amp;lt;200 ms target for most read queries).&lt;/li&gt;
&lt;li&gt;Low latency matters because delays directly increase user drop-off during checkout.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Reliability &amp;amp; Consistency
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Ensure no double-booking of seats.&lt;/li&gt;
&lt;li&gt;Strong consistency for seat inventory, eventual consistency acceptable for less critical data (e.g., search).&lt;/li&gt;
&lt;li&gt;Techniques such as distributed locks or optimistic concurrency control are critical here.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Fault Tolerance
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Gracefully handle partial failures (e.g., payment succeeds but DB write fails).&lt;/li&gt;
&lt;li&gt;Use retries, dead-letter queues, and idempotent APIs.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Fairness &amp;amp; Security
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Queueing or rate limiting to prevent system overload.&lt;/li&gt;
&lt;li&gt;Anti-bot measures such as CAPTCHA, token-based access, and dynamic queueing systems to protect inventory.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Observability
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Real-time logging, monitoring, and alerting.&lt;/li&gt;
&lt;li&gt;Traceability for debugging failed transactions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Roadmap for This Series
&lt;/h2&gt;

&lt;p&gt;Each upcoming part of this series will address a key aspect of the system, with diagrams, real-world examples, and interview tips:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-2-event-discovery-32o5"&gt;&lt;strong&gt;Part 2&lt;/strong&gt;: Event discovery (search &amp;amp; indexing)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-3-seat-management-nn2"&gt;&lt;strong&gt;Part 3&lt;/strong&gt;: Seat selection and reservation (concurrency control)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://dev.to/sumedhbala/part-4-payments-and-ticket-issuance-3h7h"&gt;&lt;strong&gt;Part 4&lt;/strong&gt;: Payments and ticket issuance (idempotency and consistency)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://gemini.google.com/share/15d3d82059b5" rel="noopener noreferrer"&gt;Review by Google Gemini&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;By the end of this series, you should be able to discuss the design of a large-scale ticketing system in a structured, interview-ready manner, while demonstrating the ability to balance correctness, scalability, and user experience.&lt;/p&gt;

</description>
      <category>systemdesign</category>
      <category>ticketmaster</category>
    </item>
  </channel>
</rss>
