<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Thuyavan</title>
    <description>The latest articles on DEV Community by Thuyavan (@thuyavank08).</description>
    <link>https://dev.to/thuyavank08</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3953141%2F42f011c8-c156-4edd-8e89-39c92e68ce20.jpeg</url>
      <title>DEV Community: Thuyavan</title>
      <link>https://dev.to/thuyavank08</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thuyavank08"/>
    <language>en</language>
    <item>
      <title>Moving Beyond Probabilistic Outputs: Designing AI for High-Stakes Reliability</title>
      <dc:creator>Thuyavan</dc:creator>
      <pubDate>Fri, 05 Jun 2026 05:44:28 +0000</pubDate>
      <link>https://dev.to/thuyavank08/moving-beyond-probabilistic-outputs-designing-ai-for-high-stakes-reliability-707</link>
      <guid>https://dev.to/thuyavank08/moving-beyond-probabilistic-outputs-designing-ai-for-high-stakes-reliability-707</guid>
      <description>&lt;p&gt;Many of the AI applications we interact with today are built on a streamlined, direct architecture:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;User → Prompt → LLM → Response&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That works surprisingly well for:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;chat assistants,&lt;/li&gt;
&lt;li&gt;summarization,&lt;/li&gt;
&lt;li&gt;content generation,&lt;/li&gt;
&lt;li&gt;and general productivity tooling.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;While this approach is incredibly effective for creative tasks and general productivity, high-stakes environments—where accuracy is non-negotiable—require a different level of structural support.&lt;/p&gt;

&lt;p&gt;In specialized fields like healthcare or finance, a probabilistic response isn't just a minor hurdle; it's a risk that needs to be managed through robust system design.&lt;/p&gt;

&lt;p&gt;I’ve spent the last few weeks exploring a decision-support architecture specifically tailored for these critical settings. The goal is to ensure that every output is grounded in fact, every recommendation is fully explainable, and every step of the reasoning process is auditable.&lt;/p&gt;

&lt;p&gt;The central shift in this approach is viewing the Large Language Model (LLM) not as the entire system, but as a specialized component that requires clear boundaries and deterministic oversight.&lt;/p&gt;

&lt;h3&gt;
  
  
  Rethinking the Role of the Model
&lt;/h3&gt;

&lt;p&gt;Standard AI architectures often rely heavily on the model's internal memory and prompt engineering. While impressive, LLMs are fundamentally designed to predict the next likely token, which introduces a level of uncertainty that can be challenging for regulated industries.&lt;/p&gt;

&lt;p&gt;In sectors like compliance, policy-making, or healthcare, the model needs to be supported by a framework that provides authority and verification. The architecture itself acts as a safeguard, guiding the model's reasoning toward consistent and safe outcomes.&lt;/p&gt;

&lt;h3&gt;
  
  
  An Engineering-First Framework
&lt;/h3&gt;

&lt;p&gt;This architecture treats the LLM as a "reasoning engine" situated within a larger, deterministic pipeline. The goal is a design that prioritizes visibility and control at every stage, built on the core philosophy of Orchestration Over Generation.&lt;/p&gt;

&lt;p&gt;Instead of relying on an LLM to manage its own multi-step reasoning, the entire pipeline is managed by an Agent Orchestrator. This system uses an explicit, checkpointed state machine to run the case pipeline, providing total observability and natively supporting human-in-the-loop (HITL) interrupts.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qkpl27k4w1nssvan3vx.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F0qkpl27k4w1nssvan3vx.png" alt="Architecture Diagram" width="800" height="584"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  The Knowledge and Data Layer
&lt;/h3&gt;

&lt;p&gt;Rather than relying on the model to "know" facts, the system retrieves them from verified databases and structured records. To combat the semantic blurring common in a standard vector database, the data foundation is split into strict categories:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Enterprise Fact Store:&lt;/strong&gt; The immutable system of record for the entity (e.g., the client or case file). It sits on PostgreSQL and handles deterministic queries based on strict temporal logic.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Taxonomy Server:&lt;/strong&gt; Before the orchestrator searches for context, an Intake &amp;amp; Normalization node parses unstructured input and maps it to standardized industry codes.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;The Knowledge Index:&lt;/strong&gt; A hybrid retrieval system (vector + keyword) that searches over curated, versioned business rules and guidelines. Crucially, it returns passages with stable IDs to enforce strict citation.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Pipeline: Constrained Reasoning and Verification
&lt;/h3&gt;

&lt;p&gt;When a user submits an unstructured report, the orchestrator executes a tightly controlled sequence via FastAPI, ensuring every output is grounded in fact. The sequence of steps includes:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Input Normalization:&lt;/strong&gt; The Taxonomy Server parses unstructured input and maps it to standardized industry codes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deterministic Retrieval of Facts:&lt;/strong&gt; The orchestrator simultaneously builds the specific entity's context from the Fact Store and retrieves the relevant domain knowledge in a Parallel Retrieval step.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Structured Context Assembly:&lt;/strong&gt; The retrieved facts and knowledge are assembled as context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Constrained Reasoning:&lt;/strong&gt; The LLM acts purely as a Reasoner/Proposer under contract. It is strictly instructed to generate typed, evidence-bearing suggestions constrained by a JSON schema.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rule-Based Safety and Faithfulness Verification:&lt;/strong&gt; Before any output proceeds, two dedicated layers ensure its validity.&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  The Absolute Guardrail: A Dedicated Safety Engine
&lt;/h3&gt;

&lt;p&gt;A key decision in this architecture is the separation of probabilistic reasoning from deterministic safety. This engine ensures that n*&lt;em&gt;o LLM is allowed to make the final safety or compliance decision.&lt;/em&gt;*&lt;/p&gt;

&lt;p&gt;After the LLM proposes an action, the output is intercepted by this dedicated &lt;strong&gt;Deterministic Safety Engine&lt;/strong&gt;. Built on traditional, verifiable code, this engine runs conflict resolution, constraint violations, and duplicate-action checks using versioned rules and structured data. If the LLM proposes an action that violates an established hard rule, it is programmatically blocked before the operator ever sees it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Ensuring Faithfulness Through Verification
&lt;/h3&gt;

&lt;p&gt;A &lt;strong&gt;Faithfulness Verifier&lt;/strong&gt; then cross-checks the model's output against the retrieved evidence. This secondary NLI-style (Natural Language Inference) check confirms that every single generated claim is directly entailed by its cited evidence. If the model hallucinations a fact, the verifier flags it or forces an abstention or signal for human review—a feature that is essential for building long-term trust.&lt;/p&gt;

&lt;h3&gt;
  
  
  Privacy, Latency, and Infrastructure
&lt;/h3&gt;

&lt;p&gt;Deploying a mission-critical system requires addressing data sovereignty and response times.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Trust Boundaries:&lt;/strong&gt; A pluggable LLM Gateway handles all routing. If an external, hosted model is used, the gateway strips the text of sensitive PII (de-identification) before it crosses the trust boundary, and re-identifies the response when it returns.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Squeezing out Latency:&lt;/strong&gt; The pipeline relies heavily on parallelization and a Redis cache for entity summaries and prompt-cache keys. To achieve true low-latency inference in production, deployment strategy can shift: running API and process orchestrators as bare-metal PM2-managed services on dedicated VMs, and keeping a self-hosted engine like vLLM physically adjacent to the embedding generation and orchestrator. This drastically reduces the Time-To-First-Token.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  The Future of Trustworthy AI
&lt;/h3&gt;

&lt;p&gt;Designing for reliability means shifting our focus from prompt engineering to system engineering. By creating clear trust boundaries, implementing observability, and treating the LLM as a specialized component within a robust infrastructure, we can deploy autonomous systems in environments where precision is paramount.&lt;/p&gt;

&lt;p&gt;Ultimately, the most dependable AI systems will feel less like "black boxes" and more like carefully engineered distributed systems—reliable, predictable, and ready for the most critical tasks. LLMs are not databases, and they are not deterministic logic engines; by restricting the LLM to a highly constrained reasoning role within a rigid, stateful architecture, you ensure that being wrong is simply not an option.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>rag</category>
      <category>agents</category>
    </item>
    <item>
      <title>Build Log: Untangling SameSite, Same-Origin, and Cookie Auth in a Microservice Platform</title>
      <dc:creator>Thuyavan</dc:creator>
      <pubDate>Wed, 27 May 2026 13:46:23 +0000</pubDate>
      <link>https://dev.to/thuyavank08/build-log-untangling-samesite-same-origin-and-cookie-auth-in-a-microservice-platform-2dj0</link>
      <guid>https://dev.to/thuyavank08/build-log-untangling-samesite-same-origin-and-cookie-auth-in-a-microservice-platform-2dj0</guid>
      <description>&lt;p&gt;Over the last few days I went deep into one of those deceptively simple auth problems that turns into a browser security rabbit hole.&lt;/p&gt;

&lt;p&gt;The original goal sounded straightforward:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Move the platform from localStorage JWT auth toward secure cookie-based authentication across multiple microservices.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;But once I started reconciling the actual implementation against the original migration epic, I realized the real problem wasn’t JWTs.&lt;/p&gt;

&lt;p&gt;It was understanding:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;same-origin vs same-site,&lt;/li&gt;
&lt;li&gt;how browsers attach cookies,&lt;/li&gt;
&lt;li&gt;whether SameSite=None was actually necessary,&lt;/li&gt;
&lt;li&gt;and how deployment topology changes security behavior.&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  The Existing Architecture:
&lt;/h3&gt;

&lt;p&gt;The platform is composed of multiple services:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;login/auth service&lt;/li&gt;
&lt;li&gt;BFF&lt;/li&gt;
&lt;li&gt;billing&lt;/li&gt;
&lt;li&gt;notification&lt;/li&gt;
&lt;li&gt;frontend application&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The frontend was still heavily localStorage-token based:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Authorization: Bearer ${localStorage.getItem("token")}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Meanwhile:&lt;/p&gt;

&lt;p&gt;OTP login was already cookie-based&lt;br&gt;
backends already supported:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;credentials: true&lt;/li&gt;
&lt;li&gt;cookie parsing&lt;/li&gt;
&lt;li&gt;auth middleware&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So the backend groundwork for cookie auth was mostly there.&lt;/p&gt;

&lt;p&gt;The original migration epic proposed:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;switch cookies to SameSite=None; Secure&lt;/li&gt;
&lt;li&gt;move frontend to cookie auth&lt;/li&gt;
&lt;li&gt;remove bearer-token usage&lt;/li&gt;
&lt;li&gt;stop returning access tokens in login responses&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;But there was a problem.&lt;/p&gt;


&lt;h3&gt;
  
  
  The Security Contradiction
&lt;/h3&gt;

&lt;p&gt;While reviewing the actual codebase, I noticed something important:&lt;/p&gt;

&lt;p&gt;The auth cookies had already been hardened to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;SameSite=Strict
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That change was intentional.&lt;/p&gt;

&lt;p&gt;It had been introduced earlier to close a CSRF exposure.&lt;/p&gt;

&lt;p&gt;So now there was a contradiction:&lt;/p&gt;

&lt;p&gt;the migration epic wanted SameSite=None&lt;br&gt;
but security hardening had intentionally moved to Strict&lt;/p&gt;

&lt;p&gt;That immediately raised the question:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Do we even need SameSite=None?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;And the answer depends entirely on deployment topology.&lt;/p&gt;


&lt;h3&gt;
  
  
  Same-Origin vs Same-Site
&lt;/h3&gt;

&lt;p&gt;This turned out to be the key distinction.&lt;/p&gt;
&lt;h5&gt;
  
  
  Same-Origin
&lt;/h5&gt;

&lt;p&gt;An origin is:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;scheme + host + port
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Examples:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;URL A&lt;/th&gt;
&lt;th&gt;URL B&lt;/th&gt;
&lt;th&gt;Same-Origin?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;https://app.company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://app.company.com/x&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;https://app.company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://api.company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;http://localhost:5104&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://localhost:3001&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Same-origin controls:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CORS&lt;/li&gt;
&lt;li&gt;JS access&lt;/li&gt;
&lt;li&gt;localStorage isolation&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  Same-Site
&lt;/h5&gt;

&lt;p&gt;A site is roughly the registrable domain (eTLD+1).&lt;/p&gt;

&lt;p&gt;Examples:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;URL A&lt;/th&gt;
&lt;th&gt;URL B&lt;/th&gt;
&lt;th&gt;Same-Site?&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;https://app.company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://api.company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;http://localhost:5104&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;http://localhost:3001&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Yes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;https://company.com&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;https://billing.io&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;This distinction matters because:&lt;/p&gt;

&lt;p&gt;SameSite cookie behavior operates at the site level, not the origin level.&lt;/p&gt;

&lt;p&gt;That means:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;app.company.com -&amp;gt; api.company.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;is:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cross-origin&lt;/li&gt;
&lt;li&gt;but still same-site&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So SameSite=Strict cookies still work there.&lt;/p&gt;

&lt;p&gt;That realization changed the entire migration plan.&lt;/p&gt;




&lt;h4&gt;
  
  
  Reverse Proxy Topology Changes Everything
&lt;/h4&gt;

&lt;p&gt;The next major insight came from analyzing nginx routing.&lt;/p&gt;

&lt;p&gt;There are two fundamentally different deployment models.&lt;/p&gt;




&lt;h3&gt;
  
  
  Option A — Single Gateway (Same-Origin)
&lt;/h3&gt;

&lt;p&gt;Browser sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://app.company.com/
https://app.company.com/auth/*
https://app.company.com/billing/*
https://app.company.com/notifications/*

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Internally nginx fans requests out to different services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;auth:3001
billing:3009
notifications:3010
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But the browser never sees that.&lt;/p&gt;

&lt;p&gt;So to the browser:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;everything is same-origin&lt;/li&gt;
&lt;li&gt;cookies are trivially attached&lt;/li&gt;
&lt;li&gt;no SameSite=None&lt;/li&gt;
&lt;li&gt;minimal CORS complexity&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the cleanest architecture.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option B — Separate API Hostnames
&lt;/h3&gt;

&lt;p&gt;Browser sees:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://app.company.com
https://api.company.com
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This becomes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cross-origin&lt;/li&gt;
&lt;li&gt;but same-site&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In this setup:&lt;/p&gt;

&lt;p&gt;SameSite=Strict still works&lt;br&gt;
but CORS credentials become mandatory&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;fetch(url, {
  credentials: "include"
})
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and server-side:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
credentials: true
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Important Realization
&lt;/h3&gt;

&lt;p&gt;A reverse proxy alone does NOT automatically make things same-origin.&lt;/p&gt;

&lt;p&gt;What matters is:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;what hostname the browser sees.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That was probably the biggest conceptual breakthrough in this debugging session.&lt;/p&gt;




&lt;h3&gt;
  
  
  SPA Routing Pitfalls
&lt;/h3&gt;

&lt;p&gt;I also explored consolidating multiple SPAs under one gateway.&lt;/p&gt;

&lt;p&gt;Example:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;location /billing-app/ {
    alias /usr/share/nginx/billing/;
    try_files $uri $uri/ /billing-app/index.html;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This introduces several subtle problems.&lt;/p&gt;

&lt;h5&gt;
  
  
  1. Prefix collisions
&lt;/h5&gt;

&lt;p&gt;If:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
/billing/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;already proxies billing APIs, then the frontend cannot also live there.&lt;/p&gt;

&lt;p&gt;Solution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;/billing/ for APIs&lt;/li&gt;
&lt;li&gt;/billing-app/ for SPA&lt;/li&gt;
&lt;/ul&gt;

&lt;h5&gt;
  
  
  2. Vite base path issues
&lt;/h5&gt;

&lt;p&gt;Frontend builds must specify:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
base: "/billing-app/"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Otherwise assets load from /assets/...&lt;br&gt;
and break in production.&lt;/p&gt;
&lt;h5&gt;
  
  
  3. React Router basename
&lt;/h5&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;&amp;lt;BrowserRouter basename="/billing-app"&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Without this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;client-side navigation breaks&lt;/li&gt;
&lt;li&gt;refreshes 404&lt;/li&gt;
&lt;/ul&gt;
&lt;h5&gt;
  
  
  4. SPA fallback routing
&lt;/h5&gt;

&lt;p&gt;Each SPA needs its own fallback:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
try_files $uri /billing-app/index.html;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Otherwise deep routes fail.&lt;/p&gt;




&lt;h4&gt;
  
  
  The HTTPS Problem
&lt;/h4&gt;

&lt;p&gt;One final issue surfaced during review.&lt;/p&gt;

&lt;p&gt;The deployment was still running on:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
http://&amp;lt;raw-ip&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;But production cookies were marked:&lt;/p&gt;

&lt;p&gt;Secure&lt;/p&gt;

&lt;p&gt;Browsers reject secure cookies over plain HTTP.&lt;/p&gt;

&lt;p&gt;Which means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;cookie auth silently fails&lt;/li&gt;
&lt;li&gt;even before any SameSite logic matters&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So before the final auth migration:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;HTTPS termination needs to exist&lt;/li&gt;
&lt;li&gt;preferably with a proper domain&lt;/li&gt;
&lt;li&gt;not a raw IP&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Final Direction
&lt;/h3&gt;

&lt;p&gt;After walking through all of this, the architecture direction became much clearer:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Best End-State&lt;/li&gt;
&lt;li&gt;single gateway&lt;/li&gt;
&lt;li&gt;path-based routing&lt;/li&gt;
&lt;li&gt;same-origin frontend/API&lt;/li&gt;
&lt;li&gt;SameSite=Strict&lt;/li&gt;
&lt;li&gt;cookie-only auth&lt;/li&gt;
&lt;li&gt;no localStorage bearer tokens&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That gives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;simpler auth&lt;/li&gt;
&lt;li&gt;lower CSRF exposure&lt;/li&gt;
&lt;li&gt;fewer CORS headaches&lt;/li&gt;
&lt;li&gt;cleaner frontend architecture&lt;/li&gt;
&lt;/ul&gt;




&lt;h3&gt;
  
  
  Biggest Lesson
&lt;/h3&gt;

&lt;p&gt;The most valuable part of this debugging session wasn’t a code change.&lt;/p&gt;

&lt;p&gt;It was realizing that:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;browser security behavior is deeply tied to deployment topology.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Two systems with identical backend code can behave completely differently depending on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;domains,&lt;/li&gt;
&lt;li&gt;ports,&lt;/li&gt;
&lt;li&gt;proxies,&lt;/li&gt;
&lt;li&gt;and what the browser actually sees.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That understanding made the rest of the migration decisions much more obvious.&lt;/p&gt;

</description>
      <category>webdev</category>
      <category>microservices</category>
      <category>architecture</category>
      <category>coding</category>
    </item>
  </channel>
</rss>
