<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: vapmail16</title>
    <description>The latest articles on DEV Community by vapmail16 (@vapmail16).</description>
    <link>https://dev.to/vapmail16</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3776163%2F62e8c47b-1731-448a-81fa-593f49dcdab2.jpeg</url>
      <title>DEV Community: vapmail16</title>
      <link>https://dev.to/vapmail16</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/vapmail16"/>
    <language>en</language>
    <item>
      <title>I set up my own chatgpt for analysing company reports — here's exactly how you can too</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Mon, 02 Mar 2026 17:27:19 +0000</pubDate>
      <link>https://dev.to/vapmail16/i-set-up-my-own-chatgpt-for-analysing-company-reports-heres-exactly-how-you-can-too-5137</link>
      <guid>https://dev.to/vapmail16/i-set-up-my-own-chatgpt-for-analysing-company-reports-heres-exactly-how-you-can-too-5137</guid>
      <description>&lt;h2&gt;
  
  
  why i did this
&lt;/h2&gt;

&lt;p&gt;i spend a lot of time analysing company reports — annual filings, quarterly earnings, financial statements. i was using chatgpt and claude for this but had a few problems:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;privacy&lt;/strong&gt; — i didn't want to upload sensitive financial documents to third party AI services&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;cost&lt;/strong&gt; — the subscriptions add up, especially if you want the good models&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;control&lt;/strong&gt; — i wanted to customise the system prompt, use specific models, and share access with a friend who does similar analysis&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;so i decided to set up my own private chatgpt-like interface running on a rented GPU. the whole thing costs me less than £1 per session and my data never leaves a server i control.&lt;/p&gt;

&lt;p&gt;here's exactly how i did it.&lt;/p&gt;




&lt;h2&gt;
  
  
  what you're building
&lt;/h2&gt;

&lt;p&gt;by the end of this guide you'll have:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;a &lt;strong&gt;chatgpt-like web interface&lt;/strong&gt; you can access from any browser&lt;/li&gt;
&lt;li&gt;the ability to &lt;strong&gt;upload PDFs&lt;/strong&gt; and chat about them (company reports, annual filings, etc.)&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;vision model&lt;/strong&gt; that can read charts, tables and images from reports&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;multi-user access&lt;/strong&gt; so you can share it with a colleague or friend&lt;/li&gt;
&lt;li&gt;a &lt;strong&gt;web research tool&lt;/strong&gt; that searches the internet and analyses results using your local model&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;and it all runs on a rented GPU that you can stop and start whenever you want.&lt;/p&gt;




&lt;h2&gt;
  
  
  the stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Vast.ai&lt;/strong&gt; — rent a GPU by the hour (way cheaper than buying one)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; — runs AI models on the GPU&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Open WebUI&lt;/strong&gt; — the chatgpt-like interface with PDF upload, conversation history, user management&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qwen2.5:32b&lt;/strong&gt; — brilliant open source model for financial analysis&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Qwen2.5-VL:7b&lt;/strong&gt; — vision model for reading charts and images&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;total cost: roughly &lt;strong&gt;$0.40/hour&lt;/strong&gt; when running, basically nothing when stopped.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 1: rent a GPU on vast.ai
&lt;/h2&gt;

&lt;p&gt;go to &lt;a href="https://cloud.vast.ai/" rel="noopener noreferrer"&gt;cloud.vast.ai&lt;/a&gt; and create an account if you haven't already. add some credit — $5-10 is enough to get started.&lt;/p&gt;

&lt;h3&gt;
  
  
  picking the right GPU
&lt;/h3&gt;

&lt;p&gt;click &lt;strong&gt;Search&lt;/strong&gt; and look for instances with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GPU&lt;/strong&gt;: RTX 5090 (32GB), RTX 4090 (24GB), RTX 3090 (24GB), or A6000 (48GB)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;VRAM&lt;/strong&gt;: 24GB minimum, 32GB+ ideal&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disk&lt;/strong&gt;: 60GB+&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt;: 95%+&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;for company report analysis, the 32B parameter model gives the best results and needs around 20GB of VRAM. if you can only get a 24GB card, the 14B model still works well.&lt;/p&gt;

&lt;h3&gt;
  
  
  selecting the template
&lt;/h3&gt;

&lt;p&gt;this is the key bit that saves you loads of setup time.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;click &lt;strong&gt;Select Template&lt;/strong&gt; on the left panel&lt;/li&gt;
&lt;li&gt;scroll down and find &lt;strong&gt;"Open Webui (Ollama)"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;select it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;this template comes pre-configured with Ollama and Open WebUI already installed. no messing about with docker commands or manual installation.&lt;/p&gt;

&lt;h3&gt;
  
  
  before you click rent
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;set &lt;strong&gt;Container Size&lt;/strong&gt; to at least &lt;strong&gt;80GB&lt;/strong&gt; (models take up space)&lt;/li&gt;
&lt;li&gt;pick your instance from the list&lt;/li&gt;
&lt;li&gt;click &lt;strong&gt;RENT&lt;/strong&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  a word of warning
&lt;/h3&gt;

&lt;p&gt;i tried an RTX 5090 instance first and got a GPU error — &lt;code&gt;Error: GPU error, unable to start instance&lt;/code&gt;. this happens sometimes on vast.ai, especially with newer hardware. if this happens to you, just destroy the instance and rent a different one. my second attempt with a different host worked within seconds.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 2: wait for it to boot
&lt;/h2&gt;

&lt;p&gt;once you click rent, go to the &lt;strong&gt;Instances&lt;/strong&gt; tab in the left sidebar. you'll see your instance with a status indicator.&lt;/p&gt;

&lt;p&gt;it goes through a few stages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Loading&lt;/strong&gt; — downloading the docker image&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Starting&lt;/strong&gt; — preparing GPUs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Running&lt;/strong&gt; — ready to go&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the first boot can take anywhere from 30 seconds to 20 minutes depending on whether the host has the docker image cached. if it's taking ages, check the status text — you should see progress like "Pull complete" and "Verifying Checksum".&lt;/p&gt;

&lt;p&gt;once it shows a blue &lt;strong&gt;"Open"&lt;/strong&gt; button, you're good.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 3: download the AI models
&lt;/h2&gt;

&lt;p&gt;click &lt;strong&gt;Open&lt;/strong&gt; on your instance. you'll see the vast.ai applications dashboard with several options.&lt;/p&gt;

&lt;p&gt;click &lt;strong&gt;"Launch Application"&lt;/strong&gt; on &lt;strong&gt;Jupyter Terminal&lt;/strong&gt;. this opens a command line in your browser.&lt;/p&gt;

&lt;p&gt;run these commands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5:32b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;this downloads the main analysis model (~19GB). takes about 10 minutes depending on the connection speed.&lt;/p&gt;

&lt;p&gt;then download the vision model for reading charts and images:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama pull qwen2.5vl:7b
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;note: it's &lt;code&gt;qwen2.5vl&lt;/code&gt; not &lt;code&gt;qwen2.5-vl&lt;/code&gt; — i got tripped up by this myself. the download is about 6GB.&lt;/p&gt;

&lt;p&gt;you can verify both models are there by running:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;ollama list
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  step 4: open the chat interface
&lt;/h2&gt;

&lt;p&gt;go back to the applications dashboard and click &lt;strong&gt;"Launch Application"&lt;/strong&gt; on &lt;strong&gt;Open Webui&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;this opens the chatgpt-like interface. the first thing you'll see is a signup page.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;important&lt;/strong&gt;: the first person to sign up becomes the admin. so make sure you create your account before sharing the link with anyone.&lt;/p&gt;

&lt;p&gt;once you're in, you'll see a familiar chat interface with a model dropdown at the top. select &lt;strong&gt;qwen2.5:32b&lt;/strong&gt; and start chatting.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 5: set up the system prompt
&lt;/h2&gt;

&lt;p&gt;this makes a massive difference to the quality of analysis you get.&lt;/p&gt;

&lt;p&gt;in Open WebUI, go to &lt;strong&gt;Settings&lt;/strong&gt; (the sliders icon at the top) and find the &lt;strong&gt;System Prompt&lt;/strong&gt; section. paste this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You are a senior financial analyst with 20 years of experience analysing
company reports, annual filings, quarterly earnings, and market data.

When analysing any document or question:
- Always cite specific numbers with their context (page, section, table)
- Flag inconsistencies between different sections of a report
- Compare metrics against industry benchmarks
- Identify both risks and opportunities
- Be precise — if you're unsure about a number, say so
- Never fabricate or hallucinate data points

Structure detailed analyses as:
1. EXECUTIVE SUMMARY
2. KEY FINANCIALS (revenue, profit, margins, growth)
3. RISKS &amp;amp; RED FLAGS
4. OPPORTUNITIES
5. OUTLOOK &amp;amp; RECOMMENDATION

For quick questions, respond concisely without this structure.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the difference between a generic model response and one with a good system prompt is night and day. it stops the model from being wishy-washy and forces it to give you structured, actionable analysis.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 6: upload and analyse PDFs
&lt;/h2&gt;

&lt;p&gt;click the &lt;strong&gt;"+"&lt;/strong&gt; button at the bottom-left of the chat input to upload files.&lt;/p&gt;

&lt;p&gt;Open WebUI extracts text from PDFs automatically using its built-in RAG (retrieval augmented generation) pipeline. so you can upload an annual report and ask things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"summarise the key financials from this report"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"what are the main risk factors mentioned?"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"compare the revenue growth year over year"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"what does management say about future outlook?"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"extract all the numbers from the balance sheet"&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  for charts and images
&lt;/h3&gt;

&lt;p&gt;switch to the &lt;strong&gt;qwen2.5vl:7b&lt;/strong&gt; model using the dropdown at the top. this model can see and understand images. upload a screenshot of a chart or table and ask:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"what does this chart show?"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"extract the data from this table"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"what trend is this graph showing?"&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  step 7: share with your friend
&lt;/h2&gt;

&lt;p&gt;this was one of my main requirements — being able to share access with someone else for joint analysis.&lt;/p&gt;

&lt;p&gt;the URL in your browser when you have Open WebUI open is your access link. it looks something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://something-something.trycloudflare.com/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;send that to your friend. they can create their own account with their own username and password.&lt;/p&gt;

&lt;h3&gt;
  
  
  lock it down
&lt;/h3&gt;

&lt;p&gt;once your friend has signed up, you don't want random people creating accounts. as the admin:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;click your &lt;strong&gt;profile icon&lt;/strong&gt; (bottom-left)&lt;/li&gt;
&lt;li&gt;go to &lt;strong&gt;Admin Panel&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;navigate to &lt;strong&gt;Settings → General&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;toggle off &lt;strong&gt;"Enable New Sign Ups"&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;save&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;now only you two can access it.&lt;/p&gt;




&lt;h2&gt;
  
  
  step 8: stop it when you're done
&lt;/h2&gt;

&lt;p&gt;this is crucial for managing costs.&lt;/p&gt;

&lt;p&gt;when you're done analysing for the day:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;go to your &lt;a href="https://cloud.vast.ai/instances/" rel="noopener noreferrer"&gt;vast.ai instances page&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;click the &lt;strong&gt;stop button&lt;/strong&gt; (the power icon) — &lt;strong&gt;NOT destroy&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;your instance goes to sleep&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;when stopped, you only pay a tiny storage fee (around $0.02/hr). all your models, conversations, and settings are preserved.&lt;/p&gt;

&lt;p&gt;when you want to use it again, just click &lt;strong&gt;resume&lt;/strong&gt; and wait a minute for it to start up. everything will be exactly as you left it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;destroy&lt;/strong&gt; the instance only when you're completely done and don't need it anymore.&lt;/p&gt;




&lt;h2&gt;
  
  
  bonus: web research tool
&lt;/h2&gt;

&lt;p&gt;i also set up a python script that searches the web and feeds the results to the local model for analysis. this is useful for enriching your PDF analysis with current market data, news, and competitor information.&lt;/p&gt;

&lt;p&gt;SSH into your instance or use the Jupyter Terminal and create this file:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# web-research.py
&lt;/span&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;openai&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;OpenAI&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;duckduckgo_search&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;DDGS&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;OpenAI&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;base_url&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:11434/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ollama&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;research&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c1"&gt;# search the web
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;searching: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nc"&gt;DDGS&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ddgs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ddgs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;max_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="n"&gt;search_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;[&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;+&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;] &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;title&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;body&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# analyse with local model
&lt;/span&gt;    &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;chat&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;completions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;qwen2.5:32b&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;messages&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;system&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;You are a financial analyst. Cite sources, flag risks, be precise.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;role&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;user&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;content&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Research: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;query&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Results:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;search_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Provide analysis.&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="n"&gt;temperature&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mf"&gt;0.3&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;choices&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# usage
&lt;/span&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;sys&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="nf"&gt;research&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:]))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;install the dependencies:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;openai duckduckgo-search
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;run it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;python3 web-research.py &lt;span class="s2"&gt;"Tesla Q4 2025 earnings analysis"&lt;/span&gt;
python3 web-research.py &lt;span class="s2"&gt;"Compare Nvidia vs AMD data centre revenue"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  costs breakdown
&lt;/h2&gt;

&lt;p&gt;let's talk real numbers.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;what&lt;/th&gt;
&lt;th&gt;cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;RTX 5090 (32GB) per hour&lt;/td&gt;
&lt;td&gt;~$0.40&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;a typical 2-3 hour analysis session&lt;/td&gt;
&lt;td&gt;~$0.80-1.20&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;storage while stopped per day&lt;/td&gt;
&lt;td&gt;~$0.50&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;monthly usage (2hrs/day, 20 days)&lt;/td&gt;
&lt;td&gt;~$25-35&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;compare that to chatgpt plus at $20/month or claude pro at $20/month — you're getting a much more powerful model with complete privacy and control for roughly the same price.&lt;/p&gt;




&lt;h2&gt;
  
  
  what i learned
&lt;/h2&gt;

&lt;p&gt;a few things worth noting from going through this process:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;template selection matters.&lt;/strong&gt; using the pre-built "Open Webui (Ollama)" template on vast.ai saved me hours of setup time. trying to install everything manually from a bare ubuntu image is painful and error-prone.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;host reliability varies.&lt;/strong&gt; my first GPU instance failed with a GPU error. the second one worked instantly. if something fails, just destroy it and try a different host. don't waste time debugging someone else's hardware.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;model names can be tricky.&lt;/strong&gt; &lt;code&gt;qwen2.5-vl:7b&lt;/code&gt; doesn't exist but &lt;code&gt;qwen2.5vl:7b&lt;/code&gt; does. small details like hyphens matter. always check the &lt;a href="https://ollama.com/library" rel="noopener noreferrer"&gt;ollama library&lt;/a&gt; for exact model names.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;system prompts make a huge difference.&lt;/strong&gt; the same model with a generic prompt gives you generic answers. with a targeted financial analysis prompt, it gives you structured, specific, actionable insights. invest time in getting your system prompt right.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the 32B model is the sweet spot.&lt;/strong&gt; it's significantly better than 14B models at understanding context, catching nuances in financial documents, and giving structured analysis. if you can get a 32GB VRAM GPU, go for the 32B model. if not, 14B is still decent.&lt;/p&gt;




&lt;h2&gt;
  
  
  the model choice guide
&lt;/h2&gt;

&lt;p&gt;depending on what GPU you can rent:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;VRAM available&lt;/th&gt;
&lt;th&gt;best model&lt;/th&gt;
&lt;th&gt;quality&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;24GB&lt;/td&gt;
&lt;td&gt;qwen2.5:14b&lt;/td&gt;
&lt;td&gt;good for summaries and basic analysis&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;32GB&lt;/td&gt;
&lt;td&gt;qwen2.5:32b&lt;/td&gt;
&lt;td&gt;excellent — catches nuances, structured output&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;48GB+&lt;/td&gt;
&lt;td&gt;qwen2.5:72b&lt;/td&gt;
&lt;td&gt;best possible — deep analysis, cross-referencing&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;for the vision model (charts, images), &lt;code&gt;qwen2.5vl:7b&lt;/code&gt; works great on any of these setups.&lt;/p&gt;




&lt;h2&gt;
  
  
  wrapping up
&lt;/h2&gt;

&lt;p&gt;the whole setup takes about 30 minutes from zero to a working system. the first time i did it, it took longer because of the GPU error and the model name typo, but once you know the steps it's straightforward.&lt;/p&gt;

&lt;p&gt;if you're doing any kind of company analysis, financial research, or document review, this is genuinely worth setting up. you get the power of a large language model with complete privacy and control over your data.&lt;/p&gt;

&lt;p&gt;the best part? when you're not using it, you stop the instance and it costs almost nothing. when you need it again, it boots up in a minute with everything preserved.&lt;/p&gt;

&lt;p&gt;if you've got questions or run into issues, drop a comment and i'll try to help.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;tools used: &lt;a href="https://vast.ai" rel="noopener noreferrer"&gt;vast.ai&lt;/a&gt; | &lt;a href="https://ollama.com" rel="noopener noreferrer"&gt;ollama&lt;/a&gt; | &lt;a href="https://openwebui.com" rel="noopener noreferrer"&gt;open webui&lt;/a&gt; | &lt;a href="https://ollama.com/library/qwen2.5" rel="noopener noreferrer"&gt;qwen2.5&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>privacy</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>NextSaaS: "Would Your SaaS Pass a Security Audit? (Honest Checklist)</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Fri, 27 Feb 2026 17:14:56 +0000</pubDate>
      <link>https://dev.to/vapmail16/nextsaas-would-your-saas-pass-a-security-audit-honest-checklist-1ob2</link>
      <guid>https://dev.to/vapmail16/nextsaas-would-your-saas-pass-a-security-audit-honest-checklist-1ob2</guid>
      <description>&lt;p&gt;Would Your SaaS Pass a Security Audit? (Honest Checklist)&lt;br&gt;
When I ran OWASP ZAP against my own app, I expected a clean report. I'd been careful about security from day one — parameterized queries, proper authentication, HTTPS everywhere.&lt;br&gt;
Instead, I found 3 medium-severity issues in the first scan.&lt;br&gt;
That scan taught me something important: there's a massive gap between "secure" and "provably secure." The first means you haven't been hacked yet. The second means you can demonstrate to an auditor, a customer, or a regulator that your systems are hardened, logged, and defensible.&lt;br&gt;
Here's the checklist I built after going through this process. Score yourself honestly.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Encryption at Rest — Not Just HTTPS
Most developers stop at HTTPS. "Data is encrypted in transit — we're good."
Auditors ask a different question: Is PII encrypted in your database?
If someone gains database access — a leaked backup, a compromised admin account, a SQL injection you missed — can they read your users' email addresses, phone numbers, and payment references in plain text?
Field-level encryption is the answer. AES-256-GCM is the standard for SOC 2 and enterprise compliance.
Here's the pattern I use:
typescriptimport crypto from 'crypto';&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;const ALGORITHM = 'aes-256-gcm';&lt;br&gt;
const KEY = Buffer.from(process.env.ENCRYPTION_KEY!, 'hex'); // 32 bytes&lt;/p&gt;

&lt;p&gt;export function encrypt(text: string): string {&lt;br&gt;
  const iv = crypto.randomBytes(16);&lt;br&gt;
  const cipher = crypto.createCipheriv(ALGORITHM, KEY, iv);&lt;/p&gt;

&lt;p&gt;let encrypted = cipher.update(text, 'utf8', 'hex');&lt;br&gt;
  encrypted += cipher.final('hex');&lt;/p&gt;

&lt;p&gt;const authTag = cipher.getAuthTag().toString('hex');&lt;br&gt;
  return &lt;code&gt;${iv.toString('hex')}:${authTag}:${encrypted}&lt;/code&gt;;&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;export function decrypt(encryptedText: string): string {&lt;br&gt;
  const [ivHex, authTagHex, encrypted] = encryptedText.split(':');&lt;/p&gt;

&lt;p&gt;const iv = Buffer.from(ivHex, 'hex');&lt;br&gt;
  const authTag = Buffer.from(authTagHex, 'hex');&lt;br&gt;
  const decipher = crypto.createDecipheriv(ALGORITHM, KEY, iv);&lt;br&gt;
  decipher.setAuthTag(authTag);&lt;/p&gt;

&lt;p&gt;let decrypted = decipher.update(encrypted, 'hex', 'utf8');&lt;br&gt;
  decrypted += decipher.final('utf8');&lt;br&gt;
  return decrypted;&lt;br&gt;
}&lt;br&gt;
What to encrypt: email addresses, phone numbers, payment references, any PII. Not everything — just the fields that would cause damage if exposed.&lt;br&gt;
Quick test: If you dumped your database right now, could someone read your users' emails? If yes, you'd fail this check.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Audit Logging — The Thing That Makes or Breaks Audits
Logging is not console.log("User logged in"). Audit-grade logging means a complete, immutable, queryable trail of every significant action in your system.
Here's the schema I use:
prismamodel AuditLog {
id          String   &lt;a class="mentioned-user" href="https://dev.to/id"&gt;@id&lt;/a&gt; &lt;a class="mentioned-user" href="https://dev.to/default"&gt;@default&lt;/a&gt;(cuid())
userId      String?
action      String   // LOGIN_SUCCESS, LOGIN_FAILURE, DATA_EXPORT, USER_DELETE
entity      String   // USER, PAYMENT, SETTINGS
entityId    String?
before      Json?    // State before the change
after       Json?    // State after the change
ipAddress   String
userAgent   String
requestId   String   // Correlation ID for tracing
metadata    Json?
createdAt   DateTime &lt;a class="mentioned-user" href="https://dev.to/default"&gt;@default&lt;/a&gt;(now())&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;@@index([userId])&lt;br&gt;
  @@index([action])&lt;br&gt;
  @@index([entity, entityId])&lt;br&gt;
  @@index([createdAt])&lt;br&gt;
}&lt;br&gt;
The key properties auditors look for:&lt;/p&gt;

&lt;p&gt;Completeness: Every login attempt (success AND failure), every data modification (with before/after values), every admin action.&lt;br&gt;
Immutability: Append-only. No updates, no deletes. If someone tampers with logs, you've lost your evidence.&lt;br&gt;
Retention: 7 years for financial data. Configurable per data type.&lt;br&gt;
Queryable: "Show me every action User X took in the last 30 days." If you can't answer this in seconds, your logging isn't audit-ready.&lt;/p&gt;

&lt;p&gt;The service function:&lt;br&gt;
typescriptexport async function createAuditLog(data: {&lt;br&gt;
  userId?: string;&lt;br&gt;
  action: string;&lt;br&gt;
  entity: string;&lt;br&gt;
  entityId?: string;&lt;br&gt;
  before?: any;&lt;br&gt;
  after?: any;&lt;br&gt;
  ipAddress: string;&lt;br&gt;
  userAgent: string;&lt;br&gt;
  requestId: string;&lt;br&gt;
}) {&lt;br&gt;
  return prisma.auditLog.create({&lt;br&gt;
    data: {&lt;br&gt;
      ...data,&lt;br&gt;
      before: data.before ? JSON.parse(JSON.stringify(data.before)) : undefined,&lt;br&gt;
      after: data.after ? JSON.parse(JSON.stringify(data.after)) : undefined,&lt;br&gt;
    },&lt;br&gt;
  });&lt;br&gt;
}&lt;br&gt;
Quick test: Can you tell me exactly what admin actions were performed on your system last Tuesday? If not, you'd fail this check.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Rate Limiting Per Endpoint
Global rate limiting — "100 requests per minute per IP" — is a start. But it's not enough.
Each endpoint has different attack vectors and different acceptable thresholds:
typescriptimport rateLimit from 'express-rate-limit';&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;// Login: prevent brute force&lt;br&gt;
export const loginLimiter = rateLimit({&lt;br&gt;
  windowMs: 15 * 60 * 1000, // 15 minutes&lt;br&gt;
  max: 5,                    // 5 attempts&lt;br&gt;
  keyGenerator: (req) =&amp;gt; req.ip,&lt;br&gt;
  message: 'Too many login attempts. Try again in 15 minutes.',&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;// Password reset: prevent email bombing&lt;br&gt;
export const passwordResetLimiter = rateLimit({&lt;br&gt;
  windowMs: 60 * 60 * 1000, // 1 hour&lt;br&gt;
  max: 3,&lt;br&gt;
  keyGenerator: (req) =&amp;gt; req.body.email || req.ip,&lt;br&gt;
  message: 'Too many password reset requests.',&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;// API: general usage&lt;br&gt;
export const apiLimiter = rateLimit({&lt;br&gt;
  windowMs: 60 * 1000,      // 1 minute&lt;br&gt;
  max: 100,&lt;br&gt;
  keyGenerator: (req) =&amp;gt; req.user?.id || req.ip,&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;// Registration: prevent mass account creation&lt;br&gt;
export const registrationLimiter = rateLimit({&lt;br&gt;
  windowMs: 60 * 60 * 1000,&lt;br&gt;
  max: 10,&lt;br&gt;
  keyGenerator: (req) =&amp;gt; req.ip,&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;// MFA verification: prevent code guessing&lt;br&gt;
export const mfaLimiter = rateLimit({&lt;br&gt;
  windowMs: 15 * 60 * 1000,&lt;br&gt;
  max: 3,&lt;br&gt;
  keyGenerator: (req) =&amp;gt; req.sessionID || req.ip,&lt;br&gt;
});&lt;br&gt;
Apply them individually:&lt;br&gt;
typescriptapp.post('/api/auth/login', loginLimiter, authController.login);&lt;br&gt;
app.post('/api/auth/reset-password', passwordResetLimiter, authController.resetPassword);&lt;br&gt;
app.post('/api/auth/verify-mfa', mfaLimiter, authController.verifyMfa);&lt;br&gt;
app.post('/api/auth/register', registrationLimiter, authController.register);&lt;br&gt;
app.use('/api/', apiLimiter);&lt;br&gt;
Quick test: Could someone attempt 1,000 logins to your API right now? If yes, you'd fail this check.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;PII in Logs — The Silent Compliance Killer
I found this pattern in a real production codebase:
typescript// ❌ WRONG — this logs passwords and emails in plain text
logger.info("User login attempt:", { email, password });
Passwords. In. Logs. Sitting in CloudWatch, Datadog, or a log file on a server somewhere, accessible to anyone with log access.
The fix is a PII masking format for your logger:
typescriptimport winston from 'winston';&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;const sensitiveFields = ['password', 'token', 'secret', 'authorization',&lt;br&gt;
  'creditCard', 'ssn', 'refreshToken'];&lt;/p&gt;

&lt;p&gt;const emailRegex = /([a-zA-Z0-9._%+-]+)@([a-zA-Z0-9.-]+.[a-zA-Z]{2,})/g;&lt;/p&gt;

&lt;p&gt;const piiMaskingFormat = winston.format((info) =&amp;gt; {&lt;br&gt;
  const masked = JSON.parse(JSON.stringify(info));&lt;/p&gt;

&lt;p&gt;// Mask sensitive fields&lt;br&gt;
  for (const field of sensitiveFields) {&lt;br&gt;
    if (masked[field]) {&lt;br&gt;
      masked[field] = '[REDACTED]';&lt;br&gt;
    }&lt;br&gt;
  }&lt;/p&gt;

&lt;p&gt;// Mask email addresses&lt;br&gt;
  const str = JSON.stringify(masked);&lt;br&gt;
  const cleaned = str.replace(emailRegex, (match, local, domain) =&amp;gt; {&lt;br&gt;
    return &lt;code&gt;${local[0]}***@${domain}&lt;/code&gt;;&lt;br&gt;
  });&lt;/p&gt;

&lt;p&gt;return JSON.parse(cleaned);&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;export const logger = winston.createLogger({&lt;br&gt;
  level: process.env.LOG_LEVEL || 'info',&lt;br&gt;
  format: winston.format.combine(&lt;br&gt;
    piiMaskingFormat(),&lt;br&gt;
    winston.format.timestamp(),&lt;br&gt;
    winston.format.json()&lt;br&gt;
  ),&lt;br&gt;
  transports: [&lt;br&gt;
    new winston.transports.File({ filename: 'logs/error.log', level: 'error' }),&lt;br&gt;
    new winston.transports.File({ filename: 'logs/combined.log' }),&lt;br&gt;
  ],&lt;br&gt;
});&lt;br&gt;
Now logger.info("Login:", { email: "&lt;a href="mailto:john@example.com"&gt;john@example.com&lt;/a&gt;", password: "secret123" }) outputs:&lt;br&gt;
json{ "message": "Login:", "email": "j***@example.com", "password": "[REDACTED]" }&lt;br&gt;
Quick test: Search your logs for @gmail.com or @yahoo.com right now. If you find full email addresses, you have a GDPR problem.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Security Headers — 2 Lines of Code, Massive Impact
Helmet.js handles this in Express:
typescriptimport helmet from 'helmet';&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;app.use(helmet({&lt;br&gt;
  contentSecurityPolicy: {&lt;br&gt;
    directives: {&lt;br&gt;
      defaultSrc: ["'self'"],&lt;br&gt;
      scriptSrc: ["'self'"],&lt;br&gt;
      styleSrc: ["'self'", "'unsafe-inline'"],&lt;br&gt;
      imgSrc: ["'self'", 'data:', 'https:'],&lt;br&gt;
    },&lt;br&gt;
  },&lt;br&gt;
  hsts: {&lt;br&gt;
    maxAge: 31536000,&lt;br&gt;
    includeSubDomains: true,&lt;br&gt;
    preload: true,&lt;br&gt;
  },&lt;br&gt;
}));&lt;br&gt;
This gives you: X-Content-Type-Options, X-Frame-Options, Strict-Transport-Security, Content-Security-Policy, and more. Each header blocks a specific class of attacks.&lt;br&gt;
CORS configuration matters too:&lt;br&gt;
typescriptimport cors from 'cors';&lt;/p&gt;

&lt;p&gt;app.use(cors({&lt;br&gt;
  origin: process.env.FRONTEND_URL, // NOT "&lt;em&gt;"&lt;br&gt;
  credentials: true,&lt;br&gt;
  methods: ['GET', 'POST', 'PUT', 'DELETE', 'PATCH'],&lt;br&gt;
  allowedHeaders: ['Content-Type', 'Authorization'],&lt;br&gt;
}));&lt;br&gt;
origin: "&lt;/em&gt;" is the most common security misconfiguration I see in SaaS codebases. It means any website can make authenticated requests to your API.&lt;br&gt;
Quick test: Run curl -I &lt;a href="https://your-app.com" rel="noopener noreferrer"&gt;https://your-app.com&lt;/a&gt; and check the response headers. No Strict-Transport-Security? You'd fail.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Input Validation at the Edge
Every request should be validated before it reaches your controller. Not inside the controller — at the middleware layer:
typescriptimport { z } from 'zod';
import { Request, Response, NextFunction } from 'express';&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;// Define schema&lt;br&gt;
const registerSchema = z.object({&lt;br&gt;
  email: z.string().email(),&lt;br&gt;
  password: z.string()&lt;br&gt;
    .min(8, 'Password must be at least 8 characters')&lt;br&gt;
    .regex(/[A-Z]/, 'Must contain uppercase letter')&lt;br&gt;
    .regex(/[a-z]/, 'Must contain lowercase letter')&lt;br&gt;
    .regex(/[0-9]/, 'Must contain number'),&lt;br&gt;
  name: z.string().min(1).max(100),&lt;br&gt;
});&lt;/p&gt;

&lt;p&gt;// Validation middleware factory&lt;br&gt;
export function validate(schema: z.ZodSchema) {&lt;br&gt;
  return (req: Request, res: Response, next: NextFunction) =&amp;gt; {&lt;br&gt;
    const result = schema.safeParse(req.body);&lt;br&gt;
    if (!result.success) {&lt;br&gt;
      return res.status(400).json({&lt;br&gt;
        success: false,&lt;br&gt;
        errors: result.error.issues.map(i =&amp;gt; ({&lt;br&gt;
          field: i.path.join('.'),&lt;br&gt;
          message: i.message,&lt;br&gt;
        })),&lt;br&gt;
      });&lt;br&gt;
    }&lt;br&gt;
    req.body = result.data; // Use parsed/cleaned data&lt;br&gt;
    next();&lt;br&gt;
  };&lt;br&gt;
}&lt;/p&gt;

&lt;p&gt;// Usage&lt;br&gt;
app.post('/api/auth/register',&lt;br&gt;
  registrationLimiter,&lt;br&gt;
  validate(registerSchema),&lt;br&gt;
  authController.register&lt;br&gt;
);&lt;br&gt;
Your controller now receives guaranteed clean, typed data. Zero "cannot read property of undefined" errors. And Prisma's parameterized queries handle SQL injection protection at the ORM layer.&lt;br&gt;
Quick test: Send {"email": "not-an-email", "password": "1"} to your registration endpoint. Does it return a clear validation error, or does it crash?&lt;/p&gt;

&lt;p&gt;Security Audit Readiness Score&lt;br&gt;
Score yourself honestly. 10 points each:&lt;/p&gt;

&lt;h1&gt;
  
  
  CheckY/N1PII is encrypted at rest (not just HTTPS)2Audit logs capture every login attempt (success + failure)3Audit logs include before/after values for data changes4Logs are append-only (immutable)5Rate limiting is per-endpoint, not just global6No PII in log files (emails, passwords, tokens)7Security headers set via Helmet.js (or equivalent)8CORS is configured to specific origins (not "*")9All input is validated at the middleware layer10You can query "what did User X do last Tuesday?" in seconds
&lt;/h1&gt;

&lt;p&gt;Scoring:&lt;/p&gt;

&lt;p&gt;90-100: Audit-ready. You're ahead of most SaaS products.&lt;br&gt;
60-80: Solid foundation, gaps to close. Prioritize encryption and logging.&lt;br&gt;
30-50: Significant work needed. Start with Helmet.js and rate limiting — quickest wins.&lt;br&gt;
0-20: Honest. And fixable. Every item above is buildable in a day or less.&lt;/p&gt;

&lt;p&gt;The difference between "secure" and "audit-ready" is documentation and traceability. If you can't prove it's secure, it's not secure enough.&lt;br&gt;
Run these checks against your own app. Score honestly.&lt;br&gt;
What did you get?&lt;/p&gt;

</description>
      <category>security</category>
      <category>webdev</category>
      <category>saas</category>
      <category>ai</category>
    </item>
    <item>
      <title>I Switched From Pure Vector Search to Hybrid Retrieval in My RAG System — Here's What Changed</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Thu, 26 Feb 2026 21:51:00 +0000</pubDate>
      <link>https://dev.to/vapmail16/i-switched-from-pure-vector-search-to-hybrid-retrieval-in-my-rag-system-heres-what-changed-d16</link>
      <guid>https://dev.to/vapmail16/i-switched-from-pure-vector-search-to-hybrid-retrieval-in-my-rag-system-heres-what-changed-d16</guid>
      <description>&lt;p&gt;I've been building RAG (Retrieval-Augmented Generation) systems for a while now, and I recently made one change that boosted my retrieval accuracy from ~60% to ~85%.&lt;/p&gt;

&lt;p&gt;The change? Adding BM25 keyword matching alongside my existing vector search.&lt;/p&gt;

&lt;p&gt;That's it. No fancy model swaps. No expensive rerankers. Just combining two search strategies that complement each other's blind spots.&lt;/p&gt;

&lt;p&gt;Let me walk you through exactly what happened, why it works, and what I learned from other engineers running RAG in production.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Problem With Pure Vector Search
&lt;/h2&gt;

&lt;p&gt;Vector search (using embeddings + cosine similarity) is incredible at understanding &lt;em&gt;meaning&lt;/em&gt;. Ask it for "employee vacation policy" and it'll find documents about "time off benefits," "annual leave," and "PTO guidelines."&lt;/p&gt;

&lt;p&gt;But here's the catch — it sometimes &lt;strong&gt;misses exact terminology&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;In my test set of 50 questions against internal documentation, I kept running into this pattern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User query:&lt;/strong&gt; "What's the PTO policy?"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vector search found:&lt;/strong&gt; Chunks about "vacation time," "time off benefits," "leave of absence"&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Vector search missed:&lt;/strong&gt; The exact chunk that contained the acronym "PTO"&lt;/p&gt;

&lt;p&gt;The embeddings understood the &lt;em&gt;concept&lt;/em&gt; of paid time off perfectly. But when a document used a specific acronym, abbreviation, or domain-specific term, vector search would sometimes grab semantically similar but &lt;em&gt;wrong&lt;/em&gt; chunks.&lt;/p&gt;

&lt;p&gt;This matters a lot in production. Your users don't speak in perfect semantic paragraphs — they use acronyms, product names, jargon, and exact phrases they remember from documents.&lt;/p&gt;




&lt;h2&gt;
  
  
  Enter Hybrid Retrieval: Vector + BM25
&lt;/h2&gt;

&lt;p&gt;BM25 is an old-school keyword matching algorithm. It doesn't understand meaning at all — it just finds documents that contain the exact words in your query. Think of it as a very sophisticated Ctrl+F.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid retrieval&lt;/strong&gt; combines both:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Vector search&lt;/strong&gt; finds chunks that are &lt;em&gt;semantically similar&lt;/em&gt; to your query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BM25&lt;/strong&gt; finds chunks that contain &lt;em&gt;exact keyword matches&lt;/em&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Reciprocal Rank Fusion (RRF)&lt;/strong&gt; merges both result sets into a single ranked list&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Here's a simplified view of how RRF works:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reciprocal_rank_fusion&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;bm25_results&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;60&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Merge two ranked lists using RRF.
    k=60 is the standard constant that controls 
    how much weight lower-ranked results get.
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;fused_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector_results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;fused_scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fused_scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bm25_results&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;fused_scores&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fused_scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;k&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;rank&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Sort by combined score — docs appearing in BOTH lists rank highest
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;fused_scores&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;items&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="k"&gt;lambda&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;reverse&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The beauty of RRF is that documents appearing in &lt;strong&gt;both&lt;/strong&gt; result sets naturally bubble to the top. If a chunk is semantically relevant AND contains exact keywords, it's almost certainly the right one.&lt;/p&gt;




&lt;h2&gt;
  
  
  My Results
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Pure Vector&lt;/th&gt;
&lt;th&gt;Hybrid (Vector + BM25)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Accuracy (50 questions)&lt;/td&gt;
&lt;td&gt;~60%&lt;/td&gt;
&lt;td&gt;~85%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Acronym/jargon queries&lt;/td&gt;
&lt;td&gt;Poor&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Natural language queries&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;td&gt;Strong&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Implementation complexity&lt;/td&gt;
&lt;td&gt;Simple&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The 25-percentage-point jump came almost entirely from queries involving exact terminology — acronyms, product names, specific phrases, and domain jargon.&lt;/p&gt;

&lt;p&gt;For natural language queries like "how do I request time off?", both approaches performed similarly. The hybrid approach essentially kept vector search's strengths while patching its biggest weakness.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I Learned From Other Engineers
&lt;/h2&gt;

&lt;p&gt;After sharing these results, I got some fascinating insights from engineers running RAG in production. Here are the key takeaways:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. "It depends on your data" is annoyingly true
&lt;/h3&gt;

&lt;p&gt;For &lt;strong&gt;technical/QA documentation&lt;/strong&gt; (code, APIs, specs), BM25 alone can outperform vector search because queries tend to use exact terms. For &lt;strong&gt;enterprise/business documents&lt;/strong&gt; with natural language, vector search pulls more weight. Hybrid gives you the best of both worlds, but the &lt;em&gt;weighting&lt;/em&gt; between vector and BM25 scores should be tuned for your specific corpus.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Rerankers aren't always worth the latency
&lt;/h3&gt;

&lt;p&gt;Cross-encoder rerankers (a second-pass model that re-scores your top results) are often recommended as the next step after hybrid retrieval. But several production teams reported &lt;strong&gt;minimal improvement&lt;/strong&gt; when their initial retrieval was already solid. One engineer measured an NDCG of 0.74 on their hybrid setup and saw almost no gain from adding a reranker — so they dropped it to reduce latency.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; If your hybrid retrieval is already good, a reranker might just add 100-200ms of latency for marginal improvement. Measure before committing.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Chunk size is NOT a solved problem
&lt;/h3&gt;

&lt;p&gt;I started with ~500 characters and 100-character overlap, which works OK for my use case. But the consensus from production teams is clear: &lt;strong&gt;there is no universal best chunk size.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The right size depends on:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Document type&lt;/strong&gt;: Legal docs need larger chunks (full clauses); FAQs need smaller ones&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Query patterns&lt;/strong&gt;: How specific are your users' questions?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Content structure&lt;/strong&gt;: Is your data prose, tables, code, or mixed?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A good starting heuristic: think of chunks as "one complete thought." For prose, that's roughly a paragraph. For code, it's a function. For FAQs, it's a question-answer pair.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Dimensionality reduction is an underexplored lever
&lt;/h3&gt;

&lt;p&gt;One interesting approach I came across: reducing embedding vectors from their native dimensions (e.g., 1536 for OpenAI) down to 25-100 dimensions using PCA, UMAP, or similar methods. The goal is to strip away noisy dimensions and keep only the ones that carry meaningful signal.&lt;/p&gt;

&lt;p&gt;This could potentially &lt;strong&gt;improve accuracy AND speed&lt;/strong&gt; — smaller vectors mean faster similarity search and less noise in the matching. Worth experimenting with if you're at scale.&lt;/p&gt;




&lt;h2&gt;
  
  
  Getting Started With Hybrid Retrieval
&lt;/h2&gt;

&lt;p&gt;If you're running pure vector search and want to try hybrid, here's a practical starting point:&lt;/p&gt;

&lt;h3&gt;
  
  
  Option A: PostgreSQL + pgvector
&lt;/h3&gt;

&lt;p&gt;If you're already using Postgres, you can run both searches in the same database:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Vector search (pgvector)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;vector_distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;query_embedding&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- BM25 search (pg_trgm or ParadeDB for proper BM25)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ts_rank&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;plainto_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'PTO policy'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;bm25_score&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;to_tsvector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;@@&lt;/span&gt; &lt;span class="n"&gt;plainto_tsquery&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'PTO policy'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;bm25_score&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Merge results in application code using RRF&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Option B: Dedicated tools
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Weaviate&lt;/strong&gt;, &lt;strong&gt;Qdrant&lt;/strong&gt;, and &lt;strong&gt;Milvus&lt;/strong&gt; all support hybrid search natively&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;LangChain&lt;/strong&gt; and &lt;strong&gt;LlamaIndex&lt;/strong&gt; have hybrid retriever implementations&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;NornicDB&lt;/strong&gt; (MIT licensed) handles the full pipeline — embedding, BM25, reranking — in-process with impressive latency (~7ms on 1M documents)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Option C: Keep it simple
&lt;/h3&gt;

&lt;p&gt;If you want minimal infrastructure, just run:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Your existing vector DB for semantic search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Elasticsearch&lt;/strong&gt; or even SQLite FTS5 for BM25&lt;/li&gt;
&lt;li&gt;Merge results with the RRF function above&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's not elegant, but it works. You can optimize later.&lt;/p&gt;




&lt;h2&gt;
  
  
  The 80/20 of RAG Optimization
&lt;/h2&gt;

&lt;p&gt;Here's what I've learned about where to spend your time:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;High impact, do first:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Switch from pure vector to hybrid retrieval&lt;/li&gt;
&lt;li&gt;Build a proper evaluation set (50+ questions with expected answers)&lt;/li&gt;
&lt;li&gt;Tune your chunk boundaries to match document structure&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Moderate impact, do second:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Experiment with chunk sizes for your specific data&lt;/li&gt;
&lt;li&gt;Try different embedding models&lt;/li&gt;
&lt;li&gt;Add metadata filtering (date, source, category)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Low impact unless at scale:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cross-encoder reranking (measure before committing)&lt;/li&gt;
&lt;li&gt;Dimensionality reduction on embeddings&lt;/li&gt;
&lt;li&gt;HyDE (Hypothetical Document Embeddings)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The single best investment? &lt;strong&gt;Build an eval set.&lt;/strong&gt; Without one, you're just guessing whether your changes help or hurt.&lt;/p&gt;




&lt;h2&gt;
  
  
  Wrapping Up
&lt;/h2&gt;

&lt;p&gt;Hybrid retrieval isn't new or cutting-edge — BM25 has been around since 1994. But combining it with modern vector search is one of those "boring but effective" improvements that should probably be the default for any production RAG system.&lt;/p&gt;

&lt;p&gt;If you're running RAG with pure vector search and your accuracy isn't where you want it to be, try adding BM25 before you reach for more complex solutions. It took me about half a day to implement and delivered a bigger improvement than anything else I've tried.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's your RAG setup look like?&lt;/strong&gt; Are you running pure vector, hybrid, or something else entirely? I'd love to hear what's working (or not working) for you in production.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Tags: &lt;code&gt;rag&lt;/code&gt;, &lt;code&gt;ai&lt;/code&gt;, &lt;code&gt;machinelearning&lt;/code&gt;, &lt;code&gt;tutorial&lt;/code&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rag</category>
      <category>ai</category>
      <category>machinelearning</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>Remote-to-Remote PostgreSQL Migration: Theory and Practice</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Wed, 25 Feb 2026 00:46:12 +0000</pubDate>
      <link>https://dev.to/vapmail16/remote-to-remote-postgresql-migration-theory-and-practice-424n</link>
      <guid>https://dev.to/vapmail16/remote-to-remote-postgresql-migration-theory-and-practice-424n</guid>
      <description>&lt;h1&gt;
  
  
  Remote-to-Remote PostgreSQL Migration: Theory and Practice
&lt;/h1&gt;

&lt;p&gt;A guide for anyone who needs to clone a PostgreSQL database from one server to another—whether you use Prisma, another ORM, or raw SQL. We explain the &lt;strong&gt;concepts&lt;/strong&gt; first, then show &lt;strong&gt;concrete scripts&lt;/strong&gt; you can adapt. No prior knowledge of our app is required.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Who this is for:&lt;/strong&gt; Developers or ops people planning a one-off or repeatable database move (new host, new region, new environment). The theory applies to any relational DB; the code examples use PostgreSQL and Prisma and can be translated to other stacks.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: What is “remote-to-remote” migration?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Remote-to-remote&lt;/strong&gt; means copying a database from one &lt;strong&gt;already running&lt;/strong&gt; database server (the &lt;em&gt;source&lt;/em&gt;) to another &lt;strong&gt;already running&lt;/strong&gt; server (the &lt;em&gt;target&lt;/em&gt;). Both are “remote” in the sense that they live on a host you connect to over the network—as opposed to dumping on your laptop and restoring elsewhere.&lt;/p&gt;

&lt;p&gt;You might do this when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Changing hosting or region&lt;/strong&gt; — Moving from one cloud provider or region to another (e.g. new AWS region, new PaaS org).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Disaster recovery or failover&lt;/strong&gt; — Bringing up a replica or standby in a different datacenter.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Environment cloning&lt;/strong&gt; — Creating a staging or QA database that mirrors production.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance or isolation&lt;/strong&gt; — Moving data to a new tenant or organisation boundary (e.g. new “org” in a multi-tenant platform).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In all cases the goal is the same: the &lt;strong&gt;target&lt;/strong&gt; should end up with the same &lt;strong&gt;schema&lt;/strong&gt; (tables, columns, constraints, indexes, enums, triggers) and the same &lt;strong&gt;data&lt;/strong&gt; (rows) as the source, so you can point your application at the new database and keep running.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: Schema vs data
&lt;/h2&gt;

&lt;p&gt;A PostgreSQL database has two main parts:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Schema (structure)&lt;/strong&gt; — Tables, columns, data types, primary keys, foreign keys, unique and check constraints, indexes, custom types (e.g. enums), sequences, triggers, functions. This is the “shape” of your data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data&lt;/strong&gt; — The actual rows in those tables.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you migrate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;You must establish the &lt;strong&gt;schema&lt;/strong&gt; on the target first (so tables and constraints exist).&lt;/li&gt;
&lt;li&gt;Then you &lt;strong&gt;copy the data&lt;/strong&gt; in an order that respects foreign keys (parent rows before child rows).&lt;/li&gt;
&lt;li&gt;Finally you &lt;strong&gt;verify&lt;/strong&gt; that the target matches the source (same tables, same row counts, and optionally same schema objects).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you use an ORM or migration tool (e.g. Prisma, Flyway, Liquibase), the “schema” on the target is usually created by &lt;strong&gt;running your migrations&lt;/strong&gt; against the new database. That way the target is defined by the same migration history as the rest of your project, not by a one-off dump.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: Two ways to get schema + data to the target
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Approach&lt;/th&gt;
&lt;th&gt;How schema gets to target&lt;/th&gt;
&lt;th&gt;How data gets to target&lt;/th&gt;
&lt;th&gt;Best when&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;A. Deploy then copy&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Run your migration tool (e.g. &lt;code&gt;prisma migrate deploy&lt;/code&gt;) on the target&lt;/td&gt;
&lt;td&gt;Application-level copy: connect to both DBs, read rows from source, insert into target in FK order&lt;/td&gt;
&lt;td&gt;You want to avoid depending on &lt;code&gt;pg_dump&lt;/code&gt;/&lt;code&gt;pg_restore&lt;/code&gt; or your local Postgres client version. Works with any stack that can connect to both DBs.&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;B. Full dump/restore&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;pg_dump&lt;/code&gt; (schema + data) from source → &lt;code&gt;pg_restore&lt;/code&gt; into target&lt;/td&gt;
&lt;td&gt;Same dump file carries both&lt;/td&gt;
&lt;td&gt;You’re comfortable with &lt;code&gt;pg_dump&lt;/code&gt; and your client version is &lt;strong&gt;≥&lt;/strong&gt; the server’s. One-shot clone.&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Why Option A is often safer:&lt;/strong&gt; &lt;code&gt;pg_dump&lt;/code&gt; and &lt;code&gt;pg_restore&lt;/code&gt; are tied to a specific PostgreSQL &lt;strong&gt;client&lt;/strong&gt; version. If the &lt;strong&gt;server&lt;/strong&gt; is newer (e.g. 17) and your laptop has 15, you can hit compatibility issues or subtle errors. With Option A, your app (e.g. Node + Prisma) talks to both databases using the same driver; the server version is what matters, not the tool on your machine.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: Foreign keys and copy order
&lt;/h2&gt;

&lt;p&gt;PostgreSQL (and any relational DB) enforces &lt;strong&gt;foreign key&lt;/strong&gt; constraints: a row in a “child” table cannot reference a non-existent row in the “parent” table. So you must insert &lt;strong&gt;parents before children&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Example: if &lt;code&gt;sessions&lt;/code&gt; has &lt;code&gt;user_id&lt;/code&gt; → &lt;code&gt;users.id&lt;/code&gt;, you must copy all &lt;code&gt;users&lt;/code&gt; rows before any &lt;code&gt;sessions&lt;/code&gt; rows. So the &lt;strong&gt;order of tables&lt;/strong&gt; in your data copy script is critical. You need a list of tables in &lt;strong&gt;dependency order&lt;/strong&gt; (topological order with respect to FKs). You can derive this from your schema or from the database’s &lt;code&gt;information_schema&lt;/code&gt; and &lt;code&gt;pg_catalog&lt;/code&gt;. In the code below we use a fixed list that we keep in sync with our schema.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: PostgreSQL enums and “migrate-only” targets
&lt;/h2&gt;

&lt;p&gt;PostgreSQL supports &lt;strong&gt;custom enum types&lt;/strong&gt;. Once created, you can add new values with &lt;code&gt;ALTER TYPE ... ADD VALUE&lt;/code&gt;. But &lt;strong&gt;you cannot add a value to an enum that doesn’t exist&lt;/strong&gt;. If the target database was created &lt;strong&gt;only&lt;/strong&gt; by running migrations (no full clone from source), it’s possible that an older migration never created a certain enum—e.g. it was added later in the app’s life. In that case, a migration that only does &lt;code&gt;ALTER TYPE "ChartType" ADD VALUE 'TRANSIT_MOON'&lt;/code&gt; will &lt;strong&gt;fail&lt;/strong&gt; on that target with “type does not exist”.&lt;/p&gt;

&lt;p&gt;So the safe pattern in a migration is: &lt;strong&gt;if the enum doesn’t exist, create it with all values; else add the new values&lt;/strong&gt;. That way the same migration works both on (a) a DB that already had the enum (e.g. cloned from production) and (b) a DB that was built from scratch by running migrations only.&lt;/p&gt;




&lt;h2&gt;
  
  
  Theory: Verification — row counts and schema diff
&lt;/h2&gt;

&lt;p&gt;Copying data can fail silently (e.g. duplicate key skips, or a few rows error out). So you should &lt;strong&gt;verify&lt;/strong&gt; the target.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Row counts&lt;/strong&gt; — For each table, compare &lt;code&gt;COUNT(*)&lt;/code&gt; on source vs target. If they match (and the set of tables matches), you have strong evidence that data was copied. This is cheap and easy to automate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema diff&lt;/strong&gt; — Compare not just tables but primary keys, foreign keys, unique constraints, indexes, check constraints, enum types, sequences, and triggers. That way you catch “target missing an index” or “target has different enum values”. Comparing by &lt;strong&gt;meaning&lt;/strong&gt; (e.g. constraint clause, not constraint name) avoids false differences when constraint names are auto-generated differently on each DB.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Optional objects (e.g. a table used only by a trigger on the source) may exist only on one side. You can either apply the same trigger/table on the target, or explicitly &lt;strong&gt;exclude&lt;/strong&gt; those from the schema comparison so they don’t cause a failure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why we did it (our case)
&lt;/h2&gt;

&lt;p&gt;We had an existing app with a &lt;strong&gt;source&lt;/strong&gt; database (current production) and a &lt;strong&gt;target&lt;/strong&gt; database (new, empty) in a new organisation. Goal: clone schema and data to the target, verify parity, then point the app at the new DB—without long downtime or manual table-by-table copy. The theory above guided how we designed the scripts.&lt;/p&gt;




&lt;h2&gt;
  
  
  What we learned (short version)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Migration order matters.&lt;/strong&gt; If migrations depend on each other (e.g. GDPR tables before a migration that adds columns), rename the migration folders so they run in the right order (e.g. &lt;code&gt;20251210_1_add_gdpr_tables&lt;/code&gt;, &lt;code&gt;20251210_2_add_error_message&lt;/code&gt;).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enums and “migrate-only” DBs.&lt;/strong&gt; If the target DB was created only by running Prisma migrations (no prior full clone), the DB might not have an enum that was added later in the schema. Our migration had to &lt;strong&gt;create the enum if it doesn’t exist&lt;/strong&gt;, then add values—otherwise &lt;code&gt;ALTER TYPE ... ADD VALUE&lt;/code&gt; fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Avoid pg_dump version mismatch.&lt;/strong&gt; Using &lt;code&gt;pg_dump&lt;/code&gt;/&lt;code&gt;pg_restore&lt;/code&gt; from a client older than the server can cause subtle failures. We preferred a &lt;strong&gt;Node + Prisma&lt;/strong&gt; data copy so we didn’t depend on local PostgreSQL client versions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Copy order = FK order.&lt;/strong&gt; Copy tables in dependency order (parents before children) to satisfy foreign keys. We maintain a single ordered list and use raw SQL with proper enum/JSONB casting.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verify with row counts and schema.&lt;/strong&gt; We run a row-count comparison and a separate schema comparison (PKs, FKs, indexes, constraints, enums, sequences, triggers) so we know the new DB is really in parity.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Triggers and optional objects.&lt;/strong&gt; Things like &lt;code&gt;user_deletion_trigger&lt;/code&gt; and &lt;code&gt;user_deletion_logs&lt;/code&gt; may exist only on the source. Document them and apply the same SQL on the target (or exclude them from schema compare) so “diff” doesn’t mean “wrong.”&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Special characters in passwords.&lt;/strong&gt; Use URL-encoded passwords in &lt;code&gt;DATABASE_URL&lt;/code&gt; (e.g. &lt;code&gt;^&lt;/code&gt; → &lt;code&gt;%5E&lt;/code&gt;, &lt;code&gt;}&lt;/code&gt; → &lt;code&gt;%7D&lt;/code&gt;).&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Two ways to do it (practice)
&lt;/h2&gt;

&lt;p&gt;Below we give concrete scripts for both approaches. You can reuse them for any PostgreSQL project; replace table lists and connection details with your own.&lt;/p&gt;

&lt;h3&gt;
  
  
  Option A: Deploy migrations on target, then copy data (recommended)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; Schema comes from your migration tool (e.g. Prisma); data is copied by an app that connects to both DBs in FK order; then we verify with row counts (and optionally schema diff). No dependency on local &lt;code&gt;pg_dump&lt;/code&gt; version.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Steps:&lt;/strong&gt; Deploy schema on target → copy data with a Node script → compare row counts (and optionally schema).&lt;/p&gt;

&lt;h3&gt;
  
  
  Option B: Full clone with pg_dump / pg_restore
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; One binary dump carries schema + data. Restore into target. Your local &lt;code&gt;pg_dump&lt;/code&gt;/&lt;code&gt;pg_restore&lt;/code&gt; version should be &lt;strong&gt;≥&lt;/strong&gt; the server’s PostgreSQL version.&lt;/p&gt;

&lt;p&gt;Below we focus on &lt;strong&gt;Option A&lt;/strong&gt; (full scripts), then give the exact commands for Option B.&lt;/p&gt;




&lt;h2&gt;
  
  
  Option A: Step-by-step with code
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Prerequisites
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Node.js, &lt;code&gt;tsx&lt;/code&gt;, and Prisma in your backend.&lt;/li&gt;
&lt;li&gt;Two env vars (or args): &lt;code&gt;SOURCE_DATABASE_URL&lt;/code&gt; (old DB) and &lt;code&gt;TARGET_DATABASE_URL&lt;/code&gt; (new, empty DB). Create the target DB in your host (e.g. dcdeploy) first.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  2. Deploy schema on the target
&lt;/h3&gt;

&lt;p&gt;We deploy Prisma migrations on the &lt;strong&gt;target&lt;/strong&gt;, then run &lt;code&gt;db push&lt;/code&gt; so any tables that exist in the schema but aren’t in migrations (e.g. &lt;code&gt;profiles&lt;/code&gt;, &lt;code&gt;charts_cache&lt;/code&gt;) are created.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Orchestration script:&lt;/strong&gt; &lt;code&gt;backend/scripts/deploy-migrate-compare.sh&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;#!/usr/bin/env bash&lt;/span&gt;
&lt;span class="c"&gt;#&lt;/span&gt;
&lt;span class="c"&gt;# 1) Deploy Prisma migrations to the NEW (target) database&lt;/span&gt;
&lt;span class="c"&gt;# 2) Copy all data from OLD (source) remote to NEW (target) remote (Node script, no pg_dump)&lt;/span&gt;
&lt;span class="c"&gt;# 3) Compare source vs target to verify parity&lt;/span&gt;
&lt;span class="c"&gt;#&lt;/span&gt;
&lt;span class="nb"&gt;set&lt;/span&gt; &lt;span class="nt"&gt;-e&lt;/span&gt;

&lt;span class="nv"&gt;SOURCE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;1&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;$SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nv"&gt;TARGET_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;2&lt;/span&gt;&lt;span class="k"&gt;:-&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SOURCE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="nt"&gt;-z&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Usage: SOURCE_DATABASE_URL=... TARGET_DATABASE_URL=... &lt;/span&gt;&lt;span class="nv"&gt;$0&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi

&lt;/span&gt;&lt;span class="nv"&gt;SCRIPT_DIR&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;&lt;span class="nb"&gt;dirname&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;BASH_SOURCE&lt;/span&gt;&lt;span class="p"&gt;[0]&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;pwd&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;span class="nb"&gt;cd&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SCRIPT_DIR&lt;/span&gt;&lt;span class="s2"&gt;/.."&lt;/span&gt;

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"1) Deploy migrations to TARGET database"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
npx prisma migrate deploy

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"1b) Sync schema (create any tables not in migrations)"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
npx prisma db push

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"2) Copy data: source remote → target remote"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nv"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SOURCE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nv"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; npx tsx scripts/migrate-remote-to-remote-data.ts

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"3) Compare source vs target"&lt;/span&gt;
&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"=========================================="&lt;/span&gt;
&lt;span class="nv"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SOURCE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nv"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; npx tsx scripts/compare-remote-databases.ts

&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Done. New database is ready; point app to TARGET_DATABASE_URL."&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run from backend:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;backend
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'postgresql://user:pass@old-host:port/olddb'&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'postgresql://user:pass@new-host:port/newdb'&lt;/span&gt;   &lt;span class="c"&gt;# URL-encode password if needed&lt;/span&gt;
./scripts/deploy-migrate-compare.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Data copy script (Node + Prisma, no pg_dump)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; We copy tables in &lt;strong&gt;FK order&lt;/strong&gt; (parents first). For each row we build an &lt;code&gt;INSERT&lt;/code&gt;; PostgreSQL requires enum columns to be cast explicitly (e.g. &lt;code&gt;$1::"ChartType"&lt;/code&gt;) and JSONB to be cast as &lt;code&gt;::jsonb&lt;/code&gt;, so we read column metadata from &lt;code&gt;information_schema&lt;/code&gt; and add the right cast per column type. Duplicate key errors are treated as “already exists” (skip) so the script can be re-run safely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key ideas:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Two &lt;code&gt;PrismaClient&lt;/code&gt; instances (source and target).&lt;/li&gt;
&lt;li&gt;Tables listed in dependency order (users → audit_logs → sessions → …).&lt;/li&gt;
&lt;li&gt;For each table: &lt;code&gt;SELECT *&lt;/code&gt; from source; for each row, build &lt;code&gt;INSERT&lt;/code&gt; with &lt;code&gt;$n::"EnumType"&lt;/code&gt; for enum columns and &lt;code&gt;$n::jsonb&lt;/code&gt; for JSONB.&lt;/li&gt;
&lt;li&gt;Skip rows that already exist (by &lt;code&gt;id&lt;/code&gt;) to allow re-runs.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;backend/scripts/migrate-remote-to-remote-data.ts&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="cm"&gt;/**
 * Copy data from one remote database to another (remote → remote).
 * Use after running Prisma migrations on the target. No pg_dump needed.
 *
 * Usage:
 *   SOURCE_DATABASE_URL="postgresql://..." TARGET_DATABASE_URL="postgresql://..." npx tsx scripts/migrate-remote-to-remote-data.ts
 */&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PrismaClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@prisma/client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Set both SOURCE_DATABASE_URL and TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Tables in FK order (adjust for your schema)&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;TABLES_IN_ORDER&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;audit_logs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;sessions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;password_resets&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;notifications&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;notification_preferences&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;chart_settings&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;payments&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;subscriptions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;payment_webhook_logs&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;payment_refunds&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data_export_requests&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data_deletion_requests&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;consent_records&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;profiles&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;charts_cache&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;matchings&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reports&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;saved_transits&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;aspect_relationships&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ai_predictions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;house_predictions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ascendant_analyses&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;];&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;migrateTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;migrated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;skipped&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$queryRawUnsafe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`SELECT * FROM "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;migrated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;skipped&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$queryRawUnsafe&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;
      &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;data_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;udt_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s2"&gt;`SELECT column_name, data_type, udt_name FROM information_schema.columns 
       WHERE table_schema = 'public' AND table_name = $1
       ORDER BY ordinal_position`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;tableName&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;columnNames&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;column_name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isEnum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data_type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;USER-DEFINED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;udt_name&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isJson&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;c&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data_type&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;jsonb&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;udt_name&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;jsonb&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;migrated&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;skipped&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasId&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$queryRawUnsafe&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s2"&gt;`SELECT 1 FROM "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" WHERE id = $1 LIMIT 1`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;
          &lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;skipped&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;placeholders&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
        &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;columnNames&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;col&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;columnNames&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;col&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="kc"&gt;undefined&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="nx"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`"&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;col&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;row&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;col&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;val&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;object&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;isArray&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;val&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;val&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;val&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;string&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;val&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="nx"&gt;vals&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;val&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;colMeta&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;columns&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;i&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isEnum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;colMeta&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;placeholders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`$&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;::"&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;colMeta&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;udt_name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;isJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;colMeta&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;placeholders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`$&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;::jsonb`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;placeholders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`$&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
          &lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="nx"&gt;idx&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$executeRawUnsafe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="s2"&gt;`INSERT INTO "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;" (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;) VALUES (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;placeholders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;, &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
          &lt;span class="p"&gt;...&lt;/span&gt;&lt;span class="nx"&gt;vals&lt;/span&gt;
        &lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="nx"&gt;migrated&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
        &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;duplicate key&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unique constraint&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;skipped&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="nx"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
          &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`   Row error in &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;migrated&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;skipped&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;errors&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="na"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;unknown&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`   Table &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;:`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;migrated&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;skipped&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Connecting to source and target...&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✅ Connected&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;table&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;TABLES_IN_ORDER&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;migrateTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sym&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errors&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;⚠️&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✅&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sym&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: migrated=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;migrated&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, skipped=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;skipped&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, errors=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;errors&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s1"&gt;Data copy finished. Run compare script to verify.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$disconnect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$disconnect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What to change for your project:&lt;/strong&gt; set &lt;code&gt;TABLES_IN_ORDER&lt;/code&gt; to your tables in FK order (parents first). You can derive the list from &lt;code&gt;prisma migrate&lt;/code&gt; or your schema.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Row-count comparison script
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; We verify parity with &lt;strong&gt;row counts&lt;/strong&gt; (see “Verification” above): list tables in both DBs, then for each common table compare &lt;code&gt;COUNT(*)&lt;/code&gt;. If any table is missing on target or counts differ, we exit with failure so the migration is not considered successful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;backend/scripts/compare-remote-databases.ts&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="cm"&gt;/**
 * Compare two remote databases: table list and row counts per table.
 * Usage: SOURCE_DATABASE_URL="..." TARGET_DATABASE_URL="..." npx tsx scripts/compare-remote-databases.ts
 */&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;dotenv&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;dotenv&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;PrismaClient&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;@prisma/client&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;dotenv&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;config&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;targetUrl&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Set both SOURCE_DATABASE_URL and TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sourceUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;datasources&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;db&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;targetUrl&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getTables&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$queryRawUnsafe&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;tablename&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`SELECT tablename FROM pg_tables WHERE schemaname = 'public' AND tablename NOT LIKE '_prisma%' ORDER BY tablename`&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tablename&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;getRowCount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;PrismaClient&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;$queryRawUnsafe&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;bigint&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s2"&gt;`SELECT COUNT(*) as count FROM "&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"`&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;Number&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]?.&lt;/span&gt;&lt;span class="nx"&gt;count&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$connect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;sourceTables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getTables&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;targetTables&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;getTables&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;common&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;sourceTables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;targetTables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;onlyInSource&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;sourceTables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;targetTables&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;onlyInSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;❌ Tables only in SOURCE:&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;onlyInSource&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;allMatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;table&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;common&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sort&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;sourceCount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;targetCount&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="nf"&gt;getRowCount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nf"&gt;getRowCount&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;match&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;sourceCount&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;targetCount&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;match&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;allMatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`  &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;match&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✅&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;❌&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;table&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;: source=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;sourceCount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, target=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;targetCount&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;allMatch&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;onlyInSource&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;✅ All compared tables have matching row counts.&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;sourcePrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$disconnect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;targetPrisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$disconnect&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="nf"&gt;main&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  5. Schema comparison (optional but recommended)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; Row counts tell you “same number of rows”; they don’t tell you “same indexes or triggers”. So we run a &lt;strong&gt;schema diff&lt;/strong&gt;: query both DBs for primary keys, foreign keys, unique constraints, indexes, check constraints (compared by &lt;strong&gt;table + clause&lt;/strong&gt;, not by name), enums, sequences, and triggers. Any difference fails the run. We exclude optional objects (e.g. a logging table that exists only on source) so that “only in source” doesn’t wrongly fail the run.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;backend/scripts/compare-remote-databases-schema.ts&lt;/code&gt; (concept)&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query both DBs for: primary keys, foreign keys, unique constraints, indexes, check constraints (by table + clause), enum types, sequences, triggers.&lt;/li&gt;
&lt;li&gt;Diff the sets; if something exists only in source or only in target (and isn’t in the ignore list), exit with code 1.&lt;/li&gt;
&lt;li&gt;Full script lives in the repo at &lt;code&gt;backend/scripts/compare-remote-databases-schema.ts&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Run after data copy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'...'&lt;/span&gt; &lt;span class="nv"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s1"&gt;'...'&lt;/span&gt; npx tsx scripts/compare-remote-databases-schema.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  6. Migration order and enum migration (lessons in code)
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; Migration tools run migrations in a defined order (usually by name/timestamp). If migration B depends on something migration A creates (e.g. a table), A must run first. &lt;strong&gt;Migration order:&lt;/strong&gt; We had migrations that had to run in a specific order (e.g. GDPR tables before a later migration that referenced them). We renamed the folders so the timestamp prefix orders them, e.g.:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;code&gt;20251210_1_add_gdpr_tables&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;20251210_2_add_error_message&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;20251210_3_add_payment_tables&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Enum that might not exist:&lt;/strong&gt; (See “Theory: PostgreSQL enums” above.) On a DB that was created only by running migrations, the &lt;code&gt;ChartType&lt;/code&gt; enum sometimes didn’t exist when we added new values. So we changed the migration to &lt;strong&gt;create the enum if missing&lt;/strong&gt;, then add the values.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;File:&lt;/strong&gt; &lt;code&gt;backend/prisma/migrations/20260214000000_add_transit_chart_types/migration.sql&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Create ChartType enum if it does not exist (e.g. when DB was created from migrations only).&lt;/span&gt;
&lt;span class="c1"&gt;-- Then ensure TRANSIT_LAGNA and TRANSIT_MOON exist for transit house predictions.&lt;/span&gt;
&lt;span class="k"&gt;DO&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pg_type&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;typname&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'ChartType'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;THEN&lt;/span&gt;
    &lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="nv"&gt;"ChartType"&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;ENUM&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="s1"&gt;'D1_LAGNA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D1_RASI'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D2_HORA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D3_DREKKANA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D7_SAPTAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'D9_NAVAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D10_DASAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D12_DWADASAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D16_SHODASAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'D20_VIMSAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D24_CHATURVIMSAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D27_BHAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D30_TRIMSAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'D40_KHAVEDAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D45_AKSHAVEDAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'D60_SHASHTYAMSA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="s1"&gt;'TRANSIT_LAGNA'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'TRANSIT_MOON'&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;ELSE&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="nv"&gt;"ChartType"&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="s1"&gt;'TRANSIT_LAGNA'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TYPE&lt;/span&gt; &lt;span class="nv"&gt;"ChartType"&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="n"&gt;VALUE&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;EXISTS&lt;/span&gt; &lt;span class="s1"&gt;'TRANSIT_MOON'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;END&lt;/span&gt; &lt;span class="n"&gt;IF&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you add new enum values in Prisma, keep in mind: on a “migrate-only” target, the enum might not exist yet—so “create if not exists, then add value” is safer.&lt;/p&gt;

&lt;h3&gt;
  
  
  7. Triggers and optional objects
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Theory in practice:&lt;/strong&gt; Triggers and their side effects (e.g. a log table) are part of the “schema” in a broad sense. If the source has them and the target doesn’t, the schema diff will show “only in source”. You can either &lt;strong&gt;apply the same SQL&lt;/strong&gt; on the target (so behaviour and schema match) or &lt;strong&gt;exclude&lt;/strong&gt; those objects from the comparison. We apply the trigger SQL so the target behaves like the source.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;psql &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; backend/scripts/add-user-deletion-trigger.sql
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The schema comparison script can exclude optional tables/sequences (e.g. &lt;code&gt;user_deletion_logs&lt;/code&gt;) so they don’t cause a “diff” failure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Option B: pg_dump / pg_restore (one-shot clone)
&lt;/h2&gt;

&lt;p&gt;Use when you want a single full clone and your local PostgreSQL client version is &lt;strong&gt;≥&lt;/strong&gt; the server’s.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Prerequisites:&lt;/strong&gt; &lt;code&gt;pg_dump&lt;/code&gt;, &lt;code&gt;pg_restore&lt;/code&gt; (and optionally &lt;code&gt;psql&lt;/code&gt;) on PATH. If your default client is older than the server, set:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PG_DUMP&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/path/to/pg_dump   &lt;span class="c"&gt;# e.g. /usr/local/opt/postgresql@17/bin/pg_dump&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;PG_RESTORE&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;/path/to/pg_restore
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;One-shot (schema + data):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;backend
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"postgresql://user:pass@old-host:port/olddb"&lt;/span&gt;
&lt;span class="nb"&gt;export &lt;/span&gt;&lt;span class="nv"&gt;TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s2"&gt;"postgresql://user:pass@new-host:port/newdb"&lt;/span&gt;
./scripts/pg-migrate-remote-to-remote.sh
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Data-only (target already has schema from Prisma):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;DATA_ONLY&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;1 ./scripts/pg-migrate-remote-to-remote.sh &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$SOURCE_DATABASE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$TARGET_DATABASE_URL&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Passwords with special characters:&lt;/strong&gt; URL-encode them in the URL (e.g. &lt;code&gt;^&lt;/code&gt; → &lt;code&gt;%5E&lt;/code&gt;, &lt;code&gt;}&lt;/code&gt; → &lt;code&gt;%7D&lt;/code&gt;).&lt;/p&gt;




&lt;h2&gt;
  
  
  After migration
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Point the app to the new DB: set &lt;code&gt;DATABASE_URL&lt;/code&gt; (e.g. in &lt;code&gt;.env&lt;/code&gt; or in your host’s env) to &lt;code&gt;TARGET_DATABASE_URL&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Run &lt;code&gt;npx prisma generate&lt;/code&gt; in the backend.&lt;/li&gt;
&lt;li&gt;Map backend and frontend URLs in your host (e.g. dcdeploy) to the new environment.&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  Summary and takeaways
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Theory (what to remember):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Remote-to-remote&lt;/strong&gt; = copy from one live DB server to another (new host, region, or environment). You need the same &lt;strong&gt;schema&lt;/strong&gt; and &lt;strong&gt;data&lt;/strong&gt; on the target.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema first, then data:&lt;/strong&gt; Create structure (tables, constraints, enums, etc.) on the target, then copy rows in &lt;strong&gt;FK order&lt;/strong&gt; so parent rows exist before child rows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Two strategies:&lt;/strong&gt; (A) Deploy your migrations on the target and copy data with an app (avoids &lt;code&gt;pg_dump&lt;/code&gt; version issues); (B) Use &lt;code&gt;pg_dump&lt;/code&gt;/&lt;code&gt;pg_restore&lt;/code&gt; when your client version matches or exceeds the server.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Enums:&lt;/strong&gt; On a migrate-only target, an enum might not exist yet; migrations should create-if-not-exists then add value so they work on both fresh and cloned DBs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verification:&lt;/strong&gt; Compare row counts (and optionally full schema: PKs, FKs, indexes, constraints, enums, sequences, triggers) so you know the target really matches the source.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Practice (what we did):&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We migrated remote → remote by &lt;strong&gt;deploying Prisma migrations on the target&lt;/strong&gt;, then &lt;strong&gt;copying data with a Node script&lt;/strong&gt; (enum + JSONB safe), then &lt;strong&gt;comparing row counts and optionally schema&lt;/strong&gt;.&lt;/li&gt;
&lt;li&gt;We &lt;strong&gt;ordered migrations&lt;/strong&gt; by folder naming, &lt;strong&gt;handled enums&lt;/strong&gt; with create-if-not-exists in SQL, and &lt;strong&gt;verified&lt;/strong&gt; with scripts. The same approach works for any Prisma/PostgreSQL project; adjust &lt;code&gt;TABLES_IN_ORDER&lt;/code&gt; and optional/ignore lists to your schema.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All scripts referenced here live in this repo under &lt;code&gt;backend/scripts/&lt;/code&gt;. The technical runbook is in &lt;a href="//./REMOTE_TO_REMOTE_MIGRATION.md"&gt;REMOTE_TO_REMOTE_MIGRATION.md&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>database</category>
      <category>postgres</category>
    </item>
    <item>
      <title>RAG From Scratch: Build a System That Answers Questions From Your Docs</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Tue, 24 Feb 2026 11:51:48 +0000</pubDate>
      <link>https://dev.to/vapmail16/rag-from-scratch-build-a-system-that-answers-questions-from-your-docs-4h0</link>
      <guid>https://dev.to/vapmail16/rag-from-scratch-build-a-system-that-answers-questions-from-your-docs-4h0</guid>
      <description>&lt;p&gt;My first RAG system answered "I don't know" to questions that were clearly in the documents. The information was right there — paragraph three, page seven — and the AI couldn't find it.&lt;br&gt;
Turns out, my chunking strategy was destroying context. I was splitting documents every 1,000 characters like every tutorial told me to. The split landed in the middle of a sentence about quarterly revenue targets. The first half ended up in one chunk, the second half in another, and the embedding for each half was meaningless.&lt;br&gt;
That was the moment I understood: RAG isn't a retrieval problem or a generation problem. It's an architecture problem. And most tutorials stop at step one.&lt;br&gt;
Here's how to build a RAG system that actually works — from loading documents to generating accurate answers.&lt;/p&gt;

&lt;p&gt;What RAG Actually Is&lt;br&gt;
RAG stands for Retrieval-Augmented Generation. Instead of asking an LLM to answer from its training data (which might be outdated or wrong), you give it YOUR documents and let it answer from those.&lt;br&gt;
The pipeline:&lt;br&gt;
Documents → Chunk → Embed → Store in Vector DB → &lt;br&gt;
User asks question → Embed question → Search for similar chunks → &lt;br&gt;
Feed chunks + question to LLM → Generate answer&lt;br&gt;
Simple in theory. The devil is in each arrow.&lt;/p&gt;

&lt;p&gt;Step 1: Load and Chunk Your Documents&lt;br&gt;
Loading is straightforward. Chunking is where most RAG systems break.&lt;br&gt;
pythonfrom langchain.document_loaders import DirectoryLoader, TextLoader&lt;br&gt;
from langchain.text_splitter import RecursiveCharacterTextSplitter&lt;/p&gt;

&lt;h1&gt;
  
  
  Load documents
&lt;/h1&gt;

&lt;p&gt;loader = DirectoryLoader('./docs', glob="*&lt;em&gt;/&lt;/em&gt;.md", loader_cls=TextLoader)&lt;br&gt;
documents = loader.load()&lt;/p&gt;

&lt;h1&gt;
  
  
  Bad chunking (what most tutorials teach)
&lt;/h1&gt;

&lt;p&gt;bad_splitter = RecursiveCharacterTextSplitter(&lt;br&gt;
    chunk_size=1000,&lt;br&gt;
    chunk_overlap=0  # No overlap = lost context&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  Good chunking
&lt;/h1&gt;

&lt;p&gt;good_splitter = RecursiveCharacterTextSplitter(&lt;br&gt;
    chunk_size=500,        # Smaller = more precise retrieval&lt;br&gt;
    chunk_overlap=100,     # Overlap preserves context at boundaries&lt;br&gt;
    separators=["\n## ", "\n### ", "\n\n", "\n", " "]  # Respect document structure&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;chunks = good_splitter.split_documents(documents)&lt;br&gt;
Why 500 instead of 1,000? Smaller chunks mean more precise retrieval. When a user asks "What's the refund policy?", you want to retrieve the exact paragraph about refunds — not a 1,000-character block that's half refund policy and half shipping information.&lt;br&gt;
Why overlap of 100? Because sentences at chunk boundaries get split. Overlapping by 100 characters means the end of chunk N appears at the start of chunk N+1. Context preserved.&lt;br&gt;
Why respect separators? Splitting on headings and paragraphs keeps semantic units together. Splitting on character count doesn't care if it lands mid-sentence.&lt;/p&gt;

&lt;p&gt;Step 2: Embed and Store&lt;br&gt;
Turn each chunk into a vector embedding and store it in a vector database.&lt;br&gt;
pythonfrom langchain.embeddings import OpenAIEmbeddings&lt;br&gt;
from langchain.vectorstores import Chroma  # or Pinecone, Weaviate, pgvector&lt;/p&gt;

&lt;p&gt;embeddings = OpenAIEmbeddings(model="text-embedding-3-small")&lt;/p&gt;

&lt;h1&gt;
  
  
  Create vector store
&lt;/h1&gt;

&lt;p&gt;vectorstore = Chroma.from_documents(&lt;br&gt;
    documents=chunks,&lt;br&gt;
    embedding=embeddings,&lt;br&gt;
    persist_directory="./chroma_db"&lt;br&gt;
)&lt;br&gt;
Embedding model choice matters. text-embedding-3-small is fast and cheap. For higher accuracy on technical content, text-embedding-3-large is worth the extra cost. For completely local/private data, consider open-source alternatives like sentence-transformers/all-MiniLM-L6-v2.&lt;/p&gt;

&lt;p&gt;Step 3: Retrieval — Where Most Systems Fail&lt;br&gt;
Naive retrieval: find the top-k most similar vectors to the user's question. This works for simple questions but fails badly when:&lt;/p&gt;

&lt;p&gt;The question uses different words than the document (semantic gap)&lt;br&gt;
Multiple chunks contain partial answers&lt;br&gt;
The most similar chunk isn't the most relevant chunk&lt;/p&gt;

&lt;p&gt;Here are three upgrades that dramatically improve retrieval:&lt;br&gt;
Upgrade A: Hybrid Search (Vector + Keyword)&lt;br&gt;
pythonfrom langchain.retrievers import BM25Retriever, EnsembleRetriever&lt;/p&gt;

&lt;h1&gt;
  
  
  Vector retriever (semantic similarity)
&lt;/h1&gt;

&lt;p&gt;vector_retriever = vectorstore.as_retriever(search_kwargs={"k": 10})&lt;/p&gt;

&lt;h1&gt;
  
  
  BM25 retriever (keyword matching)
&lt;/h1&gt;

&lt;p&gt;bm25_retriever = BM25Retriever.from_documents(chunks)&lt;br&gt;
bm25_retriever.k = 10&lt;/p&gt;

&lt;h1&gt;
  
  
  Combine both
&lt;/h1&gt;

&lt;p&gt;ensemble_retriever = EnsembleRetriever(&lt;br&gt;
    retrievers=[vector_retriever, bm25_retriever],&lt;br&gt;
    weights=[0.6, 0.4]  # Favor semantic, supplement with keywords&lt;br&gt;
)&lt;br&gt;
Why hybrid? Vector search finds semantically similar content. BM25 finds exact keyword matches. Together they catch what each misses alone. User asks "What is the PTO policy?" — vector search finds "vacation days and time off" while BM25 catches the exact term "PTO."&lt;br&gt;
Upgrade B: HyDE (Hypothetical Document Embeddings)&lt;br&gt;
Instead of embedding the question, generate a hypothetical answer first, then search with that.&lt;br&gt;
pythonfrom langchain.chains import HypotheticalDocumentEmbedder&lt;/p&gt;

&lt;p&gt;hyde_embeddings = HypotheticalDocumentEmbedder.from_llm(&lt;br&gt;
    llm=ChatOpenAI(model="gpt-4o-mini"),&lt;br&gt;
    base_embeddings=embeddings,&lt;br&gt;
    prompt_key="web_search"&lt;br&gt;
)&lt;br&gt;
The intuition: a question like "What's the refund policy?" is semantically different from the answer paragraph that describes the actual policy. But a hypothetical answer ABOUT refund policies is much closer to the real document. HyDE bridges that gap.&lt;br&gt;
Upgrade C: Re-ranking&lt;br&gt;
Retrieve broadly (top 20), then re-rank to find the best 5.&lt;br&gt;
pythonfrom langchain.retrievers import ContextualCompressionRetriever&lt;br&gt;
from langchain.retrievers.document_compressors import CrossEncoderReranker&lt;/p&gt;

&lt;p&gt;reranker = CrossEncoderReranker(&lt;br&gt;
    model_name="cross-encoder/ms-marco-MiniLM-L-6-v2",&lt;br&gt;
    top_n=5&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;compression_retriever = ContextualCompressionRetriever(&lt;br&gt;
    base_compressor=reranker,&lt;br&gt;
    base_retriever=vector_retriever  # Retrieves 20&lt;br&gt;
)&lt;br&gt;
Why re-rank? Bi-encoder similarity (what vector search uses) is fast but approximate. Cross-encoder re-ranking is slower but far more accurate. Retrieve broadly, then re-rank precisely.&lt;/p&gt;

&lt;p&gt;Step 4: Generation — The Prompt Matters&lt;br&gt;
The difference between a hallucinating RAG system and an honest one is usually the system prompt.&lt;br&gt;
pythonfrom langchain.prompts import ChatPromptTemplate&lt;br&gt;
from langchain.chat_models import ChatOpenAI&lt;/p&gt;

&lt;p&gt;prompt = ChatPromptTemplate.from_template("""&lt;br&gt;
Answer the user's question using ONLY the provided context.&lt;br&gt;
If the context doesn't contain the answer, say "I don't have &lt;br&gt;
enough information to answer this question."&lt;br&gt;
Never make up facts. Never extrapolate beyond the context.&lt;/p&gt;

&lt;p&gt;Context:&lt;br&gt;
{context}&lt;/p&gt;

&lt;p&gt;Question: {question}&lt;/p&gt;

&lt;p&gt;Answer:&lt;br&gt;
""")&lt;/p&gt;

&lt;p&gt;llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)&lt;/p&gt;

&lt;h1&gt;
  
  
  Build the chain
&lt;/h1&gt;

&lt;p&gt;from langchain.chains import RetrievalQA&lt;/p&gt;

&lt;p&gt;qa_chain = RetrievalQA.from_chain_type(&lt;br&gt;
    llm=llm,&lt;br&gt;
    retriever=compression_retriever,&lt;br&gt;
    chain_type_kwargs={"prompt": prompt},&lt;br&gt;
    return_source_documents=True&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  Ask a question
&lt;/h1&gt;

&lt;p&gt;result = qa_chain.invoke({"query": "What is the refund policy?"})&lt;br&gt;
print(result["result"])&lt;br&gt;
print("Sources:", [doc.metadata for doc in result["source_documents"]])&lt;br&gt;
Three critical instructions in that prompt: (1) ONLY use the provided context, (2) admit when you don't know, (3) never make up facts. Remove any of these and your system will hallucinate confidently.&lt;/p&gt;

&lt;p&gt;Step 5: Evaluate (The Step Everyone Skips)&lt;br&gt;
Most RAG systems ship with zero evaluation. That's like deploying a web app without testing.&lt;br&gt;
Three things to measure:&lt;br&gt;
python# 1. Retrieval quality: did we fetch the right chunks?&lt;br&gt;
def retrieval_precision(query, retrieved_docs, relevant_docs):&lt;br&gt;
    relevant_retrieved = set(retrieved_docs) &amp;amp; set(relevant_docs)&lt;br&gt;
    return len(relevant_retrieved) / len(retrieved_docs)&lt;/p&gt;

&lt;h1&gt;
  
  
  2. Answer faithfulness: does the answer match the context?
&lt;/h1&gt;

&lt;h1&gt;
  
  
  (Use an LLM to verify — no hallucination beyond context)
&lt;/h1&gt;

&lt;h1&gt;
  
  
  3. Answer relevance: does the answer address the question?
&lt;/h1&gt;

&lt;h1&gt;
  
  
  (Use an LLM to score 1-5)
&lt;/h1&gt;

&lt;p&gt;Build a test set of 20-50 question/answer pairs from your documents. Run your RAG against them. Track precision, faithfulness, and relevance over time. When you change chunking, retrieval, or prompts — re-run the eval.&lt;/p&gt;

&lt;p&gt;RAG Architecture Comparison&lt;br&gt;
Not all RAG is created equal:&lt;br&gt;
ArchitectureHow It WorksBest ForNaive RAGTop-k vector search → generateSimple docs, getting startedHyDEHypothetical answer → search with thatWhen questions ≠ document languageHybridVector + BM25 keyword searchMost production systemsCorrective RAGCheck retrieval quality → retry if badHigh-accuracy requirementsGraph RAGKnowledge graph + vector searchComplex entity relationshipsAgentic RAGAgent decides retrieval strategyMulti-step reasoning&lt;br&gt;
Most production systems should start with Hybrid + Re-ranking. It handles 80% of use cases well.&lt;/p&gt;

&lt;p&gt;RAG Debugging Checklist&lt;br&gt;
When your RAG isn't working, check these in order:&lt;br&gt;
□ 1. Are chunks too large? (&amp;gt;1000 chars = probably too big)&lt;br&gt;
□ 2. Is there chunk overlap? (0 overlap = context loss)&lt;br&gt;
□ 3. Are you respecting document structure? (split on headings)&lt;br&gt;
□ 4. Is retrieval returning relevant chunks? (print them!)&lt;br&gt;
□ 5. Is your prompt explicit about "only use context"?&lt;br&gt;
□ 6. Are you using temperature=0 for factual answers?&lt;br&gt;
□ 7. Have you tried hybrid search? (vector + keyword)&lt;br&gt;
□ 8. Do you have an evaluation set? (20+ Q&amp;amp;A pairs minimum)&lt;br&gt;
If your RAG answers are mediocre, it's almost never the model. It's the retrieval. Fix retrieval first, always.&lt;br&gt;
What's the hardest question your RAG system can't answer?&lt;/p&gt;

&lt;p&gt;Tags: ai, rag, tutorial, python&lt;/p&gt;

</description>
      <category>ai</category>
      <category>rag</category>
      <category>tutorial</category>
      <category>python</category>
    </item>
    <item>
      <title>Vector Databases Explained: How AI Actually Understands Your Text</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Sun, 22 Feb 2026 18:22:48 +0000</pubDate>
      <link>https://dev.to/vapmail16/vector-databases-explained-how-ai-actually-understands-your-text-241n</link>
      <guid>https://dev.to/vapmail16/vector-databases-explained-how-ai-actually-understands-your-text-241n</guid>
      <description>&lt;p&gt;When I first saw that &lt;code&gt;King - Man + Woman ≈ Queen&lt;/code&gt; in vector space, something clicked. Not intellectually — I'd read about word embeddings before. But seeing it actually work, watching the maths produce the right answer from pure numbers, was the moment I finally understood why everyone was excited about embeddings.&lt;/p&gt;

&lt;p&gt;Vector databases are the backbone of every modern AI application — semantic search, recommendation engines, RAG systems, chatbots that actually know your data. But most tutorials skip the &lt;em&gt;why&lt;/em&gt; and jump straight to &lt;code&gt;pip install&lt;/code&gt;. That's like learning SQL without understanding what a relational database actually does.&lt;/p&gt;

&lt;p&gt;Let me fix that.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Are Embeddings? (And Why They Matter)
&lt;/h2&gt;

&lt;p&gt;Traditional databases store data as rows and columns. You search with exact matches: &lt;code&gt;WHERE name = 'pizza'&lt;/code&gt;. That's great for structured data. It's terrible for meaning.&lt;/p&gt;

&lt;p&gt;Vector databases store data as &lt;em&gt;embeddings&lt;/em&gt; — arrays of numbers that capture semantic meaning. The sentence "I love pizza" becomes something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[0.23, -0.41, 0.87, 0.12, -0.56, ...]  // 1,536 numbers
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the magic: sentences with similar &lt;em&gt;meanings&lt;/em&gt; have similar numbers, regardless of the words used.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;"I love pizza"              → [0.23, -0.41, 0.87, ...]
"Pizza is my favourite food" → [0.25, -0.39, 0.85, ...]  // Very close!
"I love debugging"          → [0.67, 0.12, -0.34, ...]   // Very different.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The embedding model (like OpenAI's &lt;code&gt;text-embedding-3-small&lt;/code&gt;) has been trained on billions of text examples. It's learned that "love" and "favourite" carry similar weight, that "pizza" and "food" are related, and that "pizza" and "debugging" occupy completely different corners of meaning-space.&lt;/p&gt;

&lt;p&gt;This is what makes semantic search possible. You don't search for keywords. You search for &lt;em&gt;meaning&lt;/em&gt;.&lt;/p&gt;




&lt;h2&gt;
  
  
  How Similarity Search Works
&lt;/h2&gt;

&lt;p&gt;Once your text is converted to vectors, you need to measure how close two vectors are. There are three common approaches, and which one you pick matters.&lt;/p&gt;

&lt;h3&gt;
  
  
  Cosine Similarity (Most Common)
&lt;/h3&gt;

&lt;p&gt;Measures the angle between two vectors. Ignores magnitude, focuses on direction.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;1.0&lt;/strong&gt; = identical meaning&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;0.0&lt;/strong&gt; = completely unrelated&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;-1.0&lt;/strong&gt; = opposite meaning
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;numpy&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;dot&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;numpy.linalg&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;norm&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;dot&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;norm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Use when:&lt;/strong&gt; Your embeddings vary in magnitude (most text use cases). This is the default for a reason.&lt;/p&gt;

&lt;h3&gt;
  
  
  Euclidean Distance
&lt;/h3&gt;

&lt;p&gt;Measures the straight-line distance between two points in vector space. Smaller = more similar.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use when:&lt;/strong&gt; You care about absolute differences, not just direction. Less common for text, more useful for image features or numerical data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Dot Product
&lt;/h3&gt;

&lt;p&gt;Like cosine similarity but &lt;em&gt;does&lt;/em&gt; consider magnitude. Vectors that are both similar in direction and large in magnitude score highest.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Use when:&lt;/strong&gt; Your embedding model is designed for it (some models normalise vectors, making dot product equivalent to cosine).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;My recommendation:&lt;/strong&gt; Start with cosine similarity. It's robust, well-understood, and works for almost every text use case. Only switch if you have a specific reason.&lt;/p&gt;




&lt;h2&gt;
  
  
  Indexing: Why Brute Force Doesn't Scale
&lt;/h2&gt;

&lt;p&gt;Here's the problem: if you have 1 million documents, comparing your query vector to every single stored vector takes 1 million similarity calculations. That's O(n). Slow. Unusable at scale.&lt;/p&gt;

&lt;p&gt;Vector databases solve this with specialised index structures.&lt;/p&gt;

&lt;h3&gt;
  
  
  HNSW (Hierarchical Navigable Small World)
&lt;/h3&gt;

&lt;p&gt;The most popular index type. Think of it like a network of neighbours.&lt;/p&gt;

&lt;p&gt;Imagine you're looking for a specific house in a city. Brute force: check every house on every street. HNSW: ask a neighbour, they point you closer, you ask that person's neighbour, they point you closer still. A few hops and you're there.&lt;/p&gt;

&lt;p&gt;Technically, HNSW builds a multi-layer graph where each node connects to its nearest neighbours. Higher layers are sparse (for big jumps), lower layers are dense (for precision). Search starts at the top and navigates down.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search time:&lt;/strong&gt; O(log n) — millions of vectors in milliseconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tradeoff:&lt;/strong&gt; Uses more memory (stores the graph structure)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Most use cases. Fast, accurate, well-supported&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  IVF (Inverted File Index)
&lt;/h3&gt;

&lt;p&gt;Divides vector space into clusters (like postal codes). At query time, only searches the nearest clusters instead of everything.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Search time:&lt;/strong&gt; Faster than brute force, less accurate than HNSW&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Tradeoff:&lt;/strong&gt; Needs tuning (how many clusters? how many to search?)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Very large datasets where memory is constrained&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Product Quantization (PQ)
&lt;/h3&gt;

&lt;p&gt;Compresses vectors by splitting them into sub-vectors and approximating each one. Dramatically reduces memory usage at the cost of some accuracy.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Tradeoff:&lt;/strong&gt; Lossy compression — some precision loss&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Billions of vectors where memory is the bottleneck&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;My take:&lt;/strong&gt; HNSW is the right choice for 95% of applications. It's the default in Pinecone, Weaviate, and Qdrant for good reason. Only reach for IVF or PQ when you're dealing with genuinely massive scale or tight memory constraints.&lt;/p&gt;




&lt;h2&gt;
  
  
  Choosing a Vector Database
&lt;/h2&gt;

&lt;p&gt;This is where people get stuck. There are too many options and every vendor says they're the best. Here's an honest comparison based on what I've actually used:&lt;/p&gt;

&lt;h3&gt;
  
  
  Pinecone — Managed, Zero Ops
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;pinecone&lt;/span&gt;

&lt;span class="n"&gt;pinecone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;init&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;your-key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;environment&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;us-east-1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pinecone&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Index&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-index&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# Upsert
&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vectors&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.41&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.87&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...],&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Original document&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="c1"&gt;# Query
&lt;/span&gt;&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mf"&gt;0.25&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mf"&gt;0.39&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.85&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...],&lt;/span&gt; &lt;span class="n"&gt;top_k&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Fully managed. No infrastructure to worry about. Scales automatically. Great for teams that don't want to manage databases.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Vendor lock-in. More expensive at scale. Limited querying beyond similarity search.&lt;br&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Production apps where you want zero ops overhead.&lt;/p&gt;
&lt;h3&gt;
  
  
  Weaviate — Open Source, Hybrid Search
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Open source. Supports hybrid search (vector + keyword BM25 in one query). GraphQL API. Modules for auto-vectorisation.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; More complex to set up and manage. Heavier resource usage.&lt;br&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Applications that need both semantic and keyword search — which, in practice, is most production RAG systems.&lt;/p&gt;
&lt;h3&gt;
  
  
  Chroma — Lightweight, Great for Prototyping
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;chromadb&lt;/span&gt;

&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;chromadb&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="n"&gt;collection&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_collection&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;my-collection&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;I love pizza&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Pizza is my favourite food&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;ids&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;doc2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;collection&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;query_texts&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;What food do you enjoy?&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;n_results&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Dead simple API. Runs locally. Handles embedding generation for you. Perfect for experimentation.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Not designed for production scale. Limited configuration.&lt;br&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Prototyping, local development, learning.&lt;/p&gt;
&lt;h3&gt;
  
  
  pgvector — Postgres Extension
&lt;/h3&gt;


&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;EXTENSION&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="nb"&gt;SERIAL&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="n"&gt;vector&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1536&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Similarity search&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;'[0.23, -0.41, ...]'&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;distance&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Use your existing Postgres. No new infrastructure. SQL familiarity. Transactions and joins with your regular data.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Slower than purpose-built vector DBs at scale. Limited index types.&lt;br&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Teams already on Postgres who don't want another database. Works well under 1M vectors.&lt;/p&gt;
&lt;h3&gt;
  
  
  Qdrant — Rust-Based, Fast, Self-Hosted
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pros:&lt;/strong&gt; Very fast (written in Rust). Rich filtering alongside vector search. Open source with a managed option.&lt;br&gt;
&lt;strong&gt;Cons:&lt;/strong&gt; Smaller community than Pinecone/Weaviate. Fewer integrations.&lt;br&gt;
&lt;strong&gt;Best for:&lt;/strong&gt; Performance-sensitive applications. Teams comfortable self-hosting.&lt;/p&gt;


&lt;h2&gt;
  
  
  The Vector DB Decision Checklist
&lt;/h2&gt;

&lt;p&gt;When someone asks me "which vector database should I use?", I run through this:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Are you prototyping or building for production?&lt;/strong&gt;&lt;br&gt;
Prototyping → Chroma. Get something working in 10 minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Do you already have Postgres and &amp;lt; 1M vectors?&lt;/strong&gt;&lt;br&gt;
Yes → pgvector. Don't add infrastructure you don't need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Do you need hybrid search (vector + keyword)?&lt;/strong&gt;&lt;br&gt;
Yes → Weaviate. Its hybrid search is genuinely best-in-class.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Do you want zero infrastructure management?&lt;/strong&gt;&lt;br&gt;
Yes → Pinecone. Pay more, worry less.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Is raw query speed your top priority?&lt;/strong&gt;&lt;br&gt;
Yes → Qdrant. Rust-level performance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;6. Budget-constrained and self-hosting?&lt;/strong&gt;&lt;br&gt;
Weaviate or Qdrant, both open source.&lt;/p&gt;

&lt;p&gt;There's no universal "best." There's "best for your use case, your team, and your stage."&lt;/p&gt;


&lt;h2&gt;
  
  
  Putting It Together: The Embedding Pipeline
&lt;/h2&gt;

&lt;p&gt;Here's the complete flow for any AI application using vector search:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;1. Document comes in (PDF, webpage, chat message, etc.)
2. Split into chunks (paragraphs, sections — overlap by 200 chars)
3. Generate embeddings via API (OpenAI, Cohere, or local model)
4. Store embeddings + metadata in your vector database
5. User asks a question
6. Embed the question using the SAME model
7. Query vector DB for top-k similar chunks
8. Pass retrieved chunks + question to an LLM
9. LLM generates an answer grounded in YOUR data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Steps 5-9 is what people call RAG (Retrieval-Augmented Generation). But the quality of step 8's answer depends entirely on steps 2-4. Bad chunking, wrong embedding model, or a poorly-configured index means the LLM gets irrelevant context and hallucinates confidently.&lt;/p&gt;

&lt;p&gt;The foundation matters. Get the vector layer right, and everything built on top works. Get it wrong, and no amount of prompt engineering will save you.&lt;/p&gt;




&lt;p&gt;What's your use case? I'd genuinely love to hear — drop a comment and I'll tell you which vector DB fits. Building semantic search? RAG chatbot? Recommendation engine? The answer changes based on what you're actually building.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>database</category>
      <category>machinelearning</category>
      <category>tutorial</category>
    </item>
    <item>
      <title>SaaS Database Design: 6 Decisions You'll Regret Getting Wrong</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Fri, 20 Feb 2026 14:11:48 +0000</pubDate>
      <link>https://dev.to/vapmail16/saas-database-design-6-decisions-youll-regret-getting-wrong-2aop</link>
      <guid>https://dev.to/vapmail16/saas-database-design-6-decisions-youll-regret-getting-wrong-2aop</guid>
      <description>&lt;p&gt;I refactored my database schema 4 times before getting it right. Each time cost me a week.&lt;/p&gt;

&lt;p&gt;The first version used auto-increment IDs everywhere. The second didn't account for soft deletes. The third had audit logging bolted on as an afterthought. The fourth tried to retrofit consent versioning into a schema that assumed a single boolean for "agreed to terms."&lt;/p&gt;

&lt;p&gt;Database schema is the one foundation you can't easily refactor once you have real data. Migrations exist, but migrating a million-row table with zero downtime while restructuring relationships is a different beast than &lt;code&gt;prisma migrate dev&lt;/code&gt; on localhost.&lt;/p&gt;

&lt;p&gt;Here are the six decisions I'd get right from Day 1.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. UUIDs for Public IDs
&lt;/h2&gt;

&lt;p&gt;Auto-increment IDs leak information. When a competitor signs up and sees &lt;code&gt;/users/47&lt;/code&gt;, they know you have 47 users. When they see &lt;code&gt;/api/payments/12&lt;/code&gt;, they know your payment volume. Worse, sequential IDs are trivially enumerable — an attacker can crawl &lt;code&gt;/api/users/1&lt;/code&gt; through &lt;code&gt;/api/users/10000&lt;/code&gt; in seconds.&lt;/p&gt;

&lt;p&gt;UUIDs fix this entirely:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model User {
  id        String   @id @default(uuid())
  email     String   @unique
  name      String?
  role      Role     @default(USER)
  isActive  Boolean  @default(true)
  createdAt DateTime @default(now())
  updatedAt DateTime @updatedAt
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every model uses &lt;code&gt;@default(uuid())&lt;/code&gt;. No sequential integer, no information leakage, no enumeration attacks. The performance cost on PostgreSQL is negligible with proper indexing — and the security benefit is immediate.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. RBAC as a Role Hierarchy, Not Flat Strings
&lt;/h2&gt;

&lt;p&gt;A &lt;code&gt;role: string&lt;/code&gt; field with values like &lt;code&gt;"editor"&lt;/code&gt;, &lt;code&gt;"manager"&lt;/code&gt;, &lt;code&gt;"admin"&lt;/code&gt; scattered across your codebase becomes unmaintainable fast. Every permission check turns into &lt;code&gt;if (role === 'admin' || role === 'manager' || role === 'editor')&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Use an enum with a clear hierarchy instead:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;enum Role {
  USER
  ADMIN
  SUPER_ADMIN
}

model User {
  id   String @id @default(uuid())
  role Role   @default(USER)
  // ...
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The middleware that enforces this is one function:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requireRole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authentication required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ForbiddenError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Insufficient permissions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// ADMIN and SUPER_ADMIN can list users; only SUPER_ADMIN can delete&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/admin/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;requireRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SUPER_ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/admin/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;requireRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SUPER_ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Because the enum is defined in Prisma, the database enforces it. No rogue &lt;code&gt;"superadmin"&lt;/code&gt; or &lt;code&gt;"Admin"&lt;/code&gt; strings sneaking in. Every role change gets audit-logged (more on that in a moment).&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Soft Deletes With &lt;code&gt;deletedAt&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Hard-deleting a user row cascades through sessions, payments, notifications, audit logs — and if anything goes wrong mid-cascade, you're left with orphaned records and no way to recover.&lt;/p&gt;

&lt;p&gt;Worse: GDPR requires you to prove deletion happened. If you &lt;code&gt;DELETE FROM users WHERE id = ?&lt;/code&gt; and it's gone, you have no proof.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;deletedAt&lt;/code&gt; pattern gives you both safety and compliance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model User {
  id           String    @id @default(uuid())
  email        String    @unique
  isActive     Boolean   @default(true)
  deletedAt    DateTime?
  anonymizedAt DateTime?
  // ...
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Soft delete sets &lt;code&gt;deletedAt&lt;/code&gt; and &lt;code&gt;isActive = false&lt;/code&gt;. For GDPR "right to erasure," you anonymize the PII (replace email with a hash, wipe name and phone) and set &lt;code&gt;anonymizedAt&lt;/code&gt;. The record still exists for accounting and audit purposes, but the personal data is gone.&lt;/p&gt;

&lt;p&gt;Hard delete is available when legally required — but it's a separate, deliberate action (&lt;code&gt;DeletionType.HARD&lt;/code&gt;) with email confirmation, not the default.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Audit Logs as a Separate, Append-Only Table
&lt;/h2&gt;

&lt;p&gt;Audit logging isn't a "nice to have." It's the first thing auditors ask for, the first thing you'll need when debugging a production incident, and the first thing enterprise procurement teams check.&lt;/p&gt;

&lt;p&gt;It needs to be a dedicated table, not a column on other tables:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model AuditLog {
  id         String    @id @default(uuid())
  userId     String?
  action     String    // "USER_LOGIN", "PAYMENT_CREATED", "ROLE_CHANGED"
  resource   String?   // "users", "payments", "mfa_methods"
  resourceId String?   // ID of affected resource
  details    Json?     // Before/after state, additional context
  ipAddress  String?
  userAgent  String?
  createdAt  DateTime  @default(now())
  archived   Boolean   @default(false)
  archivedAt DateTime?

  user User? @relation(fields: [userId], references: [id], onDelete: SetNull)

  @@index([userId])
  @@index([action])
  @@index([createdAt])
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few critical design decisions here:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;onDelete: SetNull&lt;/code&gt;&lt;/strong&gt; — if a user is deleted, their audit logs remain. You never cascade-delete audit records.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;details: Json?&lt;/code&gt;&lt;/strong&gt; — stores before/after snapshots for state changes. When someone asks "what did this user's profile look like before the admin edited it?", you have the answer.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;archived&lt;/code&gt; + &lt;code&gt;archivedAt&lt;/code&gt;&lt;/strong&gt; — financial data needs 7-year retention in many jurisdictions. Archiving lets you move old logs to cold storage without deleting them.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Append-only by convention&lt;/strong&gt; — the service only creates log entries, never updates or deletes them. Immutable audit trails are a compliance requirement.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every meaningful action in the system — login, password change, MFA setup, role change, data export, OAuth linking — writes to this table with the user's IP and user agent.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. Session Tracking With Device Context
&lt;/h2&gt;

&lt;p&gt;Most tutorials store sessions as a token string and an expiry. That's not enough. When a user looks at "Active Sessions" in their security settings, they expect to see &lt;em&gt;where&lt;/em&gt; they're logged in. And when you need to revoke sessions on password change, you need to know which ones exist.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model Session {
  id        String   @id @default(uuid())
  userId    String
  token     String   @unique
  expiresAt DateTime
  userAgent String?
  ipAddress String?
  createdAt DateTime @default(now())

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)

  @@index([userId])
  @@index([token])
  @@index([expiresAt])
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;userAgent&lt;/code&gt; and &lt;code&gt;ipAddress&lt;/code&gt; are captured at login and stored alongside the session. This enables:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;"Active sessions" UI&lt;/strong&gt; — show device/browser and location per session.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Password change → revoke all&lt;/strong&gt; — delete all sessions for this user, forcing re-authentication everywhere.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Suspicious activity detection&lt;/strong&gt; — flag when the same user logs in from two countries within an hour.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code&gt;@@index([expiresAt])&lt;/code&gt; matters for cleanup: a scheduled job can efficiently delete expired sessions without a full table scan.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Consent Versioning
&lt;/h2&gt;

&lt;p&gt;This is the one most SaaS templates get wrong — or skip entirely.&lt;/p&gt;

&lt;p&gt;A single &lt;code&gt;agreedToTerms: Boolean&lt;/code&gt; on the User model is legally meaningless. GDPR requires you to prove &lt;em&gt;what&lt;/em&gt; the user consented to, &lt;em&gt;which version&lt;/em&gt; of the policy, &lt;em&gt;when&lt;/em&gt;, and from &lt;em&gt;where&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;That means two models: one for consent records, one for consent versions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model ConsentRecord {
  id          String      @id @default(uuid())
  userId      String
  consentType ConsentType
  granted     Boolean     @default(false)
  grantedAt   DateTime?
  revokedAt   DateTime?
  ipAddress   String?
  userAgent   String?
  version     String?     // e.g. "2.1.0"
  versionId   String?     // FK to ConsentVersion
  expiresAt   DateTime?
  metadata    Json?
  createdAt   DateTime    @default(now())
  updatedAt   DateTime    @updatedAt

  user           User            @relation(fields: [userId], references: [id], onDelete: Cascade)
  consentVersion ConsentVersion? @relation(fields: [versionId], references: [id])

  @@unique([userId, consentType])
  @@index([consentType])
}

model ConsentVersion {
  id                String      @id @default(uuid())
  consentType       ConsentType
  version           String      // Semantic version "1.0.0"
  title             String
  content           String      @db.Text
  summary           String?     @db.Text // Summary of changes
  effectiveDate     DateTime
  expiryPeriod      Int?        // Days until consent expires
  requiresReConsent Boolean     @default(true)
  changes           String?     @db.Text
  isActive          Boolean     @default(false)
  createdAt         DateTime    @default(now())

  consentRecords ConsentRecord[]

  @@unique([consentType, version])
}

enum ConsentType {
  MARKETING_EMAILS
  ANALYTICS
  THIRD_PARTY_SHARING
  COOKIES
  TERMS_OF_SERVICE
  PRIVACY_POLICY
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Six consent types, each independently versioned. When you update your privacy policy (version 1.0 → 2.0), &lt;code&gt;requiresReConsent: true&lt;/code&gt; triggers a prompt for every user who consented to v1. The &lt;code&gt;ConsentRecord&lt;/code&gt; stores the exact version they agreed to, with IP and timestamp.&lt;/p&gt;

&lt;p&gt;When a regulator asks "did this user consent to third-party data sharing?", you don't check a boolean. You pull the record: they consented to version 1.2.0 of the THIRD_PARTY_SHARING policy on January 15th at 14:32 UTC from IP 203.0.113.42. That's the difference between "we think so" and "here's the evidence."&lt;/p&gt;




&lt;h2&gt;
  
  
  The Controversial Opinion
&lt;/h2&gt;

&lt;p&gt;Prisma migrations &amp;gt; raw SQL. I know, I know.&lt;/p&gt;

&lt;p&gt;Raw SQL gives you fine-grained control. But when you're iterating fast, the DX difference is massive. &lt;code&gt;prisma migrate dev&lt;/code&gt; generates the SQL, tracks the migration history, and handles rollbacks. You can always eject to raw SQL for complex migrations. But for the 90% case — adding a field, creating a table, adjusting an index — the productivity gain is real.&lt;/p&gt;

&lt;p&gt;That said: always review the generated SQL before running it in production. &lt;code&gt;prisma migrate deploy&lt;/code&gt;, not &lt;code&gt;prisma migrate dev&lt;/code&gt;, in prod. And keep backups before every migration. Trust the tool, but verify the output.&lt;/p&gt;




&lt;p&gt;What database decisions have burned you?&lt;/p&gt;

</description>
    </item>
    <item>
      <title>Authentication That Actually Passes Security Audits</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Thu, 19 Feb 2026 12:59:26 +0000</pubDate>
      <link>https://dev.to/vapmail16/authentication-that-actually-passes-security-audits-2bl6</link>
      <guid>https://dev.to/vapmail16/authentication-that-actually-passes-security-audits-2bl6</guid>
      <description>&lt;p&gt;I ran a security self-audit on my own auth code. Found 3 issues in the first hour.&lt;/p&gt;

&lt;p&gt;Token expiry was too generous. Sessions weren't tracked by device. And password reset tokens weren't being invalidated after use. None of these would show up in a typical tutorial. All three would show up in an actual security review.&lt;/p&gt;

&lt;p&gt;Most auth tutorials teach you the basics: hash the password, sign a JWT, protect a route. That gets you through the demo. Auditors check for depth — token rotation, MFA implementation, session tracking, role hierarchies, rate limiting. The gap between "works in development" and "passes a security questionnaire" is bigger than most developers expect.&lt;/p&gt;

&lt;p&gt;Here's what a production auth system actually looks like, with real code.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. JWT + Refresh Tokens: The Cookie-Based Approach
&lt;/h2&gt;

&lt;p&gt;The first thing any auditor checks: where are you storing tokens?&lt;/p&gt;

&lt;p&gt;If the answer is &lt;code&gt;localStorage&lt;/code&gt;, you've already failed. Any XSS vulnerability — and every non-trivial app will eventually have one — gives an attacker full access to the token. Game over.&lt;/p&gt;

&lt;p&gt;HTTP-only cookies fix this. The browser sends them automatically; JavaScript can't read them. Here's the login flow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Access token: short-lived (15 min), signed with a dedicated secret&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;accessToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;expiresIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;15m&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Refresh token: long-lived (30 days), separate secret, stored in DB&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;refreshToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;refreshSecret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;expiresIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;30d&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Both tokens go into HTTP-only cookies — never in the response body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Set access token — httpOnly prevents XSS from reading it&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cookie&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;accessToken&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;httpOnly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;secure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;        &lt;span class="c1"&gt;// HTTPS only&lt;/span&gt;
  &lt;span class="na"&gt;sameSite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;strict&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="c1"&gt;// CSRF protection&lt;/span&gt;
  &lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;15&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Set refresh token — same protections, longer lifespan&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cookie&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;refreshToken&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;refreshToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;httpOnly&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;secure&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;sameSite&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;strict&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;maxAge&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;60&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Body contains user info only — no tokens exposed&lt;/span&gt;
&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;loginResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The refresh token is stored in the database tied to the session. When it's used, the old session is deleted and a new one is created. If a token is reused (i.e., someone stole it and the real user already rotated it), we know the session was compromised.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Refresh: verify token, check DB session, issue new access token&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;token&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;refreshToken&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;expiresAt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Invalid or expired refresh token&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isActive&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Account is disabled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;newAccessToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;expiresIn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;15m&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The middleware that protects routes tries the cookie first, falls back to the &lt;code&gt;Authorization&lt;/code&gt; header for API clients:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Prefer cookie (browser), fall back to header (API/mobile)&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;cookies&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;accessToken&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;authorization&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;authHeader&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Bearer &lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;authHeader&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;substring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;No token provided&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;decoded&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;verify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;config&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;jwt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;decoded&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;select&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;isActive&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;isActive&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;User not found or disabled&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  2. MFA: Why SMS Fails Audits
&lt;/h2&gt;

&lt;p&gt;SMS-based MFA is convenient and insecure. SIM swapping, SS7 vulnerabilities, social engineering at carrier stores — there's a reason NIST downgraded SMS verification years ago. Auditors know this.&lt;/p&gt;

&lt;p&gt;TOTP (Google Authenticator, Authy) is the standard. Here's the setup flow — generate a secret, create a QR code, and store backup codes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;setupTotp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;speakeasy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateSecret&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`MyApp (&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;)`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;qrCodeUrl&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;QRCode&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toDataURL&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;otpauth_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;width&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;512&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;errorCorrectionLevel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;M&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Generate 10 backup codes (stored hashed in production)&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;backupCodes&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateBackupCodes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="c1"&gt;// Store secret — not enabled until user verifies first code&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;mfaMethod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;upsert&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;userId_method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TOTP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;create&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;TOTP&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;base32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;update&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;base32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;isEnabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;secret&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;base32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;qrCodeUrl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;backupCodes&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The Prisma schema behind this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model MfaMethod {
  id        String        @id @default(uuid())
  userId    String
  method    MfaMethodType // TOTP or EMAIL
  secret    String?
  isEnabled Boolean       @default(false)
  isPrimary Boolean       @default(false)
  createdAt DateTime      @default(now())
  updatedAt DateTime      @updatedAt

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)
  @@unique([userId, method])
}

model MfaBackupCode {
  id        String    @id @default(uuid())
  userId    String
  code      String    @unique
  used      Boolean   @default(false)
  usedAt    DateTime?
  createdAt DateTime  @default(now())
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;MFA isn't enabled until the user verifies their first code. This prevents locking someone out if setup was interrupted. And Email OTP exists as a fallback for accessibility — not everyone has a smartphone.&lt;/p&gt;

&lt;p&gt;During login, if MFA is enabled, we return a temporary token instead of the real session. The user completes MFA, and only then do we issue access/refresh tokens:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;enabledMfaMethods&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;requiresMfa&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;mfaMethod&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;primaryMethod&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="c1"&gt;// No tokens — need MFA verification first&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  3. OAuth Edge Cases Nobody Warns You About
&lt;/h2&gt;

&lt;p&gt;OAuth tutorials show "click Google, get user." Real implementations deal with:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Account linking:&lt;/strong&gt; User signs up with email/password, later clicks "Connect with Google." You need to match the email and link providers, not create a duplicate.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Email conflicts:&lt;/strong&gt; User has a Google account with &lt;code&gt;user@gmail.com&lt;/code&gt; and tries to link GitHub which has a different email. Which one wins?&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-provider:&lt;/strong&gt; Supporting Google, GitHub, and Microsoft simultaneously means three different token formats and profile shapes.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The link/unlink flow needs its own endpoints and audit logging:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/oauth/link&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;profile&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;verifyOAuthToken&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;linkOAuthToUser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;profile&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;auditLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;OAUTH_LINKED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;details&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;provider&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;ipAddress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;getClientIp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;success&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every OAuth operation is audit-logged. When something goes wrong (and it will), you need to know exactly what happened and when.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Session Management
&lt;/h2&gt;

&lt;p&gt;Sessions are tracked with device info. This isn't just for the "active sessions" UI — it's a security requirement:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;model Session {
  id        String   @id @default(uuid())
  userId    String
  token     String   @unique
  expiresAt DateTime
  userAgent String?
  ipAddress String?
  createdAt DateTime @default(now())

  user User @relation(fields: [userId], references: [id], onDelete: Cascade)
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;On login, existing sessions for that user are cleared. On password change, all sessions are revoked. This means: compromised password → change password → attacker is immediately locked out, even if they have a valid refresh token.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. RBAC: Role Hierarchy With One Middleware
&lt;/h2&gt;

&lt;p&gt;Three roles — &lt;code&gt;USER&lt;/code&gt;, &lt;code&gt;ADMIN&lt;/code&gt;, &lt;code&gt;SUPER_ADMIN&lt;/code&gt; — with inheritance. The middleware is simple because the hierarchy is explicit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requireRole&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nx"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;next&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;UnauthorizedError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Authentication required&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;roles&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ForbiddenError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Insufficient permissions&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="nf"&gt;next&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="c1"&gt;// Usage&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/admin/users&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;requireRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SUPER_ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/admin/users/:id&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;authenticate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;requireRole&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;SUPER_ADMIN&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;handler&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every role change is audit-logged. An admin promoting someone to &lt;code&gt;SUPER_ADMIN&lt;/code&gt; is a high-severity event that should be visible in your audit trail.&lt;/p&gt;




&lt;h2&gt;
  
  
  Auth Audit Readiness Checklist
&lt;/h2&gt;

&lt;p&gt;Run this against your auth. Score yourself honestly.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;#&lt;/th&gt;
&lt;th&gt;Check&lt;/th&gt;
&lt;th&gt;What Auditors Look For&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;1&lt;/td&gt;
&lt;td&gt;Token storage&lt;/td&gt;
&lt;td&gt;HTTP-only cookies, not localStorage&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;2&lt;/td&gt;
&lt;td&gt;Access token expiry&lt;/td&gt;
&lt;td&gt;15 minutes or less&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;Refresh token rotation&lt;/td&gt;
&lt;td&gt;Old token invalidated on use&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;td&gt;MFA support&lt;/td&gt;
&lt;td&gt;TOTP (not just SMS)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;Backup codes&lt;/td&gt;
&lt;td&gt;Generated and stored for account recovery&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;Rate limiting&lt;/td&gt;
&lt;td&gt;Login, registration, password reset endpoints&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;7&lt;/td&gt;
&lt;td&gt;Session tracking&lt;/td&gt;
&lt;td&gt;IP, user agent, device logged per session&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;Password change revocation&lt;/td&gt;
&lt;td&gt;All other sessions killed on password change&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;OAuth audit trail&lt;/td&gt;
&lt;td&gt;Link/unlink events logged with IP and timestamp&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;10&lt;/td&gt;
&lt;td&gt;Role change logging&lt;/td&gt;
&lt;td&gt;Every privilege escalation tracked in audit log&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;If your auth tutorial doesn't cover token rotation and reuse detection, it's teaching you to build a vulnerability.&lt;/p&gt;




&lt;p&gt;Run this checklist against your auth. What did you score?&lt;/p&gt;

</description>
      <category>backend</category>
      <category>security</category>
      <category>tutorial</category>
      <category>webdev</category>
    </item>
    <item>
      <title>The #1 Thing Missing From Every SaaS Starter Kit</title>
      <dc:creator>vapmail16</dc:creator>
      <pubDate>Wed, 18 Feb 2026 12:16:52 +0000</pubDate>
      <link>https://dev.to/vapmail16/the-1-thing-missing-from-every-saas-starter-kit-437i</link>
      <guid>https://dev.to/vapmail16/the-1-thing-missing-from-every-saas-starter-kit-437i</guid>
      <description>&lt;p&gt;I've evaluated dozens of SaaS starter kits. Next.js boilerplates, Express templates, Rails scaffolds — you name it.&lt;/p&gt;

&lt;p&gt;They all nail the same things: authentication, a pretty dashboard, Stripe integration, maybe some admin CRUD. And they all skip the same thing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compliance.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Not "we added a cookie banner" compliance. I mean actual GDPR data export, right-to-deletion, consent management, audit logging, field-level encryption, breach notification — the stuff that turns a weekend project into something you can actually sell to businesses in the EU (or anywhere with privacy laws, which is increasingly everywhere).&lt;/p&gt;

&lt;p&gt;I spent months building these features into a SaaS template. Here's what I learned and why most starter kits get this wrong.&lt;/p&gt;




&lt;h2&gt;
  
  
  The "We'll Add It Later" Trap
&lt;/h2&gt;

&lt;p&gt;Every developer who's shipped a SaaS product knows this conversation:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"We'll handle GDPR later."&lt;br&gt;
"Let's just get to market first."&lt;br&gt;
"We only have 50 users, nobody's going to request a data export."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Then you land your first enterprise client. Their procurement team sends over a 40-page security questionnaire. Question 7: "Describe your data subject access request (DSAR) process." Question 23: "How do you handle data retention and deletion?" Question 31: "Provide evidence of audit logging for PII access."&lt;/p&gt;

&lt;p&gt;And suddenly "later" is now, and you're bolting compliance onto an architecture that was never designed for it.&lt;/p&gt;

&lt;p&gt;The fix isn't complicated — but it needs to be &lt;strong&gt;baked in from the start&lt;/strong&gt;, not layered on after.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Real GDPR Data Export Looks Like in Code
&lt;/h2&gt;

&lt;p&gt;Most starter kits that claim "GDPR support" give you a checkbox component and call it done. Here's what an actual data export implementation looks like.&lt;/p&gt;

&lt;p&gt;First, the service that collects everything you store about a user:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;generateDataExport&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;requestId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataExportRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;requestId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;NotFoundError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Export request not found&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataExportRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;requestId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DataExportStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PROCESSING&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;request&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="c1"&gt;// Collect ALL user data — not just the profile&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;auditLogs&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;notifications&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;payments&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;consentRecords&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;all&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findUnique&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;session&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;auditLog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;notification&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;payment&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="na"&gt;include&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;refunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;subscription&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
      &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;consentRecord&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;findMany&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;exportData&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;user&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;email&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;role&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;ipAddress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ipAddress&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})),&lt;/span&gt;
    &lt;span class="na"&gt;auditLogs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;auditLogs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;ipAddress&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ipAddress&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})),&lt;/span&gt;
    &lt;span class="na"&gt;payments&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;payments&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
      &lt;span class="na"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;amount&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;currency&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;refunds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;refunds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;})),&lt;/span&gt;
    &lt;span class="na"&gt;consents&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;consentRecords&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;exportMetadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;requestId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;generatedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;toISOString&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;JSON&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// Update request, set 7-day download window&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;expiresAt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="nx"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setDate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getDate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataExportRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;where&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;requestId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DataExportStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;COMPLETED&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;completedAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="na"&gt;downloadUrl&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`/api/gdpr/exports/&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;requestId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/download`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;fileSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;byteLength&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;exportData&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;utf8&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="nx"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;exportData&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things to notice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;You must export everything&lt;/strong&gt; — sessions, audit logs, payment history, consent records, IP addresses. Not just the profile. GDPR Article 15 is specific about this.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Download links expire&lt;/strong&gt; — you don't want a permanent URL to someone's entire data sitting in their email forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The request itself is tracked&lt;/strong&gt; — status goes from PENDING → PROCESSING → COMPLETED (or FAILED), with timestamps and audit logging at each step.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The route that serves the download is equally important. It validates ownership, checks expiry, and sets proper headers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;router&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/exports/:id/download&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nf"&gt;asyncHandler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;jsonString&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;gdprService&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getExportDataForDownload&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
      &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="c1"&gt;// ensures only the data owner can download&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;`data-export-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;req&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;params&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;.json`&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Type&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;application/json&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Content-Disposition&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;`attachment; filename="&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;"`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;jsonString&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h2&gt;
  
  
  Data Deletion Is Harder Than You Think
&lt;/h2&gt;

&lt;p&gt;"Just delete the user row." If only.&lt;/p&gt;

&lt;p&gt;Real right-to-erasure means you need two modes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Soft delete (anonymize):&lt;/strong&gt; Replace PII with anonymized values, keep the record for accounting/legal obligations. You can't delete a payment record — your accountant needs it — but you can strip the name and email.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard delete:&lt;/strong&gt; Actually remove everything. Cascade through sessions, notifications, audit logs, payments, consents.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And crucially, deletion requires &lt;strong&gt;email confirmation&lt;/strong&gt; before it executes. You don't want a compromised session to wipe a user's entire account:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;requestDataDeletion&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;deletionType&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DeletionType&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;DeletionType&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;SOFT&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;confirmationToken&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;crypto&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;randomBytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;hex&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;deletionRequest&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;prisma&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;dataDeletionRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;data&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;deletionType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;reason&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="nx"&gt;confirmationToken&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;DataDeletionStatus&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;PENDING&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="c1"&gt;// Send confirmation email — deletion only proceeds after click&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendDataDeletionRequestConfirmationEmail&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;to&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;confirmationLink&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;frontendUrl&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/gdpr?confirmDeletion=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;confirmationToken&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createAuditLog&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;DATA_DELETION_REQUESTED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;resource&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;data_deletion_requests&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;resourceId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;deletionRequest&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;details&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;deletionType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reason&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;deletionRequest&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Every step is audit-logged. The confirmation token is cryptographically random. The request sits in PENDING until the user clicks the email link. This is the kind of thing that takes a couple of days to get right — and that's before you write the tests.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Iceberg Below the Surface
&lt;/h2&gt;

&lt;p&gt;Data export and deletion are the visible parts. Underneath, you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Consent management&lt;/strong&gt; with versioned consent records (users agreed to v2.1 of your privacy policy on this date, from this IP).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging&lt;/strong&gt; on every PII access — not just writes, but reads. Who looked at what, when, and why.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field-level encryption&lt;/strong&gt; for ultra-sensitive fields (phone numbers, addresses, tax IDs) as a second layer beyond database encryption.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data retention policies&lt;/strong&gt; that automatically purge expired data.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Breach notification&lt;/strong&gt; workflows — GDPR requires reporting within 72 hours.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CSRF protection&lt;/strong&gt; on every state-changing endpoint, because a compliance feature that can be exploited via cross-site request forgery isn't really compliant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PII masking in logs&lt;/strong&gt; so your error tracking doesn't accidentally become a data breach.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these is individually straightforward. Together, they represent weeks of careful engineering. And they all need tests — you can't just eyeball compliance.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Starter Kits Skip This
&lt;/h2&gt;

&lt;p&gt;I get it. Compliance features don't make for exciting demo videos. Nobody screenshots a consent versioning system for their landing page. "Look at this beautiful audit log table" doesn't go viral on Twitter.&lt;/p&gt;

&lt;p&gt;But here's the business reality: &lt;strong&gt;compliance is a feature that sells&lt;/strong&gt;. Enterprise buyers specifically ask for it. EU-based customers require it by law. And the fines for getting it wrong are not theoretical — they're in the headlines every month.&lt;/p&gt;

&lt;p&gt;The starter kits that skip compliance are optimizing for the first sale. The ones that include it are optimizing for every sale after that.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I'd Tell My Past Self
&lt;/h2&gt;

&lt;p&gt;If you're building a SaaS template (or choosing one), here's the order I'd prioritize compliance:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Audit logging&lt;/strong&gt; — add it first, to everything. You'll thank yourself when debugging and when filling out security questionnaires.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data export&lt;/strong&gt; — GDPR Article 15 requests are the most common. Have a one-click solution ready.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Consent management&lt;/strong&gt; — versioned, timestamped, with IP. Not a single boolean "agreed to terms."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Data deletion&lt;/strong&gt; — with soft/hard modes and email confirmation. Test the cascading deletes thoroughly.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Encryption&lt;/strong&gt; — field-level for PII, on top of whatever your database provides.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Everything else&lt;/strong&gt; — retention, breach notification, CSRF, monitoring. Important, but the first five cover 80% of what procurement teams ask about.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Build it in from day one. It's 10x easier than retrofitting it after you have real users and real data.&lt;/p&gt;




&lt;p&gt;What compliance features do you wish your starter had? I'm genuinely curious — drop a comment below.&lt;/p&gt;

</description>
    </item>
  </channel>
</rss>
