<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Iurii Rogulia</title>
    <description>The latest articles on DEV Community by Iurii Rogulia (@iurii_rogulia).</description>
    <link>https://dev.to/iurii_rogulia</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3561015%2Fd1b53175-2e87-4fa8-9c54-90f6b713141b.jpg</url>
      <title>DEV Community: Iurii Rogulia</title>
      <link>https://dev.to/iurii_rogulia</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/iurii_rogulia"/>
    <language>en</language>
    <item>
      <title>AI Coding for Senior Developers: What to Delegate (and Not)</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Thu, 02 Jul 2026 10:00:44 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/ai-coding-for-senior-developers-what-to-delegate-and-not-3bph</link>
      <guid>https://dev.to/iurii_rogulia/ai-coding-for-senior-developers-what-to-delegate-and-not-3bph</guid>
      <description>&lt;p&gt;It's Tuesday morning. I have a task in the backlog: add a new billing event type to an existing webhook pipeline. The schema exists, the handler pattern exists, three similar event types are already wired up. I open the agent, point it at the relevant files, and describe what I need. Twelve minutes later the implementation is done, the tests pass, and I'm reading through the diff before committing.&lt;/p&gt;

&lt;p&gt;That's most of it. No drama, no revelation. This is what a productive morning looks like now.&lt;/p&gt;

&lt;p&gt;But those twelve minutes involve a specific kind of attention that I didn't have before working with AI tools regularly — a trained sense for where to slow down and where to let the agent run. That sense is what I want to write about here. Not the philosophy of it (I covered &lt;a href="https://iurii.rogulia.fi/blog/ai-coding-senior-vs-junior" rel="noopener noreferrer"&gt;why AI amplifies rather than equalizes in the previous article&lt;/a&gt;). The operational shape of an actual workday.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I Delegate Without Hesitation
&lt;/h2&gt;

&lt;p&gt;"Trust completely" is a bad frame because I still read what the model produces. A better frame is: there are tasks where the right answer is well-defined enough that I can evaluate the output quickly without deep scrutiny. I delegate these freely.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boilerplate that follows an established pattern.&lt;/strong&gt; If I need a new CRUD endpoint and ten similar endpoints already exist in the codebase, I point the agent at two of them and ask for the new one in the same style. The output is usually correct on the first pass. I read it, I verify it follows the same error-handling conventions, and I move on. The value here isn't that the model is smarter than me — it's that it can hold the existing pattern in context and reproduce it accurately without me having to do so consciously.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tests for behavior I've already specified in prose.&lt;/strong&gt; If I've written a description of what a function should do — what inputs it accepts, what edge cases it handles, what it should return or throw — the model writes tests that match that description well. This is different from asking the model to write tests for code it just wrote, which produces a different and much lower-quality result. &lt;a href="https://iurii.rogulia.fi/blog/vibe-coded-codebase-patterns" rel="noopener noreferrer"&gt;I described that pattern in detail in the first article&lt;/a&gt;. The distinction matters: tests from my specification test behavior; tests from the model's own code test the implementation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Type definitions and interfaces.&lt;/strong&gt; Describing a data structure in plain language and asking for the TypeScript interface is something the model does better than I would if I were typing from scratch. The output is complete, well-named, and includes JSDoc when I ask. I review it, adjust names where I have preferences, and move on.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Regular expressions.&lt;/strong&gt; These used to take me twenty minutes even for moderately complex cases — writing the pattern, testing against edge cases in a REPL, adjusting. Now: I describe the matching requirements including the edge cases I care about, the model produces the regex, and I verify it against the inputs I actually have. Elapsed time went from roughly twenty minutes to roughly two. The model is also better than I am at producing readable regexes with named groups, because it has pattern-matched on a huge number of examples.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Format conversions and data transformations.&lt;/strong&gt; Parsing an API response into a different shape, writing a one-off CSV-to-JSON converter, building a jq pipeline for a log file. These are well-defined inputs-to-outputs with no ambiguity about what "correct" means. I describe the input format, describe the output format, and the model produces the code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bash one-liners and scripting.&lt;/strong&gt; I know enough Bash to get things done. I don't enjoy writing Bash. The model does fine here and I review the output before running anything that touches production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documentation for code I've already written.&lt;/strong&gt; I write comments where decisions need explanation; the model fills in JSDoc blocks and inline clarifications for things that are genuinely readable but underdocumented. The result is acceptable as a first draft; I adjust the parts that don't sound like how I would write.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I Use AI But Read Every Line
&lt;/h2&gt;

&lt;p&gt;These are the high-stakes areas — tasks where I still use AI assistance because the drafting speed is real, but where I treat the output as a pull request from a developer I don't fully trust yet. Every line gets read. Important parts get tested in isolation before the code goes anywhere near production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;SQL queries, especially with JOINs and multi-tenancy.&lt;/strong&gt; I build on an existing Drizzle ORM schema and the model can produce correct queries from my description. For a &lt;a href="https://iurii.rogulia.fi/services/mvp-development" rel="noopener noreferrer"&gt;SaaS product&lt;/a&gt; with proper multi-tenancy, this is the area where I see the most subtle AI-generated mistakes. But "correct" for a query means more than "returns results." It means filtered by the right tenant, using an index that actually exists, returning only the columns the caller needs, and behaving predictably under the data distribution I have in production. For any query that touches more than one table or runs in a hot path, I read the output carefully and run &lt;code&gt;EXPLAIN ANALYZE&lt;/code&gt; on the actual production database before deploying. The model cannot know what my indexes are or what my data looks like. I do.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Anything that touches money.&lt;/strong&gt; Stripe charge creation, invoice line item calculation, VAT application, refund logic. The model drafts; I review as if I were a senior engineer who doesn't trust the author. Every conditional branch. Every rounding behavior. Every failure path. If a payment goes wrong, the damage is real and often irreversible. Speed on this code is not a priority.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;User input handling.&lt;/strong&gt; Any code that accepts data from outside the system: form submissions, webhook payloads, API parameters. I check that validation covers the edge cases I can think of, that error messages don't leak internal structure, and that no input can reach a database or a file system without going through the validation layer first.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Auth checks inside functions.&lt;/strong&gt; Not the architectural question of where auth should happen — that's a decision I make and document before the model writes any code. But when a function has a guard clause that checks permissions or ownership, I read it carefully. Subtle errors here — checking the wrong field, failing open rather than closed, not accounting for a null case — are the kind of thing that's invisible in tests and visible only when someone exploits it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Migrations.&lt;/strong&gt; The model generates them quickly and the syntax is usually correct. I read every migration twice: once for what it does, once for what it might do to existing data. I run migrations on a copy of production data before running them on production. The model has no knowledge of the actual rows in the database.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where I Don't Delegate
&lt;/h2&gt;

&lt;p&gt;
  slug="fractional-cto"&lt;br&gt;
  text="If AI tools are producing code faster than your team can evaluate it safely, I help senior engineering leaders build the review discipline and guardrails that keep it in check."&lt;br&gt;
/&amp;gt;&lt;/p&gt;

&lt;p&gt;This list is short because the previous article covered the principles. Here I want to state these as practices, not arguments.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Architectural decisions.&lt;/strong&gt; Where modules live, how the application is layered, what the module boundary is between two services. I make these decisions and communicate them to the model explicitly before it writes anything. If I don't, the model will make them for me, locally and inconsistently. It will also make them differently in different sessions. I use a &lt;code&gt;CLAUDE.md&lt;/code&gt; file in every project to capture these decisions — more on that in the next article in this series.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security topology.&lt;/strong&gt; Where authentication happens. Which routes are public. What an unauthenticated request is allowed to touch. The model can implement the mechanism; I decide where it lives. This is not a task that can be delegated safely because its correctness depends on understanding the full request path, and the model doesn't have that understanding unless I've given it explicitly — and even then I verify.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Schema design.&lt;/strong&gt; The model drafts DDL quickly. The decision about which columns are nullable, which indexes serve the queries that will actually run, whether to denormalize this field or keep it normalized — those require knowing the query patterns, the data growth projections, and the operational constraints. I make these decisions and then ask the model to generate the DDL.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Trade-off decisions.&lt;/strong&gt; The model is good at presenting options. "Here are three ways to implement this caching layer, with trade-offs." I use that output as a starting point. But the actual choice requires context the model doesn't have: what the operational team can support, what the cost ceiling is, what technical debt already exists in this area, what the team's experience is with each option. The model doesn't know. I tell it what I've decided, not the other way around.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Production incident debugging.&lt;/strong&gt; When something is actively broken, I want a direct path from observation to diagnosis. The model introduces latency into that path because it generates hypotheses that sound plausible but are based on general patterns, not the specific state of the system I'm looking at. In a production incident I want logs, metrics, and my own knowledge of the system — not a confident-sounding guess about what might be wrong. I turn the agent off and debug directly.&lt;/p&gt;
&lt;h2&gt;
  
  
  Signals That Make Me Stop
&lt;/h2&gt;

&lt;p&gt;This is the part that took time to develop and that I don't see written about often. Mid-session, in real time, there are specific outputs that cause me to stop and reconsider before accepting anything else from that session.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The model creates a new file instead of modifying an existing one.&lt;/strong&gt; If I asked it to extend an existing module and it produced a new file alongside it, it didn't see the existing code or decided to work around it. Either way, the output probably duplicates something. I reject it and redirect with an explicit pointer to the existing file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A new dependency appears.&lt;/strong&gt; I have not approved. I did not ask for a library. The model added one because it was convenient for the task. I stop here every time. Adding a dependency is a decision that affects the entire project — security surface, bundle size, maintenance burden. It is never implicit.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The model reinvented a pattern that already exists in the codebase.&lt;/strong&gt; This happens when I didn't give it enough context about what's already there. The result is a second version of something that should be unified, and now I have inconsistency I didn't start with. I reject the output and provide the existing pattern explicitly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confident use of an API method I don't recognize.&lt;/strong&gt; Sometimes it's a hallucinated method name. Sometimes it's a real method that was added in a newer version than I'm running. Sometimes it's correct. I check every time without exception, because the failure mode for a hallucinated API call is a runtime error in a code path that looks fine statically.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Model produced this. I didn't recognize `.parseAsync` with this signature.&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parseAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;strict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// Checked: the `strict` option doesn't exist on Zod's parseAsync.&lt;/span&gt;
&lt;span class="c1"&gt;// The code would have worked silently, ignoring the unknown option.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;A &lt;code&gt;try/catch&lt;/code&gt; that swallows errors.&lt;/strong&gt; The model adds these because it has seen a lot of code that wraps things in try/catch. When the catch block logs and returns null — or worse, returns a fallback value and continues — it hides failures rather than surfacing them. I always check what catch blocks do. An error that disappears into a log is an incident waiting to happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A test that checks the return type rather than the return value.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// This is not a test. This is noise.&lt;/span&gt;
&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;returns a number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nf"&gt;calculateTax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;FI&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// This is a test.&lt;/span&gt;
&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;applies Finnish VAT at 25.5% to the base amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;calculateTax&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;FI&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;125.5&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I wrote about this in the &lt;a href="https://iurii.rogulia.fi/blog/vibe-coded-codebase-patterns" rel="noopener noreferrer"&gt;first article&lt;/a&gt;. When I see the first pattern in model output, I reject the test and write a description of the actual behavior I want to verify.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Magic numbers without explanation.&lt;/strong&gt; &lt;code&gt;setTimeout(fn, 5000)&lt;/code&gt; — why 5000? &lt;code&gt;maxRetries = 3&lt;/code&gt; — where did 3 come from? The model uses reasonable-looking defaults without documenting the reasoning. I either add the comment myself or ask the model to explain and then document it, because whoever reads this code next (usually me, six months later) will have no idea why that number is there.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Do Differently Now
&lt;/h2&gt;

&lt;p&gt;Working with AI tools daily has introduced habits that didn't exist in my workflow before.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I write specifications in prose before asking for code.&lt;/strong&gt; The quality of what the model produces is a direct function of the quality of the specification it receives. &lt;a href="https://iurii.rogulia.fi/blog/ai-coding-senior-vs-junior" rel="noopener noreferrer"&gt;I described the mechanism in detail in the previous article&lt;/a&gt;. Practically, this means I spend five to ten minutes writing a description of what a function needs to do — its inputs, outputs, constraints, and failure modes — before I prompt for implementation. This has had an unexpected benefit beyond AI: it surfaces ambiguities before I'm in the middle of writing code.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I use AI to review my own code.&lt;/strong&gt; After writing something non-trivial, I paste it into a session and ask the agent to find problems with it. This catches things I've become blind to after staring at the code for an hour: unchecked error paths, a null case I didn't handle, an assumption that's not documented. It is not a substitute for a human review. It is a useful additional pass that costs two minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I refactor more often.&lt;/strong&gt; The cost of refactoring went down because the mechanical parts — renaming, restructuring, updating call sites — are faster with AI assistance. So I do it sooner, at lower threshold. I used to tolerate more awkwardness in a module because cleaning it up wasn't worth the time. Now the threshold for "this is worth fixing" is lower.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I use AI to explore alternatives before committing.&lt;/strong&gt; "Here's my current approach. What are two or three other ways to solve this, with trade-offs?" The model is good at this and it's made me less likely to commit to the first approach that compiles. I don't take its recommendation — I use the options as a starting point for my own assessment.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I maintain a &lt;code&gt;CLAUDE.md&lt;/code&gt; in every project.&lt;/strong&gt; A document that captures the architectural decisions, the patterns, the conventions, and the explicit constraints the model should follow. The details of what goes in it and how to write it are what the next article in this series covers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;I watch for cognitive offload in myself.&lt;/strong&gt; This is the subtler discipline and the one I think most people working with AI underestimate. The risk isn't that I delegate execution to the model — that's the whole point. The risk is that I start delegating &lt;em&gt;thinking&lt;/em&gt; without noticing: accepting a suggestion because the explanation sounded confident rather than because I evaluated the substance, skipping a verification step because I'm tired and the output looks right, treating "the model generated this" as evidence that it was considered. Leverage and dependency look identical in the moment; the difference is whether I'd still be able to make the decision without the model's draft in front of me. When I notice the dependency pattern in myself — and it happens — I stop the session, take a break, and come back when I can do the evaluation properly. This is the new failure mode AI introduces for experienced developers, and it doesn't trigger any of the signals listed above. It comes from inside.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Hasn't Changed
&lt;/h2&gt;

&lt;p&gt;This is the honest part of the balance sheet.&lt;/p&gt;

&lt;p&gt;I am only marginally faster at understanding a new codebase. This is more nuanced than I used to think. The model genuinely reduces the cost of &lt;em&gt;orientation&lt;/em&gt;: it maps structure, surfaces entry points, identifies the libraries in use, and narrates what individual files do. That part is real and useful, and it's saved me hours on unfamiliar repositories — the search cost of "where does this even start" has collapsed. But orientation and comprehension are different things. Understanding the codebase well enough to change it safely — knowing why something was built this way, what invariants are being protected, what assumptions the original author was making — still requires reading the code and thinking about it. The shortcut is to the map, not to the territory.&lt;/p&gt;

&lt;p&gt;I am not faster at estimating the complexity of tasks. The model is an optimist. When I ask it how long something will take, or when I ask it to assess the risk of a change, it systematically underestimates. My own estimates, based on experience with what actually goes wrong, are more accurate. I've stopped asking.&lt;/p&gt;

&lt;p&gt;Production incidents don't go faster. As noted above: I debug those the old way.&lt;/p&gt;

&lt;p&gt;Conversations with clients haven't changed. Understanding what a business actually needs, translating that into a technical approach, deciding what not to build — none of that is faster.&lt;/p&gt;

&lt;p&gt;Code reading as a deep skill is still slow and manual. The model can help narrate code, but narration and understanding are different things.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Honest Assessment
&lt;/h2&gt;

&lt;p&gt;AI is now part of my toolchain the same way a compiler, a debugger, and version control are. Each of those tools changed how professional developers work when they were introduced. None of them changed what the work actually requires — they changed the mechanics of doing it. AI is doing the same thing, at larger scale, in more parts of the workflow.&lt;/p&gt;

&lt;p&gt;The tool is productive where I have clear constraints and well-specified intent. It is a source of plausible-looking problems where I don't. The discipline is knowing the difference, in real time, for each task I'm about to hand off. That's a skill that develops with use and with being wrong often enough to notice the pattern.&lt;/p&gt;

&lt;p&gt;After 25 years of building systems, the part of my job that AI has not changed is the part that was never about typing: deciding what to build, deciding what not to build, and knowing which decisions can be taken back if they turn out to be wrong.&lt;/p&gt;




&lt;p&gt;If working with AI tools makes your development process feel chaotic rather than faster, that's usually a process problem, not a tool problem. My &lt;a href="https://iurii.rogulia.fi/services/fractional-cto" rel="noopener noreferrer"&gt;fractional CTO&lt;/a&gt; work often starts with sorting out exactly this — establishing the structure that makes AI-assisted development reliable rather than risky. If you've inherited a codebase that was built with AI without that structure in place, that's what my &lt;a href="https://iurii.rogulia.fi/services/rescue-projects" rel="noopener noreferrer"&gt;rescue projects&lt;/a&gt; service is for.&lt;/p&gt;

&lt;p&gt;For a deeper look at what a well-maintained AI workflow looks like in practice — the &lt;code&gt;CLAUDE.md&lt;/code&gt; file, system prompts, and mid-session stop signals — read &lt;a href="https://iurii.rogulia.fi/blog/ai-agent-codebase-prompts" rel="noopener noreferrer"&gt;Prompts That Keep an AI Agent From Wrecking Your Codebase&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>engineeringpractice</category>
      <category>workflow</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>PDF Integrity Report: May 2026</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Thu, 02 Jul 2026 10:00:42 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/pdf-integrity-report-may-2026-3p1g</link>
      <guid>https://dev.to/iurii_rogulia/pdf-integrity-report-may-2026-3p1g</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/pdf-integrity-report-may-2026" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Every month we look at aggregate, anonymized data from checks processed by HTPBE and write up what the structural signals tell us about the state of PDF tampering. No file contents, no personally identifiable information — only the structural and metadata patterns the algorithm uses to classify documents.&lt;/p&gt;

&lt;p&gt;This report is about &lt;strong&gt;proportions and movement&lt;/strong&gt;, not raw counts. What share of documents came back flagged, which signals fired more or less often than the month before, which origins shifted, and what the recurring tampering shapes looked like. Those are the numbers that mean something; an absolute file count for a single month is noise by comparison.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Shape of the Verdicts
&lt;/h2&gt;

&lt;p&gt;The flagged share climbed again — from just under half in March to &lt;strong&gt;roughly seven in ten&lt;/strong&gt; in May. The bigger story is &lt;em&gt;within&lt;/em&gt; the flagged set. For the first time, &lt;strong&gt;"certain" verdicts — where multiple unambiguous structural signals converge — overtook "high-confidence" ones&lt;/strong&gt; as the single largest bucket.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Verdict&lt;/th&gt;
&lt;th&gt;Direction vs. prior months&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Certain modification&lt;/td&gt;
&lt;td&gt;▲ now the largest single bucket&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High-confidence modification&lt;/td&gt;
&lt;td&gt;▲ slightly, but overtaken by "certain"&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Not flagged&lt;/td&gt;
&lt;td&gt;▼ shrinking share&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Two forces are behind that. First, the traffic mix tilted hard toward the API this month, and API callers skew toward documents that are &lt;em&gt;already&lt;/em&gt; modified — developers testing an integration with known-bad files, and forger-bridge traffic uploading a fake to see whether it gets caught. That population produces stacked, unambiguous evidence, which lands in the "certain" tier. Second, detection coverage widened (see the version count below), so files that once scraped a "high" now cross into "certain" because a newer check adds the converging second signal.&lt;/p&gt;

&lt;p&gt;Read the flagged share as a statement about &lt;em&gt;who is submitting documents&lt;/em&gt;, not as a population-wide fraud rate.&lt;/p&gt;




&lt;h2&gt;
  
  
  Signals That Moved
&lt;/h2&gt;

&lt;p&gt;Among flagged documents, the evidence mix shifted in a consistent direction: the &lt;strong&gt;classical first-order signals held their lead, while the newer second-order signals kept gaining share.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Up month over month:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Generator-fingerprint contradictions&lt;/strong&gt; — the declared producer says one thing, the binary structure says another. Growing fastest, because it catches files where the forger spoofed the producer string but left the structural fingerprint of the tool they actually used.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-source page-template assembly&lt;/strong&gt; — pages stamped from one source spliced onto pages from another. A new dedicated check this month moved this from "occasionally caught" to "routinely caught."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Producer-identity spoofing on re-distilled files&lt;/strong&gt; — also a May-shipped check; immediately started firing on files laundered through a re-distill step.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Missing creation date&lt;/strong&gt; — roughly a quarter of all files now arrive with no creation timestamp at all, a share that has crept up every month. A missing creation date strips out one of the cleaner forensic anchors, and its rise is itself a signal: someone is scrubbing it.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Flat or down:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Date-field inconsistencies&lt;/strong&gt; — still the most common single finding by share, but no longer growing; the easy timestamp tells are increasingly being cleaned by forgers before submission.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Post-signature modification&lt;/strong&gt; — down in share, mostly because signed documents were a thin slice of the month.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The trend we have flagged all year held: a forger who learned to scrub creation dates and avoid an incremental-update trail does not necessarily know to reconcile the structural fingerprint of the tool they rebuilt the file with against the producer string they spoofed. The second-order signals are where those cases get caught.&lt;/p&gt;




&lt;h2&gt;
  
  
  Incremental Updates: Almost Without Exception
&lt;/h2&gt;

&lt;p&gt;The cleanest signal we track, and it got cleaner. Files carrying incremental updates were flagged &lt;strong&gt;in virtually every case this month&lt;/strong&gt; — the highest rate in the four months we have published, continuing a monotonic climb. The average revision chain on those files sat around three appends.&lt;/p&gt;

&lt;p&gt;The mechanism is unchanged: incremental updates let content be appended after the original write. Legitimate workflows produce them — signature application, annotation, form-fill — but on the population reaching the tool, those clean cases have shrunk to a rounding error. When an incremental update shows up on a document submitted for fraud detection, it is now almost synonymous with post-creation editing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Representative Cases
&lt;/h2&gt;

&lt;p&gt;These are composite, anonymized illustrations of the recurring shapes the engine resolved this month — not specific files. Each maps to the structural markers that actually drove the verdict.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The spreadsheet rebuild (verdict: certain).&lt;/strong&gt; A "bank statement" arrives looking clean to the eye. Structurally, the producer field names a spreadsheet-export pipeline rather than the bank's core system, and the modification timestamp trails the creation timestamp by weeks. Two converging signals — producer mismatch plus a date gap — and the file is flagged with high certainty. This is the single most common shape in the flagged set, month after month.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The signature that held while the bytes moved (verdict: certain).&lt;/strong&gt; A signed contract shows a green "signed by" badge in the viewer. Structurally, an incremental update was appended &lt;em&gt;after&lt;/em&gt; the signature's byte range — a figure changed on page three, saved as a new revision the signature never covered. The signature stays technically valid; the document is not what was signed. This is the case digital-signature validation alone cannot catch, and structural analysis does.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The two-source invoice (verdict: certain).&lt;/strong&gt; An invoice looks like one coherent document. Structurally, page one carries the font subsets and object fingerprint of an institutional generator, while page two was spliced in from a different source — a different font-subset prefix, a stamp-coverage discontinuity at the page boundary. Multi-source page-template assembly: the body is genuine, one page was swapped.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The borrowed identity (verdict: high → certain).&lt;/strong&gt; A file declares "Adobe Acrobat" in its producer string but is missing the XMP toolkit marker and document-instance identifiers a genuine Acrobat save always writes, and its structural fingerprint matches a re-distill pipeline. Producer-identity spoofing — the May-shipped check that turns "claims to be Adobe" into a flag when the structure says otherwise.&lt;/p&gt;




&lt;h2&gt;
  
  
  Document Origin
&lt;/h2&gt;

&lt;p&gt;The origin mix shifted: &lt;strong&gt;scanned documents rose sharply, to nearly a quarter of submissions&lt;/strong&gt; — overtaking consumer-software exports — while institutional documents remained the plurality.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Origin classification&lt;/th&gt;
&lt;th&gt;Direction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Institutional (server-side / enterprise generators)&lt;/td&gt;
&lt;td&gt;plurality, steady&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scanned ("Cannot Verify")&lt;/td&gt;
&lt;td&gt;▲ sharp rise, now ~a quarter&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Consumer software ("Cannot Verify")&lt;/td&gt;
&lt;td&gt;▼ slipped below scanned&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Online editor / unknown / other&lt;/td&gt;
&lt;td&gt;small shares&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Scans and consumer-software exports fall into a "Cannot Verify" bucket where the structural layer deliberately returns a conservative inconclusive verdict rather than an intact-or-modified call — forcing a binary verdict on those formats would generate false positives in both directions. The rise in scanned share is worth watching: re-scanning a tampered printout is a known way to launder edits out of the structural record, which is exactly why a scan can never earn an "intact" verdict here.&lt;/p&gt;




&lt;h2&gt;
  
  
  Digital Signatures
&lt;/h2&gt;

&lt;p&gt;Signed documents remained a thin slice of the month, too small to quote a meaningful rate. The pattern that did appear is the one we keep reporting: a signature that is valid in the viewer does not guarantee the bytes were not altered, because incremental updates appended after signing fall outside the signed scope. Checking integrity at the structural layer, not the signature-validation layer, is what catches that — see "the signature that held while the bytes moved" above.&lt;/p&gt;




&lt;h2&gt;
  
  
  Algorithm Development
&lt;/h2&gt;

&lt;p&gt;May was, by version count, the busiest month since launch — &lt;strong&gt;twenty-nine versions shipped&lt;/strong&gt;, up from April's eighteen. The work split three ways, as always:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;New detection categories&lt;/strong&gt; — multi-source page-template assembly, producer-identity spoofing on re-distilled files, and refinements to drawing-operator and content-stream consistency checks. The first two show up directly in the "signals that moved" section above.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;False-positive reductions&lt;/strong&gt; — roughly half the releases narrowed heuristics misfiring on legitimate document classes: professional export pipelines, multi-tool re-export chains, certain office-suite and print-to-PDF outputs, signed-document workflows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Clearer inconclusive verdicts&lt;/strong&gt; — scans, consumer-software exports and HTML-to-PDF output continued to be routed into an explicit inconclusive verdict rather than forced into intact-or-modified.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Wider coverage is itself part of why the flagged share rose: a share of the documents now flagged would have come back intact under the early-May algorithm.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Software Ecosystem
&lt;/h2&gt;

&lt;p&gt;The recurring fingerprints held. &lt;strong&gt;Online manipulation services as intermediate steps&lt;/strong&gt; — a service in the producer field with a different application in the creator field, the signature of a compress / merge / page-extract step between creation and submission. &lt;strong&gt;Design-tool origin&lt;/strong&gt; — vector- and consumer-design applications appearing where a system-generated producer belongs, on documents that purport to be business records. &lt;strong&gt;Programmatic manipulation libraries&lt;/strong&gt; — where the signal is no longer the spoofable producer string but the structural fingerprint the library leaves at the binary level. May's producer-identity-spoofing check was built for exactly that last category.&lt;/p&gt;




&lt;h2&gt;
  
  
  PDF Version Landscape
&lt;/h2&gt;

&lt;p&gt;Concentration tightened: &lt;strong&gt;PDF 1.7 alone accounted for over half the sample&lt;/strong&gt;, with 1.6, 1.4, 1.5 and 1.3 splitting most of the rest. PDF 2.0, despite nearly a decade of availability, stayed a rounding-error share.&lt;/p&gt;




&lt;h2&gt;
  
  
  Summary
&lt;/h2&gt;

&lt;p&gt;May 2026, in relative terms:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The flagged share climbed past seven in ten — but read it as a traffic-mix effect (API-dominated, skewed toward already-modified files), not a population fraud rate.&lt;/li&gt;
&lt;li&gt;"Certain" verdicts overtook "high-confidence" ones for the first time — converging, unambiguous evidence is becoming the norm in the flagged set.&lt;/li&gt;
&lt;li&gt;Incremental-update files were flagged almost without exception — the cleanest single signal we track, climbing every month.&lt;/li&gt;
&lt;li&gt;Second-order signals — generator-fingerprint contradictions, multi-source assembly, producer-identity spoofing — kept gaining share on the classical date and incremental-update tells.&lt;/li&gt;
&lt;li&gt;Scanned documents rose sharply, to nearly a quarter of submissions; missing creation dates continued their slow climb.&lt;/li&gt;
&lt;li&gt;Twenty-nine algorithm versions shipped, the most in any month so far.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every pattern here comes from the same forensic engine teams run on their own intake stream through the &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;PDF tamper detection API&lt;/a&gt;. If you want to run a single document through the same analysis by hand, the &lt;a href="https://htpbe.tech/pdf-tamper-detection" rel="noopener noreferrer"&gt;free checker&lt;/a&gt; does it in the browser.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This report covers checks processed by HTPBE in May 2026. File contents are not stored or analyzed; only structural metadata signals are retained. All figures are aggregate and anonymized.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>fraud</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Why AI Coding Widens the Senior–Junior Developer Gap</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Wed, 01 Jul 2026 10:00:46 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/why-ai-coding-widens-the-senior-junior-developer-gap-111p</link>
      <guid>https://dev.to/iurii_rogulia/why-ai-coding-widens-the-senior-junior-developer-gap-111p</guid>
      <description>&lt;p&gt;Two things happened in the same week.&lt;/p&gt;

&lt;p&gt;A founder sent me a repository. Thirty thousand lines, built in three months with AI assistance. It compiled. The tests were green. In the demo it looked polished. In production it had five separate authentication flows, a test suite that verified return types rather than business logic, and a database schema that disagreed with the ORM in three different ways. I wrote about the patterns in detail in &lt;a href="https://iurii.rogulia.fi/blog/vibe-coded-codebase-patterns" rel="noopener noreferrer"&gt;the previous article in this series&lt;/a&gt;. The short version: the code was locally coherent and globally incoherent, because no one had been in the role of architect.&lt;/p&gt;

&lt;p&gt;Two days later, a senior developer I respect sent me a message. He'd integrated AI coding tools into his workflow six months earlier. He said his output had roughly doubled on everything that wasn't architecture-level work. He was shipping faster, making fewer typos in tedious boilerplate, and spending more time on the parts of the job he found interesting.&lt;/p&gt;

&lt;p&gt;Same technology. Opposite results. That gap is not random.&lt;/p&gt;

&lt;h2&gt;
  
  
  Wind Does Not Have a Direction
&lt;/h2&gt;

&lt;p&gt;There's an old observation about fire and wind: wind doesn't create fire, and it doesn't choose sides. It intensifies whatever is already burning. A small flame, it extinguishes. A large fire, it feeds until it becomes something much larger.&lt;/p&gt;

&lt;p&gt;AI-assisted coding works the same way.&lt;/p&gt;

&lt;p&gt;The tools don't have preferences. They don't know whether your database migration is safe to run or whether your authentication boundary is in the right place. They produce plausible-looking output based on patterns in their training data, and they do this with consistent confidence regardless of whether the output is correct. The tool is powerful. The question is: powerful in what direction?&lt;/p&gt;

&lt;p&gt;That depends entirely on what was already there before you opened the chat.&lt;/p&gt;

&lt;p&gt;This is not a comfortable observation because it contradicts the popular narrative. The narrative says AI democratizes software development — that it gives less experienced developers access to patterns and capabilities that previously required years of practice. This is true at the surface level and misleading at the level that matters. There is a real difference between generating a plausible-looking piece of code and knowing whether that code should exist at all, whether it fits the system around it, and whether it will hold under conditions the demo never tests.&lt;/p&gt;

&lt;p&gt;AI closes the gap on the first kind of knowledge. It widens the gap on the second.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens Without an Architect in the Loop
&lt;/h2&gt;

&lt;p&gt;I don't think the people who produce the codebases I described in the previous article are careless or uninformed. They asked the model to help them build something, and the model helped. The problem is structural: the model has no persistent understanding of the system as a whole. Each response optimizes locally. It solves the current problem without regard for whether that solution creates a new problem two files away, or whether a nearly identical solution already exists from a different prompt three weeks earlier.&lt;/p&gt;

&lt;p&gt;A senior developer doesn't distinguish themselves by holding more in their head than a junior. Distributed systems have long since exceeded the cognitive capacity of any single person — the idea that good engineering is about maximizing mental retention is a romanticization that doesn't survive contact with real systems. What a senior does instead is build processes that make the dependency on memory unnecessary: Architecture Decision Records, explicit invariants in code, contract tests, CI gates that encode architectural constraints, schema-as-code, runbooks for failure modes. The system is made legible not by one person's heroic memory but by accumulated structure that anyone can read.&lt;/p&gt;

&lt;p&gt;This is worth saying clearly because it changes how you think about AI in the picture. AI accelerates coding velocity. The senior's job is to make sure the system is protected from its own AI-assisted speed — through the same formal processes that make it resilient to any other kind of acceleration.&lt;/p&gt;

&lt;p&gt;This is not a skill that comes from knowing a framework or a programming language. It comes from having seen a lot of systems, having watched them fail, and having developed a sense — call it taste, or judgment — for when something is wrong even before you can articulate precisely why. This judgment is not a substitute for process. It is what tells you which processes need to exist, where the invariants are, and which parts of the system deserve explicit protection.&lt;/p&gt;

&lt;p&gt;That judgment is what a junior developer doesn't have yet. Not because they're incapable of it but because it requires time and failure to develop. There's no shortcut. And when you give a developer without that judgment an AI assistant that produces confident-looking code at high volume, what you get is more code faster, along with all the problems that come from code written without the judgment to evaluate it.&lt;/p&gt;

&lt;p&gt;The junior's fundamental problem with AI is not that they accept bad suggestions. It's that they often can't distinguish the bad suggestions from the good ones. When a model proposes five different ways to approach authentication and the junior picks one, the criterion for picking is not "which of these is architecturally sound" — it's "which of these seems most familiar, or most recently discussed, or most confidently explained." The model's tone doesn't change between good suggestions and bad ones. It sounds equally sure of itself when it's right and when it's leading you into a pattern that will be painful to undo.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Happens When the Senior Is in the Loop
&lt;/h2&gt;

&lt;p&gt;
  slug="fractional-cto"&lt;br&gt;
  text="Need a senior engineer in the loop without a full-time hire? Fractional CTO work is exactly this — technical leadership that keeps AI-assisted velocity from producing architectural chaos."&lt;br&gt;
/&amp;gt;&lt;/p&gt;

&lt;p&gt;The senior developer I mentioned at the start roughly doubled his output on routine work. I believe that number because it matches what I experience myself. But the mechanism is worth examining carefully, because it's not what most people assume.&lt;/p&gt;

&lt;p&gt;It's not that AI writes the code and the senior reviews it. Or rather — that's the surface description, but it leaves out the most important part. The senior rejects far more than they accept. For every suggestion that gets merged, there are several that get discarded, requested to be rewritten, or corrected before they land. That filtering work is invisible. People talk about how AI accelerated their development; they rarely talk about how often they told it no, or how many iterations it took to get something they trusted.&lt;/p&gt;

&lt;p&gt;The senior also knows exactly which parts of the system they will never delegate. I have a short list of things I will not let a model make decisions about without heavy supervision:&lt;/p&gt;

&lt;p&gt;Database schema changes. Not because models can't generate migrations — they can, quickly and with correct syntax — but because a migration that looks right can have consequences that only show up after it runs, and the model cannot predict those consequences without knowledge of the actual production data distribution. I generate migrations with AI assistance and read every line twice before running them anywhere.&lt;/p&gt;

&lt;p&gt;Security boundaries. Where does authentication happen? Who is allowed to see what? What does an unauthenticated request touch? These are architectural decisions that need to be made deliberately, by a person who understands the full system, and written down so they can be audited. A model will write an auth check if you ask for one. Whether that check is in the right place, at the right layer, for the right reason — that requires judgment the model doesn't have.&lt;/p&gt;

&lt;p&gt;Anything that touches money. Payment flow, billing logic, invoice calculation. The model can write a Stripe integration faster than I can. I'll still read every line, test every edge case, and be the one who decides how failures are handled.&lt;/p&gt;

&lt;p&gt;Operational and runtime correctness — and this one is less obvious than the others. AI generates code that works in ideal conditions. What it does not reliably handle is behavior under pressure: race conditions, idempotency and retry storms in webhooks and queues, deadlocks, failure semantics when an external API dies mid-pipeline. The model produces code that passes tests in a clean environment; operational correctness is about what happens in the unclean one. Plausible-looking code that fails under load takes root in production before it becomes visible — which is what makes it the most expensive kind of wrong.&lt;/p&gt;

&lt;p&gt;Outside these areas, I use AI extensively. Migrations for things that aren't schema changes. CRUD endpoints that follow established patterns. Type definitions and interfaces. Test cases for behavior I've already specified in prose — and that qualifier matters: AI writing tests for code it just wrote is a different thing entirely, and it produces a different kind of failure; I covered that in detail in &lt;a href="https://iurii.rogulia.fi/blog/vibe-coded-codebase-patterns" rel="noopener noreferrer"&gt;the previous article&lt;/a&gt;. Regular expressions. Boilerplate for new services that should look like existing services. These are the places where AI genuinely multiplies output, because they're places where the right answer is relatively well-defined and the main cost is typing.&lt;/p&gt;

&lt;p&gt;What this means in practice: I'm much faster on the parts of a project that are fundamentally mechanical, and about the same speed on the parts that require judgment. The ratio of judgment work to mechanical work shifts — there's more of the former as a proportion of my time. I find this more interesting, not less.&lt;/p&gt;
&lt;h2&gt;
  
  
  A Concrete Illustration
&lt;/h2&gt;

&lt;p&gt;Consider something simple: prompting for a database query.&lt;/p&gt;

&lt;p&gt;A junior developer, new to a codebase, might write something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;Write&lt;/span&gt; &lt;span class="n"&gt;me&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="n"&gt;query&lt;/span&gt; &lt;span class="k"&gt;to&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt; &lt;span class="k"&gt;all&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="k"&gt;user&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model returns a query. Probably a correct one. The junior adds it to the codebase.&lt;/p&gt;

&lt;p&gt;A senior developer, working in the same codebase, might write something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nx"&gt;I&lt;/span&gt; &lt;span class="nx"&gt;need&lt;/span&gt; &lt;span class="nx"&gt;to&lt;/span&gt; &lt;span class="nx"&gt;fetch&lt;/span&gt; &lt;span class="nx"&gt;all&lt;/span&gt; &lt;span class="nx"&gt;open&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;a&lt;/span&gt; &lt;span class="nx"&gt;given&lt;/span&gt; &lt;span class="nx"&gt;user&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;using&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;existing&lt;/span&gt;
&lt;span class="nx"&gt;Drizzle&lt;/span&gt; &lt;span class="nx"&gt;ORM&lt;/span&gt; &lt;span class="nx"&gt;schema&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;schema&lt;/span&gt;&lt;span class="o"&gt;/&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt; &lt;span class="nx"&gt;Orders&lt;/span&gt; &lt;span class="nx"&gt;are&lt;/span&gt; &lt;span class="nx"&gt;soft&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nf"&gt;deleted &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;deleted_at&lt;/span&gt; &lt;span class="nx"&gt;column&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="nx"&gt;must&lt;/span&gt; &lt;span class="nx"&gt;be&lt;/span&gt; &lt;span class="nx"&gt;filtered&lt;/span&gt; &lt;span class="nx"&gt;by&lt;/span&gt; &lt;span class="nx"&gt;tenant_id&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="nx"&gt;multi&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="nx"&gt;tenancy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;and&lt;/span&gt; &lt;span class="nx"&gt;we&lt;/span&gt; &lt;span class="nx"&gt;need&lt;/span&gt; &lt;span class="nx"&gt;the&lt;/span&gt; &lt;span class="nx"&gt;customer&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;s
name from a JOIN on the customers table. Return only order_id, created_at,
status, and customer_name. The function should be in lib/queries/orders.ts
next to the existing getOrderById function, following the same pattern.
Include proper typing with InferSelectModel.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The model returns a query that fits the actual codebase. The senior reads it, adjusts two things, and merges it.&lt;/p&gt;

&lt;p&gt;Both developers used AI. The difference is not in the tool. It's in the question asked. The senior's question encodes architectural context the junior didn't know to include: the soft-delete pattern, the multi-tenancy constraint, the existing code location, the existing naming conventions. That context comes from understanding the system. AI can't supply it. You have to bring it.&lt;/p&gt;

&lt;p&gt;The bottleneck in software development before AI was implementation speed — who could write code fastest. The bottleneck in the AI era is specification precision. The real advantage a senior brings is not the ability to write code quickly by hand, but the ability to formalize intent: to state constraints, describe invariants, design interfaces, and define failure semantics before a line of code is written. AI writes code from a specification; the quality of that code is now a function of the quality of the specification it received. That's a different skill from typing faster, and it takes the same years to develop.&lt;/p&gt;

&lt;p&gt;This is what I mean when I say AI amplifies what you already have. The senior gets a useful response on the first prompt because the prompt reflects real understanding. The junior gets a response that will need to be rewritten — or worse, won't be recognized as wrong until it causes a production incident.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Illusion of Equal Access
&lt;/h2&gt;

&lt;p&gt;There's a version of the AI democratization story that I think is genuinely harmful, not because it's dishonest, but because it gives founders and hiring managers the wrong mental model.&lt;/p&gt;

&lt;p&gt;The story goes: AI makes anyone capable of writing production-grade software. Hire cheaper developers and give them AI tools. Get the same output at lower cost.&lt;/p&gt;

&lt;p&gt;What actually happens: you get output that looks like production-grade software. It passes a code review from someone who doesn't know what to look for. It deploys. Then real users arrive, and it starts to fail in the ways I described in the previous article — auth that breaks when sessions expire, tests that pass but don't catch logic errors, dependencies that quietly conflict, schema drift that surfaces only when you try to add a feature.&lt;/p&gt;

&lt;p&gt;Companies that have tried this are now, depending on how far along they are, either discovering these problems in production or paying to have them fixed. A few of them have hired me for that.&lt;/p&gt;

&lt;p&gt;The senior developer is expensive not because the market is irrational. It's because the value is real, and AI has not made it less real. If anything, AI has made the judgment gap more visible, because junior developers can now produce large codebases quickly, which means the consequences of poor judgment arrive sooner and at greater scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  What AI Actually Changes for Senior Developers
&lt;/h2&gt;

&lt;p&gt;To be direct about this: AI has made my working life better and my output higher. I don't think this is controversial if you're honest about the conditions.&lt;/p&gt;

&lt;p&gt;Speed on routine work: roughly 2–3× on the kinds of tasks that involve following established patterns — CRUD handlers, type definitions, boilerplate for new services, test cases against a specification I've already written.&lt;/p&gt;

&lt;p&gt;Speed on unfamiliar technology: significantly higher, sometimes 5× or more. If I'm working with a library I haven't used before, I can have a working integration faster than if I were reading documentation alone, because the model can show me idiomatic usage in context. I still read the documentation. I still verify that what the model produced is actually idiomatic. But the feedback loop is faster.&lt;/p&gt;

&lt;p&gt;There is also an effect I didn't fully anticipate: AI accelerates not just feature delivery but architectural entropy — the accumulation of inconsistencies, duplicated abstractions, and divergence between layers. Commit speed scales; so does the rate of debt accumulation. I use AI and still control entropy through the same processes I always have: ADRs, regular refactoring cycles, code review that looks across files rather than at functions in isolation. The velocity is real. So is the discipline required to keep it from compounding.&lt;/p&gt;

&lt;p&gt;What doesn't change: the time I spend on architecture decisions, on debugging production incidents, on reviewing what I've built against what I intended to build. These don't go faster. The judgment work is the same.&lt;/p&gt;

&lt;p&gt;What gets worse if I'm not careful: consistency across a large codebase, when AI writes in different sessions without shared context about previous decisions. I use a &lt;code&gt;CLAUDE.md&lt;/code&gt; file in every project I work on with AI tools — a document that describes the patterns, conventions, and constraints that the model should follow. Without it, the model optimizes locally and introduces drift. With it, the output is much more consistent. Even then, I audit for drift regularly. The tools can introduce subtle inconsistencies that only show up when you read across files.&lt;/p&gt;

&lt;p&gt;There is a structural reason this problem doesn't go away with better models or larger context windows. LLMs optimize within a session; software architecture lives between sessions — across months and years of decisions, reversals, and accumulated constraints. That is a different category of problem, not a scaling problem. A junior developer who has noticed the inconsistency problem often concludes that a longer context window will fix it. It won't. Architectural consistency lives in code, documentation, and process — not in the model's session memory. The senior knows to put it there.&lt;/p&gt;

&lt;p&gt;One honest disclaimer: everything I've described is grounded in SaaS and backend systems — which is where I work. In realtime systems, embedded software, kernels, HPC, lock-free concurrency, distributed consensus, or safety-critical code, "plausible-looking" output is not merely expensive — it can be directly dangerous. The human judgment required shifts even further toward "don't delegate anything load-bearing." I won't write about those domains because I don't practice in them, but if you do, the argument in this article applies more strongly, not less.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Businesses Hiring Right Now
&lt;/h2&gt;

&lt;p&gt;I'll be brief here because it's not primarily a technical point.&lt;/p&gt;

&lt;p&gt;Companies that are replacing senior engineering judgment with AI tools and cheaper developers will accumulate technical debt at scale. The codebases will be larger and will arrive faster, and the problems in them will be harder to fix precisely because there's more code to sort through. The inbox I have for &lt;a href="https://iurii.rogulia.fi/services/rescue-projects" rel="noopener noreferrer"&gt;rescue projects&lt;/a&gt; and &lt;a href="https://iurii.rogulia.fi/services/fractional-cto" rel="noopener noreferrer"&gt;fractional CTO&lt;/a&gt; work is, in part, a record of this trend playing out in real businesses over the last two years.&lt;/p&gt;

&lt;p&gt;Companies that hired one or two senior engineers and gave them AI tools have gotten real acceleration — not because of the AI alone, but because the AI is working in a context where someone with judgment is deciding what to build, how to build it, and what to discard.&lt;/p&gt;

&lt;p&gt;This is not a prediction. It's already visible in the pattern of what work comes through my door, and in how those businesses differ when I first see them.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Honest Summary
&lt;/h2&gt;

&lt;p&gt;I am not making an argument against AI-assisted development. I use it daily. I think it's made me materially faster and, in some ways, better — it surfaces options I might not have considered, it catches the kinds of errors that come from typing too fast, and it handles the mechanical parts of coding with a patience I don't always have.&lt;/p&gt;

&lt;p&gt;I am making an argument about the conditions under which it works. Wind does not decide what burns. The fire was already there — or it wasn't. The wind only reveals which.&lt;/p&gt;

&lt;p&gt;After 25 years of building systems, the thing I can do that AI cannot is not merely write better code. It is deciding what must be made explicit: which constraints belong in tests, which decisions belong in ADRs, which failure modes deserve runbooks, which boundaries must not drift, and which simplification will save a month of future work rather than create one. AI makes the expression of that understanding faster. It doesn't substitute for it.&lt;/p&gt;

&lt;p&gt;If you're working with AI and it feels like a superpower, you're probably already the fire. If it feels like you're just approving everything it suggests and hoping it works — that's worth thinking about carefully.&lt;/p&gt;




&lt;p&gt;If what I've described sounds like the codebase you're currently maintaining — the one that was built fast with AI and is now difficult to change — that's what my &lt;a href="https://iurii.rogulia.fi/services/rescue-projects" rel="noopener noreferrer"&gt;rescue projects&lt;/a&gt; service is for. If you're a senior developer or technical leader looking for someone to work alongside on complex systems, my &lt;a href="https://iurii.rogulia.fi/services/fractional-cto" rel="noopener noreferrer"&gt;fractional CTO&lt;/a&gt; work might be a better fit.&lt;/p&gt;

&lt;p&gt;For the concrete patterns that come out of AI-only development — five auth flows, wrong tests, security shortcuts — see &lt;a href="https://iurii.rogulia.fi/blog/vibe-coded-codebase-patterns" rel="noopener noreferrer"&gt;Vibe-Coded Codebase Problems&lt;/a&gt;. For the practical tools that prevent them — &lt;code&gt;CLAUDE.md&lt;/code&gt;, system prompts, stop signals — read &lt;a href="https://iurii.rogulia.fi/blog/ai-agent-codebase-prompts" rel="noopener noreferrer"&gt;Prompts That Keep an AI Agent From Wrecking Your Codebase&lt;/a&gt;. And for the &lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;vatnode.dev&lt;/a&gt; codebase specifically — built with AI assistance under senior oversight — the difference in structural coherence is visible in the code.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>engineeringpractice</category>
      <category>bestpractices</category>
      <category>architecture</category>
    </item>
    <item>
      <title>Bank Statement Fraud in Lending: What Gets Altered and How to Catch It</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Wed, 01 Jul 2026 10:00:36 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/bank-statement-fraud-in-lending-what-gets-altered-and-how-to-catch-it-3hce</link>
      <guid>https://dev.to/iurii_rogulia/bank-statement-fraud-in-lending-what-gets-altered-and-how-to-catch-it-3hce</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/bank-statement-fraud-in-lending" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A borrower submits a three-month bank statement PDF. The layout matches their bank’s template exactly. The running balance climbs steadily through the period, peaking just above the income threshold required for approval. There are no overdrafts. The font is correct. The logo renders cleanly. Your underwriter approves the file.&lt;/p&gt;

&lt;p&gt;None of that means the document was unaltered.&lt;/p&gt;

&lt;p&gt;According to fraud analytics firm SEON, bank statements are the most commonly falsified document in lending applications — cited in over 59% of fraudulent loan applications. The editing happens after the borrower downloads the legitimate PDF from their bank portal, before they upload it to your application form. The tools required cost nothing and require no technical skill.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Borrowers Actually Change
&lt;/h2&gt;

&lt;p&gt;The three most common alterations are running balance inflation, inserted deposits, and hidden overdrafts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running balance inflation&lt;/strong&gt; is the simplest. The borrower opens the PDF in Adobe Acrobat, selects the balance figures, and types over them. No arithmetic is recalculated — the inserted numbers are static text objects with no relationship to the surrounding transaction data. This is invisible to visual review but immediately detectable to any system that checks whether the edit session was recorded in the file structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Inserted deposits&lt;/strong&gt; involve adding a transaction line — typically a salary credit or a one-off transfer — to push the average monthly income above a threshold. Again, Adobe Acrobat or an online editor like iLovePDF or Smallpdf handles this in seconds. The inserted line sits alongside real transactions in the visible content layer, but the edit leaves a different fingerprint in the structural layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hidden overdrafts&lt;/strong&gt; are less common but more technically deliberate. A borrower who has run negative balances removes those rows or replaces negative figures with small positives. This can involve reformatting an entire section, and consumer PDF editors typically re-embed fonts or rewrite object streams in ways that differ structurally from the bank’s original rendering pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Visual Review Fails
&lt;/h2&gt;

&lt;p&gt;A reviewer comparing a statement on screen cannot see the xref chain, the producer field shift between bank portal and applicant upload, or the modification timestamp arriving four days after the creation timestamp. Visual review catches sloppy fraud: wrong font, misaligned columns, a logo that renders at the wrong resolution. It does not catch competent fraud performed with mainstream consumer tools.&lt;/p&gt;

&lt;p&gt;The structural evidence of an edit lives in the file’s metadata and revision history, not on the rendered page.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Structural Forensics Actually Catches
&lt;/h2&gt;

&lt;p&gt;Three structural signals cover the large majority of bank statement fraud attempts.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The xref incremental update trail.&lt;/strong&gt; Every time a PDF is modified and saved, the changes are appended to the file and a new cross-reference table is written. A bank statement generated by a bank portal and saved without intermediate processing typically has one xref entry. The exceptions matter: some retail banks save incrementally on first export, mail-handling gateways and DMS pipelines can rewrite metadata in transit, and mobile banking apps occasionally re-save the PDF on share. So &lt;code&gt;xref_count &amp;gt; 1&lt;/code&gt; is a strong correlate of post-creation modification, not a deterministic marker — it earns a closer look, not an automatic rejection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Producer mismatch.&lt;/strong&gt; The &lt;code&gt;creator&lt;/code&gt; field identifies the software that originally produced the document. The &lt;code&gt;producer&lt;/code&gt; field identifies the software that last saved it. A bank’s web portal typically generates PDFs using server-side engines: iText, Aspose.PDF, PrinceXML, or similar. When the producer field shows &lt;code&gt;Adobe Acrobat 24.2&lt;/code&gt; or &lt;code&gt;iLovePDF&lt;/code&gt; while the creator shows &lt;code&gt;Temenos&lt;/code&gt; or &lt;code&gt;FIS&lt;/code&gt;, the document was re-saved in a tool that is not part of any documented bank statement pipeline. &lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt; in the API response is a high-confidence anomaly — the right action is to flag for review, not auto-reject, because the customer might have opened the file in Preview to compress before email.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Modification date gap.&lt;/strong&gt; The PDF specification requires that generators set both a &lt;code&gt;CreationDate&lt;/code&gt; and a &lt;code&gt;ModDate&lt;/code&gt;. On a clean bank export these timestamps are usually equal or within seconds of each other. On an edited document the gap reveals when the editing occurred — a statement dated Monday with a modification timestamp of Thursday means someone opened and saved the file in between. The size of the gap, the day of the week, and the relationship to the application submission date all become usable signals.&lt;/p&gt;

&lt;h2&gt;
  
  
  What an API Response Looks Like
&lt;/h2&gt;

&lt;p&gt;Here is a representative HTPBE response on a bank statement where the borrower used iLovePDF to remove two overdraft entries and adjust the closing balance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ck_4e2f9a1b-7c3d-4f8e-b5a2-9d1e6c0f8b4a"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_EDITING_TOOL_FINGERPRINT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_MULTIPLE_REVISION_LAYERS"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Temenos T24"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"producer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"iLovePDF"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1751760000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1752105600&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;creator: "Temenos T24"&lt;/code&gt; is a core banking platform used by mid-tier banks across Europe and Australia. &lt;code&gt;producer: "iLovePDF"&lt;/code&gt; is a consumer online PDF editor. iLovePDF, PDF24, Smallpdf, and similar consumer editors are not part of any documented bank statement pipeline — their producer string in a bank statement is a high-confidence anomaly. The &lt;code&gt;xref_count&lt;/code&gt; of 3 indicates three write sessions: the original generation and two subsequent saves. The modification timestamp trails the creation timestamp by four days.&lt;/p&gt;

&lt;p&gt;The verdict is &lt;code&gt;modified&lt;/code&gt; with &lt;code&gt;modification_confidence: "high"&lt;/code&gt;. That gives an underwriter the signals to decide whether to route for manual review, request a fresh download from the bank portal, or proceed.&lt;/p&gt;

&lt;h2&gt;
  
  
  The &lt;code&gt;inconclusive&lt;/code&gt; Signal in Lending Context
&lt;/h2&gt;

&lt;p&gt;Not every bank statement returns &lt;code&gt;modified&lt;/code&gt; or &lt;code&gt;intact&lt;/code&gt;. Some legitimate statements — particularly from smaller credit unions or non-standard banking apps — are generated in consumer software or exported via generic PDF print drivers, and return &lt;code&gt;inconclusive&lt;/code&gt;. That is not a failure verdict: it means the document was produced in software that does not identify itself as an institutional banking system, so structural integrity cannot be checked against a known baseline.&lt;/p&gt;

&lt;p&gt;In a lending context the meaning depends on the claimed institution. Major retail banks — Chase, Barclays, Commonwealth Bank, TD — all generate statements from server-side institutional PDF engines, so &lt;code&gt;inconclusive&lt;/code&gt; on a statement claimed to be from one of them is itself a signal worth escalating. The same verdict on a statement from a smaller institution where consumer-style export is normal can be treated as routine.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where It Fits in Your Underwriting Stack
&lt;/h2&gt;

&lt;p&gt;Plaid and similar open banking connectors pull transaction data directly from the bank’s API. They are the gold standard for income source-of-truth checks when the applicant has an account at a supported institution. They bypass the document layer entirely — and that is also their limit. Plaid supports a fraction of global financial institutions, and BNPL and MCA lenders frequently serve applicants whose banks are not connected, who have international income, or who hold multiple accounts. Those applicants submit PDF statements because open banking cannot reach them.&lt;/p&gt;

&lt;p&gt;Identity fraud platforms like Persona and Alloy confirm that the person submitting the application is who they claim to be — face matching, ID document checks, watchlist screening. They do not analyze the structural integrity of submitted financial documents.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://htpbe.tech/use-cases/lending" rel="noopener noreferrer"&gt;structural PDF forensics for alternative lenders&lt;/a&gt; layer fills the gap between document submission and open banking coverage. It does not replace Plaid when Plaid is available — it operates on the submitted PDF regardless of whether the borrower’s institution is connected to any open banking network.&lt;/p&gt;

&lt;p&gt;For a full breakdown of how document fraud shows up in income source-of-truth check workflows, see the &lt;a href="https://htpbe.tech/use-cases/fake-bank-statement-detection" rel="noopener noreferrer"&gt;bank statement fraud detection guide&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating the Check at Application Intake
&lt;/h2&gt;

&lt;p&gt;The check runs at document upload. Before the statement reaches income parsing, before a human underwriter touches it, before a credit model sees any figures from it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.htpbe.tech/v1/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$HTPBE_API_KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"url": "https://your-storage.example.com/statements/applicant-7821.pdf"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response is synchronous for most documents. For large statements, poll &lt;code&gt;GET /api/v1/result/{id}&lt;/code&gt; — typical analysis time is under three seconds. Store the &lt;code&gt;check_id&lt;/code&gt; against the application record. If the credit decision is later disputed, the forensic report is retrievable as a permanent audit trail showing exactly which structural signals triggered the hold.&lt;/p&gt;

&lt;p&gt;The full API reference and test scenarios are available at &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;the self-serve API&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  False Positives and Calibration
&lt;/h2&gt;

&lt;p&gt;The harder question for lending is not "how many forgeries does this catch" but "how often does it flag a clean document". A high false-positive rate is expensive: delayed approvals, customer friction, and a swelling manual-review queue. Calibration matters more than any individual forensic example.&lt;/p&gt;

&lt;p&gt;Real sources of false positives on bank statements include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A retail bank that saves incrementally on its first export, producing &lt;code&gt;xref_count &amp;gt; 1&lt;/code&gt; on an otherwise untouched file.&lt;/li&gt;
&lt;li&gt;A mail or document gateway that adds its own metadata layer at delivery, shifting the producer field away from the original generator.&lt;/li&gt;
&lt;li&gt;A customer who downloads the PDF, opens it in Preview or Adobe Reader to compress for email, and re-saves before uploading.&lt;/li&gt;
&lt;li&gt;A statement generated through a third-party aggregator pipeline — Plaid's downstream PDF render, a Yodlee export — where the producer field is the aggregator, not the bank.&lt;/li&gt;
&lt;li&gt;DMS or ECM pre-processing on the borrower's accountant's side, common for self-employed applicants who route financial documents through bookkeeping software.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;None of these are fraud, but each can produce a single modification marker in isolation. The calibration approach is to combine signals rather than treat any one as a verdict. A &lt;code&gt;modified&lt;/code&gt; outcome with one marker on a re-compressed file is a different population from &lt;code&gt;modified&lt;/code&gt; with &lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt; plus &lt;code&gt;xref_count = 3&lt;/code&gt; plus a four-day ModDate gap. Underwriting policy should weight the full response — &lt;code&gt;status&lt;/code&gt;, &lt;code&gt;modification_confidence&lt;/code&gt;, &lt;code&gt;modification_markers&lt;/code&gt;, the &lt;code&gt;producer&lt;/code&gt; string itself, and the claimed institution — rather than firing on marker presence alone. A practical default: manual review on &lt;code&gt;inconclusive&lt;/code&gt; for institutions with a known institutional baseline, and on &lt;code&gt;modified&lt;/code&gt; only when confidence is &lt;code&gt;high&lt;/code&gt; or two or more markers are present.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Does Not Catch
&lt;/h2&gt;

&lt;p&gt;Structural forensics detects modifications to existing documents. Two scenarios fall outside its scope.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Statements fabricated from scratch in the same software the bank uses.&lt;/strong&gt; If a borrower somehow builds a document using the exact same PDF generation stack as their bank — same library, same parameters — the structural layer will appear consistent. This attack requires technical knowledge most borrowers do not have, and is far less common than editing an existing export. For this pattern, cross-referencing with open banking data or direct account fraud detection remains the correct control.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Statements generated in consumer software legitimately.&lt;/strong&gt; Some smaller institutions and fintech neobanks generate statements via generic export tools, mobile print drivers, or third-party aggregators. For these, &lt;code&gt;inconclusive&lt;/code&gt; is the expected and correct verdict. The signal is only meaningful when the claimed institution has a known institutional PDF generation profile.&lt;/p&gt;

&lt;p&gt;Understanding these limits matters. The forensics layer is not a fraud oracle — it is a structural signal that increases confidence and reduces the human review burden on documents that carry structural evidence of alteration.&lt;/p&gt;

&lt;p&gt;For pay stub fraud detection in the same workflow, the patterns are similar — see the &lt;a href="https://htpbe.tech/use-cases/fake-pay-stub-detection" rel="noopener noreferrer"&gt;fake pay stub detection guide&lt;/a&gt; for how income document fraud across document types can be detected with the same API call. For the broader stack view, the &lt;a href="https://htpbe.tech/blog/kyc-vs-document-forensics-pillar" rel="noopener noreferrer"&gt;KYC vs. document forensics breakdown&lt;/a&gt; covers where this layer sits alongside identity checks.&lt;/p&gt;

&lt;h2&gt;
  
  
  Frequently Asked Questions
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;How can I tell if a bank statement PDF is fake?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Visual review catches sloppy fakes — wrong fonts, misaligned columns, implausible totals. Competent edits made with Adobe Acrobat or online editors look identical to the original on screen. The reliable signal lives in the file structure: the &lt;code&gt;producer&lt;/code&gt; field, the cross-reference (xref) chain, and the gap between &lt;code&gt;CreationDate&lt;/code&gt; and &lt;code&gt;ModDate&lt;/code&gt;. A statement with &lt;code&gt;producer: "iLovePDF"&lt;/code&gt; on a major retail bank’s document is a structural anomaly worth escalating.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What is producer mismatch in PDF forensics?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;producer&lt;/code&gt; field identifies the software that last saved a PDF. The &lt;code&gt;creator&lt;/code&gt; field identifies the software that originally generated it. When a bank emits a statement through its server-side engine (Temenos, Aspose, FIS) and the file later carries &lt;code&gt;producer: "Microsoft Excel"&lt;/code&gt; or an online editor signature, those two fields disagree in a way no documented bank distribution pipeline produces. That is producer mismatch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Can structural forensics catch a fabricated bank statement?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Partially. A statement built from scratch in Word or Excel returns &lt;code&gt;inconclusive&lt;/code&gt;, not &lt;code&gt;modified&lt;/code&gt; — there is no prior structure to compare against. For a document claiming to be from a top-tier retail bank, &lt;code&gt;inconclusive&lt;/code&gt; is itself a signal: real bank portals do not emit PDFs with a consumer-software producer. Fabrication from inside the bank’s own infrastructure (rare, requires platform access) is out of structural scope and needs an open-banking or out-of-band check.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Does Plaid replace bank statement forensics?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;No — they cover different applicants. Plaid pulls transaction data directly when the borrower’s institution is connected and the borrower consents. BNPL, MCA, and lenders serving international or unsupported institutions still receive PDF statements, and that is where structural forensics applies. The two layers are complementary.&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>fintech</category>
      <category>fraud</category>
      <category>api</category>
    </item>
    <item>
      <title>Vibe Coding Problems: What AI-Generated Code Gets Wrong</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Tue, 30 Jun 2026 10:00:39 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/vibe-coding-problems-what-ai-generated-code-gets-wrong-2c6j</link>
      <guid>https://dev.to/iurii_rogulia/vibe-coding-problems-what-ai-generated-code-gets-wrong-2c6j</guid>
      <description>&lt;p&gt;A few months ago, a founder sent me a repository and said: "The developer used AI to build it. It works in demos but breaks in production and we can't figure out why."&lt;/p&gt;

&lt;p&gt;I've seen this enough times now that I have a checklist before I open the code. Not because I want to be right, but because the patterns are so consistent it would be irresponsible to pretend otherwise. The codebase compiles. The tests pass. The structure looks vaguely like something a senior developer would produce. And yet it is still broken — just in ways that don't show up until you have real users, real data, or someone reading it carefully.&lt;/p&gt;

&lt;p&gt;This is not an argument against AI-assisted development. I use it myself. This is about what happens when an LLM drives the whole project with nobody steering — no architect in the loop, no one reviewing whether the fifth approach to authentication should have replaced the first four. The output looks like code. It is not, in any meaningful sense, a designed system.&lt;/p&gt;

&lt;h2&gt;
  
  
  Five Overlapping Ways to Do the Same Thing
&lt;/h2&gt;

&lt;p&gt;The most reliable tell is duplication — not the obvious kind, but the kind that happens when a model adds a feature, is asked about it again later in a different context, and adds it again without checking what already exists.&lt;/p&gt;

&lt;p&gt;Authentication is the canonical example. In a recent rescue project — a codebase under 30k lines — I counted five distinct authentication flows: one in middleware, one in individual page components, one duplicated across API routes, one wrapped in a client-side hook that did its own token check, and one experimental Passport.js setup in a folder no route imported anymore. Each was added at a different point in the conversation history. None was deliberately redundant. The model never went back and removed what it had made obsolete.&lt;/p&gt;

&lt;p&gt;The same happens with validation. There's a Zod schema at the API boundary, a manual check function three files away, and a third set of checks inside the database utility — written at different times, with slightly different rules. Which one is authoritative? Nobody knows, including the person who asked the AI to write them.&lt;/p&gt;

&lt;p&gt;This matters in production because when you fix a bug in one layer, the other layers still have the old behaviour. You patch the API validation and the database utility silently rejects inputs that should now be valid. Or the opposite. The system is not one thing — it's an archaeological site of previous attempts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tests That Assert the Wrong Thing
&lt;/h2&gt;

&lt;p&gt;A passing test suite is not evidence that the code is correct. In a vibe-coded repo, it is often evidence of the opposite.&lt;/p&gt;

&lt;p&gt;Here's the shape of the problem:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;it&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;calculates the order total&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;total&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;calculateOrderTotal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;items&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;expect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;total&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toBe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That test passes. It will always pass, because &lt;code&gt;calculateOrderTotal&lt;/code&gt; does return a number — it just returns the wrong number. The VAT is not applied. Discounts are double-counted. The shipping is hardcoded to zero. None of this is caught because the test was written to verify that the function runs without throwing, not to verify that it produces a correct result.&lt;/p&gt;

&lt;p&gt;When a model writes tests for code it just wrote, it tends to test the implementation rather than the behaviour. It knows what the function does and it asserts that. What it doesn't know — because nobody told it — is what the function &lt;em&gt;should&lt;/em&gt; do in edge cases, error conditions, or inputs that don't look like the happy path.&lt;/p&gt;

&lt;p&gt;The result is a CI pipeline that's permanently green on a system with incorrect business logic. The tests are not lying. They are asking the wrong question.&lt;/p&gt;

&lt;h2&gt;
  
  
  Everything in &lt;code&gt;package.json&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Open the dependencies and you will find a history of the model's training data.&lt;/p&gt;

&lt;p&gt;There is &lt;code&gt;axios&lt;/code&gt; for HTTP. There is also &lt;code&gt;node-fetch&lt;/code&gt;. There is also the native &lt;code&gt;fetch&lt;/code&gt; with a polyfill that's not needed in the Node version being used. There are three date libraries: &lt;code&gt;moment&lt;/code&gt; (because tutorials used it for years), &lt;code&gt;date-fns&lt;/code&gt; (because someone told the model moment was deprecated), and &lt;code&gt;dayjs&lt;/code&gt; (because another prompt mentioned it was lighter). They are all installed. Two of them are used in different files. One is not used at all but was added to &lt;code&gt;package.json&lt;/code&gt; during a refactor that never happened.&lt;/p&gt;

&lt;p&gt;The same with utilities. &lt;code&gt;lodash&lt;/code&gt; and &lt;code&gt;underscore&lt;/code&gt;. &lt;code&gt;uuid&lt;/code&gt; and &lt;code&gt;nanoid&lt;/code&gt; and &lt;code&gt;crypto.randomUUID()&lt;/code&gt; all used in different places for the same purpose. A PDF library installed alongside a different PDF library, each used in one file.&lt;/p&gt;

&lt;p&gt;Unused dependencies are not just an aesthetic problem. They expand the attack surface for dependency vulnerabilities. They bloat the build. They make it harder to understand what the system actually uses. And they are almost impossible to clean up confidently without reading every import in the codebase — because there is no documentation saying why any of them were added.&lt;/p&gt;

&lt;p&gt;The proportion is consistent enough to be predictable. The last three vibe-coded repos I audited had between 25% and 35% of their direct dependencies entirely unused. In one of them, removing the unused entries cut the production Docker image by roughly 40%.&lt;/p&gt;

&lt;h2&gt;
  
  
  Security Patterns Copied from Tutorials
&lt;/h2&gt;

&lt;p&gt;The model has processed an enormous amount of tutorial code. Tutorials are written to be easy to follow, which means they cut security corners that a production system cannot afford.&lt;/p&gt;

&lt;p&gt;I find access tokens stored in &lt;code&gt;localStorage&lt;/code&gt; because that is how most OAuth tutorials store them — it is easy to read from JavaScript, which is exactly the problem. I find CORS configured as &lt;code&gt;Access-Control-Allow-Origin: *&lt;/code&gt; on an API that handles authenticated requests, because that is the configuration that makes the demo work without the CORS error. I find SQL built by string concatenation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;query&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`SELECT * FROM orders WHERE user_id = '&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;'`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is an injection vulnerability. The model has seen this pattern in thousands of tutorials and Stack Overflow answers where the point was to demonstrate something else, not to demonstrate safe query construction.&lt;/p&gt;

&lt;p&gt;The one that I find most alarming, because it is the hardest to reverse: API keys committed into the codebase, sometimes in client-side bundles. The model puts configuration where it has seen configuration put before — directly in the code, or in a &lt;code&gt;.env&lt;/code&gt; file that was never added to &lt;code&gt;.gitignore&lt;/code&gt;. The key is now in the git history. Rotating it is mandatory. If it is a third-party service key, you must assume it was seen.&lt;/p&gt;

&lt;p&gt;I am not describing carelessness. I am describing what happens when a system that has learned from millions of examples of code-written-for-explanation is asked to produce code-that-must-actually-be-secure. These are different tasks. The model was not told they were different.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Schema That Disagrees With the Code
&lt;/h2&gt;

&lt;p&gt;Database drift is the slowest-moving problem and often the most expensive to fix.&lt;/p&gt;

&lt;p&gt;The migration files exist. They do not reflect the current state of the database, because migrations were generated at different stages of development and some were edited directly rather than through a new migration. Others were run manually on the production database by someone who was in a hurry. The migration history and the actual schema are no longer the same document.&lt;/p&gt;

&lt;p&gt;The ORM models have columns that do not exist. The database has columns that the ORM does not know about. Foreign keys are defined in the code but not enforced at the database level — or enforced in the database but not declared in the ORM, so queries that cross that boundary fail in unpredictable ways. Indexes were added at the model's suggestion but are on columns that are never actually queried, while the columns that appear in every &lt;code&gt;WHERE&lt;/code&gt; clause have no index.&lt;/p&gt;

&lt;p&gt;The practical consequence: you cannot do a fresh setup from the migration history alone. The schema and the code are permanently entangled with the specific database instance that was used during development. Moving to a new environment requires manual reconstruction. Rolling back a bad migration is not possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  A README That Describes a Different Product
&lt;/h2&gt;

&lt;p&gt;The documentation was written by the same model that built the app — and it imagines what the finished version would look like.&lt;/p&gt;

&lt;p&gt;The README describes role-based permissions that exist only as placeholder comments. The API reference lists endpoints that were planned but never added. The "quick start" fails on step three because a dependency was renamed and the docs weren't. Two of the environment variables it lists have different names than the actual code reads.&lt;/p&gt;

&lt;p&gt;This is not the usual kind of documentation drift, where someone wrote accurate docs and the code moved on. The model produces confident-sounding documentation based on the intended design, not the actual implementation. It describes the system it was asked to build, not the system it built. A new developer setting up the project is working from a description of software that does not exist.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Happens
&lt;/h2&gt;

&lt;p&gt;None of this is random. It follows directly from how large language models work.&lt;/p&gt;

&lt;p&gt;The model's training data is dominated by tutorial code, Stack Overflow answers, and open-source repositories in various states of completion. Tutorial code is written to illustrate a point, not to be maintained. Stack Overflow answers solve the specific question asked, without regard for what surrounds them. Open-source repositories include everything from exemplary production code to abandoned weekend projects — and the model cannot easily distinguish between them.&lt;/p&gt;

&lt;p&gt;More structurally: the model optimises for plausible-looking output. Each response is evaluated on whether it seems correct, not on whether it integrates cleanly with the thirty responses that preceded it. There is no persistent memory of previous architectural decisions. When you ask it to add authentication at the start of a project and ask again halfway through, it adds authentication again — and there is no mechanism that asks whether the first attempt should be removed.&lt;/p&gt;

&lt;p&gt;There is also no one in the role of architect. An experienced developer making these decisions would delete the previous approach before adding a new one. They would read the test and ask whether it is actually testing the right thing. They would notice that three HTTP clients are installed and remove two of them. The model does none of this unless explicitly prompted, and even then it only does it locally — for the current file or function, not the system as a whole.&lt;/p&gt;

&lt;p&gt;The result is not incompetent code. It is code that is locally coherent but globally incoherent — each piece makes sense in isolation, and the whole doesn't hold together.&lt;/p&gt;

&lt;p&gt;There is also a quieter cause that is harder to write about because it concerns hiring decisions rather than technology. Many of the projects I see arrive in this state because the company used AI assistance to substitute for a senior engineering review that nobody on the team was qualified to do. The model was not asked to draft a feature that a senior engineer would then evaluate — it was the only entity in the loop with any opinion about software design. That arrangement produces the codebases described in this article more reliably than any specific prompt or tool choice. The fix is not better prompts. The fix is putting an experienced developer back into the decision path, even part-time.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Cleanup Actually Looks Like
&lt;/h2&gt;

&lt;p&gt;
  slug="rescue-projects"&lt;br&gt;
  text="Inherited a vibe-coded codebase? I audit, stabilise, and fix the patterns described above — without a rewrite that costs you twice."&lt;br&gt;
/&amp;gt;&lt;/p&gt;

&lt;p&gt;The sequence I follow when I take one of these projects on:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit first.&lt;/strong&gt; Read the codebase before touching it. Map what's actually there: every auth mechanism, every validation layer, every dependency that's imported somewhere. The goal is to understand the actual system, not the intended one. This is also what I do in a &lt;a href="https://iurii.rogulia.fi/services/technical-due-diligence" rel="noopener noreferrer"&gt;technical due diligence&lt;/a&gt; engagement before a client commits to acquiring or building on top of an existing codebase.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dead code removal.&lt;/strong&gt; Once you have the map, remove what is not used. This is tedious and requires confidence — you need to be sure a dependency is genuinely unused before removing it. Tools like &lt;code&gt;depcheck&lt;/code&gt; help, but manual review is necessary for anything that touches shared utilities.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Real tests.&lt;/strong&gt; Not tests that assert the code runs — tests that assert it produces correct results for known inputs, including edge cases and error conditions. This usually means reading the actual business logic and asking: what is this function supposed to do, and what are the cases it can fail? Write those tests before refactoring anything.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dependency cull.&lt;/strong&gt; Consolidate to one HTTP client, one date library, one ID generation utility. Document why each remaining dependency exists.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Security pass.&lt;/strong&gt; Tokens out of &lt;code&gt;localStorage&lt;/code&gt;. CORS locked to actual origins. SQL through parameterised queries or an ORM with no raw interpolation. Secrets out of the repository and into an actual secrets manager. If anything was committed, rotate it.&lt;/p&gt;

&lt;p&gt;This is not a rewrite. The underlying application logic is usually salvageable — the business rules, the data model, most of the UI. What needs replacing is the infrastructure that holds it together.&lt;/p&gt;

&lt;p&gt;I emphasise this because the first instinct of most founders, when they realise what they have, is to throw it away and start again. That instinct is almost always wrong. A rewrite means rebuilding everything the AI happened to get right alongside everything it got wrong, paying for it a second time, and arriving — months later — at a system that has not yet been tested against real users. Stabilising what exists is usually two to four times cheaper than starting over, and you keep the parts that already work. The codebase is in a worse state than it looks. It is rarely in a worse state than starting from zero.&lt;/p&gt;

&lt;p&gt;
  items={[&lt;br&gt;
    {&lt;br&gt;
      q: "What is a vibe-coded codebase?",&lt;br&gt;
      a: "A codebase built primarily by prompting an LLM with no experienced developer reviewing the output, making architectural decisions, or enforcing consistency. The code compiles and demos work, but the system is globally incoherent: duplicate auth flows, tests that assert the wrong thing, security patterns copied from tutorials, and a schema that has drifted from the migration history.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "Should I rewrite a vibe-coded codebase or fix it?",&lt;br&gt;
      a: "Almost always fix it. The underlying business logic and data model are usually salvageable. A rewrite means rebuilding everything the AI happened to get right alongside everything it got wrong, paying for it twice, and arriving months later at a system with no real-user battle-testing. Stabilising what exists is typically two to four times cheaper than starting over.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "What are the most common security problems in AI-generated code?",&lt;br&gt;
      a: "Three patterns I find in nearly every vibe-coded codebase: access tokens stored in localStorage (readable by any JavaScript on the page); CORS configured as Access-Control-Allow-Origin: * on authenticated APIs; and SQL built by string concatenation instead of parameterised queries. API keys committed to git history are also common — if found, rotate them immediately.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "How do I audit a codebase for vibe-coded problems?",&lt;br&gt;
      a: "Read before touching anything. Map every authentication mechanism, every validation layer, and every dependency that is actually imported somewhere. Use depcheck to identify unused packages, then manually verify anything that touches shared utilities. Look for duplicate implementations of the same concept. The goal is to understand the actual system before assuming the README describes it correctly.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "How do I prevent these problems when using AI to write code?",&lt;br&gt;
      a: "Put an experienced developer back into the decision path, even part-time. An LLM should draft features that a senior engineer evaluates and integrates, not drive the whole project. Practically: use a CLAUDE.md or equivalent to enforce architectural rules at the prompt level, run lint gates on every commit, and never let the model add a dependency or auth flow without a human reviewing whether the existing one should be removed first.",&lt;br&gt;
    },&lt;br&gt;
  ]}&lt;br&gt;
/&amp;gt;&lt;/p&gt;




&lt;p&gt;If what I've described sounds like the codebase on your laptop — or the one a developer just handed you — that's what my &lt;a href="https://iurii.rogulia.fi/services/rescue-projects" rel="noopener noreferrer"&gt;rescue projects&lt;/a&gt; service is for. I'll tell you honestly what's there and what it'll take to fix it.&lt;/p&gt;

&lt;p&gt;For the broader argument about why AI amplifies existing skill gaps rather than closing them, see &lt;a href="https://iurii.rogulia.fi/blog/ai-coding-senior-vs-junior" rel="noopener noreferrer"&gt;AI Coding Is Wind&lt;/a&gt;. For the practical tools that prevent these patterns from forming in the first place — &lt;code&gt;CLAUDE.md&lt;/code&gt;, lint gates, stop prompts — read &lt;a href="https://iurii.rogulia.fi/blog/ai-agent-codebase-prompts" rel="noopener noreferrer"&gt;Prompts That Keep an AI Agent From Wrecking Your Codebase&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>technicaldebt</category>
      <category>architecture</category>
      <category>bestpractices</category>
    </item>
    <item>
      <title>PDF Digital Signature Chain: Detect PDFs Signed Then Modified</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Tue, 30 Jun 2026 10:00:37 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/pdf-digital-signature-chain-detect-pdfs-signed-then-modified-1npg</link>
      <guid>https://dev.to/iurii_rogulia/pdf-digital-signature-chain-detect-pdfs-signed-then-modified-1npg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/pdf-signature-chain-verification" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The PDF digital signature chain is a cryptographic lock: it computes a hash over a specific byte range of the file and stores that hash inside the document. If any byte within that range changes, the signature becomes invalid. This is what makes digitally signed PDFs trustworthy — in theory.&lt;/p&gt;

&lt;p&gt;In practice, the PDF specification allows content to be appended after a signature without invalidating it. This is intentional: it supports legitimate workflows such as adding a second signature, appending a certification, or adding long-term validation (LTV) data. But it also creates a bypass: an attacker can append modified content after the signature, and the signature itself remains cryptographically valid. The signed portion did not change. The document did.&lt;/p&gt;

&lt;p&gt;This is how a PDF gets signed then modified without triggering a certificate-level alert — and it is one of two high-stakes tampering patterns that HTPBE detects at &lt;code&gt;certain&lt;/code&gt; confidence.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a PDF digital signature actually covers
&lt;/h2&gt;

&lt;p&gt;When a signing platform — DocuSign, Adobe Sign, HelloSign, or any PDF-signing library — signs a document, it records a &lt;code&gt;/ByteRange&lt;/code&gt; entry in the signature dictionary. This entry specifies exactly which bytes of the file the cryptographic hash covers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;/ByteRange [0 840 1256 4390]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This means bytes 0–840 and bytes 1256–5646 are signed. The gap (bytes 841–1255) is reserved for the signature value itself — it cannot be signed because it contains the signature.&lt;/p&gt;

&lt;p&gt;After signing, the file ends at byte 5646. If content is appended, the file grows beyond that boundary. The signature covers bytes 0–5646. Everything appended after byte 5646 is unsigned.&lt;/p&gt;

&lt;p&gt;An e-signature platform showing “signature valid” on such a document is not lying: the signed byte range is intact. But the document you are reading is larger than what the signer approved.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to detect a PDF signed then modified: two patterns at &lt;code&gt;certain&lt;/code&gt; confidence
&lt;/h2&gt;

&lt;p&gt;HTPBE raises two markers at &lt;code&gt;certain&lt;/code&gt; confidence for signature-related tampering. Certain is the highest confidence level in the system. Unlike &lt;code&gt;high&lt;/code&gt;-confidence markers — which indicate strong but circumstantial evidence — &lt;code&gt;certain&lt;/code&gt; markers are binary. Either the bytes changed or they did not. Either the signature object exists or evidence of its removal is in the file.&lt;/p&gt;

&lt;h3&gt;
  
  
  Pattern 1: Modifications after digital signature
&lt;/h3&gt;

&lt;p&gt;The detection reads the signature dictionary’s &lt;code&gt;/ByteRange&lt;/code&gt;, calculates the covered range, and compares it against the actual file size. If there are xref entries pointing to objects outside the signed byte range, the document was modified after signing.&lt;/p&gt;

&lt;p&gt;Structural evidence that confirms this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The file has an incremental update (a new xref section appended after the last signature event)&lt;/li&gt;
&lt;li&gt;The new xref section contains object references that fall outside the &lt;code&gt;/ByteRange&lt;/code&gt; of the existing signature&lt;/li&gt;
&lt;li&gt;The signature itself remains internally valid — only the unsigned append is flagged&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This pattern appears when someone takes a signed contract, appends a modified page (or replaces a referenced object through an incremental update), and re-distributes the file. The signing platform’s signature UI may still show “valid” because the platform validates the byte range it knows about, not the full document.&lt;/p&gt;

&lt;p&gt;The API response for this pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"certain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Modifications after digital signature"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modifications_after_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signature_removed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Pattern 2: Signature removed
&lt;/h3&gt;

&lt;p&gt;The second pattern is removal. A signed PDF has its signature object stripped and the file is saved as a new document. The goal is to produce a file that does not show a signature at all — removing the constraint of an immutable signed payload.&lt;/p&gt;

&lt;p&gt;This is harder to clean up completely. The PDF specification’s incremental-update model means prior xref entries leave traces in the file structure. When a signature object is removed, the object slot persists in the xref table as a free entry. Additionally, HTPBE’s ghost-info scan inspects object streams (&lt;code&gt;ObjStm&lt;/code&gt;) for residual metadata — Info dictionaries, annotation references, and signature field stubs that were present before removal but not cleanly erased.&lt;/p&gt;

&lt;p&gt;The structural evidence for removal:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;code&gt;/SigFlags&lt;/code&gt; entry in the AcroForm dictionary (or a residual reference to one) indicates the document had a signature form field&lt;/li&gt;
&lt;li&gt;The corresponding signature object is absent or marked free in the xref&lt;/li&gt;
&lt;li&gt;Object streams preserve the original Info dictionary with signing-tool metadata that contradicts the document’s current state&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;API response for signature removal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"certain"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Signature removed from document"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modifications_after_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"signature_removed"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why these are &lt;code&gt;certain&lt;/code&gt;, not &lt;code&gt;high&lt;/code&gt;
&lt;/h2&gt;

&lt;p&gt;Most forensic markers in this system produce &lt;code&gt;high&lt;/code&gt; confidence because they rely on inferential evidence: a timestamp inconsistency, a producer mismatch, an unexpected xref count. These signals are reliable but not deterministic. A single &lt;code&gt;high&lt;/code&gt; marker is suspicious. Multiple &lt;code&gt;high&lt;/code&gt; markers together constitute a strong case.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;certain&lt;/code&gt; confidence works differently. The underlying evidence is structural and self-consistent:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For modifications after signing:&lt;/strong&gt; the signature’s byte range is embedded in the document. The file size is a fact. Whether xref entries exist outside the signed range is an arithmetic check. There is no ambiguity.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;For signature removal:&lt;/strong&gt; the PDF xref model records object lifecycle events. A free-entry slot where a signature object previously lived, combined with residual metadata that references it, is evidence that does not require interpretation. The object existed and was removed.&lt;/p&gt;

&lt;p&gt;Neither pattern requires comparing the document against an external original. The file carries its own forensic record.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating PDF digital signature chain fraud detection into a contract workflow
&lt;/h2&gt;

&lt;p&gt;The check is a single API call. Upload the PDF to accessible storage (S3, Cloudflare R2, Google Cloud Storage — presigned URLs work), then submit the URL:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.htpbe.tech/v1/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"url": "https://your-storage.example.com/contracts/executed-agreement-2024.pdf"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response comes back immediately with a check ID. Retrieve the full result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.htpbe.tech/v1/result/&lt;span class="o"&gt;{&lt;/span&gt;check_id&lt;span class="o"&gt;}&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a contract workflow, the fields to branch on are &lt;code&gt;modifications_after_signature&lt;/code&gt; and &lt;code&gt;signature_removed&lt;/code&gt;. Both being &lt;code&gt;false&lt;/code&gt; on a signed document (where &lt;code&gt;has_digital_signature: true&lt;/code&gt;) means the signed byte range covers the full document and no removal evidence was found:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;SignatureCheckResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;intact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;modified&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inconclusive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;modification_confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;certain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;none&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;modification_markers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;has_digital_signature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;modifications_after_signature&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;signature_removed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;xref_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;assessSignedContract&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;SignatureCheckResult&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;valid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tampered&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unsigned&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Certain-confidence tampering: reject immediately&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;modifications_after_signature&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;signature_removed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;tampered&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Signed and structurally intact&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;has_digital_signature&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;intact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;valid&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// No signature present — escalate for manual review in a workflow&lt;/span&gt;
  &lt;span class="c1"&gt;// that expects signed documents&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;has_digital_signature&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;unsigned&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// Modified (high confidence) but no signature-specific markers&lt;/span&gt;
  &lt;span class="c1"&gt;// Producer mismatch, incremental updates without signature bypass, etc.&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Testing without real contracts
&lt;/h3&gt;

&lt;p&gt;Test keys accept mock URLs that return deterministic responses. For signature-chain testing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Returns modifications_after_signature: true&lt;/span&gt;
https://api.htpbe.tech/v1/test/modified-medium.pdf

&lt;span class="c"&gt;# Returns signature_removed: true&lt;/span&gt;
https://api.htpbe.tech/v1/test/signature-removed.pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Use these in your CI test suite to cover both &lt;code&gt;certain&lt;/code&gt;-confidence branches without consuming production quota.&lt;/p&gt;

&lt;h2&gt;
  
  
  How this complements DocuSign and Adobe Sign validation
&lt;/h2&gt;

&lt;p&gt;Platforms like DocuSign and Adobe Sign validate the signing certificate chain and timestamp integrity at the time of signing. That is a different layer from what HTPBE checks.&lt;/p&gt;

&lt;p&gt;DocuSign’s validation answers: “Was this document signed with a valid certificate at the claimed time?”&lt;/p&gt;

&lt;p&gt;HTPBE’s analysis answers: “Has this document been modified since it was signed, or has the signature been removed?”&lt;/p&gt;

&lt;p&gt;These are complementary checks. A document can pass DocuSign certificate validation while still containing unsigned content appended after the fact — because DocuSign validates the byte range it signed, not the file as you receive it now.&lt;/p&gt;

&lt;p&gt;The practical scenario where both layers matter: an executed agreement is downloaded from a signing platform, modified by one party (additional clause appended, a payment term changed via an incremental update), and re-submitted as the “final executed version.” DocuSign’s certificate is intact. HTPBE surfaces the post-sign modification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Certificate trust validation is out of scope.&lt;/strong&gt; HTPBE does not validate that the signing certificate comes from a trusted CA, is within its validity period, or chains to a known root. That is the e-signature platform’s job. HTPBE operates on structural evidence, not PKI.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;LTV annotations are not flagged.&lt;/strong&gt; Long-Term Validation (LTV) data — certificate revocation records, timestamp tokens — is legitimately appended to signed PDFs after the signing event. HTPBE accounts for this and does not flag LTV appends as modifications after signature.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Counter-signatures on the same byte range.&lt;/strong&gt; A workflow where a second party adds their signature to an already-signed document creates a valid multi-signature PDF with an incremental update. HTPBE distinguishes this from content modification because the appended xref section contains a new signature object, not arbitrary content changes. This is treated as &lt;code&gt;intact&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Clean removal with full file rewrite.&lt;/strong&gt; If an attacker takes a signed PDF and rewrites the entire file from scratch — rebuilding the xref table from scratch, regenerating all object streams, and removing all structural traces of the signature — the ghost-info scan may not surface residual evidence. This requires deliberate, expert-level counter-forensic work and is not the profile of typical contract fraud.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who should use this check
&lt;/h2&gt;

&lt;p&gt;This check is most relevant for teams that receive executed contracts, certified documents, or signed legal agreements from counterparties and need to check post-receipt integrity.&lt;/p&gt;

&lt;p&gt;Specific workflows where it applies:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Legal operations and contract management platforms&lt;/strong&gt; — checking that executed agreements received from opposing counsel or vendors match what was sent for signing. See how HTPBE integrates into &lt;a href="https://htpbe.tech/use-cases/contract-tamper-detection" rel="noopener noreferrer"&gt;legal contract workflows&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Financial services compliance&lt;/strong&gt; — loan agreements, insurance policy documents, and disclosure forms often carry digital signatures. A compliance audit that includes structural fraud detection of signed documents catches tampering that certificate validation alone misses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;M&amp;amp;A and due diligence workflows&lt;/strong&gt; — document rooms accumulate hundreds of executed agreements. Running batch checks through the &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;PDF signature fraud detection API&lt;/a&gt; surfaces any that were modified after signing before they influence deal terms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;E-signature platform integrations&lt;/strong&gt; — if you build on top of DocuSign or Adobe Sign and store executed documents in your own system, adding a structural check at the point of storage creates an immutable forensic record for each document from the moment it enters your custody.&lt;/p&gt;

&lt;p&gt;The check is fast enough to run synchronously at document intake: average analysis time is under three seconds for contracts up to 10 MB.&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>api</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Health Check Endpoint in Node.js: Liveness vs Readiness</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Mon, 29 Jun 2026 10:00:38 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/health-check-endpoint-in-nodejs-liveness-vs-readiness-418p</link>
      <guid>https://dev.to/iurii_rogulia/health-check-endpoint-in-nodejs-liveness-vs-readiness-418p</guid>
      <description>&lt;p&gt;Your load balancer is routing traffic to a server whose database connection pool is exhausted. Docker restarted a container that never finished its startup migrations. Kubernetes replaced a healthy pod because the liveness probe hit an endpoint that returned 503 on a transient Redis timeout.&lt;/p&gt;

&lt;p&gt;All of these happen because &lt;code&gt;/health&lt;/code&gt; returned the wrong thing — or because nobody designed it carefully enough.&lt;/p&gt;

&lt;p&gt;A health check endpoint is not a ping. It is the interface between your application and the infrastructure that decides whether your application lives or dies, receives traffic or gets replaced. Getting this interface wrong costs you incidents. Getting it right costs you an afternoon.&lt;/p&gt;

&lt;h2&gt;
  
  
  Liveness vs Readiness vs Startup
&lt;/h2&gt;

&lt;p&gt;Kubernetes formalised a distinction that applies to any containerised workload:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Liveness probe&lt;/strong&gt; — "Is this process still running correctly, or has it become a zombie?" If the liveness probe fails, the container is restarted. The question is about process health, not dependency health. If your app is live but Postgres is down, you do not want the container restarted — you want it to stop receiving traffic until Postgres recovers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Readiness probe&lt;/strong&gt; — "Is this instance ready to serve requests right now?" If the readiness probe fails, the container is removed from the load balancer rotation. Traffic stops coming to it. The container keeps running. When the probe passes again, traffic resumes. This is the correct mechanism for handling a temporarily unavailable database or Redis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Startup probe&lt;/strong&gt; — "Has the application finished initialising?" Some apps take 10–30 seconds on boot — running migrations, warming caches, establishing connection pools. The startup probe gives you time to do this without triggering false liveness failures. Once the startup probe passes, liveness and readiness probes take over.&lt;/p&gt;

&lt;p&gt;In Docker Compose or Docker Swarm without Kubernetes, you get a single &lt;code&gt;HEALTHCHECK&lt;/code&gt; directive. The semantics are simpler: healthy or unhealthy. If a container is unhealthy for a certain number of consecutive checks, Docker will restart it — there is no "remove from load balancer rotation" equivalent. This is an important difference: in Docker Compose, a failed healthcheck always means a restart, not a graceful drain. The consequence: if you check external dependencies (Postgres, Redis) in your Docker healthcheck and those dependencies go down, your container will be restarted — even though a restart cannot fix a database outage. Design accordingly: set &lt;code&gt;retries&lt;/code&gt; high (3–5) and &lt;code&gt;interval&lt;/code&gt; long (30s+) to tolerate transient failures without triggering unnecessary restarts. Save the aggressive dependency checking for your monitoring system.&lt;/p&gt;

&lt;p&gt;The practical mapping:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Probe&lt;/th&gt;
&lt;th&gt;What it checks&lt;/th&gt;
&lt;th&gt;Failure action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Liveness&lt;/td&gt;
&lt;td&gt;Process is alive, not deadlocked&lt;/td&gt;
&lt;td&gt;Restart container&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Readiness&lt;/td&gt;
&lt;td&gt;Dependencies reachable, app ready for load&lt;/td&gt;
&lt;td&gt;Remove from load balancer rotation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Startup&lt;/td&gt;
&lt;td&gt;App initialisation complete&lt;/td&gt;
&lt;td&gt;Delay liveness/readiness probes&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What to Check
&lt;/h2&gt;

&lt;p&gt;A useful health endpoint checks the things your app needs to serve requests correctly. For a typical Node.js API backed by Postgres and Redis:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Database connectivity&lt;/strong&gt; — a lightweight query that exercises the connection pool. Not a &lt;code&gt;SELECT 1&lt;/code&gt; to the database server directly, but through your ORM/connection pool, so you detect pool exhaustion and misconfigured connection strings, not just network reachability.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Redis connectivity&lt;/strong&gt; — a &lt;code&gt;PING&lt;/code&gt; command. If Redis is down and you depend on it for caching or session state, you are degraded. If you depend on it for rate limiting that gates all requests, you may be unhealthy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background job queue health&lt;/strong&gt; — include this in readiness only if serving HTTP traffic directly depends on queue capacity. For example: if your API enqueues jobs and immediately returns a response that assumes the job will be processed, a backed-up or stuck queue is a readiness concern. If jobs run in the background independently of request handling, queue health belongs in &lt;code&gt;/metrics&lt;/code&gt;, not &lt;code&gt;/health&lt;/code&gt; — a flooded queue should trigger an alert, not take your API offline. In vatnode, queue health lives in metrics: the API can accept Stripe webhooks and return 200 even if workers are temporarily stuck; the jobs will drain once workers recover.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Disk space&lt;/strong&gt; — optional but useful in containerised environments where logs or temporary files can fill a volume. A simple &lt;code&gt;df&lt;/code&gt; check can prevent a class of incidents where the container fills its storage and starts failing writes silently.&lt;/p&gt;

&lt;p&gt;What not to check in a health endpoint:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;External third-party APIs (Stripe, Mailgun, VIES). Their transient failures should not take your service offline. Handle them with circuit breakers at the call site.&lt;/li&gt;
&lt;li&gt;Business logic assertions. Health is an infrastructure concern, not a data consistency check.&lt;/li&gt;
&lt;li&gt;Anything that takes more than a second to complete under normal conditions.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Response Format
&lt;/h2&gt;

&lt;p&gt;A health endpoint that just returns &lt;code&gt;200 OK&lt;/code&gt; with no body is better than nothing, but barely. A useful response tells you &lt;em&gt;what&lt;/em&gt; is healthy, so that when something breaks you know what to investigate.&lt;/p&gt;

&lt;p&gt;Here is the structure I use:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// types/health.ts&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;ComponentHealth&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;error&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;HealthResponse&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;uptime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// seconds&lt;/span&gt;
  &lt;span class="nl"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="nl"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;?:&lt;/span&gt; &lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;status&lt;/code&gt; at the top level is a rollup. If all checks pass, it is &lt;code&gt;ok&lt;/code&gt;. If non-critical checks fail (Redis is down but the app can serve cached data from memory), it is &lt;code&gt;degraded&lt;/code&gt;. If critical checks fail (database unreachable), it is &lt;code&gt;unhealthy&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;version&lt;/code&gt; field is your current deployment version or git SHA. This is extremely useful when debugging — it lets you verify that the instance you are looking at is actually running the code you just deployed.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;uptime&lt;/code&gt; field catches restart loops. An instance with 30 seconds of uptime that is supposed to have been running for days has restarted recently.&lt;/p&gt;

&lt;h2&gt;
  
  
  Implementing It in Hono
&lt;/h2&gt;

&lt;p&gt;
  slug="mvp-development"&lt;br&gt;
  text="Healthchecks, structured logging, and graceful degradation are part of every production system I deliver. If you need a developer who treats reliability as a first-class requirement, let's talk."&lt;br&gt;
/&amp;gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// routes/health.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Hono&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;hono&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;sql&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;drizzle-orm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/redis&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderQueue&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/queues/order-queue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="kd"&gt;type&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;HealthResponse&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;ComponentHealth&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/types/health&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;health&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Hono&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;PROCESS_START&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkDatabase&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;race&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sql&lt;/span&gt;&lt;span class="s2"&gt;`SELECT 1`&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;timeout&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unknown error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkRedis&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;race&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ping&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
      &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;never&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;timeout&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;PONG&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;throw&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`unexpected response: &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;response&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Redis down = degraded, not unhealthy (depends on your app)&lt;/span&gt;
      &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unknown error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;checkQueue&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;ComponentHealth&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;race&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
      &lt;span class="nx"&gt;orderQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getJobCounts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;waiting&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;active&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;never&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;reject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;timeout&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt; &lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;)),&lt;/span&gt;
    &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="c1"&gt;// Thresholds are system-specific — calibrate against your normal throughput.&lt;/span&gt;
    &lt;span class="c1"&gt;// A high-volume queue may have 1000 waiting jobs as steady state;&lt;/span&gt;
    &lt;span class="c1"&gt;// a low-volume queue may flag 10 failed jobs as a problem.&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;isHealthy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;failed&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;waiting&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;isHealthy&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;start&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt; &lt;span class="k"&gt;instanceof&lt;/span&gt; &lt;span class="nb"&gt;Error&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;message&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unknown error&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// This is the readiness-style check — full dependency verification.&lt;/span&gt;
&lt;span class="c1"&gt;// Wire it to /health/ready (Kubernetes) or /health (Caddy/monitoring).&lt;/span&gt;
&lt;span class="c1"&gt;// Docker's container healthcheck should call /health/live instead.&lt;/span&gt;
&lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/health/ready&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;allSettled&lt;/span&gt;&lt;span class="p"&gt;([&lt;/span&gt;
    &lt;span class="nf"&gt;checkDatabase&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;checkRedis&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="nf"&gt;checkQueue&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="p"&gt;]);&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;db_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fulfilled&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;database&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;
      &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;check threw&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;redis_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fulfilled&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;
      &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;check threw&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;queue_result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
    &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;fulfilled&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
      &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;
      &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;latencyMs&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;error&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;check threw&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// Determine overall status&lt;/span&gt;
  &lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="na"&gt;overallStatus&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HealthStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;db_result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;overallStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;redis_result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;queue_result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;overallStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;degraded&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HealthResponse&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;overallStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;APP_VERSION&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unknown&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;uptime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;PROCESS_START&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="na"&gt;checks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;database&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;db_result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;redis_result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;queue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;queue_result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="c1"&gt;// 200 for ok and degraded — load balancer should still route traffic&lt;/span&gt;
  &lt;span class="c1"&gt;// 503 for unhealthy — remove from rotation&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;httpStatus&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;overallStatus&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unhealthy&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;httpStatus&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;health&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  HTTP Status Codes: Why 200 for Degraded
&lt;/h2&gt;

&lt;p&gt;This surprises people: returning &lt;code&gt;200&lt;/code&gt; for a degraded service is intentional — but it depends on what your load balancer does with the response.&lt;/p&gt;

&lt;p&gt;The reasoning: if you return &lt;code&gt;503&lt;/code&gt; when Redis is temporarily unavailable and your app can still serve most requests from Postgres, you just took yourself out of the load balancer rotation. All your instances are likely degraded at the same time if Redis is down — so you just took down your entire service over a non-fatal condition.&lt;/p&gt;

&lt;p&gt;The general rule:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;200&lt;/code&gt; — route traffic here (ok or degraded, app is serving requests)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;503&lt;/code&gt; — do not route traffic here (unhealthy, app cannot serve requests)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Your monitoring system reads the response body and alerts on &lt;code&gt;degraded&lt;/code&gt; status. The load balancer cares about the HTTP status; your alerting cares about the body. Keep these concerns separate.&lt;/p&gt;

&lt;p&gt;A few important caveats:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Check what your load balancer actually does.&lt;/strong&gt; Some older or simpler load balancers (HAProxy with basic config, certain managed cloud LBs) only look at status codes, not the body. The 200-for-degraded pattern assumes your LB and monitoring are separate consumers with separate configurations. If your LB is the only health check consumer, you may need to return &lt;code&gt;503&lt;/code&gt; for degraded too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Consider a separate internal port.&lt;/strong&gt; A cleaner architecture separates concerns by port: the external port serves user traffic, an internal-only port (e.g., 9090) serves &lt;code&gt;GET /health&lt;/code&gt; with full details for the load balancer and monitoring system. This way the detailed health response never touches the public network, and you avoid the content-negotiation problem entirely. Not always worth the operational overhead, but worth knowing the pattern exists.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Timeout Circuit Breaker
&lt;/h2&gt;

&lt;p&gt;The single most important constraint on a health endpoint: it must never hang.&lt;/p&gt;

&lt;p&gt;If your database connection pool is exhausted, a &lt;code&gt;SELECT 1&lt;/code&gt; query will sit in the queue waiting for a connection to become available. Without a timeout, your health check hangs for 30 seconds, the load balancer times out waiting for a response, and it marks your instance unhealthy — not because it is, but because the health check itself blocked.&lt;/p&gt;

&lt;p&gt;Every check in the example above uses &lt;code&gt;Promise.race&lt;/code&gt; with a timeout. The thresholds I use:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database: 2 seconds (a &lt;code&gt;SELECT 1&lt;/code&gt; over a local connection should complete in under 5ms; if it takes 2 seconds something is wrong)&lt;/li&gt;
&lt;li&gt;Redis: 1 second (&lt;code&gt;PING&lt;/code&gt; should be sub-millisecond)&lt;/li&gt;
&lt;li&gt;BullMQ queue counts: 1.5 seconds (reads from Redis, same logic)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One important caveat: &lt;code&gt;Promise.race&lt;/code&gt; bounds the health endpoint's response time, but it does not cancel the underlying operation. If &lt;code&gt;db.execute(sql\&lt;/code&gt;SELECT 1&lt;code&gt;)&lt;/code&gt;loses the race, it keeps running inside the connection pool — the database driver has no way to know you stopped waiting. This means a pool under pressure can accumulate abandoned queries alongside new ones. Where possible, use driver-level timeouts in addition to&lt;code&gt;Promise.race&lt;/code&gt;: most Postgres clients support a &lt;code&gt;query_timeout&lt;/code&gt;or&lt;code&gt;statement_timeout&lt;/code&gt; setting that cancels the query at the server level.&lt;/p&gt;

&lt;p&gt;The total health check should complete in well under 3 seconds. Configure your Docker or Kubernetes timeout to 5–10 seconds to give it headroom without making your orchestrator wait indefinitely.&lt;/p&gt;

&lt;h2&gt;
  
  
  Docker and Kubernetes Configuration
&lt;/h2&gt;

&lt;p&gt;In Docker Compose:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;services&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;api&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp:latest&lt;/span&gt;
    &lt;span class="na"&gt;healthcheck&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="c1"&gt;# Use /health/live, not the full /health — Docker healthcheck triggers restarts,&lt;/span&gt;
      &lt;span class="c1"&gt;# not graceful drains. Checking Postgres/Redis here means a DB outage restarts your container.&lt;/span&gt;
      &lt;span class="na"&gt;test&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="pi"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;CMD"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wget"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;--spider"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;-q"&lt;/span&gt;&lt;span class="pi"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;http://localhost:3001/health/live"&lt;/span&gt;&lt;span class="pi"&gt;]&lt;/span&gt;
      &lt;span class="na"&gt;interval&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;30s&lt;/span&gt;
      &lt;span class="na"&gt;timeout&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;10s&lt;/span&gt;
      &lt;span class="na"&gt;retries&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;start_period&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;20s&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;wget --spider&lt;/code&gt; (instead of &lt;code&gt;wget -qO-&lt;/code&gt;) makes the intent explicit: exit code 0 on 2xx, non-zero on anything else. The &lt;code&gt;-qO-&lt;/code&gt; variant prints the body to stdout, which is noise in Docker logs and makes the behavior less obvious.&lt;/p&gt;

&lt;p&gt;The &lt;code&gt;start_period&lt;/code&gt; gives your app time to finish initialising — running migrations, establishing connection pools — before the health check starts counting failures. Without it, your container can fail health checks during startup and get restarted in a loop before it has had a chance to boot.&lt;/p&gt;

&lt;p&gt;In Kubernetes, split liveness and readiness:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# kubernetes/deployment.yaml&lt;/span&gt;
&lt;span class="na"&gt;spec&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="na"&gt;containers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;api&lt;/span&gt;
      &lt;span class="na"&gt;image&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;myapp:latest&lt;/span&gt;
      &lt;span class="na"&gt;livenessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/health/live&lt;/span&gt;
          &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3001&lt;/span&gt;
        &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
        &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;
        &lt;span class="na"&gt;timeoutSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
        &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3&lt;/span&gt;
      &lt;span class="na"&gt;readinessProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/health/ready&lt;/span&gt;
          &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3001&lt;/span&gt;
        &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
        &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
        &lt;span class="na"&gt;timeoutSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
        &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
      &lt;span class="na"&gt;startupProbe&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
        &lt;span class="na"&gt;httpGet&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
          &lt;span class="na"&gt;path&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;/health/live&lt;/span&gt;
          &lt;span class="na"&gt;port&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;3001&lt;/span&gt;
        &lt;span class="na"&gt;initialDelaySeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
        &lt;span class="na"&gt;periodSeconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
        &lt;span class="na"&gt;failureThreshold&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;12&lt;/span&gt; &lt;span class="c1"&gt;# Allow 60 seconds for startup&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With separate probes, implement separate routes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// /health/live — only checks if the process itself is functional&lt;/span&gt;
&lt;span class="c1"&gt;// No external dependencies; if this fails, restart the container&lt;/span&gt;
&lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/health/live&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ok&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;uptime&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;floor&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;PROCESS_START&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// /health/ready — checks if the instance can serve traffic&lt;/span&gt;
&lt;span class="c1"&gt;// Uses the full check above&lt;/span&gt;
&lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/health/ready&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// ... full dependency checks, returns 503 if unhealthy&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you are running Docker Compose without Kubernetes, use a minimal &lt;code&gt;/health/live&lt;/code&gt; endpoint for Docker's container healthcheck. Use the full &lt;code&gt;/health&lt;/code&gt; or &lt;code&gt;/health/ready&lt;/code&gt; endpoint for external monitoring, Caddy upstream checks, or manual diagnostics — not for Docker restarts. The Docker daemon will restart containers that fail their healthcheck after &lt;code&gt;retries&lt;/code&gt; consecutive failures, so what Docker calls should be process-level only.&lt;/p&gt;

&lt;h2&gt;
  
  
  Caching the Result and Managing Load
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;/health&lt;/code&gt; can become your most frequently called endpoint. A load balancer checking every 10 seconds, a monitoring system checking every minute, an autoscaler reading it constantly — all hitting the same Postgres and Redis that serve your actual users.&lt;/p&gt;

&lt;p&gt;If each health check fires a &lt;code&gt;SELECT 1&lt;/code&gt; to the database, you are generating steady background load that compounds with your regular query traffic. At low volume this is irrelevant; on a high-traffic system with a loaded database, it matters.&lt;/p&gt;

&lt;p&gt;The fix: cache the health check result for a few seconds. A 5-second TTL means a load balancer checking every 10 seconds hits your real dependencies at most twice per minute per instance, instead of six times per minute.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;let&lt;/span&gt; &lt;span class="nx"&gt;cachedResult&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;body&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HealthResponse&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="nl"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;CACHE_TTL_MS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;5000&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="nx"&gt;health&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;/health/ready&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedResult&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;cachedResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ts&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nx"&gt;CACHE_TTL_MS&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;cachedResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;cachedResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="c1"&gt;// ... run checks ...&lt;/span&gt;

  &lt;span class="nx"&gt;cachedResult&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;httpStatus&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;ts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;httpStatus&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="mi"&gt;503&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The tradeoff: a failure that occurs between cache refreshes will not be visible for up to 5 seconds. For most systems this is acceptable — load balancers already have &lt;code&gt;failureThreshold&lt;/code&gt; and &lt;code&gt;interval&lt;/code&gt; buffers built in. If you need sub-second failure detection, you probably need a more sophisticated monitoring pipeline, not a faster health endpoint.&lt;/p&gt;

&lt;p&gt;One caveat: cache &lt;code&gt;ok&lt;/code&gt; and &lt;code&gt;degraded&lt;/code&gt; results freely, but consider a shorter TTL (or no cache) for &lt;code&gt;unhealthy&lt;/code&gt;. A single transient Postgres timeout that resolves in 200ms should not cause 5 seconds of &lt;code&gt;503&lt;/code&gt; responses. A simple approach: set the TTL based on the result status — 5 seconds for &lt;code&gt;ok&lt;/code&gt;, 2 seconds for &lt;code&gt;degraded&lt;/code&gt;, 0 for &lt;code&gt;unhealthy&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;A related point: if your monitoring system needs rich queue metrics, failed job counts, and latency histograms — that data belongs in a metrics endpoint (&lt;code&gt;/metrics&lt;/code&gt; in Prometheus format, or a dedicated internal route), not in &lt;code&gt;/health&lt;/code&gt;. The healthcheck tells the orchestrator whether to route traffic. Metrics tell your team what is happening inside the system. These are different questions with different consumers.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Not to Expose Publicly
&lt;/h2&gt;

&lt;p&gt;The full JSON response with component statuses and latencies is useful for your monitoring system and your team. It should not be publicly accessible without authentication.&lt;/p&gt;

&lt;p&gt;A response body like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"checks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"latencyMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"redis"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"degraded"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"error"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"connect ECONNREFUSED 10.0.0.5:6379"&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;tells an attacker your internal Redis IP address and that it is currently unreachable. The error strings from failed checks often contain connection strings, hostnames, and infrastructure details.&lt;/p&gt;

&lt;p&gt;Options:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Restrict by network&lt;/strong&gt; — serve the full response only to requests from internal networks or specific IPs. Your load balancer and monitoring system are internal; public-facing traffic is not.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Two endpoints&lt;/strong&gt; — a public &lt;code&gt;/health&lt;/code&gt; that returns only &lt;code&gt;{ "status": "ok" }&lt;/code&gt; or &lt;code&gt;{ "status": "unhealthy" }&lt;/code&gt; without details, and an authenticated &lt;code&gt;/health/details&lt;/code&gt; that returns the full response.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Strip errors in production&lt;/strong&gt; — include &lt;code&gt;error&lt;/code&gt; fields only when &lt;code&gt;NODE_ENV !== "production"&lt;/code&gt;, or behind an &lt;code&gt;?verbose=1&lt;/code&gt; query param gated by an internal header.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;On vatnode, there are two health routes in production: &lt;code&gt;/health&lt;/code&gt; is publicly reachable (UptimeRobot and Caddy need it) but returns a sanitized body — status rollup and latencies, no error strings or infrastructure details. &lt;code&gt;/health/details&lt;/code&gt; is on an internal network interface behind IP allowlisting and returns the full response including error messages from failed checks. UptimeRobot gets enough signal to alert on downtime; the team gets full diagnostics when they need them.&lt;/p&gt;

&lt;h2&gt;
  
  
  Graceful Degradation and Not Restarting Too Fast
&lt;/h2&gt;

&lt;p&gt;One pattern that burns people: a liveness probe that checks external dependencies and triggers container restarts when a downstream service is flaky.&lt;/p&gt;

&lt;p&gt;Postgres goes down for 30 seconds. Your liveness probe checks Postgres. It fails 3 times in a row. Kubernetes restarts your pod. The new pod starts up, Postgres is still recovering, the liveness probe fails again. Kubernetes restarts again. Now you have a restart loop on top of a database outage, and your application is never in a stable state long enough to serve the requests that do not need Postgres.&lt;/p&gt;

&lt;p&gt;The fix: liveness probes check only the process itself (is the Node.js event loop responsive?). Readiness probes check external dependencies. When Postgres recovers, the readiness probe passes, traffic resumes, and no container was ever unnecessarily restarted.&lt;/p&gt;

&lt;p&gt;In Docker Compose, you only have one healthcheck, and its failure semantics are different from Kubernetes: an unhealthy container gets restarted, period. There is no "remove from rotation without restarting" equivalent. This makes the Docker case harder: if you include Postgres in your healthcheck, a database outage will trigger container restarts — which is exactly what the Kubernetes section warns against.&lt;/p&gt;

&lt;p&gt;The pragmatic approach for Docker Compose: keep the healthcheck strictly process-level — no external dependencies at all. Point it at &lt;code&gt;/health/live&lt;/code&gt;, which checks only that the Node.js event loop is responsive. Set &lt;code&gt;retries&lt;/code&gt; high (3–5) and &lt;code&gt;interval&lt;/code&gt; long (30s+) to absorb transient failures, and use your external monitoring system (Caddy upstream check, UptimeRobot, Grafana) for richer dependency checks. Accept that Docker's healthcheck is a liveness probe in all but name, and build your graceful degradation logic inside the application (circuit breakers, fallback paths) rather than relying on the orchestrator to make the distinction for you.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Good Looks Like
&lt;/h2&gt;

&lt;p&gt;On vatnode, three consumers read health endpoints with different intentions: Docker calls &lt;code&gt;/health/live&lt;/code&gt; every 30 seconds to decide whether to restart the container; Caddy's upstream health check calls &lt;code&gt;/health&lt;/code&gt; (readiness-style, full dependency check) every 30 seconds to decide whether to route traffic to this upstream; UptimeRobot calls &lt;code&gt;/health&lt;/code&gt; every 5 minutes for external availability monitoring. Each consumer gets the endpoint that matches its semantics. Under normal conditions, the &lt;code&gt;/health&lt;/code&gt; response looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"version"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"a3f2c19"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"uptime"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;847293&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"checks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"database"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"latencyMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"redis"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"latencyMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"queue"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ok"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"latencyMs"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Uptime of 847293 seconds means roughly 9.8 days since last restart. Database latency of 2ms means the connection pool is healthy and the query is executing normally. This response takes me 3 seconds to read and tells me the system is fine.&lt;/p&gt;

&lt;p&gt;When something is wrong, the response tells me where to look immediately — without SSHing into the server, without tailing logs, without waiting for a monitoring alert to fire.&lt;/p&gt;




&lt;p&gt;If you're building production infrastructure that needs to stay reliable — whether that's a &lt;a href="https://iurii.rogulia.fi/services/mvp-development" rel="noopener noreferrer"&gt;SaaS API&lt;/a&gt;, an e-commerce backend, or a worker-heavy &lt;a href="https://iurii.rogulia.fi/services/api-integrations" rel="noopener noreferrer"&gt;integration platform&lt;/a&gt; — healthcheck design is one of those details that seems boring until the 2 AM incident when it's the only tool you have.&lt;/p&gt;

&lt;p&gt;I've run these patterns in production across several systems, from &lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;vatnode.dev&lt;/a&gt; to &lt;a href="https://iurii.rogulia.fi/projects/pikkuna-ecommerce-platform" rel="noopener noreferrer"&gt;pikkuna.fi&lt;/a&gt;. If you need a senior developer who can own production reliability end-to-end — &lt;a href="https://iurii.rogulia.fi/contact" rel="noopener noreferrer"&gt;get in touch&lt;/a&gt;. I'm available for freelance projects and long-term engagements.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/background-jobs-nodejs" rel="noopener noreferrer"&gt;Background Jobs in Node.js: BullMQ, pg-boss, or Just a Cron?&lt;/a&gt; — BullMQ queue health in practice&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/self-hosting-caddy-docker-vps" rel="noopener noreferrer"&gt;Self-Hosting a Production API on a €6/month VPS&lt;/a&gt; — Docker Compose healthcheck and Caddy upstream health checks in context&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;Vatnode VAT Validation SaaS&lt;/a&gt; — production system where these health patterns run&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;External documentation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://kubernetes.io/docs/tasks/configure-pod-container/configure-liveness-readiness-startup-probes/" rel="noopener noreferrer"&gt;Kubernetes liveness, readiness, and startup probes&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.docker.com/reference/dockerfile/#healthcheck" rel="noopener noreferrer"&gt;Docker HEALTHCHECK instruction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://hono.dev/" rel="noopener noreferrer"&gt;Hono documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://docs.bullmq.io/guide/queues/getters" rel="noopener noreferrer"&gt;BullMQ getJobCounts API&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>node</category>
      <category>typescript</category>
      <category>hono</category>
      <category>bullmq</category>
    </item>
    <item>
      <title>PDF Fraud Detection in Loan Origination</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Mon, 29 Jun 2026 10:00:36 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/pdf-fraud-detection-in-loan-origination-3ma8</link>
      <guid>https://dev.to/iurii_rogulia/pdf-fraud-detection-in-loan-origination-3ma8</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/pdf-fraud-detection-loan-origination" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A borrower submits a loan application through your LOS. Bank statements, a W-2, two pay stubs, a tax return. Everything looks right — the account numbers, the employer name, the income figure that places the borrower within your debt-to-income threshold.&lt;/p&gt;

&lt;p&gt;Three weeks later, the loan closes. Eighteen months after that, it defaults. When you pull the origination file in a repurchase review, someone finally opens the PDF in the right tool and sees it: the producer field on the bank statement shows “iLovePDF”. The modification date is three days after the creation date. The balance figures were edited after the bank generated the file.&lt;/p&gt;

&lt;p&gt;The fraud was in the PDF the entire time. Your LOS never checked.&lt;/p&gt;

&lt;p&gt;PDF fraud detection in loan origination closes this gap. A structural forensics check at document intake examines the file itself — not just what it says, but whether its internal history is consistent with the system that claims to have generated it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The document fraud surface in loan origination
&lt;/h2&gt;

&lt;p&gt;Every loan application that requires income or asset fraud detection is a document fraud surface. The documents involved follow a predictable pattern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Bank statements&lt;/strong&gt; — downloaded from the borrower’s online banking portal as a PDF. Real statements are generated by institutional document systems: Chase, Wells Fargo, Bank of America, HSBC. Their producer signatures are consistent and identifiable. An edited statement carries the producer of whichever tool modified it last.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;W-2s&lt;/strong&gt; — issued by employers and generated either by payroll software (ADP, Paychex, Gusto) or filed and downloaded through tax preparation platforms (TurboTax, H&amp;amp;R Block, IRS e-file). A W-2 whose claimed issuer is a national employer but whose producer is a consumer PDF editor should not exist in a normal workflow.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pay stubs&lt;/strong&gt; — generated by payroll platforms (ADP Workforce Now, Paychex Flex, Gusto, Rippling). Each has a distinct producer signature. A pay stub that claims ADP origin but carries a different producer was modified between generation and submission.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tax returns&lt;/strong&gt; — either IRS-issued transcripts or preparer-generated documents from TurboTax, H&amp;amp;R Block, or a CPA’s practice management software. The producer field narrows the expected origin significantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Asset letters and employment fraud detection letters&lt;/strong&gt; — generated on institutional letterhead, typically by the issuer’s document management system or a staffing platform. These documents should show producer strings matching enterprise software, not consumer tools.&lt;/p&gt;

&lt;p&gt;The modification pattern is the same across all of these: the borrower takes a legitimate PDF, opens it in an editor, changes a number — income, balance, employment date, contribution amount — and submits the modified file. The edit takes five minutes. Without structural forensics, detection is nearly impossible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why your LOS does not catch this
&lt;/h2&gt;

&lt;p&gt;Encompass, Blend, Byte, and SimpleNexus are routing and workflow platforms. They intake documents, attach them to loan files, route them to the appropriate stage, and hold them for underwriter review. That is what they were built to do.&lt;/p&gt;

&lt;p&gt;Document fraud detection in a mortgage LOS is not part of that design. These platforms do not inspect whether the PDF is structurally consistent with the system that allegedly generated it — that was never in their scope.&lt;/p&gt;

&lt;p&gt;The OCR extraction layer that sits in front of many LOS platforms — Ocrolus, FormFree, Finicity — has a different role. These tools read the document’s content: the income figures, the account balances, the employment dates. They extract numbers from the page and compare them against stated income or bank transaction data.&lt;/p&gt;

&lt;p&gt;Content extraction and structural forensics are different checks. OCR reads what the document says. Structural forensics reads whether the document’s internal history is consistent with the system that claims to have generated it. A skillfully edited bank statement can pass OCR extraction cleanly — the numbers on the page are internally consistent, the transactions add up, the balance matches the running total. The forgery is invisible at the content layer. It is only visible in the file structure.&lt;/p&gt;

&lt;p&gt;Neither your LOS nor your OCR vendor closes this gap. Both process the document as presented. Neither examines whether the document was modified before it arrived.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the structural signals actually look like
&lt;/h2&gt;

&lt;p&gt;A real bank statement from Wells Fargo, Chase, or Bank of America carries a producer string generated by the bank’s document management system. The creation date reflects when the statement was generated. The modification date is absent or matches the creation date within seconds. The xref table has one entry — the original generation, nothing else.&lt;/p&gt;

&lt;p&gt;An edited version of that statement shows a different picture. Here is what HTPBE returns on a typical altered bank statement submitted in a mortgage application:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ck_4e2a1b9f-7c3d-4f8e-b2a1-5d0c9e3a7f2b"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_EDITING_TOOL_FINGERPRINT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_MULTIPLE_REVISION_LAYERS"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Wells Fargo Document Services"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"producer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Smallpdf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1743465600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1743724800&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;creator: "Wells Fargo Document Services"&lt;/code&gt; alongside &lt;code&gt;producer: "Smallpdf"&lt;/code&gt; is not a combination that occurs in any legitimate document workflow. Wells Fargo generates the statement; nothing in a normal mortgage process re-saves it in Smallpdf. The creation-to-modification gap of three days confirms the edit window. Three xref entries — the original generation and two subsequent edit sessions — tell you how many times it was touched.&lt;/p&gt;

&lt;p&gt;For a W-2 that should have originated from ADP Workforce Now and was instead edited in Adobe Acrobat Reader:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ck_8f1b3d2e-5a4c-4e7f-c3b2-9e1d4a6b8c0f"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_EDITING_TOOL_FINGERPRINT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_DATES_DISAGREE"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ADP Workforce Now"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"producer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Adobe Acrobat DC"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1735689600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1743120000&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;HTPBE_DATES_DISAGREE&lt;/code&gt; with a 3-month gap between creation and modification on a W-2 is not a timing artifact. W-2s are generated in January and submitted to lenders in January or February of the same year. A modification date in late March on a document with a January creation date means someone opened and re-saved it well after initial generation.&lt;/p&gt;

&lt;h2&gt;
  
  
  What &lt;code&gt;inconclusive&lt;/code&gt; means in a mortgage context
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;inconclusive&lt;/code&gt; is the verdict returned when a document was created with consumer software — Microsoft Word, Google Docs, LibreOffice, Canva — that does not leave the structural markers present in institutionally-generated documents.&lt;/p&gt;

&lt;p&gt;For mortgage documents, &lt;code&gt;inconclusive&lt;/code&gt; on a document that claims institutional origin is itself a red flag.&lt;/p&gt;

&lt;p&gt;A bank statement that returns &lt;code&gt;inconclusive&lt;/code&gt; with &lt;code&gt;producer: "Microsoft Word"&lt;/code&gt; was not generated by a bank. Banks do not produce statements in Microsoft Word. The document was built from scratch in consumer software, which is not the same as editing an existing statement but is equally disqualifying for underwriting purposes.&lt;/p&gt;

&lt;p&gt;The routing rule for mortgage document intake:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;intact&lt;/code&gt; — structural signals consistent with claimed origin, proceed to underwriting queue&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;modified&lt;/code&gt; — post-creation edit detected, route to pre-underwriting review with named markers&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;inconclusive&lt;/code&gt; on a bank statement, W-2, pay stub, or tax return — document origin inconsistent with institutional claim, treat as &lt;code&gt;modified&lt;/code&gt; and hold for review&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;inconclusive&lt;/code&gt; on a borrower-authored document — a personal letter of explanation, a gift letter — is expected and acceptable. The signal only matters when the document claims to come from an institutional source.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integration at document intake
&lt;/h2&gt;

&lt;p&gt;The check belongs at the moment the PDF arrives in your intake pipeline — before it is attached to the loan file, before it is queued for the underwriter, before OCR extraction runs. At that point, the document URL is available, the file is in storage, and the check takes under three seconds.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;

&lt;span class="n"&gt;HTPBE_API_KEY&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;environ&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;HTPBE_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="n"&gt;BASE_URL&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.htpbe.tech/v1&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;span class="n"&gt;HEADERS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Authorization&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Bearer &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;HTPBE_API_KEY&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# Document types that claim institutional origin in a mortgage application
&lt;/span&gt;&lt;span class="n"&gt;INSTITUTIONAL_DOC_TYPES&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bank_statement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pay_stub&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tax_return&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asset_letter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employment_letter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;check_loan_document&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pdf_url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;
    Run structural forensics on a loan application document.
    Returns a routing decision and audit metadata.

    doc_type: &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bank_statement&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;w2&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pay_stub&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tax_return&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; |
              &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asset_letter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;employment_letter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; | &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;personal_letter&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;
    &lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;submit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;post&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/analyze&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;url&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;pdf_url&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;check_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;submit&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;httpx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;BASE_URL&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/result/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;check_id&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;HEADERS&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;raise_for_status&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;markers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;modification_markers&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="n"&gt;producer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;producer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;creator&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;creator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="c1"&gt;# Modified — post-creation edit confirmed
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;modified&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pre_underwriting_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;check_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;check_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Structural modification detected: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;, &lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;markers&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;creator&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;creator&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;producer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Inconclusive on a document claiming institutional origin
&lt;/span&gt;    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inconclusive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;doc_type&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;INSTITUTIONAL_DOC_TYPES&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;hold&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;pre_underwriting_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;check_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;check_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;reason&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;doc_type&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; produced by consumer software (&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;), &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
                &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inconsistent with institutional origin&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
            &lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;producer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;# Intact, or inconclusive on a personal document
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;action&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proceed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;queue&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;underwriting&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;check_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;check_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;check_id&lt;/code&gt; is stored against the document record in the loan file. If the loan is selected for a QC audit, repurchase review, or regulatory examination, the forensic report is retrievable at any point via &lt;code&gt;GET /api/v1/result/{check_id}&lt;/code&gt;. The report includes the verdict, the named markers, the producer and creator strings, and the timestamp of the check — a complete audit trail attached to the document without requiring the original file.&lt;/p&gt;

&lt;p&gt;For LOS platforms that process documents via webhook (Blend’s document event hooks, Encompass’s pipeline triggers), the check runs asynchronously on document receipt. The routing decision is applied before the document advances to the next workflow stage.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the underwriter sees
&lt;/h2&gt;

&lt;p&gt;The structural verdict surfaces alongside the document in the loan file. It is not a score or a probability — it is a named set of signals that the underwriter can read and act on.&lt;/p&gt;

&lt;p&gt;For a held document, the underwriter sees:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The verdict: &lt;code&gt;modified&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The named markers: &lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt;, &lt;code&gt;HTPBE_MULTIPLE_REVISION_LAYERS&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The creator: &lt;code&gt;Chase Document Management&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The producer: &lt;code&gt;iLovePDF&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;The check ID linking to the full forensic report&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a black box. The underwriter can explain the hold, escalate it for borrower clarification, or reject the document based on a documented structural finding. That documented finding matters for the compliance angle.&lt;/p&gt;

&lt;h2&gt;
  
  
  Compliance: TRID, HMDA, and adverse action documentation
&lt;/h2&gt;

&lt;p&gt;TRID requires that lenders maintain a documented basis for decisions in the loan origination process. HMDA requires that adverse action be supported by identifiable reasons. When a lender rejects a document — or takes adverse action on an application that included fraudulent documents — the regulatory expectation is that the basis for that decision can be stated.&lt;/p&gt;

&lt;p&gt;“The bank statement appeared altered” is a subjective finding. “The bank statement returned a &lt;code&gt;modified&lt;/code&gt; verdict with markers &lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt; and &lt;code&gt;HTPBE_MULTIPLE_REVISION_LAYERS&lt;/code&gt; — the document’s &lt;code&gt;creator&lt;/code&gt; field shows Chase Document Management and the &lt;code&gt;producer&lt;/code&gt; field shows iLovePDF, with a modification date three days after creation” is a documented, machine-generated structural finding.&lt;/p&gt;

&lt;p&gt;Named structural markers from HTPBE translate directly into adverse action documentation. The check ID links to a permanent, retrievable record of the finding. The audit trail exists from the moment the document was checked — without any manual step from the underwriter or compliance team.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this does not catch
&lt;/h2&gt;

&lt;p&gt;Structural forensics has a defined scope. Two patterns fall outside it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Documents fabricated from scratch in the correct software.&lt;/strong&gt; If a fraudster creates a bank statement using the same document system a real bank uses — or registers a business with a payroll provider and generates a real ADP pay stub with inflated figures — the structural signals will be consistent with a legitimate document. The content is false; the structure is clean. Forensic PDF analysis cannot detect this pattern. Income fraud detection against source data (Plaid, The Work Number, IRS income source-of-truth check) is the appropriate control for fabricated-from-source fraud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Encrypted or password-protected PDFs.&lt;/strong&gt; A PDF with strong encryption cannot be analyzed for structural signals. The check returns &lt;code&gt;inconclusive&lt;/code&gt; by necessity. For loan document intake, receiving an encrypted document from a borrower without prior arrangement is itself unusual and worth flagging.&lt;/p&gt;

&lt;p&gt;For the fraud pattern that accounts for the majority of LOS document fraud — taking a legitimate PDF and editing it with available tools — structural forensics catches it consistently. That is because the tools available for editing PDFs (Adobe Acrobat, Smallpdf, iLovePDF, PDF24, Microsoft Word’s PDF export) all leave recoverable traces in the file structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where to go from here
&lt;/h2&gt;

&lt;p&gt;The check runs against any document URL your LOS has access to — a presigned S3 URL, a Cloudflare R2 link, a Blob URL. There is no file upload to a third-party service in the critical path.&lt;/p&gt;

&lt;p&gt;Teams integrating into Encompass, Blend, or a custom LOS pipeline can start with the web tool at &lt;a href="https://htpbe.tech/" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt; — free to try, with 5 free checks on signup, then pay-per-check (credit packs from $5) or a subscription from $15/mo — to check a sample of application documents from recent closed loans before committing to an API build. The results on a closed-loan sample frequently surface modifications that were missed at origination.&lt;/p&gt;

&lt;p&gt;For teams ready to build, API access starts at &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;$15/month&lt;/a&gt; with test keys available on all plans for integration testing before live documents are involved.&lt;/p&gt;

&lt;p&gt;For the full mortgage use case — including pay stub fraud detection patterns, asset letter signals, and employment letter checks — see &lt;a href="https://htpbe.tech/use-cases/mortgage" rel="noopener noreferrer"&gt;mortgage document fraud detection&lt;/a&gt; and &lt;a href="https://htpbe.tech/use-cases/lending" rel="noopener noreferrer"&gt;fintech lending fraud detection&lt;/a&gt;. For pay stub specifics, see &lt;a href="https://htpbe.tech/use-cases/fake-pay-stub-detection" rel="noopener noreferrer"&gt;fake pay stub detection&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://htpbe.tech/auth/signup" rel="noopener noreferrer"&gt;Register for API access&lt;/a&gt; and run the first check in under ten minutes.&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>fintech</category>
      <category>fraud</category>
      <category>api</category>
    </item>
    <item>
      <title>PDF Font Subset Divergence: Forensic Tampering Detection</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Sun, 28 Jun 2026 11:00:37 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/pdf-font-subset-divergence-forensic-tampering-detection-e6k</link>
      <guid>https://dev.to/iurii_rogulia/pdf-font-subset-divergence-forensic-tampering-detection-e6k</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/pdf-font-subset-divergence" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Inside every PDF that embeds fonts is a six-character prefix that most people never notice. For forensic analysis, it is one of the most precise signals a document carries: it tells you, per page, whether that font was embedded during the same rendering session as the rest of the document.&lt;/p&gt;

&lt;p&gt;When those prefixes diverge across pages, the document was not produced in a single pass. Prefix divergence is structural evidence of page assembly from multiple sources — visible in the file itself, without any reference copy.&lt;/p&gt;

&lt;h2&gt;
  
  
  What font subsetting is
&lt;/h2&gt;

&lt;p&gt;PDF renderers do not embed the full font file for every typeface they use. Embedding a full font — even a modest one — adds hundreds of kilobytes to a file for no practical benefit when only a fraction of its glyphs appear in the document. Instead, the renderer embeds a &lt;em&gt;subset&lt;/em&gt;: only the glyphs actually used on that page or in that rendering session.&lt;/p&gt;

&lt;p&gt;The PDF specification requires that embedded font subsets be tagged with a six-character random uppercase prefix, followed by a &lt;code&gt;+&lt;/code&gt; and the font name:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ABCDEF+Arial
XKZWQP+TimesNewRoman
MJVHRT+SourceSansPro
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The prefix is generated fresh at save time by the rendering engine. It is not derived from the font content, the document content, or any deterministic hash. It is random and local to that rendering session.&lt;/p&gt;

&lt;p&gt;This has a structural consequence: all fonts embedded during the same rendering session carry a prefix generated by the same process in the same execution context. When a renderer produces a multi-page document in a single call, all font subsets originate from one session. Their prefixes may differ — each subset gets its own random tag — but they are all generated by the same process in a consistent environment.&lt;/p&gt;

&lt;h2&gt;
  
  
  The forensic signal
&lt;/h2&gt;

&lt;p&gt;When pages in a PDF carry font subsets with prefix patterns that are inconsistent with a single-session origin, it is a signal that the pages were rendered or assembled separately.&lt;/p&gt;

&lt;p&gt;This is not about comparing two specific prefix strings — the prefixes are random and have no comparable value. The signal comes from structural inconsistency: fonts that should share a rendering context do not, or page-level font data shows patterns consistent with independent session origins.&lt;/p&gt;

&lt;p&gt;Three concrete scenarios produce this pattern:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Multi-call AI document generation.&lt;/strong&gt; Language model APIs that render PDF output page-by-page — sending each page as a separate generation request — produce independent font sessions per page. Each page’s embedded fonts carry subsets from a distinct rendering context. A three-page document generated this way contains three separate font-subsetting environments. The prefix patterns do not align across pages the way they would in a document rendered in a single pass.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Page insertion from a foreign source.&lt;/strong&gt; A common fraud pattern in document tampering is inserting a page from one PDF into another. A bank statement with a page replaced from a different statement, or a contract with a signature page substituted. The inserted page was embedded by a different renderer, in a different session, at a different time. Its font subsets carry a different context signature than the surrounding pages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Template reuse with copy-pasted page objects.&lt;/strong&gt; Some document assembly tools construct PDFs by duplicating page objects from template files rather than re-rendering content. The duplicated pages bring their original font subsets with them. When the assembly tool adds new pages alongside these imported objects, the new pages carry fresh font subsets from the assembler, while the imported pages retain subsets from the original template’s renderer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What this exposes in practice
&lt;/h2&gt;

&lt;p&gt;Font subset divergence is most relevant for two categories of documents.&lt;/p&gt;

&lt;p&gt;The first is AI-rendered financial summaries and reports. Automated document generation pipelines that use LLM-based rendering (common in fintech for generating statements, summaries, or reports programmatically) often operate page-by-page. When a received document claims to be a system-generated output but its pages show independent font session signatures, the claimed origin is inconsistent with single-pass institutional generation. This does not prove fraud — it identifies a structural anomaly that warrants scrutiny.&lt;/p&gt;

&lt;p&gt;The second is manually assembled multi-source documents. HR and lending workflows regularly encounter documents where an applicant has taken pages from different source documents and combined them into a single PDF. A pay stub with a page from a different employer’s statement; a lease agreement with a different tenant’s financial page inserted. Font subset divergence surfaces the page-assembly boundary directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  How HTPBE surfaces this
&lt;/h2&gt;

&lt;p&gt;HTPBE’s multi-session detection layer scans embedded font subsets across all pages and compares the rendering context signatures. When divergence is detected that is inconsistent with a single-session origin, the analysis adds a modification marker to the result.&lt;/p&gt;

&lt;p&gt;A response for a document with font subset divergence looks like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ck_7e3b1a9d-4f2c-4d8a-c3e1-5b9f2a0e7d4c"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_PAGES_FROM_MULTIPLE_SOURCES"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_MULTIPLE_REVISION_LAYERS"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"has_digital_signature"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Adobe Acrobat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"producer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Adobe Acrobat"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1746057600&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1747872000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"page_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;HTPBE_PAGES_FROM_MULTIPLE_SOURCES&lt;/code&gt; marker indicates that pages in this document were assembled from rendering contexts that are structurally inconsistent with single-pass generation. Combined with &lt;code&gt;HTPBE_MULTIPLE_REVISION_LAYERS&lt;/code&gt; (two xref entries, meaning the file was opened and saved after initial creation), this result is consistent with page insertion after the original document was produced.&lt;/p&gt;

&lt;p&gt;To submit a document for analysis:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.htpbe.tech/v1/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"url": "https://your-storage.example.com/submissions/application-pack.pdf"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Retrieve the result:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl https://api.htpbe.tech/v1/result/ck_7e3b1a9d-4f2c-4d8a-c3e1-5b9f2a0e7d4c &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full result object includes page count, xref structure, digital signature state, and the complete &lt;code&gt;modification_markers&lt;/code&gt; array. Marker descriptions are stable strings — safe to store and use in downstream routing logic.&lt;/p&gt;

&lt;h2&gt;
  
  
  Limitations
&lt;/h2&gt;

&lt;p&gt;Font subset divergence is not a &lt;code&gt;certain&lt;/code&gt;-confidence marker. It is &lt;code&gt;high&lt;/code&gt; confidence: a strong structural signal, but not cryptographically provable in the way that signature tampering is.&lt;/p&gt;

&lt;p&gt;Several legitimate production workflows produce divergent font subsets:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Document assembly pipelines.&lt;/strong&gt; Enterprise content management systems sometimes compose final PDFs from independently rendered components — a cover page from one service, body pages from another, appendices from a third. This is architecturally legitimate and produces the same structural pattern as page insertion fraud.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Print-and-scan-and-append workflows.&lt;/strong&gt; Some legal and compliance workflows scan physical pages and append them to electronically generated PDFs. The scanned pages carry no font subsets (raster content has no embedded fonts), so they introduce a different kind of structural discontinuity rather than prefix divergence — but the broader pattern of mixed rendering contexts is common in legitimate document production.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Certain PDF/A archival tools.&lt;/strong&gt; Some PDF/A conversion and compliance tools re-embed or re-subset fonts during archival processing. This can cause legitimate documents to show divergent prefix contexts if the conversion tool processed pages independently.&lt;/p&gt;

&lt;p&gt;When &lt;code&gt;HTPBE_PAGES_FROM_MULTIPLE_SOURCES&lt;/code&gt; appears alongside other markers — &lt;code&gt;HTPBE_MULTIPLE_REVISION_LAYERS&lt;/code&gt;, &lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt;, &lt;code&gt;HTPBE_DATES_DISAGREE&lt;/code&gt; — the cumulative signal is substantially stronger than any individual marker. When it appears in isolation, it warrants investigation rather than automatic rejection.&lt;/p&gt;

&lt;h2&gt;
  
  
  Integrating font-level forensics into document workflows
&lt;/h2&gt;

&lt;p&gt;For backend workflows that need to handle this marker explicitly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;HTPBEResult&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;intact&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;modified&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inconclusive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;modification_confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;certain&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;high&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;none&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;modification_markers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="nl"&gt;xref_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;page_count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;creator&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;producer&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;evaluateDocumentResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;HTPBEResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;claimsInstitutionalOrigin&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;boolean&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reject&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;modified&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Certain-confidence markers: auto-reject&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;certainMarkers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HTPBE_POST_SIGNATURE_EDIT&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HTPBE_SIGNATURE_REMOVED&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;HTPBE_DATES_DISAGREE&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;hasCertain&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;modification_markers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;some&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt;
      &lt;span class="nx"&gt;certainMarkers&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;includes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;hasCertain&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reject&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// High-confidence: reject if document claims institutional origin,&lt;/span&gt;
    &lt;span class="c1"&gt;// otherwise route to review&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;claimsInstitutionalOrigin&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;reject&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;inconclusive&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;claimsInstitutionalOrigin&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Consumer software origin for a document claiming bank/gov origin&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For workflows that need to flag assembled or AI-generated multi-page documents, &lt;code&gt;HTPBE_PAGES_FROM_MULTIPLE_SOURCES&lt;/code&gt; is one of several markers the aggregate verdict weighs together — the API returns the combined result, so you act on the verdict rather than re-deriving the signal yourself.&lt;/p&gt;

&lt;h2&gt;
  
  
  What font subset divergence does not catch
&lt;/h2&gt;

&lt;p&gt;A forger who renders an entire fraudulent document in a single rendering session — fabricating all pages together with one tool call — produces consistent font subsets. This marker targets assembly-based fraud and multi-session generation, not single-session fabrication from scratch.&lt;/p&gt;

&lt;p&gt;For single-session fabrication, other layers are more relevant: producer/creator mismatches against known institutional generators, timestamp anomalies, and structural patterns in the xref chain. The full &lt;a href="https://htpbe.tech/how" rel="noopener noreferrer"&gt;forensic analysis&lt;/a&gt; runs all of these concurrently; font subset divergence is one input to the aggregate verdict, not the only one. The &lt;a href="https://htpbe.tech/blog/pdf-xref-table-forensics" rel="noopener noreferrer"&gt;PDF xref table forensics&lt;/a&gt; post covers how the update chain independently surfaces modification history — a complementary signal to font-level analysis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who this matters to
&lt;/h2&gt;

&lt;p&gt;Developers building document intake pipelines for lending, HR, insurance, or legal tech platforms can use this marker to triage documents that warrant manual inspection. It is not a binary reject signal in isolation — it is a routing signal that sends specific document patterns to a review queue rather than auto-accept.&lt;/p&gt;

&lt;p&gt;Security researchers and forensic analysts can use the &lt;code&gt;modification_markers&lt;/code&gt; array to reconstruct a document’s assembly history with greater precision than xref counts alone. Font-level session data provides page-granularity information about where document boundaries likely exist.&lt;/p&gt;

&lt;p&gt;The &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;HTPBE API&lt;/a&gt; returns this and all other forensic markers in a single response. Plans start at $15/month for 30 documents — or use the &lt;a href="https://htpbe.tech/" rel="noopener noreferrer"&gt;web tool&lt;/a&gt; to run an immediate check: free to try, with 5 free checks on signup, then pay-per-check (credit packs from $5).&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>forensics</category>
      <category>api</category>
      <category>webdev</category>
    </item>
    <item>
      <title>PostgreSQL Production Checklist: UUIDs, RLS, Indexes, Pooling</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Sun, 28 Jun 2026 10:00:37 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/postgresql-production-checklist-uuids-rls-indexes-pooling-2dfl</link>
      <guid>https://dev.to/iurii_rogulia/postgresql-production-checklist-uuids-rls-indexes-pooling-2dfl</guid>
      <description>&lt;p&gt;PostgreSQL is the only database I trust for production SaaS. Not because the alternatives are bad — but because Postgres rewards the time you invest in learning it. After using it across vatnode.dev, pikkuna.fi, pi-pi.ee, and htpbe.tech, I have a set of patterns I reach for every time without thinking. This is that collection.&lt;/p&gt;

&lt;p&gt;This is not a tutorial. It's opinionated, and I'll explain the "why" behind each decision. If you know basic SQL and want to know what experienced practitioners actually do — this is the article I wish existed when I started.&lt;/p&gt;

&lt;h2&gt;
  
  
  UUIDs vs Serial IDs
&lt;/h2&gt;

&lt;p&gt;I use two ID strategies, depending on the table's purpose.&lt;/p&gt;

&lt;p&gt;For &lt;strong&gt;user-facing entities&lt;/strong&gt; — users, orders, invoices, API keys — I use UUID v7 via the &lt;a href="https://www.npmjs.com/package/uuidv7" rel="noopener noreferrer"&gt;&lt;code&gt;uuidv7&lt;/code&gt;&lt;/a&gt; package. Two reasons: enumeration prevention and index performance. If your order ID is &lt;code&gt;12345&lt;/code&gt;, a determined attacker can iterate through orders. With a UUID, that's not feasible. And unlike UUID v4 (fully random), UUID v7 is time-sorted — new rows always land at the end of the B-tree, not at a random position. This matters at scale: on a 5M-row table, UUID v4 produces a 285MB index; UUID v7 produces a 118MB index with 3x lower INSERT latency. I covered the full breakdown in &lt;a href="https://iurii.rogulia.fi/blog/uuid-v7-ulid-nanoid" rel="noopener noreferrer"&gt;UUID v7, ULID, and NanoID compared&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Important: PostgreSQL's built-in &lt;code&gt;gen_random_uuid()&lt;/code&gt; generates UUID v4, not v7. Generate v7 in application code and pass it explicitly.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;-- No DEFAULT on id — the application generates UUID v7 before INSERT&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For &lt;strong&gt;internal join tables&lt;/strong&gt; — &lt;code&gt;user_roles&lt;/code&gt;, &lt;code&gt;order_items&lt;/code&gt;, &lt;code&gt;tag_assignments&lt;/code&gt; — I use &lt;code&gt;BIGSERIAL&lt;/code&gt;. These IDs never surface in URLs or APIs, so enumeration isn't a concern. &lt;code&gt;BIGSERIAL&lt;/code&gt; is faster to index, smaller on disk, and slightly better for sequential inserts.&lt;/p&gt;

&lt;p&gt;In Drizzle ORM:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pgTable&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;timestamp&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;drizzle-orm/pg-core&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;uuidv7&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;uuidv7&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pgTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;orders&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;primaryKey&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;$defaultFn&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;uuidv7&lt;/span&gt;&lt;span class="p"&gt;()),&lt;/span&gt;
  &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;references&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;created_at&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;withTimezone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;defaultNow&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orderItems&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pgTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order_items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;primaryKey&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;references&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;product_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;integer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;quantity&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Timestamps: Always UTC, Always Two Columns
&lt;/h2&gt;

&lt;p&gt;Every table I create gets &lt;code&gt;created_at&lt;/code&gt; and &lt;code&gt;updated_at&lt;/code&gt;. No exceptions. I've debugged too many issues caused by "we didn't think we'd need that timestamp" to ever skip it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;gen_random_uuid&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="n"&gt;UUID&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;REFERENCES&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="n"&gt;plan&lt;/span&gt; &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Always &lt;code&gt;TIMESTAMPTZ&lt;/code&gt; (timestamp with time zone), never &lt;code&gt;TIMESTAMP&lt;/code&gt;. Postgres stores &lt;code&gt;TIMESTAMPTZ&lt;/code&gt; as UTC internally and converts on read based on the session timezone. If you store &lt;code&gt;TIMESTAMP&lt;/code&gt; and later need to handle users across time zones, you're retrofitting a problem that didn't have to exist.&lt;/p&gt;

&lt;p&gt;For &lt;code&gt;updated_at&lt;/code&gt;, I use a trigger rather than relying on the ORM to set it. ORMs miss updates done via raw SQL, admin tools, or migration scripts.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;OR&lt;/span&gt; &lt;span class="k"&gt;REPLACE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;update_updated_at&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;RETURNS&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="err"&gt;$$&lt;/span&gt;
&lt;span class="k"&gt;BEGIN&lt;/span&gt;
  &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;updated_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;RETURN&lt;/span&gt; &lt;span class="k"&gt;NEW&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;END&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="err"&gt;$$&lt;/span&gt; &lt;span class="k"&gt;LANGUAGE&lt;/span&gt; &lt;span class="n"&gt;plpgsql&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TRIGGER&lt;/span&gt; &lt;span class="n"&gt;set_updated_at&lt;/span&gt;
&lt;span class="k"&gt;BEFORE&lt;/span&gt; &lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;
&lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;EACH&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;EXECUTE&lt;/span&gt; &lt;span class="k"&gt;FUNCTION&lt;/span&gt; &lt;span class="n"&gt;update_updated_at&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Soft Deletes: Useful, With One Critical Index
&lt;/h2&gt;

&lt;p&gt;Soft deletes — marking a row as deleted with a &lt;code&gt;deleted_at&lt;/code&gt; timestamp instead of removing it — are useful when you need an audit trail, or when "deleted" means something recoverable (cancelled subscription, archived document).&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;ADD&lt;/span&gt; &lt;span class="k"&gt;COLUMN&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The query pattern for fetching live records:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The gotcha: if you add a standard index on &lt;code&gt;deleted_at&lt;/code&gt;, most queries will still do a full table scan — &lt;code&gt;NULL&lt;/code&gt; values are spread throughout the table. What you actually want is a &lt;strong&gt;partial index&lt;/strong&gt; — an index that only covers the rows you'll actually query:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Standard index — doesn't help much for WHERE deleted_at IS NULL queries&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_documents_deleted_at&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;deleted_at&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Partial index — only indexes live rows. Much smaller, much faster.&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_documents_active&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;deleted_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The partial index only stores rows where &lt;code&gt;deleted_at IS NULL&lt;/code&gt;. For a table with 10% deleted records, that's 10% smaller. More importantly, every query for live records uses this index directly.&lt;/p&gt;

&lt;p&gt;When NOT to use soft deletes: high-volume tables where deleted rows accumulate fast (event logs, audit trails), and tables where regulatory compliance requires actual deletion (GDPR right to erasure). For GDPR, you need hard deletes. Trying to justify soft deletes as a GDPR exemption is a conversation you do not want to have with a data protection authority.&lt;/p&gt;

&lt;h2&gt;
  
  
  JSONB: Where It Helps and Where It Hurts
&lt;/h2&gt;

&lt;p&gt;Postgres JSONB is genuinely useful in specific cases. I reach for it in three situations:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flexible attributes&lt;/strong&gt; where the shape isn't fully known at design time — product metadata, user preferences, feature flags per tenant.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Audit logs&lt;/strong&gt; where you want to store the full before/after state of a row without designing a separate schema for every table.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;External API responses&lt;/strong&gt; — store the raw Stripe event, the PostNord shipment response, whatever third-party payload you received. Useful for debugging and for replaying events without re-fetching from the source. In pikkuna.fi, I store every API response from PostNord and Zoho in a &lt;code&gt;raw_payload JSONB&lt;/code&gt; column alongside the parsed fields.&lt;/p&gt;

&lt;p&gt;Where it hurts: anything you filter or sort on regularly. If you're querying &lt;code&gt;WHERE metadata-&amp;gt;&amp;gt;'status' = 'active'&lt;/code&gt;, you're fighting the type system and paying in query planning complexity. Extract frequently-queried fields into proper typed columns.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Good use of JSONB — audit log with flexible payload&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;auditLog&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pgTable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;audit_log&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;bigserial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;mode&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;primaryKey&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;user_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;references&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;users&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;action&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;tableName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;text&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;table_name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;recordId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;uuid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;record_id&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;before&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;before&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;after&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;jsonb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;after&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
  &lt;span class="na"&gt;createdAt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;timestamp&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;created_at&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;withTimezone&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;notNull&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;defaultNow&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// Bad use of JSONB — filtering on JSON fields you should have typed&lt;/span&gt;
&lt;span class="c1"&gt;// SELECT * FROM products WHERE attributes-&amp;gt;&amp;gt;'category' = 'electronics'&lt;/span&gt;
&lt;span class="c1"&gt;// Just add a category column.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If you find yourself writing &lt;code&gt;-&amp;gt;&amp;gt;'field'&lt;/code&gt; in WHERE clauses more than once, that field belongs in a proper column.&lt;/p&gt;

&lt;h2&gt;
  
  
  Row-Level Security for Multi-Tenant SaaS
&lt;/h2&gt;

&lt;p&gt;
  slug="mvp-development"&lt;br&gt;
  text="Database architecture — schema design, RLS, indexes, migrations, and connection pooling — is part of every production SaaS I build. Not an afterthought."&lt;br&gt;
/&amp;gt;&lt;/p&gt;

&lt;p&gt;Row-level security (RLS) is a Postgres feature that enforces data access rules at the database layer. For multi-tenant SaaS — where every user should see only their own data — it's the most reliable isolation layer you can add.&lt;/p&gt;

&lt;p&gt;Without RLS, your application code is the only thing standing between a user and another user's data. One missed &lt;code&gt;WHERE user_id = $1&lt;/code&gt; clause and data leaks. RLS makes isolation structural.&lt;/p&gt;

&lt;p&gt;Basic pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Enable RLS on the table&lt;/span&gt;
&lt;span class="k"&gt;ALTER&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt; &lt;span class="n"&gt;ENABLE&lt;/span&gt; &lt;span class="k"&gt;ROW&lt;/span&gt; &lt;span class="k"&gt;LEVEL&lt;/span&gt; &lt;span class="k"&gt;SECURITY&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- Policy: users can only see their own documents&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="n"&gt;POLICY&lt;/span&gt; &lt;span class="n"&gt;documents_user_isolation&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
  &lt;span class="k"&gt;FOR&lt;/span&gt; &lt;span class="k"&gt;ALL&lt;/span&gt;
  &lt;span class="k"&gt;USING&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;current_setting&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'app.current_user_id'&lt;/span&gt;&lt;span class="p"&gt;)::&lt;/span&gt;&lt;span class="n"&gt;UUID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In your application, set the context before queries in the same transaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// lib/db-context.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;sql&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;drizzle-orm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;withUserContext&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;
&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;T&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sql&lt;/span&gt;&lt;span class="s2"&gt;`SELECT set_config('app.current_user_id', &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;userId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;, true)`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;fn&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;true&lt;/code&gt; parameter in &lt;code&gt;set_config&lt;/code&gt; makes the setting transaction-local — it resets after the transaction ends. This is important: if you use a connection pool, you don't want one user's context bleeding into the next query on the same connection.&lt;/p&gt;

&lt;p&gt;I use Supabase-style RLS on some projects (&lt;code&gt;auth.uid()&lt;/code&gt;) and the &lt;code&gt;set_config&lt;/code&gt; pattern on others depending on whether I control the auth layer. Both work — the key is consistency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Indexes: What Actually Matters
&lt;/h2&gt;

&lt;p&gt;Over-indexing is as harmful as under-indexing. Indexes slow down writes, consume storage, and confuse the query planner when you have too many of them. Here's what I actually add.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Partial indexes&lt;/strong&gt; — covered above for soft deletes. Also useful for status-filtered queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Only index rows that are in an actionable state&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_jobs_pending&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;jobs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'pending'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Covering indexes&lt;/strong&gt; — include columns in the index so Postgres can answer the query from the index alone, without hitting the table:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- A query that fetches user_id and status for orders in a date range&lt;/span&gt;
&lt;span class="c1"&gt;-- can be answered entirely from this index&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_orders_covering&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;INCLUDE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Composite index column order&lt;/strong&gt; — start with columns used in equality filters, then range or sort columns. Selectivity matters, but query shape matters more: a column with high cardinality that only appears in a range predicate should not go first if a lower-cardinality equality column appears in every query.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Bad order: status has low cardinality (3 possible values)&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_bad&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;-- Good order: user_id is highly selective&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;idx_good&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;subscriptions&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When NOT to index: tables under 10,000 rows (sequential scan is often faster), columns that change frequently (every update rewrites the index entry), and columns you only filter on in bulk admin queries (use a separate read replica or just accept the scan).&lt;/p&gt;

&lt;h2&gt;
  
  
  CTEs: Readability and the Optimization Fence
&lt;/h2&gt;

&lt;p&gt;Common table expressions (&lt;code&gt;WITH&lt;/code&gt; queries) are the most underused tool for readable SQL. I reach for them whenever a query would otherwise require nested subqueries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Without CTE — hard to read&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'paid'&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- With CTE — each step is named and readable&lt;/span&gt;
&lt;span class="k"&gt;WITH&lt;/span&gt; &lt;span class="n"&gt;recent_users&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;email&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;users&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'90 days'&lt;/span&gt;
&lt;span class="p"&gt;),&lt;/span&gt;
&lt;span class="n"&gt;recent_paid_orders&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
  &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;orders&lt;/span&gt;
  &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'30 days'&lt;/span&gt;
    &lt;span class="k"&gt;AND&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;'paid'&lt;/span&gt;
  &lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;user_id&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;email&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;COALESCE&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;order_count&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;order_count&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;recent_users&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;
&lt;span class="k"&gt;LEFT&lt;/span&gt; &lt;span class="k"&gt;JOIN&lt;/span&gt; &lt;span class="n"&gt;recent_paid_orders&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;u&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;One thing to know: since Postgres 12, CTEs are &lt;strong&gt;not&lt;/strong&gt; automatically optimization fences. Before Postgres 12, the planner always materialized each CTE (evaluated it once and stored the result). Since 12, the planner can inline CTEs into the main query and optimize across them. If you explicitly need materialization (e.g., to prevent re-evaluation of a volatile function), use &lt;code&gt;WITH cte AS MATERIALIZED (...)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This matters when optimizing slow queries — don't assume a CTE is cached or is being optimized in isolation.&lt;/p&gt;

&lt;h2&gt;
  
  
  Transactions and &lt;code&gt;FOR UPDATE&lt;/code&gt;: Preventing Race Conditions
&lt;/h2&gt;

&lt;p&gt;Whenever two concurrent operations read the same row and both decide to act on it, you have a race condition. Classic examples: inventory decrement, seat booking, credit balance deduction.&lt;/p&gt;

&lt;p&gt;The naive pattern:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Race condition — two concurrent requests both read stock = 1&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;product&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
    &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="c1"&gt;// Both requests pass the check, both decrement. Stock goes to -1.&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The fix is &lt;code&gt;SELECT ... FOR UPDATE&lt;/code&gt; inside a transaction. This locks the row until the transaction commits, forcing concurrent requests to wait:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// lib/inventory.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;products&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db/schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;sql&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;drizzle-orm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;decrementStock&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="nb"&gt;Promise&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;boolean&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Lock the row — concurrent requests will wait here&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;select&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stock&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;update&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;product&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;product&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stock&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// Out of stock&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sql&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;stock&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; - 1`&lt;/span&gt; &lt;span class="p"&gt;})&lt;/span&gt;
      &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;products&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;productId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The atomic &lt;code&gt;stock - 1&lt;/code&gt; expression in the UPDATE is also intentional — it applies the decrement relative to the current value at write time, not relative to the value you read earlier. This is safe even without &lt;code&gt;FOR UPDATE&lt;/code&gt;, but combining both gives you correctness under high concurrency.&lt;/p&gt;

&lt;h2&gt;
  
  
  Connection Pooling: The Math You Need to Do
&lt;/h2&gt;

&lt;p&gt;Postgres has a hard limit on simultaneous connections. The default is 100. Each connection consumes ~5–10MB of memory. On a $20/month VPS, you do not have room for 100 open connections.&lt;/p&gt;

&lt;p&gt;The problem with serverless and edge deployments is that each function invocation can open its own connection. 50 concurrent requests to a Next.js API route = 50 connections. Under traffic, this exhausts the pool and requests start failing with &lt;code&gt;too many connections&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The solution is PgBouncer or an equivalent pooler (Supabase uses PgBouncer under the hood; Neon has its own). The connection pool sits between your application and Postgres, maintaining a small number of real connections and multiplexing many application connections through them.&lt;/p&gt;

&lt;p&gt;A rough sanity check for initial sizing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Available connections ≈ (RAM in MB / 10) - 5 (reserved for superuser)
Pool size per service ≈ available connections / number of services
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For a 1GB VPS with Postgres and one application this gives you a starting point of ~80 application connections. Actual limits depend on &lt;code&gt;work_mem&lt;/code&gt;, &lt;code&gt;shared_buffers&lt;/code&gt;, autovacuum workers, and query complexity — treat this as a sanity check, not a formula to follow blindly.&lt;/p&gt;

&lt;p&gt;PgBouncer configuration I use in transaction pooling mode (recommended for most SaaS apps):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight ini"&gt;&lt;code&gt;&lt;span class="nn"&gt;[databases]&lt;/span&gt;
&lt;span class="py"&gt;myapp&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;host=localhost port=5432 dbname=myapp&lt;/span&gt;

&lt;span class="nn"&gt;[pgbouncer]&lt;/span&gt;
&lt;span class="py"&gt;pool_mode&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;transaction&lt;/span&gt;
&lt;span class="py"&gt;max_client_conn&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;1000&lt;/span&gt;
&lt;span class="py"&gt;default_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;25&lt;/span&gt;
&lt;span class="py"&gt;reserve_pool_size&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;5&lt;/span&gt;
&lt;span class="py"&gt;reserve_pool_timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;3&lt;/span&gt;
&lt;span class="py"&gt;server_idle_timeout&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="s"&gt;600&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Transaction pooling mode returns connections to the pool after each transaction, not after the session ends. This is the most efficient mode. The tradeoff: you cannot use session-level features like &lt;code&gt;SET&lt;/code&gt; (session-scoped config), prepared statements across requests, or &lt;code&gt;LISTEN/NOTIFY&lt;/code&gt; across a transaction boundary. For RLS with &lt;code&gt;set_config(..., true)&lt;/code&gt;, this matters — use transaction-scoped settings, which transaction pooling handles correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Migrations: Never Touch the Schema Manually
&lt;/h2&gt;

&lt;p&gt;This sounds obvious and people still do it. I've inherited production databases where no one knows exactly what schema is running, because "someone made a quick change in the admin panel six months ago."&lt;/p&gt;

&lt;p&gt;The rule is absolute: every schema change goes through a migration file, committed to version control, applied by a migration tool. No exceptions for "just a quick index" or "just adding one column."&lt;/p&gt;

&lt;p&gt;I use Drizzle ORM's migration workflow:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Generate migration from schema changes&lt;/span&gt;
npx drizzle-kit generate

&lt;span class="c"&gt;# Apply migrations (run in CI/CD, not manually)&lt;/span&gt;
npx drizzle-kit migrate
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The migration file is a plain SQL file, committed alongside the code change that needs it. Every deployment applies pending migrations before the new code starts.&lt;/p&gt;

&lt;p&gt;On rollback strategy: Postgres DDL is transactional. If you add a column, index, and constraint in one migration, and the constraint fails, the whole migration rolls back. This is a feature — design your migrations to be atomic. For large tables, be careful with operations that hold &lt;code&gt;ACCESS EXCLUSIVE&lt;/code&gt; locks (adding NOT NULL constraints without a default, rebuilding indexes). Use &lt;code&gt;CREATE INDEX CONCURRENTLY&lt;/code&gt; and &lt;code&gt;ALTER TABLE ... SET NOT NULL&lt;/code&gt; with a check constraint added in an earlier migration.&lt;/p&gt;

&lt;p&gt;One pattern I always follow: never delete a column in the same migration that removes its usage from the application code. Deploy the code change first (which stops writing to the column), then remove the column in a subsequent migration. This avoids downtime from deployment ordering.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Looks Like Across Projects
&lt;/h2&gt;

&lt;p&gt;In vatnode.dev — RLS for tenant isolation on validation logs, partial indexes on &lt;code&gt;deleted_at&lt;/code&gt; and &lt;code&gt;status&lt;/code&gt;, UUID v4 for all API-exposed IDs (vatnode predates my switch to v7; new projects use v7), PgBouncer in front of Postgres. The schema has 11 tables; migrations are applied automatically on each Coolify deployment.&lt;/p&gt;

&lt;p&gt;In pikkuna.fi — JSONB for storing raw PostNord and Zoho API responses alongside structured order data, &lt;code&gt;FOR UPDATE&lt;/code&gt; transactions for order status transitions, &lt;code&gt;updated_at&lt;/code&gt; triggers on every table. 100% automated order pipeline means concurrent webhooks hit the same order rows — without locking, you get races.&lt;/p&gt;

&lt;p&gt;In pi-pi.ee — soft deletes on products and categories (store operators want to recover things), covering indexes on the orders table for the admin dashboard queries, all timestamps UTC.&lt;/p&gt;




&lt;p&gt;Good PostgreSQL architecture is mostly boring: stable IDs, explicit ownership, predictable migrations, and indexes that match the queries you actually run. The value isn't in any individual pattern — it's in never having to debug the class of problem each one prevents.&lt;/p&gt;

&lt;p&gt;These patterns are not exotic. They're the defaults I reach for because skipping them has caused me real problems in the past. UUID enumeration, missing audit trails, oversized connection pools, schema drift — each is a class of incident that becomes impossible once the pattern is in place.&lt;/p&gt;

&lt;p&gt;If you're &lt;a href="https://iurii.rogulia.fi/services/mvp-development" rel="noopener noreferrer"&gt;building a SaaS or API-backed product&lt;/a&gt; and want a database foundation that holds under production load — &lt;a href="https://iurii.rogulia.fi/contact" rel="noopener noreferrer"&gt;get in touch&lt;/a&gt;. I'm available for freelance projects and longer-term engagements.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Related reading:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/uuid-v7-ulid-nanoid" rel="noopener noreferrer"&gt;UUID v7, ULID, and NanoID: Which One Should You Use?&lt;/a&gt; — deeper look at ID strategies and insert performance&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/stripe-webhooks-production" rel="noopener noreferrer"&gt;Stripe Webhooks Done Right: Production Architecture&lt;/a&gt; — idempotency patterns that depend on the PostgreSQL foundation above&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/build-saas-nextjs-checklist" rel="noopener noreferrer"&gt;Production SaaS Checklist: Launch in 8 Weeks With Next.js&lt;/a&gt; — the full checklist that includes database setup&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;Vatnode VAT Validation SaaS&lt;/a&gt; — 11-table schema running these patterns in production&lt;/li&gt;
&lt;li&gt;&lt;a href="https://orm.drizzle.team/docs/overview" rel="noopener noreferrer"&gt;Drizzle ORM documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.postgresql.org/docs/current/ddl-rowsecurity.html" rel="noopener noreferrer"&gt;PostgreSQL official documentation — Row Security Policies&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.pgbouncer.org/config.html" rel="noopener noreferrer"&gt;PgBouncer documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>typescript</category>
      <category>node</category>
      <category>drizzle</category>
      <category>postgres</category>
    </item>
    <item>
      <title>H-1B Salary Slip Fraud Detection: Altered Indian Payslips at the Offer Stage</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Sat, 27 Jun 2026 10:41:33 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/h-1b-salary-slip-fraud-detection-altered-indian-payslips-at-the-offer-stage-487p</link>
      <guid>https://dev.to/iurii_rogulia/h-1b-salary-slip-fraud-detection-altered-indian-payslips-at-the-offer-stage-487p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;Originally published at &lt;a href="https://htpbe.tech/blog/h1b-indian-salary-slip-fraud" rel="noopener noreferrer"&gt;htpbe.tech&lt;/a&gt;. The version on htpbe.tech stays in sync with the latest detection algorithm — refer to it for the canonical text.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;A mid-sized US fintech company extends an offer to a senior engineer based in Bangalore. The candidate submits three months of payslips from their current employer — a well-known Indian IT services firm — showing a monthly CTC of ₹18 lakhs. The offer is set accordingly. Three weeks later, during background screening, the BGV vendor contacts the employer and receives no response. The candidate explains that Indian companies routinely ignore BGV inquiries and submits an employment confirmation letter instead.&lt;/p&gt;

&lt;p&gt;The actual salary on those payslips was ₹11 lakhs. The candidate had opened the PDFs in a browser-based editor, changed the compensation figures, and re-exported them. Everything else — the employer letterhead, the PF deductions, the company logo — was authentic.&lt;/p&gt;

&lt;p&gt;This is the dominant salary fraud vector in Indian-to-US hiring pipelines. H-1B salary slip fraud detection is not part of most offer workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why H-1B Salary Slip Fraud Detection Fails in Standard BGV Workflows
&lt;/h2&gt;

&lt;p&gt;Indian background screening has a structural gap that enables document fraud: employer non-response rates on employment and compensation fraud detection routinely run between 15% and 30%, according to data from major BGV vendors. Indian employers — particularly mid-size IT services companies and startups — frequently do not maintain dedicated BGV response teams. HR contacts leave. Emails go unanswered for weeks.&lt;/p&gt;

&lt;p&gt;When a BGV vendor cannot confirm compensation, the candidate is typically asked to provide supporting documentation: payslips, an offer letter, or a compensation certificate. This is a reasonable fallback. It is also the point where document fraud enters the workflow.&lt;/p&gt;

&lt;p&gt;The candidate already has their real payslips. They know the exact format, the correct deductions line, the PF and ESI structure. Editing the compensation figure and re-exporting takes under ten minutes. The resulting document looks identical to a genuine payslip because it is a genuine payslip — with one number changed.&lt;/p&gt;

&lt;p&gt;US immigration counsel reviewing documents for H-1B prevailing wage compliance, and HR teams calibrating offers against claimed compensation, work from those figures. If the underlying documents have been altered, the entire wage analysis is based on false data.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Altered Payslip Actually Contains
&lt;/h2&gt;

&lt;p&gt;Indian payroll is dominated by a small number of software platforms: Keka, GreytHR, Darwinbox, Zoho Payroll, and SAP SuccessFactors for larger enterprises. Each platform generates PDFs with a recognizable structural signature — a specific &lt;code&gt;Producer&lt;/code&gt; value, a single-pass xref structure, and metadata that reflects an automated document generation event rather than a manual save.&lt;/p&gt;

&lt;p&gt;When a candidate downloads their genuine payslip and opens it in a PDF editor — Adobe Acrobat, Foxit, Smallpdf, ILovePDF, or even a browser’s built-in save-as function — the editing software appends its own changes to the file. In most cases it writes its own name into the &lt;code&gt;Producer&lt;/code&gt; field. It adds a new cross-reference entry recording the edit session. It updates the modification timestamp.&lt;/p&gt;

&lt;p&gt;The result is a file that claims to have been generated by Keka or GreytHR — because the original producer metadata is still present in the file’s early sections — but also carries a second production record from the editing tool. The modification timestamp is later than the document’s claimed issuance date. The edit history shows at least two sessions: the original payroll generation and the candidate’s edit.&lt;/p&gt;

&lt;p&gt;None of this is visible when you open the PDF. It is preserved in the file’s internal structure.&lt;/p&gt;

&lt;h2&gt;
  
  
  What HTPBE Detects
&lt;/h2&gt;

&lt;p&gt;When a salary slip with this edit pattern is submitted through HTPBE, the analysis reads those structural layers and returns a verdict based on what the file itself recorded.&lt;/p&gt;

&lt;p&gt;A realistic API response for an altered Indian payslip:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"id"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ck_7e2a91f4-bc34-4d7e-a108-3f5c8e012bd9"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"status"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"modified"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"high"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_markers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_EDITING_TOOL_FINGERPRINT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_MULTIPLE_REVISION_LAYERS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="s2"&gt;"HTPBE_DATES_DISAGREE"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creator"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Keka HR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"producer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Smallpdf"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"creation_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1738368000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"modification_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1745712000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"xref_count"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three signals are present in this response:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;HTPBE_EDITING_TOOL_FINGERPRINT&lt;/code&gt;&lt;/strong&gt; — The file’s internal records name two different software systems. Keka HR generated the original document. Smallpdf processed it afterward. Payroll software does not send finished payslips through Smallpdf. The mismatch indicates the file was reopened in a different tool after initial generation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;HTPBE_MULTIPLE_REVISION_LAYERS&lt;/code&gt;&lt;/strong&gt; — Three xref entries exist in the file. The first corresponds to the original payroll generation. The later entries record editing sessions. A genuine payslip from an automated payroll system has exactly one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;HTPBE_DATES_DISAGREE&lt;/code&gt;&lt;/strong&gt; — The modification timestamp is nearly three months after the creation date. A payslip generated once by a payroll system has matching creation and modification times. A gap of months indicates the file was opened and resaved long after it was originally issued.&lt;/p&gt;

&lt;p&gt;Together, these markers produce a &lt;code&gt;modified&lt;/code&gt; verdict. No comparison against an original is needed — the file’s own structure records what happened.&lt;/p&gt;

&lt;h2&gt;
  
  
  The INCONCLUSIVE Case: Small Employers and Word-Generated Slips
&lt;/h2&gt;

&lt;p&gt;Not all Indian employers use structured payroll software. Many small and mid-size employers — boutique agencies, early-stage startups, family-owned businesses — generate payslips in Microsoft Word or Google Docs, export them to PDF, and send them by email. These documents have no institutional payroll metadata. They originate from consumer software.&lt;/p&gt;

&lt;p&gt;For these files, HTPBE returns &lt;code&gt;inconclusive&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;inconclusive&lt;/code&gt; is not a fraud flag. It means the document was created with consumer software, and there is no structural basis for checking whether it was modified after creation.&lt;/p&gt;

&lt;p&gt;The correct interpretation depends on the context:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If the candidate works at a known large employer&lt;/strong&gt; — Infosys, Wipro, TCS, HCL, Cognizant, or any company with structured payroll infrastructure — an &lt;code&gt;inconclusive&lt;/code&gt; verdict on their payslip is a significant flag. Those companies use enterprise payroll systems. Their payslips do not originate in Word. Escalate to BGV.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If the candidate works at a small employer or consultancy&lt;/strong&gt; where Word-generated slips are common, &lt;code&gt;inconclusive&lt;/code&gt; alone does not indicate fraud. Combine it with other signals: Is the document’s visual layout consistent? Are the PF/ESI deduction calculations correct? Does the employer size match the claimed compensation band? Is there corroborating documentation?&lt;/p&gt;

&lt;p&gt;The distinction matters in practice. Rejecting every &lt;code&gt;inconclusive&lt;/code&gt; document in an Indian hiring pipeline would produce a high false-positive rate for candidates from smaller employers. The more defensible approach is to treat &lt;code&gt;inconclusive&lt;/code&gt; as a signal that feeds a risk-scoring model, not as an automatic rejection.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Integrate This Into Your Offer and fraud-detection workflow
&lt;/h2&gt;

&lt;p&gt;The right time to run this check is at the offer stage, before the compensation figure is finalized and before H-1B sponsorship documentation is started.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Collect payslips as a URL or upload them to your document storage.&lt;/strong&gt;&lt;br&gt;
HTPBE analyzes PDFs by URL. When candidates submit payslips through your ATS portal or onboarding platform, generate a presigned URL from your document storage (S3, Azure Blob, Google Cloud Storage) and pass it to the &lt;a href="https://htpbe.tech/api" rel="noopener noreferrer"&gt;payslip fraud detection API&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Submit to HTPBE and route on verdict.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://api.htpbe.tech/v1/analyze &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer YOUR_API_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"url": "https://your-storage.example.com/candidates/12345/payslip-jan.pdf"}'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Step 3: Route based on the response.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;route_payslip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;employer_is_large&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;modified&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Structural evidence of post-creation edit — escalate immediately
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate_to_hr&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inconclusive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;employer_is_large&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Large employer should not produce Word-origin payslips — escalate to BGV
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;escalate_to_bgv&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;inconclusive&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="c1"&gt;# Small employer context — flag for manual review, not auto-reject
&lt;/span&gt;        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;manual_review&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="c1"&gt;# intact
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;proceed&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Store the HTPBE check ID alongside the candidate record. If a hiring decision is later challenged, the forensic report is retrievable from &lt;code&gt;GET /api/v1/result/{check_id}&lt;/code&gt; and provides an auditable basis for the decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Cross-Border Pattern: UK Tier 2 and Australian 482 Visas
&lt;/h2&gt;

&lt;p&gt;The same structural fraud pattern appears in UK and Australian immigration workflows. Indian nationals applying for UK Skilled Worker visas (formerly Tier 2) and Australian Temporary Skill Shortage (subclass 482) visas must provide proof of current overseas earnings as part of the sponsorship and salary assessment process.&lt;/p&gt;

&lt;p&gt;The incentive is identical: higher claimed current earnings support a stronger case for the role’s salary level and the visa’s skills threshold. The method is identical: genuine payslip, edited compensation figure, re-exported PDF.&lt;/p&gt;

&lt;p&gt;UK Home Office and Australian Department of Home Affairs guidance both require sponsors to check that salary evidence is genuine. The structural forensic checks that detect this fraud pattern in US H-1B workflows apply directly to Tier 2 and 482 applications. If your organization sponsors visas across multiple jurisdictions, a single API integration covers all three workflows.&lt;/p&gt;

&lt;h2&gt;
  
  
  Who This Is For
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;US HR and talent acquisition teams&lt;/strong&gt; processing Indian candidates for H-1B sponsorship or US-based offers where salary history informs the offer. Run HTPBE at the document collection stage, before offer letters are finalized and before immigration counsel is engaged.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Background fraud detection vendors&lt;/strong&gt; covering Indian employment. When primary-source fraud detection fails and the candidate provides supporting payslip documentation, forensic analysis is the check that BGV currently lacks. The &lt;a href="https://htpbe.tech/fake-salary-slip-detection" rel="noopener noreferrer"&gt;fake salary slip detection page&lt;/a&gt; covers the full detection pattern for payslip fraud across geographies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Immigration compliance teams and counsel&lt;/strong&gt; reviewing compensation documentation for prevailing wage determinations and visa petitions. A &lt;code&gt;modified&lt;/code&gt; verdict on a submitted payslip is a material fact for the record. Document it before the petition is filed, not after.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;HR platforms and ATS vendors&lt;/strong&gt; looking to add document integrity checks to their candidate fraud detection layer. The &lt;a href="https://htpbe.tech/use-cases/hr-hiring" rel="noopener noreferrer"&gt;HR and hiring solutions page&lt;/a&gt; covers how HTPBE integrates into recruiting workflows at the platform level.&lt;/p&gt;

&lt;p&gt;The API is available on the &lt;a href="https://htpbe.tech/auth/signup" rel="noopener noreferrer"&gt;Starter plan at $15/month&lt;/a&gt;. For BGV vendors and ATS platforms processing large volumes, the Growth and Pro plans support up to 1,500 checks per month.&lt;/p&gt;

</description>
      <category>pdf</category>
      <category>fintech</category>
      <category>fraud</category>
      <category>webdev</category>
    </item>
    <item>
      <title>BullMQ vs pg-boss vs Cron: Node.js Background Jobs Compared</title>
      <dc:creator>Iurii Rogulia</dc:creator>
      <pubDate>Sat, 27 Jun 2026 10:03:48 +0000</pubDate>
      <link>https://dev.to/iurii_rogulia/bullmq-vs-pg-boss-vs-cron-nodejs-background-jobs-compared-iej</link>
      <guid>https://dev.to/iurii_rogulia/bullmq-vs-pg-boss-vs-cron-nodejs-background-jobs-compared-iej</guid>
      <description>&lt;p&gt;Your order confirmation email is sometimes delayed. Your nightly report occasionally runs twice. Your integration with the third-party API times out and nobody notices until a customer complains three days later.&lt;/p&gt;

&lt;p&gt;These are background job problems. And the solution is almost always "you need a proper job queue" — but that phrase hides a decision with real infrastructure consequences. Redis-backed or Postgres-backed? A cron schedule or event-driven workers? Retries with backoff or fire-and-forget?&lt;/p&gt;

&lt;p&gt;I have dealt with this across several production systems: on &lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;vatnode.dev&lt;/a&gt;, where BullMQ workers handle subscription logic triggered by Stripe webhooks, and on &lt;a href="https://iurii.rogulia.fi/projects/pikkuna-ecommerce-platform" rel="noopener noreferrer"&gt;pikkuna.fi&lt;/a&gt;, where the full order pipeline — CRM update, shipment creation, invoice generation, email — runs through an async worker chain. Here is how I make that decision.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three Levels of Background Work
&lt;/h2&gt;

&lt;p&gt;Before comparing tools, it helps to be precise about what "background job" actually means in your case.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 1: Scheduled tasks, single process, low stakes.&lt;/strong&gt; You need to send a weekly digest email, refresh a cache at 6 AM, or clean up expired sessions every hour. The work is triggered by time, not events. If it fails, it will run again in an hour. You are running one app instance.&lt;/p&gt;

&lt;p&gt;This is what &lt;code&gt;node-cron&lt;/code&gt; or an OS-level cron is for. Bringing in Redis and a queue worker for this is over-engineering.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 2: Event-driven work that needs retries, deduplication, or concurrency control.&lt;/strong&gt; A Stripe webhook fires, you need to create an order, update a CRM, and send a confirmation email. Any of these steps can fail. You might be running multiple app instances. You need to guarantee delivery and avoid duplicates.&lt;/p&gt;

&lt;p&gt;This is what a job queue is for. Either BullMQ or pg-boss applies here.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Level 3: High-throughput processing pipelines.&lt;/strong&gt; You are ingesting events at thousands per minute, running parallel transformations, or building complex job dependency graphs (fan-out, fan-in, rate-limited sub-queues). Sub-second job pickup latency matters.&lt;/p&gt;

&lt;p&gt;This is where BullMQ (Redis-backed) becomes the right tool over Postgres alternatives.&lt;/p&gt;

&lt;p&gt;Most applications are Level 1 or Level 2. I will focus there.&lt;/p&gt;

&lt;h2&gt;
  
  
  Simple Cron: When It Is Actually Fine
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;node-cron&lt;/code&gt; and similar packages are underrated for genuinely simple use cases. The implementation is minimal:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// lib/scheduler.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;cron&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;node-cron&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;sessions&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/db/schema&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;lt&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;drizzle-orm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;startScheduler&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Run every day at 3 AM&lt;/span&gt;
  &lt;span class="nx"&gt;cron&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;schedule&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;0 3 * * *&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;deleted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;lt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;expiresAt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Date&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
        &lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;returning&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;sessions&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;

      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Cleaned up &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;deleted&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; expired sessions`&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Session cleanup failed:&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Call &lt;code&gt;startScheduler()&lt;/code&gt; from your app entrypoint and you are done.&lt;/p&gt;

&lt;p&gt;These are the reasons to move to a queue:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No distribution.&lt;/strong&gt; If you deploy two instances, both run the cron — your cleanup job runs twice. For idempotent cleanup that's harmless; for an email send, it isn't.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No retries.&lt;/strong&gt; If the database is temporarily unavailable, the job throws and nothing re-runs it until the next scheduled interval.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No visibility.&lt;/strong&gt; You cannot see which jobs ran, which failed, or how long they took.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These are not reasons to avoid cron — they are the checklist for deciding whether cron is sufficient. If none of these matter for your use case, keep it simple.&lt;/p&gt;

&lt;h2&gt;
  
  
  BullMQ: The Redis-Backed Standard
&lt;/h2&gt;

&lt;p&gt;
  slug="mvp-development"&lt;br&gt;
  text="Need a production SaaS with reliable background job processing built in? I architect the full stack — queues, workers, retries, and observability — as part of every MVP I deliver."&lt;br&gt;
/&amp;gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://docs.bullmq.io/" rel="noopener noreferrer"&gt;BullMQ&lt;/a&gt; is the most widely used job queue in the Node.js ecosystem. It uses Redis as its backing store and provides retries with configurable backoff, job prioritization, concurrency control, delayed jobs, repeating jobs, job dependencies, and a solid UI via &lt;a href="https://github.com/felixmosh/bull-board" rel="noopener noreferrer"&gt;Bull Board&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here is the basic setup. A queue for enqueueing, a worker for processing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// lib/queues/order-queue.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Worker&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;Job&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bullmq&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;@/lib/redis&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// ioredis instance&lt;/span&gt;

&lt;span class="kr"&gt;interface&lt;/span&gt; &lt;span class="nx"&gt;OrderJobData&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nl"&gt;customerId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;// The queue — used by your API/webhook handlers to enqueue work&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orderQueue&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Queue&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;OrderJobData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order-processing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;defaultJobOptions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;exponential&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// 3s, 6s, 12s, 24s&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;removeOnComplete&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;count&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;500&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;removeOnFail&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Keep failed jobs for inspection&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// The worker — runs in a separate process or alongside your app&lt;/span&gt;
&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;orderWorker&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;Worker&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;OrderJobData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order-processing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Job&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nx"&gt;OrderJobData&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;customerId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;syncOrderToCrm&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;40&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;createShipment&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;70&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;generateInvoice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;customerId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendConfirmationEmail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;customerId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;updateProgress&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;100&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;processed&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;connection&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;redis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="c1"&gt;// Process up to 3 orders simultaneously&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="nx"&gt;orderWorker&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;`Order job &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;?.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt; failed after all retries:`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;err&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// Alert your team — Telegram, Slack, PagerDuty, whatever you use&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Enqueueing from a Stripe webhook handler is a single line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orderQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;process-order&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;customerId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;jobId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`order-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="c1"&gt;// Deduplicate by orderId in the active window&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The &lt;code&gt;jobId&lt;/code&gt; is how BullMQ handles deduplication within active jobs — if a job with that ID is already waiting or active, the new enqueue is silently ignored. This is useful for webhook retries, but note it does not cover completed jobs. For that you still need a database-level idempotency check.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What BullMQ does well:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Job prioritization (numeric priority field on each job)&lt;/li&gt;
&lt;li&gt;Delayed jobs — schedule a follow-up email 3 days after sign-up&lt;/li&gt;
&lt;li&gt;Repeating jobs with cron expressions — a proper replacement for &lt;code&gt;node-cron&lt;/code&gt; that works across multiple instances&lt;/li&gt;
&lt;li&gt;Job dependencies with &lt;code&gt;FlowProducer&lt;/code&gt; — fan-out, fan-in, pipelines&lt;/li&gt;
&lt;li&gt;Rich UI with Bull Board&lt;/li&gt;
&lt;li&gt;Large community, mature documentation&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to watch out for:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Redis is a separate infrastructure component to manage. If Redis goes down, job processing stops — jobs in the queue are still there, but no new ones are picked up and none are processed until Redis recovers. For most applications this is acceptable; for a payment pipeline it means you need Redis HA (Redis Sentinel or Cluster, or a managed service like Upstash).&lt;/p&gt;

&lt;p&gt;The other thing: BullMQ workers are long-running processes. In a Next.js deployment, you need a separate worker process. On vatnode, the &lt;a href="https://iurii.rogulia.fi/blog/turborepo-nextjs-hono-monorepo" rel="noopener noreferrer"&gt;Turborepo monorepo&lt;/a&gt; has a dedicated &lt;code&gt;apps/worker&lt;/code&gt; package that runs alongside the Next.js app in the same container.&lt;/p&gt;

&lt;h2&gt;
  
  
  pg-boss: Postgres as Your Queue
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/timgit/pg-boss" rel="noopener noreferrer"&gt;pg-boss&lt;/a&gt; takes a different approach: it uses Postgres as the queue backing store, maintaining a &lt;code&gt;pgboss&lt;/code&gt; schema with job tables, state transitions, and indexes. No Redis required.&lt;/p&gt;

&lt;p&gt;The key insight: if you already use Postgres, you can enqueue a job in the same database transaction as your business logic. This gives you ACID guarantees that Redis cannot match.&lt;/p&gt;

&lt;p&gt;Consider this scenario: you receive a payment confirmation and need to (1) update the order status and (2) enqueue a job to send a confirmation email. With Redis-backed queues:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Without transactional enqueue — there's a window where the DB write succeeds&lt;/span&gt;
&lt;span class="c1"&gt;// but the queue enqueue fails, and the email never gets sent&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;paid&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orderQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;send-confirmation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt; &lt;span class="c1"&gt;// This can fail independently&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With pg-boss, both operations happen in the same Postgres transaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// lib/queues/pg-boss-queue.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="nx"&gt;PgBoss&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pg-boss&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;boss&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;PgBoss&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;DATABASE_URL&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;boss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;start&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

&lt;span class="c1"&gt;// In your payment handler — inside a transaction&lt;/span&gt;
&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;transaction&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;status&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;paid&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;}).&lt;/span&gt;&lt;span class="nf"&gt;where&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;eq&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orders&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;

  &lt;span class="c1"&gt;// Enqueue using the same Postgres connection — atomic with the DB write.&lt;/span&gt;
  &lt;span class="c1"&gt;// Pass the raw pg client from the active transaction, not the ORM wrapper&lt;/span&gt;
  &lt;span class="c1"&gt;// (e.g. tx.client with node-postgres, or the equivalent your adapter exposes).&lt;/span&gt;
  &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;boss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sendOnce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;send-confirmation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`confirmation-&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="c1"&gt;// Deduplication key&lt;/span&gt;
    &lt;span class="nx"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="c1"&gt;// raw pg client from the active transaction&lt;/span&gt;
  &lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the transaction rolls back, the job is also rolled back. The email cannot be sent for an order that does not exist in the database. This is the killer feature of Postgres-backed queues.&lt;/p&gt;

&lt;p&gt;Setting up a worker in pg-boss:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;boss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;work&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;send-confirmation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;teamSize&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;teamConcurrency&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;orderId&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;job&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;sendConfirmationEmail&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;orderId&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;What pg-boss does well:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Transactional job enqueue — the main reason to choose it&lt;/li&gt;
&lt;li&gt;No additional infrastructure if you already have Postgres&lt;/li&gt;
&lt;li&gt;At-least-once delivery with configurable retry policy&lt;/li&gt;
&lt;li&gt;Job deduplication with &lt;code&gt;sendOnce&lt;/code&gt; and a key&lt;/li&gt;
&lt;li&gt;Scheduled and recurring jobs with cron syntax&lt;/li&gt;
&lt;li&gt;Reasonable throughput for most web applications&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;What to watch out for:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Postgres is not Redis. Job pickup latency is in the tens of milliseconds, not sub-millisecond. For typical web application workloads — order processing, webhook handling, scheduled emails — pg-boss handles volume comfortably; for sustained high-throughput pipelines, BullMQ is the safer choice. The UI tooling is sparse compared to BullMQ. If your team needs dashboards and visibility, you will need to build something yourself or query the &lt;code&gt;pgboss&lt;/code&gt; tables directly.&lt;/p&gt;

&lt;p&gt;Also: pg-boss adds tables to your Postgres instance. The schema is well-isolated (its own schema), but it is another thing in your database. Migrations and schema management become part of your normal deployment process.&lt;/p&gt;

&lt;h2&gt;
  
  
  Dead Letter Queues, Retries, and Deduplication
&lt;/h2&gt;

&lt;p&gt;These three features separate a reliable job queue from a fragile one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Retries with backoff&lt;/strong&gt; — both BullMQ and pg-boss support exponential backoff. The key thing to configure is the maximum number of attempts and the delay policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BullMQ&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;attempts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nl"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;exponential&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;delay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2000&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// Attempt delays: 2s, 4s, 8s, 16s, 32s&lt;/span&gt;

&lt;span class="c1"&gt;// pg-boss&lt;/span&gt;
&lt;span class="nx"&gt;boss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;my-job&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;retryLimit&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;retryDelay&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;retryBackoff&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="c1"&gt;// retryDelay is in seconds; retryBackoff enables exponential scaling&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Do not set retries to unlimited. A job that fails 50 times is either broken or pointing at a broken dependency — you want it to stop and alert you.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Dead letter queues&lt;/strong&gt; — in BullMQ, jobs that exhaust all retries move to the &lt;code&gt;failed&lt;/code&gt; state. They stay there (if &lt;code&gt;removeOnFail: false&lt;/code&gt;) and you can inspect them via Bull Board or query Redis directly. In pg-boss, failed jobs stay in the job table with a &lt;code&gt;failed&lt;/code&gt; state and a &lt;code&gt;output&lt;/code&gt; column containing the error.&lt;/p&gt;

&lt;p&gt;In both cases: set up an alert on failed jobs. In production I use a &lt;code&gt;worker.on('failed')&lt;/code&gt; listener that sends a Telegram message with the job ID and error. Five minutes of debugging a failed job is much better than discovering it three days later when a customer complains.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Deduplication&lt;/strong&gt; — BullMQ uses &lt;code&gt;jobId&lt;/code&gt; for deduplication within active/waiting jobs. pg-boss uses &lt;code&gt;sendOnce&lt;/code&gt; with a user-defined key. The important distinction: BullMQ's &lt;code&gt;jobId&lt;/code&gt; deduplication expires once the job completes. pg-boss's &lt;code&gt;sendOnce&lt;/code&gt; key deduplicates within a configurable retention window. For most webhook-driven workflows, database-level idempotency on top of queue-level deduplication is the right approach regardless of which queue you use.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring: What Actually Matters
&lt;/h2&gt;

&lt;p&gt;Queue depth, failed job count, and job processing duration are the three numbers that tell you whether your background processing is healthy.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// BullMQ — get queue metrics&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;counts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;orderQueue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getJobCounts&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;waiting&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;active&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;completed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;failed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;delayed&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// { waiting: 0, active: 2, completed: 1847, failed: 3, delayed: 0 }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For pg-boss:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- Check queue state directly in Postgres&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;pgboss&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;createdon&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;INTERVAL&lt;/span&gt; &lt;span class="s1"&gt;'24 hours'&lt;/span&gt;
&lt;span class="k"&gt;GROUP&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;state&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The metric that surprises people most is &lt;strong&gt;job age&lt;/strong&gt; — how long a job sits in the waiting state before a worker picks it up. If this grows, you either have too few workers or your workers are blocked on slow downstream calls. Increase &lt;code&gt;concurrency&lt;/code&gt; in your worker config or add more worker instances.&lt;/p&gt;

&lt;p&gt;I track these metrics with a simple Prometheus exporter that scrapes BullMQ counts every 30 seconds and pushes them to Grafana. The setup is a hundred lines of code and has caught two incidents before they became customer-visible problems.&lt;/p&gt;

&lt;h2&gt;
  
  
  Serverless Alternatives
&lt;/h2&gt;

&lt;p&gt;If you are on Vercel or another serverless platform, long-running worker processes are not an option. Two tools worth knowing:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://vercel.com/docs/cron-jobs" rel="noopener noreferrer"&gt;Vercel Cron&lt;/a&gt; — HTTP endpoints called on a schedule. Simple, integrated, works with the App Router. Fine for Level 1 scheduled tasks. Not suitable for event-driven work or complex retry logic.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://www.inngest.com/" rel="noopener noreferrer"&gt;Inngest&lt;/a&gt; — event-driven background functions with retries, delays, and step functions. Designed for serverless. The DX is excellent; the trade-off is a third-party dependency and pricing that scales with invocations.&lt;/p&gt;

&lt;p&gt;I do not use either in my current production stack because I run Node.js on VPS infrastructure with Docker, where long-running workers are straightforward. But if your deployment target is serverless and you need more than basic cron, Inngest is the most production-ready option I have evaluated.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Decision Matrix
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Situation&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Single instance, scheduled tasks, low stakes&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;node-cron&lt;/code&gt; or OS cron&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multi-instance, scheduled tasks&lt;/td&gt;
&lt;td&gt;BullMQ repeatable jobs (replaces cron)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event-driven work, already using Postgres, want ACID enqueue&lt;/td&gt;
&lt;td&gt;pg-boss&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Event-driven work, already using Redis, need rich UI&lt;/td&gt;
&lt;td&gt;BullMQ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;High throughput (thousands of jobs/minute)&lt;/td&gt;
&lt;td&gt;BullMQ&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Serverless deployment&lt;/td&gt;
&lt;td&gt;Inngest or Vercel Cron&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Complex job graphs (fan-out, dependencies)&lt;/td&gt;
&lt;td&gt;BullMQ FlowProducer&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The pg-boss vs BullMQ decision usually comes down to whether you already have Redis in your stack. If you do — BullMQ is the obvious choice. If you do not, and your throughput is under a few hundred jobs per minute, pg-boss lets you skip a Redis dependency entirely without meaningful trade-offs for typical web application workloads.&lt;/p&gt;

&lt;p&gt;On vatnode, I chose BullMQ because Redis was already there for rate limiting and caching. Adding BullMQ cost zero new infrastructure. On a greenfield project without Redis, I would evaluate pg-boss seriously.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Looks Like in Production
&lt;/h2&gt;

&lt;p&gt;On vatnode, BullMQ workers handle subscription lifecycle jobs triggered by Stripe webhooks — plan activations, usage resets, cancellation workflows. Worker concurrency is set to 5, which is enough to handle burst traffic during billing cycles without saturating the Postgres connection pool. Failed jobs go to the &lt;code&gt;failed&lt;/code&gt; state and trigger a Telegram alert; I have had fewer than 10 permanent failures in the past six months, all due to temporary Stripe API unavailability.&lt;/p&gt;

&lt;p&gt;On &lt;a href="https://iurii.rogulia.fi/projects/pikkuna-ecommerce-platform" rel="noopener noreferrer"&gt;pikkuna.fi&lt;/a&gt;, the order processing chain runs through a BullMQ worker that calls Zoho CRM, PostNord, Netvisor, and Mailgun in sequence — a pattern I also apply to any &lt;a href="https://iurii.rogulia.fi/services/api-integrations" rel="noopener noreferrer"&gt;API integration work&lt;/a&gt; where third-party calls need retry logic and visibility. The full chain — from Stripe webhook to sent confirmation email — completes in under 2 minutes. Before this architecture, intermittent failures in one integration would silently break the rest of the chain with no visibility. Now each step is retried independently and failed steps appear in the monitoring dashboard.&lt;/p&gt;

&lt;p&gt;
  items={[&lt;br&gt;
    {&lt;br&gt;
      q: "BullMQ vs pg-boss — which should I choose?",&lt;br&gt;
      a: "The decision usually comes down to your existing stack. If you already have Redis for rate limiting or caching, BullMQ is the obvious choice — zero new infrastructure, rich UI via Bull Board, high throughput. If you have Postgres but no Redis, pg-boss lets you skip a Redis dependency entirely with reasonable throughput for typical web application workloads. The killer feature of pg-boss is transactional job enqueue: the job and your DB write happen atomically.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "How does BullMQ handle failed jobs?",&lt;br&gt;
      a: "Failed jobs move to the 'failed' state after exhausting all retry attempts. If you set removeOnFail: false, they stay in Redis and can be inspected via Bull Board or queried directly. Set up a worker.on('failed') listener that alerts your team — Telegram, Slack, or PagerDuty. A job that fails 50 times is either broken or pointing at a broken dependency: cap your retry attempts and alert rather than looping indefinitely.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "Can I use BullMQ or pg-boss on Vercel?",&lt;br&gt;
      a: "No — both require long-running worker processes, which serverless functions do not support. On Vercel, use Vercel Cron for simple scheduled tasks or Inngest for event-driven background work with retries and step functions. Inngest is the most production-ready serverless queue option I have evaluated.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "How do I prevent duplicate jobs from Stripe webhook retries?",&lt;br&gt;
      a: "In BullMQ, set a deterministic jobId (e.g. 'order-' + orderId) — if a job with that ID is already waiting or active, the new enqueue is ignored. In pg-boss, use sendOnce() with a deduplication key. Both approaches cover active-window deduplication. For completed jobs, add a database-level idempotency check: query for an existing processed record before doing any work.",&lt;br&gt;
    },&lt;br&gt;
    {&lt;br&gt;
      q: "How do I run BullMQ workers alongside a Next.js app?",&lt;br&gt;
      a: "In a standalone Next.js deployment, workers need a separate process. In my vatnode.dev Turborepo monorepo, there is a dedicated apps/worker package that runs in the same Docker container alongside the Next.js app. The worker process starts independently and shares the Redis connection. For simple setups, you can also start workers from a separate entry point file invoked with node.",&lt;br&gt;
    },&lt;br&gt;
  ]}&lt;br&gt;
/&amp;gt;&lt;/p&gt;




&lt;p&gt;If you are &lt;a href="https://iurii.rogulia.fi/services/mvp-development" rel="noopener noreferrer"&gt;building a SaaS or e-commerce platform&lt;/a&gt; and running into the limits of synchronous request handling or simple cron jobs, you will hit exactly these decisions. I have implemented BullMQ-based pipelines across several production systems and can help you design an architecture that matches your actual throughput and reliability requirements.&lt;/p&gt;

&lt;p&gt;Background jobs are not about moving work out of the request. They are about making failure visible and recoverable.&lt;/p&gt;

&lt;p&gt;If you need a senior developer who can own background job infrastructure end-to-end — &lt;a href="https://iurii.rogulia.fi/contact" rel="noopener noreferrer"&gt;get in touch&lt;/a&gt;. I am available for freelance projects and long-term engagements.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/blog/stripe-webhooks-production" rel="noopener noreferrer"&gt;Stripe Webhooks Done Right: Production Architecture&lt;/a&gt; — how BullMQ fits into the webhook processing pipeline&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/projects/vatnode-vat-validation" rel="noopener noreferrer"&gt;Vatnode VAT Validation SaaS&lt;/a&gt; — BullMQ workers in production&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://iurii.rogulia.fi/projects/pikkuna-ecommerce-platform" rel="noopener noreferrer"&gt;Pikkuna E-commerce Platform&lt;/a&gt; — async order processing chain&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;External documentation:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://docs.bullmq.io/" rel="noopener noreferrer"&gt;BullMQ documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/timgit/pg-boss" rel="noopener noreferrer"&gt;pg-boss documentation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://github.com/felixmosh/bull-board" rel="noopener noreferrer"&gt;Bull Board UI&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="https://www.inngest.com/docs" rel="noopener noreferrer"&gt;Inngest documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>node</category>
      <category>typescript</category>
      <category>bullmq</category>
      <category>pgboss</category>
    </item>
  </channel>
</rss>
