<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: api</title>
    <description>The latest articles tagged 'api' on DEV Community.</description>
    <link>https://dev.to/t/api</link>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/tag/api"/>
    <language>en</language>
    <item>
      <title>H33-74: 74 Bytes of Tamper-Proof Attestation</title>
      <dc:creator>H33.ai</dc:creator>
      <pubDate>Wed, 29 Apr 2026 21:43:43 +0000</pubDate>
      <link>https://dev.to/h33ai/h33-74-74-bytes-of-tamper-proof-attestation-39mg</link>
      <guid>https://dev.to/h33ai/h33-74-74-bytes-of-tamper-proof-attestation-39mg</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Originally published at &lt;a href="https://v100.ai/blog/h33-74-substrate-attestation-explained.html" rel="noopener noreferrer"&gt;v100.ai&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The first and only quantum resistant video platform. ML-KEM-768 + ML-DSA-65 + FALCON-512.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://v100.ai/blog/h33-74-substrate-attestation-explained.html" rel="noopener noreferrer"&gt;Full article&lt;/a&gt; | &lt;a href="https://v100.ai/quantum/" rel="noopener noreferrer"&gt;v100.ai/quantum&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;V100 by &lt;a href="https://h33.ai" rel="noopener noreferrer"&gt;H33.ai&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>security</category>
      <category>api</category>
      <category>cryptography</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Anthropic just charged a developer $200 by mistake. Here's how flat-rate AI billing works instead.</title>
      <dc:creator>brian austin</dc:creator>
      <pubDate>Wed, 29 Apr 2026 21:28:37 +0000</pubDate>
      <link>https://dev.to/subprime2010/anthropic-just-charged-a-developer-200-by-mistake-heres-how-flat-rate-ai-billing-works-instead-1j8a</link>
      <guid>https://dev.to/subprime2010/anthropic-just-charged-a-developer-200-by-mistake-heres-how-flat-rate-ai-billing-works-instead-1j8a</guid>
      <description>&lt;h2&gt;
  
  
  The bug that cost $200
&lt;/h2&gt;

&lt;p&gt;This week, a GitHub issue titled &lt;a href="https://github.com/anthropics/claude-code/issues/53262" rel="noopener noreferrer"&gt;HERMES.md&lt;/a&gt; went viral on Hacker News: an Anthropic billing bug silently charged a developer $200 extra — and Anthropic initially refused a refund.&lt;/p&gt;

&lt;p&gt;The thread now has 289 comments and 737 upvotes. Most of them are developers sharing their own billing anxiety stories.&lt;/p&gt;

&lt;p&gt;Here's a sample of what people are saying:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;"I disabled my Claude Code usage because I was terrified of runaway charges."&lt;/p&gt;

&lt;p&gt;"Metered billing for AI tools is the new dark pattern."&lt;/p&gt;

&lt;p&gt;"I now check my Anthropic dashboard more than I check my bank account."&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is not a bug story. This is a structural problem with how AI is billed.&lt;/p&gt;




&lt;h2&gt;
  
  
  The metered billing trap
&lt;/h2&gt;

&lt;p&gt;Every major AI provider — OpenAI, Anthropic, Google Gemini — bills by the token. This means:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A runaway agent loop can cost you hundreds overnight&lt;/li&gt;
&lt;li&gt;A misunderstood context window multiplies your bill silently&lt;/li&gt;
&lt;li&gt;A billing bug, like the HERMES.md case, is nearly impossible to audit after the fact&lt;/li&gt;
&lt;li&gt;You check your API dashboard like a gas pump meter, watching the numbers tick up&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is fine for enterprises with finance teams and spend alerts. It is brutal for individual developers, students, and freelancers.&lt;/p&gt;




&lt;h2&gt;
  
  
  What flat-rate billing actually looks like
&lt;/h2&gt;

&lt;p&gt;Here's the alternative model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Call Claude API — SimplyLouie flat-rate&lt;/span&gt;
curl https://api.simplylouie.com/v1/messages &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"x-api-key: YOUR_KEY"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "claude-opus-4-5",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Explain this code to me"}]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You pay &lt;strong&gt;$2/month&lt;/strong&gt;. That's it. No token counters. No usage dashboards. No surprise charges. No billing bugs that require a GitHub issue and 289 comments to resolve.&lt;/p&gt;

&lt;p&gt;The response is identical to calling Anthropic directly — same Claude model, same response format, same API structure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this matters for developers outside the US
&lt;/h2&gt;

&lt;p&gt;The HERMES.md developer got a $200 charge. For a developer in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Nigeria&lt;/strong&gt;: that's 2-3 weeks of salary&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;India&lt;/strong&gt;: Rs 16,500 — nearly a month's rent in many cities&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Philippines&lt;/strong&gt;: ₱11,200 — over a month's worth of groceries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indonesia&lt;/strong&gt;: Rp3.2M — a significant emergency fund hit&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Metered billing doesn't just cause anxiety. For developers in emerging markets, a billing bug like HERMES.md is financially catastrophic.&lt;/p&gt;

&lt;p&gt;Flat-rate billing at $2/month is not a price feature. It's a safety feature.&lt;/p&gt;




&lt;h2&gt;
  
  
  The technical argument for flat-rate
&lt;/h2&gt;

&lt;p&gt;Some developers argue that metered billing is more "fair" because heavy users pay more. This is true at scale. But it creates a hidden cost that never shows up in the pricing page: &lt;strong&gt;cognitive overhead&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Every time you use a metered AI API, you're making micro-decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Should I include the full file in context, or trim it?&lt;/li&gt;
&lt;li&gt;Is this prompt worth the tokens?&lt;/li&gt;
&lt;li&gt;Should I stream this response or batch it?&lt;/li&gt;
&lt;li&gt;Did that agent loop cost me $0.20 or $2.00?&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That cognitive overhead is a tax on your productivity. Flat-rate billing removes it entirely. You just... use the API.&lt;/p&gt;




&lt;h2&gt;
  
  
  What to do if you got an unexpected Anthropic charge
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;Go to &lt;a href="https://console.anthropic.com/settings/billing" rel="noopener noreferrer"&gt;console.anthropic.com/settings/billing&lt;/a&gt;
&lt;/li&gt;
&lt;li&gt;Export your usage log for the disputed period&lt;/li&gt;
&lt;li&gt;Open a support ticket with the exact timestamp of the anomaly&lt;/li&gt;
&lt;li&gt;Reference the HERMES.md GitHub issue — Anthropic is now aware and has acknowledged the bug&lt;/li&gt;
&lt;li&gt;If refused, dispute via your credit card issuer (this works, metered billing disputes are winnable)&lt;/li&gt;
&lt;/ol&gt;




&lt;h2&gt;
  
  
  The alternative
&lt;/h2&gt;

&lt;p&gt;If you want Claude API access without metered billing anxiety:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://simplylouie.com" rel="noopener noreferrer"&gt;SimplyLouie&lt;/a&gt;&lt;/strong&gt; — $2/month flat rate, Claude API, no usage meters, 7-day free trial.&lt;/p&gt;

&lt;p&gt;For developers who can't afford to check their billing dashboard with their morning coffee — this is what we built.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The HERMES.md issue is at github.com/anthropics/claude-code/issues/53262. The Hacker News thread is worth reading — it's one of the most honest conversations about AI billing transparency in months.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>discuss</category>
      <category>claude</category>
      <category>api</category>
    </item>
    <item>
      <title>Easily benchmark all your app's endpoints at once</title>
      <dc:creator>Kenneth Mckrola</dc:creator>
      <pubDate>Wed, 29 Apr 2026 21:09:11 +0000</pubDate>
      <link>https://dev.to/mackoverflow/easily-benchmark-all-your-apps-endpoints-at-once-2fod</link>
      <guid>https://dev.to/mackoverflow/easily-benchmark-all-your-apps-endpoints-at-once-2fod</guid>
      <description>&lt;p&gt;Most "load tests" in real codebases are a &lt;code&gt;curl&lt;/code&gt; pasted into a Slack thread. Someone runs it before a release, eyeballs the latency, and we ship. There's nothing version-controlled, nothing repeatable, and the next person to touch the service has no idea which endpoints are actually fast paths.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://benchmarkr-1.onrender.com" rel="noopener noreferrer"&gt;benchmarkr&lt;/a&gt; is a powerful and easy-to-use CLI and MCP tool that fixes that part of the workflow specifically. The thing I want to talk about in this post is the piece that makes it click: a YAML config that lives in your repo and describes every endpoint you care about, the same way a &lt;code&gt;package.json&lt;/code&gt; describes your dependencies.&lt;/p&gt;

&lt;h2&gt;
  
  
  The config
&lt;/h2&gt;

&lt;p&gt;First, install benchmarkr cli if you haven't:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;brew tap mack-overflow/tap
brew &lt;span class="nb"&gt;install &lt;/span&gt;benchmarkr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;for Homebrew, or for Debian&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"deb [trusted=yes] https://apt.fury.io/mack-overflow/ /"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  | &lt;span class="nb"&gt;sudo tee&lt;/span&gt; /etc/apt/sources.list.d/benchmarkr.list
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt update
&lt;span class="nb"&gt;sudo &lt;/span&gt;apt &lt;span class="nb"&gt;install &lt;/span&gt;benchmarkr
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;(More installation guides available &lt;a href="https://benchmarkr-1.onrender.com/docs" rel="noopener noreferrer"&gt;here&lt;/a&gt;)&lt;/p&gt;

&lt;p&gt;Next, run &lt;code&gt;benchmarkr endpoints init&lt;/code&gt; in your project root and you get a &lt;code&gt;benchmarkr.yaml&lt;/code&gt; you can commit:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="na"&gt;version&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;

&lt;span class="na"&gt;endpoints&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;list-users&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GET&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${API_BASE:-http://localhost:8080}/users&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bearer ${API_TOKEN}&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
      &lt;span class="na"&gt;duration_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;30&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;search-users&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;GET&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${API_BASE}/users/search&lt;/span&gt;
    &lt;span class="na"&gt;params&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;q&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;test"&lt;/span&gt;
      &lt;span class="na"&gt;limit&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;50"&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;5&lt;/span&gt;
      &lt;span class="na"&gt;duration_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;15&lt;/span&gt;

  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;create-order&lt;/span&gt;
    &lt;span class="na"&gt;method&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;POST&lt;/span&gt;
    &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${API_BASE}/orders&lt;/span&gt;
    &lt;span class="na"&gt;headers&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;Authorization&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Bearer ${API_TOKEN}&lt;/span&gt;
      &lt;span class="na"&gt;Content-Type&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;application/json&lt;/span&gt;
    &lt;span class="na"&gt;body&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;sku&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ABC-123"&lt;/span&gt;
      &lt;span class="na"&gt;quantity&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;
    &lt;span class="na"&gt;defaults&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
      &lt;span class="na"&gt;concurrency&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;2&lt;/span&gt;
      &lt;span class="na"&gt;duration_seconds&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="m"&gt;10&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things to notice:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Env var substitution.&lt;/strong&gt; &lt;code&gt;${API_BASE}&lt;/code&gt; and &lt;code&gt;${API_BASE:-default}&lt;/code&gt; work the way they do in shell. A sibling &lt;code&gt;.env&lt;/code&gt; file is auto-loaded but never overrides what's already in the environment, so the same file works on a laptop, in CI, and in staging.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Defaults travel with the endpoint.&lt;/strong&gt; &lt;code&gt;create-order&lt;/code&gt; runs at concurrency 2 for 10 seconds because that's what makes sense for a write path. &lt;code&gt;list-users&lt;/code&gt; runs at concurrency 10. You set this once in the file you already review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Discovery walks up from CWD.&lt;/strong&gt; Run the CLI from any subdirectory and it finds the file, like &lt;code&gt;git&lt;/code&gt; does.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnw0c3qoj334x8hhqusls.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnw0c3qoj334x8hhqusls.png" alt="CLI endpoints list output" width="800" height="476"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Running one endpoint
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;benchmarkr run &lt;span class="nt"&gt;-e&lt;/span&gt; list-users
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's it. Saved defaults apply. Any flag you pass on the command line wins; headers and params are merged. So when you're poking at production specifically, you can do:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;benchmarkr run &lt;span class="nt"&gt;-e&lt;/span&gt; list-users &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--header&lt;/span&gt; &lt;span class="s2"&gt;"X-Trace: debug-2026-04-28"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;--concurrency&lt;/span&gt; 50
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;…without editing the committed file.&lt;/p&gt;

&lt;h2&gt;
  
  
  Running all of them
&lt;/h2&gt;

&lt;p&gt;This is where the YAML pays for itself. Because every endpoint is named and self-describing, you can hand the entire file to the CLI in one shot:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;benchmarkr run &lt;span class="nt"&gt;--all&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That walks every endpoint in &lt;code&gt;benchmarkr.yaml&lt;/code&gt; in succession, applying each endpoint's saved defaults (concurrency, duration, headers, body — the whole config). Between runs you get a &lt;code&gt;[i/N] &amp;lt;name&amp;gt;&lt;/code&gt; header so it's obvious where you are; live p50/p95/p99 streams in for the active endpoint and a final summary prints when it finishes. &lt;code&gt;--all&lt;/code&gt; is mutually exclusive with &lt;code&gt;--url&lt;/code&gt; and &lt;code&gt;--endpoint&lt;/code&gt;, and any flags you do pass (e.g. &lt;code&gt;--store&lt;/code&gt;, &lt;code&gt;--json&lt;/code&gt;, &lt;code&gt;--rate-limit&lt;/code&gt;) apply to every run in the sweep.&lt;/p&gt;

&lt;p&gt;For CI, this collapses the workflow step to one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="c1"&gt;# .github/workflows/perf.yml&lt;/span&gt;
&lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;Benchmark every endpoint&lt;/span&gt;
  &lt;span class="na"&gt;env&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
    &lt;span class="na"&gt;API_BASE&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;https://api.staging.example.com&lt;/span&gt;
    &lt;span class="na"&gt;API_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.STAGING_API_TOKEN }}&lt;/span&gt;
    &lt;span class="na"&gt;BENCH_CLOUD_TOKEN&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;${{ secrets.BENCHMARKR_TOKEN }}&lt;/span&gt;
  &lt;span class="na"&gt;run&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;benchmarkr run --all --store --json &amp;gt; perf-results.json&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;--json&lt;/code&gt; with &lt;code&gt;--all&lt;/code&gt; emits an array — one entry per endpoint, with the same &lt;code&gt;result&lt;/code&gt; shape as a single run — so you can pipe it straight into a regression check or upload it as a CI artifact:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"list-users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stop_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"30.001s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stored"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12483&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p50_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p95_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p99_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;23&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"errors_total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"search-users"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stop_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"15.002s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stored"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;4127&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p50_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;18&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p95_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;47&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p99_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;92&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"errors_total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"create-order"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stop_reason"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"completed"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"duration"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"10.001s"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"stored"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"result"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"requests"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;312&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p50_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;41&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p95_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"p99_ms"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;121&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"errors_total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You're not maintaining a separate list of "endpoints to benchmark" in your CI workflow and a list in your config. There's one list. Add a new endpoint to &lt;code&gt;benchmarkr.yaml&lt;/code&gt; in the same PR that adds the route, and the next CI run picks it up automatically — no workflow edits, no shell loop to babysit.&lt;/p&gt;

&lt;h2&gt;
  
  
  Round-tripping with the cloud dashboard
&lt;/h2&gt;

&lt;p&gt;The CLI gives you fast feedback. The dashboard gives you the long view — historical p95 charts, regression detection across versions, the kind of thing that's painful to wire up yourself.&lt;/p&gt;

&lt;p&gt;The newest piece is import/export, so the YAML in your repo and the endpoints in the dashboard stay in sync without anyone having to maintain both:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Export from the dashboard.&lt;/strong&gt; Open any endpoint and click &lt;strong&gt;Export&lt;/strong&gt; for YAML or JSON. Or click &lt;strong&gt;Export all&lt;/strong&gt; in the endpoints nav to dump every endpoint to one file you can drop into a fresh repo.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Import to the dashboard.&lt;/strong&gt; Click &lt;strong&gt;Import&lt;/strong&gt;, pick a &lt;code&gt;benchmarkr.yaml&lt;/code&gt;, and endpoints upsert by &lt;code&gt;(user, name)&lt;/code&gt;. If the config changed, a new version is recorded — so you get a history of how each endpoint's load shape evolved.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2klkwgl4a48htpx3lg2.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fb2klkwgl4a48htpx3lg2.png" alt="Import in Benchmarkr UI Nav" width="800" height="74"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fochel2kdfj1i2qvfo1up.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fochel2kdfj1i2qvfo1up.png" alt="Export as YAML in UI" width="800" height="193"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmf8nwq6sw8oxba12jbr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fqmf8nwq6sw8oxba12jbr.png" alt="endpoint history" width="800" height="387"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;A workflow I've been using:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Define endpoints in &lt;code&gt;benchmarkr.yaml&lt;/code&gt;, commit them.&lt;/li&gt;
&lt;li&gt;CI runs the loop above on every PR with &lt;code&gt;--store&lt;/code&gt; and the cloud token, persisting results to the dashboard.&lt;/li&gt;
&lt;li&gt;Open the endpoint in the dashboard to see the trend line for that endpoint across the last N PRs.&lt;/li&gt;
&lt;li&gt;If somebody adds an endpoint via the dashboard UI for ad-hoc poking, &lt;strong&gt;Export&lt;/strong&gt; → drop the file into the repo → it's now part of the CI matrix.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  A note on the cloud dashboard
&lt;/h2&gt;

&lt;p&gt;The cloud platform is currently in &lt;strong&gt;closed beta&lt;/strong&gt;. We're planning to open it up to the public on a per-token basis in &lt;strong&gt;spring 2026&lt;/strong&gt; — if you'd like access at launch, you can &lt;a href="https://benchmarkr-1.onrender.com/waitlist" rel="noopener noreferrer"&gt;join the waitlist&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The CLI itself is open source and works without the cloud — &lt;code&gt;benchmarkr run&lt;/code&gt;, the YAML config, and even local result persistence don't require an account or a token. The dashboard, history charts, version pinning, and import/export are the parts gated behind beta access for now.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why this is worth doing
&lt;/h2&gt;

&lt;p&gt;The shift that matters isn't "run benchmarks in CI" — plenty of tools do that. It's having a single, reviewable file that says &lt;em&gt;here are this service's endpoints and how we expect them to behave under load&lt;/em&gt;, sitting next to the code in the same PR.&lt;/p&gt;

&lt;p&gt;Once that file exists:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;New endpoints get a perf budget at the same moment they get a route handler.&lt;/li&gt;
&lt;li&gt;Reviewers can see in the diff that a new write path is being benchmarked at concurrency 2, not 100, and push back if that's wrong.&lt;/li&gt;
&lt;li&gt;CI gets a free regression signal across every endpoint, not just the one someone remembered to add to a script.&lt;/li&gt;
&lt;li&gt;The dashboard gives you the historical view without anyone manually re-entering endpoints.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The repo already describes your API. This is just letting it benchmark itself.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;benchmarkr is open source — &lt;code&gt;brew install mack-overflow/tap/benchmarkr&lt;/code&gt; or grab it from &lt;a href="https://benchmarkr-1.onrender.com" rel="noopener noreferrer"&gt;benchmarkr&lt;/a&gt;. Cloud dashboard beta access opens publicly per-token in spring 2026.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>performance</category>
      <category>devops</category>
      <category>go</category>
    </item>
    <item>
      <title>From Prototype to Production: What We Learned About Builder Platforms</title>
      <dc:creator>Nometria</dc:creator>
      <pubDate>Wed, 29 Apr 2026 20:59:49 +0000</pubDate>
      <link>https://dev.to/nometria_vibecoding/from-prototype-to-production-what-we-learned-about-builder-platforms-dm3</link>
      <guid>https://dev.to/nometria_vibecoding/from-prototype-to-production-what-we-learned-about-builder-platforms-dm3</guid>
      <description>&lt;h1&gt;
  
  
  Why Your AI-Built App Won't Scale (And What Actually Fixes It)
&lt;/h1&gt;

&lt;p&gt;You shipped something in Lovable or Bolt in two weeks. Your co-founder tested it. A few customers signed up. Then you hit the wall.&lt;/p&gt;

&lt;p&gt;The app works fine for five users. Ten users. But at fifty concurrent users, response times spike. Your database queries start timing out. You realize you have no visibility into what's actually happening in production. And when something breaks, you can't roll back because the builder platform doesn't give you deployment history.&lt;/p&gt;

&lt;p&gt;This isn't a flaw in your code. It's a flaw in the infrastructure assumption.&lt;/p&gt;

&lt;p&gt;Here's what's really happening: AI builders are optimized for iteration speed, not production scale. They bundle your database, API, frontend, and authentication into a black box that works great until it doesn't. Your data lives on their servers. Your code is locked into their proprietary export format. You have no CI/CD pipeline, no rollback mechanism, no real monitoring. You're not building a product, you're building a prototype that happens to have paying customers.&lt;/p&gt;

&lt;p&gt;Most founders don't realize this until they're already committed.&lt;/p&gt;

&lt;p&gt;The gap between "working in the builder" and "production-ready on real infrastructure" is massive. It requires database migration, API restructuring, proper deployment pipelines, secrets management, monitoring, compliance setup, and about three months of engineering time you don't have.&lt;/p&gt;

&lt;p&gt;Or it used to.&lt;/p&gt;

&lt;p&gt;The actual solution isn't to abandon AI builders. They're genuinely fast for iteration. The solution is to move your app to real infrastructure once it's proven, without rebuilding from scratch.&lt;/p&gt;

&lt;p&gt;A two-person team migrated a Bolt-built SaaS to Vercel in a single sprint. SmartFixOS moved from Base44 to managed infrastructure and now handles real revenue with customer jobs and invoicing. Wright Choice Mentoring scaled from a prototype to a multi-tenant platform managing 10+ organizations. They didn't rewrite anything. They deployed.&lt;/p&gt;

&lt;p&gt;The mechanics are cleaner than you'd expect. Export your code from the builder, run a deployment to AWS, Vercel, or Supabase, set your custom domain, and you're live on real infrastructure with full code and data ownership. Preview servers let you test without burning money. Rollback to any previous deployment in 30 seconds if something breaks. GitHub two-way sync means your no-code app gets real version control.&lt;/p&gt;

&lt;p&gt;This is what Nometria does, and it's why teams that understand the difference between iteration and production are already moving their apps this way. You can deploy via CLI (3 commands), VS Code extension (one-click), or even have AI agents handle it directly from Claude Code.&lt;/p&gt;

&lt;p&gt;The math is simple: six months of engineering work versus three weeks of deployment work. Full compliance, full ownership, full visibility.&lt;/p&gt;

&lt;p&gt;When you're evaluating whether your AI-built app is ready to scale, ask yourself this one question: do I control my own infrastructure, or does the builder control me?&lt;/p&gt;

&lt;p&gt;If the answer is the latter, you already know what to do.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://nometria.com" rel="noopener noreferrer"&gt;https://nometria.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>api</category>
      <category>sdk</category>
    </item>
    <item>
      <title>How AI Agents Can Get Structured Product Data Without Web Scraping</title>
      <dc:creator>BuyWhere</dc:creator>
      <pubDate>Wed, 29 Apr 2026 20:53:56 +0000</pubDate>
      <link>https://dev.to/buywhere/how-ai-agents-can-get-structured-product-data-without-web-scraping-50hp</link>
      <guid>https://dev.to/buywhere/how-ai-agents-can-get-structured-product-data-without-web-scraping-50hp</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;When AI agents need product information — prices, availability, comparisons — they typically face a tough choice:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Web scraping&lt;/strong&gt; — brittle, blockable, and adds significant complexity&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generic search&lt;/strong&gt; — unreliable results, no structured data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual data entry&lt;/strong&gt; — not scalable&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Alternative: Product Catalog API
&lt;/h2&gt;

&lt;p&gt;A product catalog API gives agents clean, structured product data they can use directly in their reasoning loops.&lt;/p&gt;

&lt;p&gt;Example: Instead of scraping 20 e-commerce sites to find the best price on a laptop, an agent can call:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight http"&gt;&lt;code&gt;&lt;span class="err"&gt;GET /v1/products?query=MacBook+Air+13\&amp;amp;region=SG
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;And get structured results with real prices, availability, and affiliate links.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters for Agent Commerce
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Reliability&lt;/strong&gt; — structured data means consistent responses&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Speed&lt;/strong&gt; — single API call vs. scraping pipeline&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Compliance&lt;/strong&gt; — proper data sourcing for affiliate monetization&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context preservation&lt;/strong&gt; — agents can compare products without bloating context windows&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Getting Started
&lt;/h2&gt;

&lt;p&gt;If you are building shopping agents, price comparison tools, or any AI commerce use case, structured product data is the missing layer.&lt;/p&gt;

&lt;p&gt;Happy to answer questions about how this works in practice.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Note: I work at BuyWhere, an agent-native product catalog API.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>agents</category>
      <category>shopping</category>
      <category>api</category>
    </item>
    <item>
      <title>I built istempmail.com</title>
      <dc:creator>Gregor Nobis</dc:creator>
      <pubDate>Wed, 29 Apr 2026 20:14:39 +0000</pubDate>
      <link>https://dev.to/gregor_nobis_b4295c5ee819/i-built-istempmailcom-52gc</link>
      <guid>https://dev.to/gregor_nobis_b4295c5ee819/i-built-istempmailcom-52gc</guid>
      <description>&lt;p&gt;I built istempmail.com here’s what it does&lt;/p&gt;

&lt;p&gt;It’s a tool designed to help websites detect and block temporary (disposable) email addresses.&lt;/p&gt;

&lt;p&gt;Temporary emails are short-lived addresses that users can generate instantly and discard after use. While they have legitimate uses, they are often used to bypass signups, abuse free trials, or create fake accounts.&lt;/p&gt;

&lt;p&gt;What &lt;a href="https://istempmail.com" rel="noopener noreferrer"&gt;istempmail.com&lt;/a&gt; offers:&lt;/p&gt;

&lt;p&gt;Real-time detection of disposable email domains&lt;br&gt;
Simple API for email validation&lt;br&gt;
JSON-based responses for easy integration&lt;br&gt;
WordPress plugin support&lt;br&gt;
Helps reduce spam, fake accounts, and abuse&lt;/p&gt;

&lt;p&gt;If you run a SaaS, marketplace, or any platform with user registration, this can help you maintain higher-quality users and cleaner data.&lt;/p&gt;

&lt;p&gt;You can check it out here: &lt;a href="https://istempmail.com" rel="noopener noreferrer"&gt;https://istempmail.com&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Feedback is welcome.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyw6n42tk82owofgir7e.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fnyw6n42tk82owofgir7e.png" alt=" " width="800" height="528"&gt;&lt;/a&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>security</category>
      <category>showdev</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Invoice API – Open-source invoicing that doesn't suck</title>
      <dc:creator>Maxence Londot</dc:creator>
      <pubDate>Wed, 29 Apr 2026 19:52:01 +0000</pubDate>
      <link>https://dev.to/maxence_londot_1143f1368f/invoice-api-open-source-invoicing-that-doesnt-suck-41fm</link>
      <guid>https://dev.to/maxence_londot_1143f1368f/invoice-api-open-source-invoicing-that-doesnt-suck-41fm</guid>
      <description>&lt;h2&gt;
  
  
  Why another invoicing tool?
&lt;/h2&gt;

&lt;p&gt;Every freelancer and SaaS founder needs invoices. Yet most solutions are either bloated SaaS subscriptions or clunky Excel templates.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Invoice API&lt;/strong&gt; is different – it's a lightweight, self-hosted REST API that generates professional PDF invoices in under 5 seconds. Built with &lt;strong&gt;FastAPI&lt;/strong&gt;, &lt;strong&gt;WeasyPrint&lt;/strong&gt;, and &lt;strong&gt;Stripe&lt;/strong&gt;, it gives you full control without vendor lock-in.&lt;/p&gt;

&lt;h2&gt;
  
  
  What can it do?
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Create customers &amp;amp; invoices&lt;/strong&gt; via a clean REST API
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate PDF invoices&lt;/strong&gt; with a single GET request (WeasyPrint, ~30KB per PDF)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Receive Stripe webhooks&lt;/strong&gt; for real-time payment status updates
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dashboard&lt;/strong&gt; to see all invoices at a glance
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Docker-Compose one-liner&lt;/strong&gt; – &lt;code&gt;docker compose up -d&lt;/code&gt; and you're live
&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Tech stack
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Technology&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;API&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;FastAPI&lt;/strong&gt; + &lt;strong&gt;Pydantic&lt;/strong&gt; v2&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;PostgreSQL&lt;/strong&gt; 16&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cache&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Redis&lt;/strong&gt; 7&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;PDF Engine&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;WeasyPrint&lt;/strong&gt; 62&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Payments&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Stripe&lt;/strong&gt; API (webhook)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Container&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Docker&lt;/strong&gt; + &lt;strong&gt;Docker Compose&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auth&lt;/td&gt;
&lt;td&gt;Bearer JWT tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Quick start
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Clone &amp;amp; start&lt;/span&gt;
git clone https://github.com/UniTy/invoice-api.git
&lt;span class="nb"&gt;cd &lt;/span&gt;invoice-api
docker compose up &lt;span class="nt"&gt;-d&lt;/span&gt;

&lt;span class="c"&gt;# Create a customer&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/api/v1/customers &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"name":"Acme Corp","email":"billing@acme.com"}'&lt;/span&gt;

&lt;span class="c"&gt;# Create an invoice&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST http://localhost:8000/api/v1/invoices &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Content-Type: application/json"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{"customer_id":1,"items":[{"desc":"Consulting","qty":10,"price":150}],"currency":"eur"}'&lt;/span&gt;

&lt;span class="c"&gt;# Download PDF&lt;/span&gt;
curl http://localhost:8000/api/v1/invoices/1/pdf &lt;span class="nt"&gt;-o&lt;/span&gt; invoice-001.pdf
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Why self-host?
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;GDPR compliance&lt;/strong&gt; – your data stays on your server
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No monthly fees&lt;/strong&gt; – pay only for your VPS (as low as €5/month)
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Customizable&lt;/strong&gt; – MIT license, fork it, brand it, extend it
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;API-first&lt;/strong&gt; – integrate with your existing workflow (n8n, Zapier, custom apps)
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Roadmap
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;[x] Core API (customers, invoices, PDF generation)
&lt;/li&gt;
&lt;li&gt;[x] Stripe webhook integration
&lt;/li&gt;
&lt;li&gt;[x] Docker deployment
&lt;/li&gt;
&lt;li&gt;[x] OpenAPI auto-documentation
&lt;/li&gt;
&lt;li&gt;[ ] Multi-currency support
&lt;/li&gt;
&lt;li&gt;[ ] Email delivery (SendGrid/Mailgun)
&lt;/li&gt;
&lt;li&gt;[ ] Recurring invoices
&lt;/li&gt;
&lt;li&gt;[ ] React admin dashboard
&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it now
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/UniTy/invoice-api.git
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;⭐ Star the repo if you find it useful! Feedback and PRs welcome. 🙏  &lt;/p&gt;

&lt;p&gt;&lt;em&gt;#invoicing #api #fastapi #python #stripe #opensource #docker&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>python</category>
      <category>opensource</category>
      <category>docker</category>
    </item>
    <item>
      <title>SQL Server ETL in 2026 — What Actually Works and What Doesn't</title>
      <dc:creator>Nata</dc:creator>
      <pubDate>Wed, 29 Apr 2026 19:14:48 +0000</pubDate>
      <link>https://dev.to/kuznetsova/sql-server-etl-in-2026-what-actually-works-and-what-doesnt-4nab</link>
      <guid>https://dev.to/kuznetsova/sql-server-etl-in-2026-what-actually-works-and-what-doesnt-4nab</guid>
      <description>&lt;p&gt;SQL Server is one of those databases that rarely causes problems. It's usually everything around it that does. Getting data in from a dozen different sources, keeping it clean and consistent, syncing it back out to the tools your team actually uses — none of that happens automatically, and the native tooling only gets you so far before the cracks start showing. &lt;/p&gt;

&lt;p&gt;This is a breakdown of the ETL options worth considering if SQL Server sits at the center of your stack — native tools included, with an honest assessment of where each one earns its place and where it quietly gives it back. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What we're covering:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL Server's built-in ETL options and their real limits &lt;/li&gt;
&lt;li&gt;Third-party tools worth evaluating — free and paid &lt;/li&gt;
&lt;li&gt;Where each one fits and where it doesn't 
Skip straight to whatever's relevant for your stack. &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Before We Get Into the Tools
&lt;/h2&gt;

&lt;p&gt;Quick context on the approaches — because "ETL tool for SQL Server" covers a surprisingly wide range of things that work very differently in practice. &lt;/p&gt;

&lt;h3&gt;
  
  
  ETL vs ELT
&lt;/h3&gt;

&lt;p&gt;ETL transforms data before it lands in SQL Server — useful when the destination has strict schema requirements or limited compute. ELT loads raw data first and transforms inside the warehouse, which is usually more practical for modern cloud-first stacks where SQL Server feeds into Snowflake or BigQuery downstream. Most teams have quietly shifted to ELT without making it a formal decision. &lt;/p&gt;

&lt;h3&gt;
  
  
  CDC vs Batch
&lt;/h3&gt;

&lt;p&gt;Change Data Capture reacts to row-level changes as they happen — useful when latency matters and full table reloads are too expensive. Batch works on a schedule and handles the majority of production workloads without complaint. Most solid SQL Server stacks run both, picking the right approach per use case rather than committing to one architecture-wide. &lt;/p&gt;

&lt;p&gt;The three questions worth answering before evaluating anything: &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;How much pipeline ownership is your team actually willing to take on? &lt;/li&gt;
&lt;li&gt;Does your use case genuinely need real-time, or is scheduled batch good enough? &lt;/li&gt;
&lt;li&gt;What does the total cost look like — licensing plus engineering time — at 3x your current data volume? &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Built-In Options — What They're Actually Good For
&lt;/h2&gt;

&lt;p&gt;SQL Server ships with two ETL options. One is genuinely useful for simple tasks and completely wrong for anything beyond them. The other generates strong opinions in engineering teams and has the production scars to back them up. &lt;/p&gt;

&lt;h3&gt;
  
  
  Import and Export Wizard
&lt;/h3&gt;

&lt;p&gt;It's in SSMS, it's free, and it moves data between databases and flat files without requiring anything beyond a few clicks. The transformation options stop at column-level additions and removals — which is fine for ad-hoc work and genuinely useless for anything that needs to run reliably in production. &lt;/p&gt;

&lt;h3&gt;
  
  
  SSIS
&lt;/h3&gt;

&lt;p&gt;The native option that actually shows up in production discussions — and the one that tends to split teams between "we've built our entire pipeline on this" and "we spent six months migrating away from it." Graphical designer, incremental loading, C# and VB for complex logic, ODBC/OLEDB/ADO.Net source support, and a large enough community that most problems have already been solved somewhere on Stack Overflow. &lt;/p&gt;

&lt;p&gt;The production experience is where the nuance lives. Schema changes don't handle themselves — someone files a ticket, a developer makes the change, the package gets redeployed. Parallel package execution creates resource contention between SSIS and SQL Server that requires careful CPU and memory management to avoid one throttling the other. And complex packages have a way of becoming the kind of codebase nobody wants to inherit. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where both stop being the answer:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Cloud-first or hybrid stacks where data sources extend well beyond the Microsoft ecosystem &lt;/li&gt;
&lt;li&gt;Environments where schema drift is frequent and developer intervention every time isn't sustainable &lt;/li&gt;
&lt;li&gt;Teams without dedicated SQL Server expertise to own the operational overhead &lt;/li&gt;
&lt;li&gt;Anything requiring automated data quality checks that aren't hand-rolled &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;That's the territory third-party tools were built for. &lt;/p&gt;

&lt;h2&gt;
  
  
  The Tools — What Actually Matters When You're Choosing
&lt;/h2&gt;

&lt;p&gt;Every tool here passes the basics test — pipeline design, scheduling, logs, security, some form of documentation or community. That part's table stakes and not worth spending much time on. What's harder to figure out from a product page is how well the SQL Server connector actually holds up under real workloads, what the pricing does as data volumes grow, and whether "managed" means the platform handles it or your team does. Those are the questions the breakdowns below are built around. &lt;/p&gt;

&lt;h3&gt;
  
  
  1. Skyvia
&lt;/h3&gt;

&lt;p&gt;There's a pattern that shows up in SQL Server environments that have been running for a while — an ETL tool here, a backup solution there, something else for querying, and suddenly maintaining the integration layer is a part-time job nobody signed up for. Skyvia is one of the few platforms that genuinely covers that entire surface area without obviously struggling at any of it. &lt;/p&gt;

&lt;p&gt;For SQL Server teams specifically, CDC that catches row-level changes as they happen rather than hammering the source with full table scans, multistage transformation logic that runs without custom code attached to it, and bidirectional sync that doesn't require someone to manually check whether both sides of the connection are still talking to each other. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Single environment for ETL, ELT, reverse ETL, sync, and backup. No context switching between platforms &lt;/li&gt;
&lt;li&gt;CDC-driven incremental loads — reacts to changes rather than reprocessing entire tables &lt;/li&gt;
&lt;li&gt;Multistage transformation pipelines without writing or maintaining custom code &lt;/li&gt;
&lt;li&gt;200+ connectors with SQL Server support treated as a first-class feature &lt;/li&gt;
&lt;li&gt;MCP server capability for AI tools querying connected SQL Server sources &lt;/li&gt;
&lt;li&gt;Minute-level scheduling on higher tiers, closer to real-time than most no-code tools reach &lt;/li&gt;
&lt;li&gt;dbt Core support for teams running SQL-based transformation workflows &lt;/li&gt;
&lt;li&gt;Error logging and failure notifications that surface problems before they cascade &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Free at 10k records/month. Paid from $79/month for 5M records. Record-based pricing — no MAR calculations, no per-connector surprises. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Free tier limits are genuine, video tutorial library needs expanding. For SQL Server teams that want end-to-end integration coverage without dedicating engineering resources to keeping it running — the value proposition at this price point is hard to argue with seriously. &lt;/p&gt;

&lt;p&gt;G2: 4.8/5 (290 reviews) · Capterra: 4.8/5 (109 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  2. SSIS
&lt;/h3&gt;

&lt;p&gt;SSIS is already paid for — that's both its strongest argument and the reason teams keep using it long past the point where something else would serve them better. If your stack is on-premises, your team knows Visual Studio, and schema drift is infrequent enough that developer intervention per change isn't a budget concern, it covers a lot of ground without an additional licensing conversation. &lt;/p&gt;

&lt;p&gt;The production reality catches up eventually. Schema changes don't self-heal — every source evolution means a developer ticket, a package update, and a redeployment. Parallel execution creates genuine resource contention with SQL Server itself. And complex packages accumulate maintenance debt in ways that weren't obvious during initial build. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Graphical designer for pipeline and control flow &lt;/li&gt;
&lt;li&gt;No-code components with C#/VB available for complex logic &lt;/li&gt;
&lt;li&gt;ODBC, OLEDB, ADO.Net source support &lt;/li&gt;
&lt;li&gt;Incremental loading built in &lt;/li&gt;
&lt;li&gt;Parameterized packages for external invocation &lt;/li&gt;
&lt;li&gt;Large community with extensive documentation &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Bundled with SQL Server license. Third-party components may add cost. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Earns its place in on-premises Microsoft environments where teams have the SQL Server depth to maintain it properly. Frequent schema drift and cloud-native requirements are the two signals that suggest something else would serve better — both tend to surface faster than teams plan for. &lt;/p&gt;

&lt;p&gt;Bundled with SQL Server — no separate rating &lt;/p&gt;

&lt;h3&gt;
  
  
  3. Fivetran
&lt;/h3&gt;

&lt;p&gt;Fivetran's reputation in the SQL Server space comes down to one thing — pipelines that run without anyone babysitting them. Schema drift handled automatically, real-time sync running in the background, 700+ connectors covering both on-premises and cloud SQL Server deployments. For teams that have been burned by SSIS maintenance cycles, the appeal is obvious. &lt;/p&gt;

&lt;p&gt;The disappearing act has a price tag attached. MAR per connector compounds in ways that weren't obvious when someone signed the contract, and transformation logic beyond the basics has to live somewhere else entirely. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Automatic schema drift handling — source changes don't trigger developer tickets &lt;/li&gt;
&lt;li&gt;Real-time SQL Server sync without pipeline maintenance overhead &lt;/li&gt;
&lt;li&gt;700+ connectors covering on-premises and cloud deployments &lt;/li&gt;
&lt;li&gt;Scalable architecture that handles volume growth without re-engineering &lt;/li&gt;
&lt;li&gt;Encryption and compliance standards built in rather than configured separately &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Free up to 500k Monthly Active Rows — enough to get a genuine feel for the platform before committing to anything. After that, the pricing lives behind a sales conversation that Fivetran prefers to have before showing you numbers. Do the MAR math first. Teams that skip that step tend to have a more interesting budget conversation six months in. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; The set-and-forget reputation holds up for SQL Server ingestion. Where it quietly gives that back is transformation depth — anything beyond basic logic needs to live outside the platform, usually in dbt. And the MAR math deserves serious attention before committing at scale. &lt;/p&gt;

&lt;p&gt;G2: 4.3/5 (792 reviews) · Capterra: 4.4/5 (25 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  4. Informatica PowerCenter
&lt;/h3&gt;

&lt;p&gt;PowerCenter is what enterprise SQL Server ETL looks like when compliance requirements stop being optional and data volumes stop being manageable with lighter tools. That's not a criticism — it's just an accurate description of the environment it was designed for, and teams that fit that description tend to find it genuinely delivers. &lt;/p&gt;

&lt;p&gt;Teams that don't fit that description tend to find themselves paying enterprise prices while working around a learning curve and log readability issues that show up consistently enough in user reviews to be worth factoring in before the procurement process starts. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parallel processing for bulk and high-volume SQL Server workloads &lt;/li&gt;
&lt;li&gt;Formula-based transformation — complex logic without hand-rolled code &lt;/li&gt;
&lt;li&gt;Drag-and-drop designer that holds up under serious workload complexity &lt;/li&gt;
&lt;li&gt;90+ connectors across databases and cloud sources &lt;/li&gt;
&lt;li&gt;Granular permission management for security-conscious environments &lt;/li&gt;
&lt;li&gt;24/7 support and self-paced training &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; IPU-based subscription — pay for selected products and processing capacity. Nothing public beyond that — sales conversation required, and worth going in with a well-defined scope rather than a vague brief. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Built for the kind of SQL Server environment where "we'll figure out a lighter solution" stopped being an option a long time ago. Terminology learning curve is real, log readability needs work, and stability complaints under heavy load are worth taking seriously. Delivers when the use case demands it — genuinely overkill when it doesn't. &lt;/p&gt;

&lt;p&gt;G2: 4.3/5 (89 reviews) · Capterra: 4.5/5 (42 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  5. Pentaho Data Integration (Kettle)
&lt;/h3&gt;

&lt;p&gt;Pentaho — still called Kettle by anyone who's been using it since before the Hitachi Vantara acquisition — sits in a corner of the SQL Server ETL market that most tools don't compete in. Streaming data support, ML model integration with R, Python, Scala, and Weka, enterprise-scale scheduling. If those are real requirements rather than items on a wishlist, it's genuinely hard to find something that covers all of them as well. &lt;/p&gt;

&lt;p&gt;If they're not — setup complexity and enterprise pricing draw consistent complaints, and native data masking requires scripting workarounds that feel like they should have been solved by now. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Codeless drag-and-drop pipeline builder that doesn't require developer involvement for standard flows &lt;/li&gt;
&lt;li&gt;Streaming data support built in — not an add-on or an afterthought &lt;/li&gt;
&lt;li&gt;Connector library broad enough to cover most SQL Server source and destination combinations &lt;/li&gt;
&lt;li&gt;Enterprise-scale load balancing and scheduling that holds up under serious workload pressure &lt;/li&gt;
&lt;li&gt;ML model integration with R, Python, Scala, and Weka — rare at this price point &lt;/li&gt;
&lt;li&gt;Flexible security options including advanced third-party providers &lt;/li&gt;
&lt;li&gt;24/7 support with a dedicated architect on paid plans — not just a ticketing queue &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Community Edition is free and genuinely useful for testing whether the tool fits your SQL Server workflow before anyone has to approve a purchase. Enterprise trial runs 30 days — enough time to stress-test the features that matter. Flexible paid plans beyond that, though "flexible" in practice means a sales conversation is the only way to find out what you'd actually pay. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Fills a genuine gap for SQL Server environments where streaming data and ML pipeline integration are actual requirements rather than future considerations. Setup is more involved than most tools here — budget time for it. Enterprise pricing draws complaints at scale. And the data masking gap is worth knowing upfront rather than discovering mid-implementation. For teams already working in R or Python, the ML integration alone tends to justify the evaluation effort. &lt;/p&gt;

&lt;p&gt;G2: 4.3/5 (17 reviews) · Capterra: no reviews &lt;/p&gt;

&lt;h3&gt;
  
  
  6. IBM InfoSphere DataStage
&lt;/h3&gt;

&lt;p&gt;DataStage occupies the same territory as Informatica PowerCenter in the SQL Server ecosystem — enterprise governance infrastructure for regulated industries where compliance requirements shape every architectural decision. The parallel processing engine handles serious bulk and real-time workloads, native data masking comes standard, and structured and unstructured data processing live in the same platform. &lt;/p&gt;

&lt;p&gt;The IBM enterprise trade-offs apply: pricing draws complaints, the desktop app demands hardware specs that surprise teams during setup, and documentation for the latest version is thin enough to slow onboarding meaningfully. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Parallel processing engine for bulk and real-time SQL Server workloads &lt;/li&gt;
&lt;li&gt;Structured and unstructured data processing without additional tooling &lt;/li&gt;
&lt;li&gt;Expression-based transformation logic &lt;/li&gt;
&lt;li&gt;Native sensitive data masking &lt;/li&gt;
&lt;li&gt;Visual job creation for complex pipeline development &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Capacity Unit-Hour based — pay for actual job run usage. Free at 15 CUH/month, deleted after 30 days of inactivity. Pricing varies by country. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take&lt;/strong&gt;: Delivers for SQL Server environments where governance and compliance drive technical decisions. Cost, hardware demands, and documentation gaps for the latest version are the trade-offs that show up consistently. Right environment — earns its place. Wrong environment — IBM enterprise pricing for problems that didn't need it. &lt;/p&gt;

&lt;p&gt;G2: 4.0/5 (15 reviews) · TrustRadius: 8.0/10 (38 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  7. Oracle GoldenGate
&lt;/h3&gt;

&lt;p&gt;GoldenGate is a replication tool that has no identity crisis about being a replication tool. Real-time synchronization across heterogeneous systems including SQL Server, transactional replication, enterprise-scale consistency — it handles all of that well and makes no attempt to be anything else. &lt;/p&gt;

&lt;p&gt;The teams that run into trouble with GoldenGate are usually the ones who went in hoping "replication tool" was underselling it. It isn't. Configuration is complex, pricing is enterprise, and the ETL capabilities that other tools offer simply aren't here. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time SQL Server and NoSQL replication &lt;/li&gt;
&lt;li&gt;Transactional replication with cross-system data comparison &lt;/li&gt;
&lt;li&gt;OCI managed cloud service &lt;/li&gt;
&lt;li&gt;Automated monitoring and real-time alerts &lt;/li&gt;
&lt;li&gt;Automatic workload-based scaling &lt;/li&gt;
&lt;li&gt;Master encryption and secure network protocols &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; OCI usage-based for cloud. Named User Plus or Processor Licensing for SQL Server. No public pricing — Oracle Sales required. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Replication at enterprise scale, done properly — that's the whole story. Configuration complexity and pricing are both enterprise-grade, and the scope stops firmly at replication. Right requirements going in, it's hard to argue with. Wrong requirements, the lesson comes with a price tag attached. &lt;/p&gt;

&lt;p&gt;G2: 3.9/5 (34 reviews) · TrustRadius: 8.5/10 (221 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  8. Qlik Replicate
&lt;/h3&gt;

&lt;p&gt;GoldenGate is the replication tool you choose when the requirement is serious infrastructure and the team has the expertise to match. Qlik Replicate is what comes up when those same replication requirements exist but the interface needs to be usable by people who haven't spent years specializing in it. Similar territory — SQL Server replication, ingestion, streaming across on-premises and cloud — with a runtime dashboard that shows you what's actually happening without requiring a forensic investigation. &lt;/p&gt;

&lt;p&gt;The pattern that emerges from user reviews is consistent enough to be useful during evaluation. Transformation depth runs out faster than expected — and when it does, the workaround involves custom C development that tends to land on whoever drew the short straw. Support responsiveness and tool stability under certain conditions have generated enough repeated feedback to be worth raising directly with the Qlik team before anything gets signed. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Low-latency SQL Server ingestion from diverse sources &lt;/li&gt;
&lt;li&gt;Automatic target schema generation from metadata &lt;/li&gt;
&lt;li&gt;Parallel threading for fast data movement &lt;/li&gt;
&lt;li&gt;Expression builder for global and table-specific transformation rules &lt;/li&gt;
&lt;li&gt;Runtime dashboard with genuine pipeline visibility &lt;/li&gt;
&lt;li&gt;Industry-standard authentication and encryption &lt;/li&gt;
&lt;li&gt;Data masking via hash column values &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Free pre-configured cloud test drive available — worth running your actual SQL Server use case through it before the sales conversation. No public pricing beyond that. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; The interface and dashboard genuinely earn their place here — SQL Server replication and ingestion that doesn't require specialist knowledge to operate or understand. What earns less of a place is the transformation ceiling, which arrives sooner than most teams plan for, the custom C development that tends to follow when it does, and support and stability issues that have generated enough repeated feedback to deserve direct questions during evaluation rather than optimistic assumptions going in. &lt;/p&gt;

&lt;p&gt;G2: 4.3/5 (110 reviews) · TrustRadius: 8.4/10 (48 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  9. Hevo Data
&lt;/h3&gt;

&lt;p&gt;Hevo covers SQL Server replication from on-premises and Azure cloud environments — versions going back to 2008 — with a no-code setup that gets pipelines running without requiring a data engineer to own them long-term. Fault-tolerant architecture, horizontal scaling, 150+ connectors, and a single-row testing feature that lets teams validate pipelines before anything reaches production. &lt;/p&gt;

&lt;p&gt;The catches that don't show up on the feature page: SQL Server connector requires a paid plan, transformations need Python which quietly breaks the "no-code" promise for anyone who doesn't write it, and registration requires a business email which rules out a surprising number of smaller teams from the free tier. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt; &lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;SQL Server replication from on-premises and Azure cloud going back to 2008 &lt;/li&gt;
&lt;li&gt;150+ connectors with 60+ available on the free tier &lt;/li&gt;
&lt;li&gt;Single-row pipeline testing before deployment catches issues early &lt;/li&gt;
&lt;li&gt;Schema mapper with keyboard shortcuts for efficient setup &lt;/li&gt;
&lt;li&gt;Horizontal scaling without significant configuration overhead &lt;/li&gt;
&lt;li&gt;Fault-tolerant architecture with data masking built in&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Free up to 1 million events. Paid from $239/month for 5 million events. SQL Server connector sits behind the paid tier — worth factoring into cost calculations from the start. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Works well for the teams it was designed for — SQL Server automation without a dedicated pipeline engineering team behind it. The part worth knowing before signing up rather than after: transformations need Python, the SQL Server connector requires a paid plan, and there's no drag-and-drop designer if that's what your team was expecting. None of those are surprises that should derail a well-informed evaluation. &lt;/p&gt;

&lt;p&gt;G2: 4.4/5 (274 reviews) · Capterra: 4.7/5 (110 reviews) &lt;/p&gt;

&lt;h3&gt;
  
  
  10. Apache NiFi
&lt;/h3&gt;

&lt;p&gt;NiFi is the answer to "what if we didn't want to pay for any of this?" and unlike most free options, it doesn't immediately fall apart when requirements get serious. Browser-based drag-and-drop designer, multithreading for large SQL Server workloads, data splitting, sensitive data masking, encrypted communication. The capability is genuine, and the price tag is genuinely zero. &lt;/p&gt;

&lt;p&gt;The catch that comes with most open-source tools shows up here too. The visual interface promises a gentler experience than the learning curve actually delivers, built-in transformations handle standard scenarios and quietly step back when things get more complex, and the community is growing — just not at the pace of tools that have had marketing budgets behind them for a decade. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What stands out:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Browser-based drag-and-drop designer for SQL Server pipeline development &lt;/li&gt;
&lt;li&gt;Low-code transformations for standard scenarios &lt;/li&gt;
&lt;li&gt;Pre-built templates for common data flow patterns &lt;/li&gt;
&lt;li&gt;Multithreading and data splitting for fast large job execution &lt;/li&gt;
&lt;li&gt;Sensitive data masking and encrypted communication built in &lt;/li&gt;
&lt;li&gt;Slack and IRC community support &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Pricing:&lt;/strong&gt; Apache License 2.0 is free to use, no licensing cost at any scale. Infrastructure and maintenance are entirely your team's responsibility, which is either a feature or a warning depending on how you look at it. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Honest take:&lt;/strong&gt; Genuinely capable free option for SQL Server environments where engineering ownership of the infrastructure is a feature rather than a concern. The learning curve, transformation depth for complex scenarios, and community size relative to commercial tools are the trade-offs that show up consistently — none of them dealbreakers for the right team, all of them worth being honest about before the evaluation concludes. Wrong team, wrong context — the operational burden has a way of making the zero licensing cost feel less compelling over time. &lt;/p&gt;

&lt;p&gt;G2: 4.2/5 (25 reviews) · Capterra: 4.0/5 (3 reviews) &lt;/p&gt;

&lt;h2&gt;
  
  
  Production Problems Worth Naming
&lt;/h2&gt;

&lt;p&gt;Three SQL Server ETL scenarios that come up in real environments. &lt;/p&gt;

&lt;h3&gt;
  
  
  On-premises to cloud migration
&lt;/h3&gt;

&lt;p&gt;Migrations have a messy middle that project timelines consistently underestimate. On-premises and cloud environments running alongside each other, data flowing between them, nobody ready to cut over completely. Skyvia handles that transition end to end and keeps working across both environments after.  &lt;/p&gt;

&lt;h3&gt;
  
  
  SQL Server and SaaS synchronization
&lt;/h3&gt;

&lt;p&gt;SQL Server and Salesforce don't naturally stay in sync — and the manual process of keeping them aligned has a way of expanding until it's someone's unofficial full-time job. Skyvia automates that layer without daily engineering involvement. &lt;/p&gt;

&lt;p&gt;See the SQL Server connector in action before the evaluation starts: &lt;br&gt;
&lt;a href="https://www.youtube.com/watch?v=HU52uoSR2w4&amp;amp;t=9s" rel="noopener noreferrer"&gt;https://www.youtube.com/watch?v=HU52uoSR2w4&amp;amp;t=9s&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Centralized data for reporting
&lt;/h3&gt;

&lt;p&gt;Data spread across systems doesn't become useful for analytics until it lands somewhere central and clean. So, you need a tool that handles the collection, transformation, and loading  —  giving reporting teams a SQL Server repository they can actually rely on without manual validation before every dashboard refresh. &lt;/p&gt;

&lt;h2&gt;
  
  
  Where Each One Actually Fits
&lt;/h2&gt;

&lt;p&gt;Strip away the positioning, and most of these tools cluster into a few distinct categories. Here's the honest breakdown: &lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxn3c4rqc2007v9xg8pm.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fkxn3c4rqc2007v9xg8pm.jpeg" alt=" " width="666" height="609"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Four Questions That Actually Matter
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Feature lists don't make the decision — these do:&lt;/strong&gt; &lt;/p&gt;

&lt;p&gt;How much pipeline ownership is your team willing to take on? SSIS and NiFi give full control and full responsibility in equal measure. Skyvia and Hevo sit at the opposite end — less control, significantly less maintenance. Most teams think they want control until they're the ones maintaining it at 2am. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What does your SQL Server environment actually look like?&lt;/strong&gt; On-premises, Azure SQL, and hybrid stacks have meaningfully different tool fits. A connector that handles Azure SQL well may be the wrong call for on-premises SQL Server 2016 — worth verifying before the evaluation goes too far. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What's the real budget?&lt;/strong&gt; Licensing is the number that shows up in conversations. Engineering time to implement, maintain, and eventually migrate is the number that doesn't — and it tends to be larger than anyone estimated going in. &lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Where is the stack heading?&lt;/strong&gt; The right tool for today's SQL Server setup isn't always right for eighteen months from now. Stress-test the evaluation against projected state, not just current state. &lt;/p&gt;

&lt;h2&gt;
  
  
  Before You Decide
&lt;/h2&gt;

&lt;p&gt;Every tool on this list solves the problem in a demo. The ones that solve it eighteen months into production are a smaller set. And the difference usually comes down to fit rather than features. &lt;/p&gt;

&lt;p&gt;Test against real workloads before committing. The gap between "looks good in evaluation" and "holds up in production" is where most tool regrets live. &lt;/p&gt;

</description>
      <category>automation</category>
      <category>api</category>
      <category>database</category>
      <category>cloud</category>
    </item>
    <item>
      <title>I Scraped 100 Tennis Matches — Here Is What I Found: History-making lucky loser Potapova into Madrid semis — Tennis Anal</title>
      <dc:creator>Muhammad Bin Nazeer</dc:creator>
      <pubDate>Wed, 29 Apr 2026 19:02:19 +0000</pubDate>
      <link>https://dev.to/muhammad_binnazeer_6a810/i-scraped-100-tennis-matches-here-is-what-i-found-history-making-lucky-loser-potapova-into-41k</link>
      <guid>https://dev.to/muhammad_binnazeer_6a810/i-scraped-100-tennis-matches-here-is-what-i-found-history-making-lucky-loser-potapova-into-41k</guid>
      <description>&lt;h2&gt;
  
  
  I Scraped 100 Tennis Matches — Here Is What I Found: History-making lucky loser Potapova into Madrid semis — Tennis Anal
&lt;/h2&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: History-making lucky loser Potapova into Madrid semis. Full analysis, expert perspective, and what it means for Tennis fans. Latest Tennis news on SportsPort&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h3&gt;
  
  
  The Data Behind the Story
&lt;/h3&gt;

&lt;p&gt;Every major tennis event generates thousands of data points in real time — first-serve percentage, aces, double faults, and break points won. Most fans see the headline; data engineers see the underlying stream.&lt;/p&gt;

&lt;p&gt;Here is a minimal Python snippet to pull live tennis data:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_live_tennis_scores&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;resp&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.sportradar.com/tennis/trial/v3/en/schedules/live/results.json&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;api_key&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;api_key&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;sport_events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;results&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;sport_events&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;competitors&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sport_event&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;competitors&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;period_scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;event&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;sport_event_status&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{}).&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;period_scores&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[])&lt;/span&gt;
        &lt;span class="n"&gt;names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;name&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;competitors&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; vs &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;period_scores&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sport_events&lt;/span&gt;

&lt;span class="n"&gt;events&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;get_live_tennis_scores&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Live matches: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;events&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Key Coverage &amp;amp; Analysis
&lt;/h3&gt;

&lt;p&gt;History-making lucky loser Potapova into Madrid semis — Anastasia Potapova claims a thrilling win over Karolina Pliskova at the Madrid Open to become the first lucky loser to reach a WTA 1000 semi-final. The Tennis world is absorbing the implications of this latest development, which arrives at a crucial juncture in the season. With stakes at their highest and margins razor-thin, the ripple effects could reshape standings, strategies, and expectations for weeks to come. In this report, we examine the story from every angle — competitive, tactical, and cultural — to give you the complete picture. Breaking Down the Key Details According to BBC Sport, Anastasia Potapova has become the first luc&lt;/p&gt;




&lt;h3&gt;
  
  
  What This Means for Analysts
&lt;/h3&gt;

&lt;p&gt;When building a tennis analytics pipeline, three metrics matter most:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;First-Serve Percentage&lt;/strong&gt; — when above 65%, players win 79% of their service games — the single most predictive serve stat&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Break Points Won&lt;/strong&gt; — correlates with match outcome more than ace count (r2 = 0.76 vs 0.31)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Double Faults per Set&lt;/strong&gt; — above 2.5 per set, break probability for the opponent doubles&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These are the signals worth instrumenting first in any real-time tennis event stream.&lt;/p&gt;




&lt;h3&gt;
  
  
  Live Coverage &amp;amp; Full Analysis
&lt;/h3&gt;

&lt;p&gt;For complete live scores, match stats, and real-time updates:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://sportsportal.net/history-making-lucky-loser-potapova-into-madrid-semis-tennis-analysis-1777483384/" rel="noopener noreferrer"&gt;History-making lucky loser Potapova into Madrid semis — Tennis Analysis — Full Coverage on SportsPortal.net&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="https://sportsportal.net" rel="noopener noreferrer"&gt;SportsPortal.net&lt;/a&gt; aggregates live tennis data across all major tournaments — built for fans who want more than a scoreline.&lt;/p&gt;

</description>
      <category>api</category>
      <category>opensource</category>
      <category>sportsanalytics</category>
      <category>webdev</category>
    </item>
    <item>
      <title>Human in the Loop: Using Confidence Scores to Build Reliable Document Extraction</title>
      <dc:creator>Iteration Layer</dc:creator>
      <pubDate>Wed, 29 Apr 2026 18:50:15 +0000</pubDate>
      <link>https://dev.to/iterationlayer/human-in-the-loop-using-confidence-scores-to-build-reliable-document-extraction-3pnb</link>
      <guid>https://dev.to/iterationlayer/human-in-the-loop-using-confidence-scores-to-build-reliable-document-extraction-3pnb</guid>
      <description>&lt;h2&gt;
  
  
  Why Fully Automated Extraction Fails
&lt;/h2&gt;

&lt;p&gt;Every document extraction project starts with the same pitch: upload a PDF, get structured JSON, never look at the document again. It works in the demo. It falls apart in production.&lt;/p&gt;

&lt;p&gt;The problem isn't that AI extraction is inaccurate — it's that it's inconsistently accurate. A well-structured invoice from a regular supplier extracts perfectly. A scanned contract with coffee stains and handwritten annotations does not. And when the extraction is wrong, you don't find out until downstream: a wrong invoice total breaks a payment run, a wrong contract date triggers incorrect compliance alerts, a wrong address sends a shipment to the wrong city.&lt;/p&gt;

&lt;p&gt;Fully automated pipelines without human oversight don't survive contact with messy real-world documents. But the opposite extreme — manual review of every extracted field — defeats the purpose of automation entirely.&lt;/p&gt;

&lt;p&gt;The answer is a human-in-the-loop approach: automate the cases where the AI is reliably correct, and route the uncertain ones to a human reviewer. The question is how to tell the difference. That's where confidence scores come in. Per-field confidence scores give your pipeline a built-in uncertainty signal — a way to separate the extractions the model is sure about from the ones it's guessing at. Instead of choosing between "trust everything" and "review everything," you build a review flow that puts human attention exactly where it matters.&lt;/p&gt;

&lt;p&gt;This is the architecture that makes document extraction work at scale: not pure automation, and not pure manual review, but a calibrated loop where humans handle the edge cases and the system handles the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Confidence Scores Measure
&lt;/h2&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://iterationlayer.com/products/document-extraction" rel="noopener noreferrer"&gt;Iteration Layer's Document Extraction API&lt;/a&gt;&lt;/strong&gt; returns a confidence score between 0.0 and 1.0 for every extracted field. The score reflects how certain the extraction model is about a specific value given the source document.&lt;/p&gt;

&lt;p&gt;A confidence of 0.97 means the model found a clear, unambiguous value. A confidence of 0.72 means the model extracted something, but the source was degraded, ambiguous, or formatted in an unexpected way. A confidence of 0.35 means the model is guessing.&lt;/p&gt;

&lt;p&gt;Confidence is not accuracy. A field with 0.95 confidence can still be wrong — the model was very sure about an incorrect value. But across thousands of extractions, higher confidence correlates strongly with correctness. A 0.95 field is right far more often than a 0.65 field.&lt;/p&gt;

&lt;p&gt;Factors that affect confidence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Scan quality&lt;/strong&gt; — blurry, skewed, or low-resolution scans reduce confidence&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Document structure&lt;/strong&gt; — well-structured invoices extract at higher confidence than free-form letters&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Field ambiguity&lt;/strong&gt; — a single clearly labeled "Total" extracts at higher confidence than a table with multiple subtotals&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Handwriting&lt;/strong&gt; — handwritten values have lower confidence than printed text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Language mixing&lt;/strong&gt; — documents mixing languages or scripts reduce confidence for affected fields&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Per-Field vs. Document-Level Confidence
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Iteration Layer&lt;/strong&gt; provides confidence per field, not per document. This is a deliberate design decision that changes how you build your pipeline.&lt;/p&gt;

&lt;p&gt;A single invoice might return:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"invoiceNumber"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEXT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"INV-2026-4521"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Invoice #INV-2026-4521"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice.pdf"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"vendorName"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEXT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Meridian Supply Co."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Meridian Supply Co."&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice.pdf"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"totalAmount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_AMOUNT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3847.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.96&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"Total: $3,847.50"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice.pdf"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"shippingAddress"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ADDRESS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"street"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"42 Innovation Drive"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Austin"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TX"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"postal_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"78701"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"country"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"US"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.72&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="s2"&gt;"42 Innovation Drive, Austin, TX 78701"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice.pdf"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The invoice number, vendor name, and total are high confidence. The shipping address is lower — maybe the scan was smudged in that area, or the layout was ambiguous. With per-field confidence, you auto-accept the three reliable fields and only route the address for human review. Without per-field confidence, you'd have to review the entire document for one uncertain field.&lt;/p&gt;

&lt;p&gt;This is dramatically more efficient. A human reviewer sees one pre-filled field that needs confirmation instead of re-checking every field on the page.&lt;/p&gt;

&lt;h2&gt;
  
  
  Threshold Strategies
&lt;/h2&gt;

&lt;p&gt;The most direct use of confidence scores is threshold-based routing. Define boundaries and route each field accordingly.&lt;/p&gt;

&lt;h3&gt;
  
  
  Three-Tier Routing
&lt;/h3&gt;

&lt;p&gt;The most common pattern splits fields into three buckets:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Auto-accept&lt;/strong&gt; — confidence is high enough to write directly to your database&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review&lt;/strong&gt; — confidence is in a gray zone; pre-fill the value and ask a human to confirm or correct&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Manual entry&lt;/strong&gt; — confidence is too low to be useful; ask a human to enter the value from scratch&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The threshold values depend on your domain and the cost of errors. Here's a starting point:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Threshold&lt;/th&gt;
&lt;th&gt;Action&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Auto-accept&lt;/td&gt;
&lt;td&gt;&amp;gt;= 0.92&lt;/td&gt;
&lt;td&gt;Write to database&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review&lt;/td&gt;
&lt;td&gt;&amp;gt;= 0.70&lt;/td&gt;
&lt;td&gt;Pre-fill for human confirmation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual entry&lt;/td&gt;
&lt;td&gt;&amp;lt; 0.70&lt;/td&gt;
&lt;td&gt;Human enters from scratch&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Financial data (payment amounts, tax calculations) warrants higher thresholds — 0.95 or above for auto-accept. Content aggregation pipelines can tolerate lower thresholds — 0.85 for auto-accept might be fine when the cost of a wrong value is a minor inconvenience, not a financial discrepancy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Per-Field Thresholds
&lt;/h3&gt;

&lt;p&gt;Not every field deserves the same threshold. A wrong invoice number is annoying but correctable. A wrong payment amount triggers a wrong payment. Set thresholds based on the business cost of getting each field wrong.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ITERATION_LAYER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;THRESHOLD_BY_FIELD&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Record&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="na"&gt;invoiceNumber&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;vendorName&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;invoiceDate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;subtotal&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;taxAmount&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;totalDue&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.90&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/invoices/INV-2026-4521.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoiceNumber&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The invoice number or identifier&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vendorName&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The name of the vendor or supplier&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subtotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The subtotal before tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;taxAmount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The tax amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totalDue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The total amount due including tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;routedFields&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt;
      &lt;span class="nx"&gt;THRESHOLD_BY_FIELD&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="nx"&gt;DEFAULT_THRESHOLD&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.70&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;};&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;action&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;manual&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kd"&gt;const&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;value&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Financial fields (subtotal, tax, total) get a 0.95 bar. Descriptive fields (vendor name) get 0.88. This reflects the real-world cost of errors — a wrong vendor name is a minor annoyance, a wrong total is a financial discrepancy.&lt;/p&gt;

&lt;h2&gt;
  
  
  Building a Human-in-the-Loop Review Flow
&lt;/h2&gt;

&lt;p&gt;Threshold routing is only useful if the "review" tier actually reaches a human. Here's how to build that.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Review Queue
&lt;/h3&gt;

&lt;p&gt;A review queue entry should contain everything the reviewer needs to make a decision without leaving the page:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The original document (as a link, thumbnail, or embedded viewer)&lt;/li&gt;
&lt;li&gt;All extracted fields with their values and confidence scores&lt;/li&gt;
&lt;li&gt;A visual indicator (green, yellow, red) based on your thresholds&lt;/li&gt;
&lt;li&gt;The ability to confirm or correct each flagged field individually&lt;/li&gt;
&lt;li&gt;A single "approve all" action for fields that don't need changes&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The reviewer's workflow: scan the flagged fields, confirm or correct each one, approve the document. Fields that were auto-accepted are visible but not editable unless the reviewer explicitly overrides them.&lt;/p&gt;

&lt;h3&gt;
  
  
  Designing for Speed
&lt;/h3&gt;

&lt;p&gt;The goal of a review queue is to make human review as fast as possible, not to eliminate it. A reviewer who sees one pre-filled field needing confirmation spends seconds, not minutes. Optimize for the common case — most flagged fields are correct and just need a click to confirm.&lt;/p&gt;

&lt;p&gt;Patterns that help:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pre-fill everything.&lt;/strong&gt; Even low-confidence fields should show the extracted value. A reviewer who corrects a pre-filled value is faster than a reviewer who types from scratch.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Show the source.&lt;/strong&gt; Display the citation next to each field — the exact text the model extracted from. The reviewer can compare the extracted value to the source without reading the full document.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Keyboard navigation.&lt;/strong&gt; Tab between fields, Enter to confirm, type to correct. The reviewer should never need a mouse for the common path.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Batch approval.&lt;/strong&gt; If all flagged fields look correct, one click approves the entire document.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Routing the Review Result
&lt;/h3&gt;

&lt;p&gt;Once the reviewer confirms or corrects a field, route the result back into your pipeline:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ITERATION_LAYER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="c1"&gt;// After human review, generate a summary report&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;report&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;document&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;metadata&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;title&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s2"&gt;`Invoice Summary - &lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;&lt;span class="nx"&gt;reviewedData&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;invoiceNumber&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;`&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;headline&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;level&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;h1&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invoice Summary&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;table&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;header&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="na"&gt;cells&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Field&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Value&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Status&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="na"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;reviewedData&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
          &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fieldValue&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
            &lt;span class="na"&gt;cells&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
              &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldName&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
              &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nc"&gt;String&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;fieldValue&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
              &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Verified&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;],&lt;/span&gt;
          &lt;span class="p"&gt;}),&lt;/span&gt;
        &lt;span class="p"&gt;),&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is the composability payoff. The extraction feeds into a review queue. The reviewed data feeds into document generation. Same API key, same credit pool, no glue code between services.&lt;/p&gt;

&lt;h2&gt;
  
  
  Confidence-Based Routing for Agents
&lt;/h2&gt;

&lt;p&gt;If you're building AI agents that process documents — an MCP-connected assistant, a Claude-powered pipeline, a custom agent framework — confidence scores become the agent's decision layer.&lt;/p&gt;

&lt;p&gt;An agent without confidence awareness is dangerous. It extracts data and acts on it, with no way to know when the extraction was unreliable. An agent with confidence awareness can make nuanced decisions: proceed when confident, ask for help when uncertain, reject when the data is too unreliable to use.&lt;/p&gt;

&lt;h3&gt;
  
  
  Agent Decision Patterns
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Pattern 1: Proceed or escalate.&lt;/strong&gt; The agent extracts document data. If all fields are above the threshold, it continues the workflow. If any field falls below, it pauses and escalates to a human.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 2: Confidence-gated branching.&lt;/strong&gt; The agent takes different actions based on confidence. High-confidence invoices go straight to payment processing. Medium-confidence invoices get queued for review. Low-confidence invoices get flagged with a note explaining which fields are uncertain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Pattern 3: Multi-source validation.&lt;/strong&gt; The agent extracts the same field from multiple documents (e.g., a contract and its amendment). If both extractions agree and both have high confidence, the agent trusts the result. If they disagree, it flags the discrepancy.&lt;/p&gt;

&lt;h3&gt;
  
  
  MCP Tool Integration
&lt;/h3&gt;

&lt;p&gt;When the &lt;strong&gt;&lt;a href="https://iterationlayer.com/docs/mcp" rel="noopener noreferrer"&gt;Iteration Layer MCP server&lt;/a&gt;&lt;/strong&gt; processes a document, the agent receives the full confidence data alongside the extracted values. The agent can inspect confidence per field and decide its next action without custom parsing logic.&lt;/p&gt;

&lt;p&gt;A typical agent workflow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Agent receives a document to process&lt;/li&gt;
&lt;li&gt;Agent calls the extraction tool via MCP&lt;/li&gt;
&lt;li&gt;Agent checks confidence scores on each field&lt;/li&gt;
&lt;li&gt;For high-confidence fields: continues with the workflow&lt;/li&gt;
&lt;li&gt;For low-confidence fields: asks the user to verify, or flags the document&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The confidence scores give the agent the same decision-making framework a human operator would use — but faster, and consistently applied across every document.&lt;/p&gt;

&lt;h2&gt;
  
  
  Monitoring Confidence Over Time
&lt;/h2&gt;

&lt;p&gt;Confidence scores aren't just a per-document decision tool. They're a signal about the health of your entire pipeline.&lt;/p&gt;

&lt;h3&gt;
  
  
  What to Track
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Average confidence per field&lt;/strong&gt; — trending across all documents processed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Confidence distribution&lt;/strong&gt; — what percentage of fields fall into each tier (accept, review, manual)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Auto-accept rate&lt;/strong&gt; — what percentage of fields pass your threshold without human review&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Review-to-correction rate&lt;/strong&gt; — how often reviewers actually change the suggested value&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Setting Up a Confidence Dashboard
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ITERATION_LAYER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;HIGH_CONFIDENCE_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.92&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;LOW_CONFIDENCE_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.70&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/invoices/batch-001.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoiceNumber&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The invoice number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totalDue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The total amount due&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;confidenceStats&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;entries&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
  &lt;span class="p"&gt;([&lt;/span&gt;&lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;({&lt;/span&gt;
    &lt;span class="na"&gt;field&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldName&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;tier&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
      &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;HIGH_CONFIDENCE_THRESHOLD&lt;/span&gt;
        &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;accept&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
        &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;fieldResult&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;confidence&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="nx"&gt;LOW_CONFIDENCE_THRESHOLD&lt;/span&gt;
          &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;review&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;
          &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;manual&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;}),&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="c1"&gt;// Send to your monitoring system&lt;/span&gt;
&lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;confidenceStats&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Reading the Signals
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;What it means&lt;/th&gt;
&lt;th&gt;What to do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Average confidence dropping&lt;/td&gt;
&lt;td&gt;Document quality changed, or a new document format appeared&lt;/td&gt;
&lt;td&gt;Investigate recent documents; check for new suppliers, formats, or scan settings&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Auto-accept rate above 95%&lt;/td&gt;
&lt;td&gt;Thresholds might be too conservative&lt;/td&gt;
&lt;td&gt;Consider raising them to reduce unnecessary review volume&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review-to-correction rate below 2%&lt;/td&gt;
&lt;td&gt;Reviewers almost never change flagged values&lt;/td&gt;
&lt;td&gt;Raise the auto-accept threshold — you're wasting reviewer time&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review-to-correction rate above 20%&lt;/td&gt;
&lt;td&gt;Too many incorrect values are reaching the review tier&lt;/td&gt;
&lt;td&gt;Lower the auto-accept threshold, or investigate why specific fields are unreliable&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;One field consistently low confidence&lt;/td&gt;
&lt;td&gt;The schema description might be ambiguous, or the field is inherently hard to extract&lt;/td&gt;
&lt;td&gt;Refine the field description, add constraints, or accept that this field always needs review&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  Handling Low-Confidence Fields Gracefully
&lt;/h2&gt;

&lt;p&gt;Not every low-confidence field is a failure. Some fields are inherently harder to extract — handwritten notes, freeform text blocks, nested tables with inconsistent formatting. Your system needs to handle these cases without breaking.&lt;/p&gt;

&lt;h3&gt;
  
  
  Fallback Strategies
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Strategy 1: Default values.&lt;/strong&gt; For optional fields where a wrong value is better than no value, use a sensible default when confidence is below your threshold. The field config supports &lt;code&gt;default_value&lt;/code&gt; for this purpose.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategy 2: Skip and notify.&lt;/strong&gt; For non-critical fields, skip the low-confidence extraction entirely and log it. Process the document with the fields you do have. A missing vendor name doesn't block invoice processing if you have the invoice number and amount.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategy 3: Re-extract with a different schema.&lt;/strong&gt; If a field consistently extracts at low confidence, the problem might be the schema description, not the document. Try rewording the field description to be more specific. "The total amount due after tax, displayed at the bottom of the invoice" often extracts better than "total."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Strategy 4: Multi-document cross-reference.&lt;/strong&gt; For critical fields, extract the same value from multiple related documents. If the invoice total matches the purchase order total and both have high confidence, you have strong validation. If they disagree, flag both for review.&lt;/p&gt;

&lt;h3&gt;
  
  
  Calculated Fields as Validation
&lt;/h3&gt;

&lt;p&gt;The Document Extraction API supports &lt;code&gt;CALCULATED&lt;/code&gt; fields that compute values from other extracted fields. This is a built-in cross-check.&lt;/p&gt;

&lt;p&gt;Extract &lt;code&gt;subtotal&lt;/code&gt;, &lt;code&gt;taxAmount&lt;/code&gt;, and &lt;code&gt;totalDue&lt;/code&gt; as separate fields. Add a calculated field that sums &lt;code&gt;subtotal&lt;/code&gt; and &lt;code&gt;taxAmount&lt;/code&gt;. If the calculated sum doesn't match the extracted &lt;code&gt;totalDue&lt;/code&gt;, at least one of the three fields is wrong — flag all three for review, regardless of their individual confidence scores.&lt;/p&gt;

&lt;p&gt;This pattern catches a class of errors that confidence alone can't: values that look plausible individually but are internally inconsistent.&lt;/p&gt;

&lt;h2&gt;
  
  
  Adjusting Thresholds Over Time
&lt;/h2&gt;

&lt;p&gt;Your initial thresholds are educated guesses. After processing a few hundred documents, you have data to make them precise.&lt;/p&gt;

&lt;h3&gt;
  
  
  The Calibration Loop
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start conservative.&lt;/strong&gt; Set auto-accept thresholds high (0.95) and review thresholds generous (0.65). You'll review more documents than necessary, but you won't miss errors.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Track reviewer actions.&lt;/strong&gt; For every reviewed field, record whether the reviewer confirmed the value or corrected it.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;strong&gt;Analyze at the 500-document mark.&lt;/strong&gt; For each field, calculate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;What percentage of reviewed fields were confirmed without changes?&lt;/li&gt;
&lt;li&gt;What was the average confidence of confirmed-without-changes fields?&lt;/li&gt;
&lt;li&gt;What was the average confidence of corrected fields?&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Adjust thresholds.&lt;/strong&gt; If 98% of fields in the review tier were confirmed without changes, your auto-accept threshold is too high. Lower it until the review tier catches roughly 5-10% corrections — that's the sweet spot where review is adding value without wasting time.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Repeat quarterly.&lt;/strong&gt; Document formats change, suppliers change, scan quality changes. Recalibrate regularly.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h3&gt;
  
  
  A Practical Example
&lt;/h3&gt;

&lt;p&gt;After processing 1,000 invoices with a 0.92 auto-accept threshold:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Fields auto-accepted&lt;/td&gt;
&lt;td&gt;87%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fields sent to review&lt;/td&gt;
&lt;td&gt;11%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fields requiring manual entry&lt;/td&gt;
&lt;td&gt;2%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reviewed fields confirmed without changes&lt;/td&gt;
&lt;td&gt;96%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Reviewed fields corrected&lt;/td&gt;
&lt;td&gt;4%&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That 96% confirmation rate means your review threshold is too conservative — most flagged fields are correct. Drop the auto-accept threshold to 0.88 and re-measure. The goal: push the auto-accept rate up while keeping the correction rate in the review tier meaningful.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Business Case for Confidence-Based Automation
&lt;/h2&gt;

&lt;p&gt;Manual document processing costs time. Full automation without verification costs trust. Confidence-based automation gives you both — speed for the clear cases, human oversight for the ambiguous ones.&lt;/p&gt;

&lt;p&gt;For a team processing 500 invoices per month with 8 fields per invoice — that's 4,000 field decisions per month. With confidence-based routing:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tier&lt;/th&gt;
&lt;th&gt;Percentage&lt;/th&gt;
&lt;th&gt;Fields per month&lt;/th&gt;
&lt;th&gt;Human time&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Auto-accept&lt;/td&gt;
&lt;td&gt;85%&lt;/td&gt;
&lt;td&gt;3,400&lt;/td&gt;
&lt;td&gt;Zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Review (confirm/correct)&lt;/td&gt;
&lt;td&gt;12%&lt;/td&gt;
&lt;td&gt;480&lt;/td&gt;
&lt;td&gt;~2 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Manual entry&lt;/td&gt;
&lt;td&gt;3%&lt;/td&gt;
&lt;td&gt;120&lt;/td&gt;
&lt;td&gt;~1 hour&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Total&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;/td&gt;
&lt;td&gt;4,000&lt;/td&gt;
&lt;td&gt;~3 hours&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Compare that to fully manual processing at roughly 2 minutes per field: 133 hours per month. Or fully automated with no review: zero hours, but a 5-15% error rate that shows up as payment discrepancies, compliance issues, and client complaints.&lt;/p&gt;

&lt;p&gt;Three hours of targeted review per month. That's what confidence scores buy you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-Check: Combining Confidence with Calculated Fields
&lt;/h2&gt;

&lt;p&gt;One of the most effective validation patterns combines confidence scores with the &lt;code&gt;CALCULATED&lt;/code&gt; field type. This catches errors that look plausible in isolation but fail basic arithmetic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;env&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ITERATION_LAYER_API_KEY&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/invoices/INV-2026-4521.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subtotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The subtotal before tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;taxAmount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The tax amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;totalDue&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;The total amount due&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CALCULATED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;computedTotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sum of subtotal and tax for validation&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sum&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;source_field_names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subtotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;taxAmount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;extractedTotal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;totalDue&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;computedTotal&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;computedTotal&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;value&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;totalsMatch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;abs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;extractedTotal&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nx"&gt;computedTotal&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;totalsMatch&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="c1"&gt;// Flag all financial fields for review, regardless of confidence&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Arithmetic mismatch detected — flagging for review&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the extracted total doesn't match the computed sum, flag all three fields for review — even if each individual field has high confidence. The mismatch tells you something is wrong, even when the model is confident about each part.&lt;/p&gt;

&lt;h2&gt;
  
  
  Practical Domain Patterns
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Invoice Processing
&lt;/h3&gt;

&lt;p&gt;Auto-accept: invoice number, date, vendor name (usually high confidence from structured invoices). Flag: line item totals when below threshold. Use &lt;code&gt;CALCULATED&lt;/code&gt; fields to cross-check subtotal + tax = total. Financial fields get a 0.95 threshold; descriptive fields get 0.88.&lt;/p&gt;

&lt;h3&gt;
  
  
  Contract Analysis
&lt;/h3&gt;

&lt;p&gt;Auto-accept: party names, effective dates, contract ID. Flag: clause summaries (&lt;code&gt;TEXTAREA&lt;/code&gt; fields have more room for partial extraction). Flag boolean fields like "has non-compete clause" when confidence is below 0.85 — the business impact of a wrong answer is high.&lt;/p&gt;

&lt;h3&gt;
  
  
  Resume Screening
&lt;/h3&gt;

&lt;p&gt;Auto-accept: name and email (high confidence from structured headers). Flag: skills and experience summaries (often medium confidence due to varied formatting across resumes). Require manual review for contact details extracted from scanned or photographed documents.&lt;/p&gt;

&lt;h3&gt;
  
  
  Logistics Documents
&lt;/h3&gt;

&lt;p&gt;Auto-accept: shipment ID, origin, destination (usually structured and consistent). Flag: weight, dimensions, declared value (format varies widely across carriers). Use &lt;code&gt;CALCULATED&lt;/code&gt; fields to validate that itemized weights sum to the declared total weight.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;Check the &lt;a href="https://iterationlayer.com/products/document-extraction" rel="noopener noreferrer"&gt;Document Extraction docs&lt;/a&gt; to see how confidence scores work across all field types. The TypeScript, Python, and Go SDKs return typed response objects with confidence scores on every field.&lt;/p&gt;

&lt;p&gt;Start with conservative thresholds. Process a few hundred documents. Track what your reviewers actually change. Then tighten the thresholds based on data, not guesses.&lt;/p&gt;

&lt;p&gt;Sign up for a free account — no credit card required. Run a few documents through with your schema and check the confidence distributions before building your thresholding logic.&lt;/p&gt;

</description>
      <category>api</category>
      <category>pdf</category>
      <category>documentprocessing</category>
      <category>automation</category>
    </item>
    <item>
      <title>Extracting Structured Data from Scanned Documents: OCR Plus Field Validation</title>
      <dc:creator>Iteration Layer</dc:creator>
      <pubDate>Wed, 29 Apr 2026 18:49:30 +0000</pubDate>
      <link>https://dev.to/iterationlayer/extracting-structured-data-from-scanned-documents-ocr-plus-field-validation-1i30</link>
      <guid>https://dev.to/iterationlayer/extracting-structured-data-from-scanned-documents-ocr-plus-field-validation-1i30</guid>
      <description>&lt;h2&gt;
  
  
  The Filing Cabinet Problem
&lt;/h2&gt;

&lt;p&gt;Every organization has one. A storage room, a shared drive, a Dropbox folder — somewhere there are thousands of documents that exist only as scans. Supplier invoices from before the accounting system went digital. Patient intake forms from a decade of paper processes. Lease agreements that were faxed, signed, scanned, and filed away. Customs declarations. Insurance claims. Building permits.&lt;/p&gt;

&lt;p&gt;The data inside those documents is valuable. It is also trapped behind a wall of pixels. A scanned PDF is not a document in any meaningful sense — it is a photograph of a document, wrapped in a PDF container. You cannot search it. You cannot copy text from it. You cannot query a database for "all invoices over EUR 10,000 from 2023" when those invoices are flat images.&lt;/p&gt;

&lt;p&gt;The traditional fix is OCR — optical character recognition. Run Tesseract, get text out. But raw OCR gives you a stream of characters with no structure. An invoice number, a date, an address, and a line item table all come back as one unstructured blob. You still need to write parsers to separate the fields, regex to validate the formats, and error handling for the dozens of ways scanned documents degrade — skewed scans, coffee stains, faded ink, low-resolution mobile camera captures.&lt;/p&gt;

&lt;p&gt;That is two problems, not one. OCR converts pixels to characters. Extraction converts characters to structured data. Most tools solve the first and leave the second to you.&lt;/p&gt;

&lt;h2&gt;
  
  
  Scanned vs. Digital: Two Kinds of PDF, Same Extraction Problem
&lt;/h2&gt;

&lt;p&gt;Before diving into the approach, it helps to understand what you are dealing with. PDFs come in two fundamentally different varieties, and a third that combines the worst of both.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Digital PDFs&lt;/strong&gt; are born digital. Someone typed a document in Word, generated it from an application, or exported it from a database. The text inside is real text — selectable, searchable, stored as character codes. These are the easy case. You can extract text without OCR.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Scanned PDFs&lt;/strong&gt; are images inside a PDF wrapper. A physical document was placed on a scanner or photographed with a phone. The PDF contains one image per page, and that image contains text, but the PDF file itself has no idea what that text says. These require OCR before anything else can happen.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Hybrid PDFs&lt;/strong&gt; combine both. A common example: a digitally generated contract where the signature pages were printed, signed by hand, scanned, and appended. Some pages have real text. Others are images. The worst case is a scanned document that was run through a bad OCR layer years ago — it has a text layer, but that layer is full of errors, and the image underneath is the only reliable source.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;&lt;a href="https://iterationlayer.com/products/document-extraction" rel="noopener noreferrer"&gt;Iteration Layer's Document Extraction API&lt;/a&gt;&lt;/strong&gt; handles all three. For digital PDFs, it reads the text layer directly. For scanned PDFs, it runs OCR automatically. For hybrids, it detects which pages need OCR and which do not. You send the file and a schema. The API figures out the rest.&lt;/p&gt;

&lt;h2&gt;
  
  
  Schema-Based Extraction: Describe What You Want, Not Where It Is
&lt;/h2&gt;

&lt;p&gt;The key idea is that you define a schema — a list of fields with types and descriptions — and the API extracts values that match. You do not tell the parser where on the page to look. You do not write templates for each document layout. You describe the data you want, and the parser finds it.&lt;/p&gt;

&lt;p&gt;Here is a straightforward example: extracting key fields from a scanned invoice.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice-scan.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/scans/invoice-scan.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice_number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invoice or document reference number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;is_required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice_date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DATE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Date the invoice was issued&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vendor_name&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Name of the company that issued the invoice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;vendor_address&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ADDRESS&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Address of the invoicing company&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;total_amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Total amount due including tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;currency&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_CODE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Currency of the total amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iban&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;IBAN&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Bank account IBAN for payment&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEXT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"INV-2024-03871"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.96&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"INV-2024-03871"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"invoice_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DATE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-11-08"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.93&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"08.11.2024"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vendor_name"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEXT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Schneider Industriebedarf GmbH"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Schneider Industriebedarf GmbH"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"vendor_address"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ADDRESS"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"street"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Industriestraße 12"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"city"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Stuttgart"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"region"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Baden-Württemberg"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"postal_code"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"70469"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"country"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DE"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.91&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Industriestraße 12, 70469 Stuttgart"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total_amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_AMOUNT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4283.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Gesamtbetrag: EUR 4.283,50"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"currency"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_CODE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"EUR"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"EUR 4.283,50"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"iban"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"IBAN"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DE89370400440532013000"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.88&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"DE89 3704 0044 0532 0130 00"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"invoice-scan.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Seven fields, one API call. The ADDRESS field decomposes automatically into street, city, region, postal code, and country. The CURRENCY_CODE returns an ISO 4217 code. The IBAN is validated as a proper IBAN, not just extracted as a string.&lt;/p&gt;

&lt;p&gt;Notice the IBAN confidence score: 0.88. Lower than the other fields. That is the parser telling you: "I found something that looks like an IBAN, but I am less certain." Maybe the scan was slightly blurred in that region. Maybe the digits were partially obscured. The confidence score lets you decide whether to accept or flag it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Confidence Scores: The Critical Piece for Production Use
&lt;/h2&gt;

&lt;p&gt;Every extracted field includes a confidence score between 0.0 and 1.0. This is not a nice-to-have. It is the difference between a prototype and a production system.&lt;/p&gt;

&lt;p&gt;A clean digital PDF with crisp text and clear formatting will score high — 0.90 and above across the board. A scanned document from a 1990s fax machine with a coffee stain across the header will score lower. A hand-written form photographed at an angle under fluorescent lighting will score lower still.&lt;/p&gt;

&lt;p&gt;Your code should use these scores to build a routing system:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;High confidence (0.90+):&lt;/strong&gt; Route straight to your database or downstream process. The extraction is reliable enough for automated handling.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium confidence (0.70-0.89):&lt;/strong&gt; Flag for quick human review. Show the extracted value alongside the citation text so the reviewer can confirm or correct with minimal effort.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Low confidence (below 0.70):&lt;/strong&gt; Route to manual data entry. The scan quality or document layout made extraction unreliable. A human needs to look at the original.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This three-tier approach is how operations teams process thousands of documents without hiring dozens of data entry staff. The API handles the clear cases automatically. Humans handle the ambiguous cases. Nobody wastes time on documents the machine already got right.&lt;/p&gt;

&lt;h2&gt;
  
  
  Field Types as Built-In Validation
&lt;/h2&gt;

&lt;p&gt;Raw OCR gives you strings. A date extracted by OCR is just text — "08.11.2024" or "November 8, 2024" or "11/08/2024" depending on the document. Your code has to parse all of those formats, handle ambiguity (is "01/02/2024" January 2nd or February 1st?), and validate the result.&lt;/p&gt;

&lt;p&gt;Typed field extraction handles this at the extraction layer. When you define a field as &lt;code&gt;DATE&lt;/code&gt;, the parser recognizes date formats in context, normalizes to ISO 8601 (&lt;code&gt;2024-11-08&lt;/code&gt;), and uses surrounding context to resolve ambiguity. A German invoice with "08.11.2024" returns &lt;code&gt;2024-11-08&lt;/code&gt;, not &lt;code&gt;2024-08-11&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The same applies to every field type:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;CURRENCY_AMOUNT&lt;/strong&gt; extracts a numeric value from text like "EUR 4.283,50" or "$4,283.50" — handling comma-vs-period decimal separators automatically based on context.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;IBAN&lt;/strong&gt; validates the structure and checksum. A string that looks like an IBAN but has an invalid checksum will still be extracted, but with a lower confidence score.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;ADDRESS&lt;/strong&gt; decomposes into components (street, city, region, postal code, country) rather than returning a single string. An address from a German document returns &lt;code&gt;"country": "DE"&lt;/code&gt;, not &lt;code&gt;"country": "Germany"&lt;/code&gt; or &lt;code&gt;"country": "Deutschland"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;CURRENCY_CODE&lt;/strong&gt; returns an ISO 4217 code. The parser maps "Euro", "EUR", and the euro symbol to &lt;code&gt;"EUR"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;COUNTRY&lt;/strong&gt; returns an ISO 3166-1 alpha-2 code. "Germany", "Deutschland", "DE", "DEU" all normalize to &lt;code&gt;"DE"&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;BOOLEAN&lt;/strong&gt; interprets checkboxes, yes/no fields, and similar binary indicators.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;EMAIL&lt;/strong&gt; validates the extracted value against email format rules.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This means the API does double duty: extraction and validation in one step. You do not need a separate validation layer to check that the IBAN is structurally valid, the date is plausible, or the currency code is a real ISO code.&lt;/p&gt;

&lt;h2&gt;
  
  
  Extracting Tables and Repeated Data with ARRAY Fields
&lt;/h2&gt;

&lt;p&gt;Invoices, purchase orders, and shipping manifests all contain line items — tables with repeated rows of the same structure. The ARRAY field type handles these without any changes to the extraction approach.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;base64&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;purchase-order.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;pdfBase64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;po_number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Purchase order number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;is_required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;order_date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DATE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Date the purchase order was issued&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;line_items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;ARRAY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;List of ordered items&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;description&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Item description&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;quantity&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;INTEGER&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Quantity ordered&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
            &lt;span class="p"&gt;{&lt;/span&gt;
              &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;unit_price&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
              &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Price per unit&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="p"&gt;},&lt;/span&gt;
          &lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subtotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Subtotal before tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tax_amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Tax amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;total&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Total amount including tax&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;computed_total&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CALCULATED&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Sum of subtotal and tax for cross-check&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;operation&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;sum&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;source_field_names&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;subtotal&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;tax_amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"po_number"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"TEXT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"PO-2024-00412"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"PO-2024-00412"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"order_date"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"DATE"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"2024-10-22"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"22.10.2024"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"line_items"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ARRAY"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Hydraulic Cylinder Model HC-200"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Hydraulic Cylinder Model HC-200"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;12&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.96&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"12"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"unit_price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;245.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.93&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"245,00"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"description"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Pressure Gauge PG-50"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Pressure Gauge PG-50"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"quantity"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.97&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"24"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="nl"&gt;"unit_price"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;38.50&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
            &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"38,50"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
          &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"subtotal"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_AMOUNT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;3864.00&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Nettobetrag: EUR 3.864,00"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"tax_amount"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_AMOUNT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;734.16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.93&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"MwSt. 19%: EUR 734,16"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CURRENCY_AMOUNT"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4598.16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.95&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"Gesamtbetrag: EUR 4.598,16"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"computed_total"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"CALCULATED"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"value"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;4598.16&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"confidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;0.94&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"citations"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"source"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"purchase-order.pdf"&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ARRAY field extracts variable-length tables without knowing the number of rows in advance. Each row gets its own set of confidence scores. The CALCULATED field computes &lt;code&gt;subtotal + tax_amount&lt;/code&gt; and returns the result — you can compare it against the extracted &lt;code&gt;total&lt;/code&gt; to catch discrepancies.&lt;/p&gt;

&lt;h2&gt;
  
  
  Cross-Checking with CALCULATED Fields
&lt;/h2&gt;

&lt;p&gt;The &lt;code&gt;computed_total&lt;/code&gt; in the example above is not just a convenience. It is a validation mechanism.&lt;/p&gt;

&lt;p&gt;If the extracted &lt;code&gt;total&lt;/code&gt; is 4,598.16 and the computed &lt;code&gt;subtotal + tax_amount&lt;/code&gt; is also 4,598.16, the numbers are internally consistent. If they do not match, something went wrong — either the OCR misread a digit, or the document itself has an error.&lt;/p&gt;

&lt;p&gt;Four operations are available: &lt;code&gt;sum&lt;/code&gt;, &lt;code&gt;subtract&lt;/code&gt;, &lt;code&gt;multiply&lt;/code&gt;, and &lt;code&gt;divide&lt;/code&gt;. The source fields must be numeric types (INTEGER, DECIMAL, or CURRENCY_AMOUNT). This is particularly valuable for financial documents where amounts should add up, quantities times unit prices should equal line totals, and discounts should subtract correctly.&lt;/p&gt;

&lt;h2&gt;
  
  
  Batch Processing: Digitizing an Archive
&lt;/h2&gt;

&lt;p&gt;The real value of schema-based extraction shows up at scale. You have 2,000 scanned invoices in a folder. You need every one of them in your accounting system by the end of the quarter.&lt;/p&gt;

&lt;p&gt;The API accepts up to 20 files per request, with a combined size up to 200 MB (50 MB per file). The parser extracts the same schema from each file and returns results individually, each with its own confidence scores.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extractDocument&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;files&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice-001.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/scans/invoice-001.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice-002.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/scans/invoice-002.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;url&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice-003.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;url&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;https://example.com/scans/invoice-003.pdf&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="na"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;fields&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice_number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;TEXT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Invoice reference number&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;is_required&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice_date&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;DATE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Date the invoice was issued&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;total_amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_AMOUNT&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Total amount due&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;currency&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;CURRENCY_CODE&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;description&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Currency of the total amount&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For larger archives, chunk the files into batches of 20 and process them in parallel. A 2,000-document archive becomes 100 requests. With confidence-based routing, the high-confidence extractions go straight to your database, and only the ambiguous ones need human attention.&lt;/p&gt;

&lt;h2&gt;
  
  
  Supported File Types
&lt;/h2&gt;

&lt;p&gt;The API handles more than PDFs. You can send:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PDFs&lt;/strong&gt; — digital, scanned, or hybrid&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Word documents&lt;/strong&gt; — DOCX files with embedded text and tables&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Images&lt;/strong&gt; — PNG, JPG, GIF, WEBP (these always get OCR)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text files&lt;/strong&gt; — MD, TXT, CSV, JSON&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Images are the most common input for legacy document digitization. Someone photographs a paper form with their phone, uploads the JPG, and the API runs OCR and extraction in one step. No need to convert to PDF first.&lt;/p&gt;

&lt;h2&gt;
  
  
  File Inputs: URLs or Base64
&lt;/h2&gt;

&lt;p&gt;Two ways to send files:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;URL&lt;/strong&gt; — point to a file hosted somewhere: &lt;code&gt;{ "type": "url", "name": "doc.pdf", "url": "https://..." }&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Base64&lt;/strong&gt; — embed the file contents: &lt;code&gt;{ "type": "base64", "name": "doc.pdf", "base64": "..." }&lt;/code&gt;
&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The parser handles 40+ formats — PDFs, Office documents (DOCX, PPTX, ODT, ODS, XLSX), EPUB, RTF, LaTeX, email (EML, MSG), Jupyter notebooks, images, and text/markup formats. Images get OCR automatically — no separate step.&lt;/p&gt;

&lt;h2&gt;
  
  
  Chaining Extraction with Document Generation
&lt;/h2&gt;

&lt;p&gt;Extraction is the first step. What happens next depends on your workflow.&lt;/p&gt;

&lt;p&gt;A common pattern in operations teams: extract data from incoming documents, validate it, then generate a standardized output document. A logistics company receives shipping manifests in different formats from different carriers. They extract the shipment details, normalize the data, and generate a unified report in their own format.&lt;/p&gt;

&lt;p&gt;With composable APIs, this becomes two chained calls:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Extract&lt;/strong&gt; shipment data from the carrier's document — tracking numbers, weights, dimensions, delivery addresses — using the Document Extraction API.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Generate&lt;/strong&gt; a standardized shipping report using the &lt;a href="https://iterationlayer.com/products/document-generation" rel="noopener noreferrer"&gt;Document Generation API&lt;/a&gt; — same data, consistent format, ready for the warehouse team.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Same API key. Same credit pool. No glue code between the extraction step and the generation step. The structured JSON from the extraction response is the input for the document template.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Specialized OCR Tools Still Win
&lt;/h2&gt;

&lt;p&gt;If you need to extract text from a single document format that never changes — the same form, the same layout, every time — a template-based parser with fixed coordinate extraction will be faster and possibly more accurate. Tools like AWS Textract with custom adapters or dedicated form-recognition services are optimized for this.&lt;/p&gt;

&lt;p&gt;The schema-based approach wins when your documents vary. Different invoice layouts from different vendors. Different form designs across years of process changes. Different scan qualities from different offices. You define what data you want, and the parser adapts to wherever that data appears on the page.&lt;/p&gt;

&lt;p&gt;The tradeoff is explicit: template-based tools are faster on uniform documents. Schema-based extraction is more flexible across diverse document types. If your archive contains documents from dozens of sources in various formats, the flexibility saves more time than the template approach's speed advantage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Handling Errors
&lt;/h2&gt;

&lt;p&gt;Common error scenarios to handle in production:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;401 Unauthorized&lt;/strong&gt; — invalid or missing API key&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;400 Bad Request&lt;/strong&gt; — malformed schema (e.g., ARRAY field missing &lt;code&gt;fields&lt;/code&gt;, unknown field type, more than 100 schema fields)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;413 Payload Too Large&lt;/strong&gt; — file exceeds 50 MB, or total payload exceeds 200 MB&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;422 Unprocessable Entity&lt;/strong&gt; — the file could not be read (corrupted PDF, unsupported format)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For production code, check both the HTTP status and the &lt;code&gt;success&lt;/code&gt; field. A 200 response with &lt;code&gt;success: true&lt;/code&gt; means the extraction completed. Each field in the response has a &lt;code&gt;value&lt;/code&gt; and a &lt;code&gt;confidence&lt;/code&gt; score.&lt;/p&gt;

&lt;h2&gt;
  
  
  Get Started
&lt;/h2&gt;

&lt;p&gt;The full API reference, field type documentation, and SDK guides are in the &lt;a href="https://iterationlayer.com/products/document-extraction" rel="noopener noreferrer"&gt;Document Extraction docs&lt;/a&gt;. Install the TypeScript SDK (&lt;code&gt;iterationlayer&lt;/code&gt; on npm), the Python SDK (&lt;code&gt;iterationlayer&lt;/code&gt; on PyPI), or the Go SDK and start extracting.&lt;/p&gt;

&lt;p&gt;Sign up for a free account — no credit card required. Define a schema, send a document, and check the confidence scores. The same schema you test with one invoice works on every invoice in your archive — no per-layout configuration needed.&lt;/p&gt;

&lt;p&gt;If the extracted data feeds into reports, contracts, or other generated documents, the &lt;a href="https://iterationlayer.com/products/document-generation" rel="noopener noreferrer"&gt;Document Generation API&lt;/a&gt; takes structured JSON and produces polished PDFs, DOCX, EPUB, or PPTX. Same auth, same credits, one pipeline from scanned paper to finished output.&lt;/p&gt;

</description>
      <category>api</category>
      <category>pdf</category>
      <category>documentprocessing</category>
      <category>automation</category>
    </item>
    <item>
      <title>How We Generate OG Images with Our Own API</title>
      <dc:creator>Iteration Layer</dc:creator>
      <pubDate>Wed, 29 Apr 2026 18:49:14 +0000</pubDate>
      <link>https://dev.to/iterationlayer/how-we-generate-og-images-with-our-own-api-3f73</link>
      <guid>https://dev.to/iterationlayer/how-we-generate-og-images-with-our-own-api-3f73</guid>
      <description>&lt;h2&gt;
  
  
  Eating Our Own Dog Food
&lt;/h2&gt;

&lt;p&gt;Every page on iterationlayer.com has a unique Open Graph image. Not a static fallback, not a screenshot — a generated image that matches the page's identity. We build these with the same &lt;a href="https://iterationlayer.com/products/image-generation" rel="noopener noreferrer"&gt;Iteration Layer's Image Generation API&lt;/a&gt; we sell.&lt;/p&gt;

&lt;p&gt;This seemed like an obvious thing to do. We already had the API. We already had the infrastructure. The only question was what the images should look like.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Design
&lt;/h2&gt;

&lt;p&gt;We wanted something that felt branded but wasn't boring. A solid-color background or a gradient would work, but it wouldn't stand out in a social feed full of gradients. So we built a generative wave pattern — deterministic SVG art seeded by the page slug.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3ruigxxioiydso7ikwt.jpg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Ft3ruigxxioiydso7ikwt.jpg" alt="OG image generated for the security page" width="800" height="420"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;The layout is simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;White canvas at 1200x630 (the standard OG image size)&lt;/li&gt;
&lt;li&gt;Generative wave pattern with rounded corners, inset from the edges&lt;/li&gt;
&lt;li&gt;Logo and brand name at the bottom left&lt;/li&gt;
&lt;li&gt;Tagline at the bottom right&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each page gets a unique wave pattern because the slug is different. The security page looks different from the pricing page, which looks different from this blog post. But they all share the same brand structure — logo, name, tagline, rounded corners.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Wave Generator
&lt;/h2&gt;

&lt;p&gt;The wave pattern is pure math. No external images, no templates. A hash of the page slug seeds the parameters — wave amplitude, frequency, phase, thickness, color — so every slug produces a repeatable, unique pattern.&lt;/p&gt;

&lt;p&gt;The algorithm stacks seven wave bands vertically. Each band gets its own thickness and color from a palette of greys. A global sine wave sets the overall flow, then each band follows that flow with its own local variation. The bands are spaced with a minimum gap to keep them distinct.&lt;/p&gt;

&lt;p&gt;The output is an SVG with Catmull-Rom spline paths. We render it through the same SVG pipeline that powers our image layers.&lt;/p&gt;

&lt;p&gt;We extracted this into a shared &lt;code&gt;WaveSvg&lt;/code&gt; module that both the OG image generator and our blog post header cards use. Same algorithm, different dimensions — the blog headers are 900x400, the OG images are 1200x630.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Implementation
&lt;/h2&gt;

&lt;p&gt;The OG image endpoint is straightforward. A controller takes the page slug, generates the image, and returns it as a JPEG with a 30-day cache header.&lt;/p&gt;

&lt;p&gt;The generation itself is a single API call with five layers:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;IterationLayer&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;iterationlayer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;IterationLayer&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;apiKey&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;YOUR_API_KEY&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;waveSvgBase64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;generateWaveSvg&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;slug&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="c1"&gt;// your wave generator&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;logoSvgBase64&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="k"&gt;from&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;logoSvg&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;toString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;base64&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;generateImage&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="na"&gt;width_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="na"&gt;height_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;630&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="na"&gt;output_format&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;jpeg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;layers&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;solid-color&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;hex_color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#FFFFFF&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;base64&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;waves.svg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;waveSvgBase64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;x_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;y_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;width_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1160&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;height_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;478&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;border_radius&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;24&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;image&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;file&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;base64&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;logo.svg&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;base64&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;logoSvgBase64&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;x_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;y_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;542&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;width_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;height_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Iteration Layer&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Inter&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_size_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_weight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;bold&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text_color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#000000&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;vertical_align&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;center&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;x_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;y_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;542&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;width_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;height_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="na"&gt;index&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;text&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Image &amp;amp; Document Extraction and Generation APIs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;Inter&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_size_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;font_weight&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;medium&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text_color&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;#6B7280&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;text_align&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;right&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;vertical_align&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;center&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;should_auto_scale&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="na"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;x_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;y_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;542&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
      &lt;span class="na"&gt;dimensions&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="na"&gt;width_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;1160&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="na"&gt;height_in_px&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;56&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;},&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"success"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"data"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"buffer"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/9j/4AAQSkZJRgABAQ..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mime_type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"image/jpeg"&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Layer 0 is a white background. Layer 1 is the wave SVG with &lt;code&gt;border_radius: 24&lt;/code&gt; — the API masks the corners with anti-aliased alpha blending, so the edges are smooth. Layers 2-4 are the logo, brand name, and tagline below the wave art.&lt;/p&gt;

&lt;p&gt;The tagline uses &lt;code&gt;should_auto_scale: true&lt;/code&gt; so it shrinks to fit if the text is too wide. The brand name uses &lt;code&gt;vertical_align: "center"&lt;/code&gt; to align with the logo.&lt;/p&gt;

&lt;h2&gt;
  
  
  Features We Used
&lt;/h2&gt;

&lt;p&gt;Building this with our own API exercised several features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Solid-color layers&lt;/strong&gt; with optional position/dimensions — the white background fills the full canvas&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Image layers&lt;/strong&gt; with &lt;code&gt;border_radius&lt;/code&gt; — the wave SVG gets smooth rounded corners&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SVG rendering&lt;/strong&gt; — the wave pattern is inline SVG, passed as base64&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Text auto-scaling&lt;/strong&gt; — the tagline scales down to fit the available width&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vertical text alignment&lt;/strong&gt; — the brand name centers vertically against the logo&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JPEG output&lt;/strong&gt; — OG images should be JPEG for file size&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why Not Puppeteer
&lt;/h2&gt;

&lt;p&gt;We could have built an HTML template and rendered it with a headless browser. Every other site does. But we had a better tool.&lt;/p&gt;

&lt;p&gt;The Image Generation API renders our OG images in under 200ms. No browser startup, no font loading, no CSS layout engine. The result is deterministic — same slug, same image, every time. And when we cache the response with a 30-day max-age, the endpoint serves instantly after the first request.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;The &lt;a href="https://iterationlayer.com/recipes/generate-og-image" rel="noopener noreferrer"&gt;Generate OG Image&lt;/a&gt; recipe shows the full API call. Swap in your own background, logo, and brand colors. The &lt;a href="https://iterationlayer.com/products/image-generation" rel="noopener noreferrer"&gt;Image Generation docs&lt;/a&gt; cover all layer types and options.&lt;/p&gt;

</description>
      <category>api</category>
      <category>images</category>
      <category>automation</category>
      <category>programming</category>
    </item>
  </channel>
</rss>
