Oleksandr Gamanyuk for Hipa.ai

Posted on Jun 19

How we auto-generate a unique data infographic for every one of our research reports (with zero LLM tokens for the pixels)

We build hipa.ai — a platform that helps patients and healthy volunteers find and apply to clinical trials. One of our content surfaces is a monthly clinical-trials research report, published at three scopes:

National, e.g. US Clinical Trials — May 2026
Per state, e.g. California Clinical Trials
Per city (top ~50 US metros), e.g. Houston, TX clinical trials and New York, NY clinical trials

These reports feed our broader clinical trials directory, where people search recruiting studies by city, condition, and drug.

That's 1 national + 50 states + ~50 cities, every single month. Every one of those pages needs a hero image that is unique, on-brand, and actually about the data on that page. You can't hand-design a hundred new infographics a month, and stock photos of test tubes are worthless for SEO and worse for trust.

So we generate them. This post is the why and the how, with the real code.

Why generate images at all?

Five reasons, roughly in order of how much they drove the decision:

Scale. ~100 new reports/month. Manual design doesn't survive contact with that number.
Freshness. The data refreshes monthly from ClinicalTrials.gov (via the AACT mirror) — the same pipeline that powers our live clinical trials search and drug catalog. A hand-made image is stale the moment the next month's snapshot lands. A generated one is regenerated for free.
Brand consistency. One layout engine → every image has the same dark gradient, the same lime accent, the same typography. No drift.
Trust + SEO. The image is the data — recruiting counts, top conditions, top sponsors. A reader (and Google's image understanding) sees real numbers, not a decorative photo. The infographic doubles as the article's social/OG card source.
Cost. This is the part people get wrong. The image is deterministic from the data — it costs zero LLM tokens. We use an LLM (Gemini 2.5 Flash) to write the prose, but every pixel is computed from the SQL aggregate. More on this below, because it also dictates the pipeline ordering.

The three image tracks

Each report actually ships three different visuals, generated three different ways:

Track	What	Size	How	Stored?
1	The big data infographic (the in-article hero)	2400×1350	Satori / `next/og`	Yes — Azure Blob
2	The social / OG card	1200×630	Satori / `next/og`, on demand	No — per-request
3	In-page interactive charts	—	Chart.js in the browser	No — client-side DOM

The star is Track 1, so let's go deep there.

Here's a real one (the national May 2026 report — this is a live blob URL):

And a city-scoped one — same engine, scope-aware panels:

How: render React → PNG with Satori, not a headless browser

The first instinct for "turn a chart into a PNG on the server" is usually Puppeteer (spin up headless Chrome, screenshot a <canvas>) or node-canvas. We use neither. We use Satori via Next.js's ImageResponse (next/og).

Why Satori:

No headless browser. Puppeteer means bundling Chromium, ~300MB, cold-start pain, and a flaky process to babysit on every render. Satori is a pure function: JSX → SVG → PNG.
It runs anywhere Next runs, including the edge/serverless image route. No native Chromium dependency.
The constraint Satori imposes — flexbox layout + inline SVG only, no arbitrary CSS — turns out to be fine for an infographic. Bars are <div>s with a width: "<pct>%". Trend lines are an SVG <polyline>. That's the whole vocabulary you need.

The tradeoff: you can't drop in Chart.js server-side (it wants a canvas). So we built a tiny chart-primitive library that speaks Satori's dialect.

The shared primitive library

Everything lives in src/lib/infographic-charts.tsx. The file's own header says it best:

/**
 * Satori-compatible chart primitives for infographic images.
 *
 * All components return JSX that works inside `ImageResponse` (next/og).
 * Satori supports flexbox layout and inline SVG — no Chart.js.
 */

It exports StatCard, HorizontalBars, StackedBar, Sparkline, SectionTitle, Panel, and an InfographicShell wrapper, plus the canonical size:

export const infographicSize = { width: 2400, height: 1350 } as const;

A "bar chart" is just a flex row where the fill <div>'s width is a percentage. No library, no canvas:

export function HorizontalBars({ items, labelWidth = 100, labelFontSize = 30 }) {
  const maxPct = Math.max(...items.map((i) => i.pct), 1);
  return (
    <div style={{ display: "flex", flexDirection: "column", gap: 24, width: "100%" }}>
      {items.map((item) => (
        <div key={item.label} style={{ display: "flex", alignItems: "center", gap: 20 }}>
          {/* label … */}
          <div style={{ display: "flex", flex: 1, height: 48,
                        background: "rgba(255,255,255,0.06)", borderRadius: 10, overflow: "hidden" }}>
            <div style={{
              display: "flex",
              width: `${Math.round((item.pct / maxPct) * 100)}%`,  // ← the entire "chart"
              height: "100%",
              background: item.color,
              borderRadius: 10,
            }} />
          </div>
          {/* value + percent … */}
        </div>
      ))}
    </div>
  );
}

A trend line is the one place we reach for SVG — Satori renders inline SVG, so a sparkline is a single <polyline>:

const points = values.map((v, i) => {
  const x = padding + (i / (values.length - 1)) * innerW;
  const y = padding + innerH - ((v - min) / range) * innerH;
  return `${Math.round(x)},${Math.round(y)}`;
}).join(" ");

return (
  <svg width={width} height={height} viewBox={`0 0 ${width} ${height}`} style={{ display: "flex" }}>
    <polyline points={points} fill="none" stroke={color} strokeWidth="5"
              strokeLinecap="round" strokeLinejoin="round" />
  </svg>
);

The brand lives in InfographicShell — one wrapper, so every image across the site (research, news, etc.) shares the exact same skin:

background: "linear-gradient(135deg, #0f172a 0%, #12304a 50%, #0f172a 100%)",
// header: Hipa.ai logo • lime dot • SCOPE TITLE … month on the right
// footer: "Source: ClinicalTrials.gov / AACT" … "Hipa.ai"

export const LIME = "#d9f99d";   // the one accent color, used everywhere
export const MUTED = "#94a3b8";
export const CARD_BG = "rgba(255,255,255,0.07)";  // frosted cards

Composing the report infographic

src/lib/research-infographic.tsx exports a single function:

createTrialsInfographic(stats, logoSrc?) // → ImageResponse

It takes the stats aggregate (not prose, not the LLM output — the raw numbers) and lays out a header, a row of five StatCards (Recruiting Now / New This Month / Closing Within 90d / …), then a six-panel grid: recruiting by condition, by sponsor, and a third panel that switches by scope (by city / by state for national / top facilities for a city report), then interventions, eligibility, and the enrollment target.

It even does domain-specific cleanup so the labels fit — e.g. abbreviateSponsor("University of California, San Francisco") → "UCSF" and strips Inc./LLC/Pharmaceuticals, and a fixed INTERVENTION_COLORS map keeps "Drug" blue and "Device" purple across every single image.

The key property: the function is pure over the data. Same stats in → same PNG out. Zero tokens. The pixels never touch an LLM.

The pipeline ordering (this is where we got burned)

Generating the image is the easy part. When you generate it is what bit us.

Each report is produced by scripts/trials/generate-trials-research.ts: it reads the AACT Postgres mirror, builds the aggregate, calls the LLM for the body, upserts the Mongo doc, and generates+uploads the infographic — inside the same per-document loop body:

// Generate and upload infographic (per doc, right after the upsert)
try {
  const imgResponse = createTrialsInfographic(agg, logoSrc);
  const imgBuffer = Buffer.from(await imgResponse.arrayBuffer());
  const blobName = `research-${monthlySlug}.png`;
  await uploadPublicBlob(blobName, imgBuffer);
} catch (imgErr) { /* log, continue — don't kill the whole run */ }

We learned three rules the hard way (a sibling pipeline once shipped 23 articles with 404 hero images because it ran "generate all docs → then upload all images" as two phases — that opens a window where the article exists but its image doesn't):

Image generation lives in the same per-doc body as the article upsert. No two-phase "all articles, then all images." That window is exactly where broken images come from.
Image before the LLM call. The PNG is deterministic and free; the LLM call costs tokens. Render+upload the image first; if upload throws, abort the doc before spending a single token on prose nobody will see.
Skip-existing is mandatory. Reruns check "does this blob already exist?" and continue. The pipeline is safe to re-run any time — it only does the missing work, and never re-bills the LLM for an article that's already live.

A lazy fallback route

If the batch script never ran for a given slug (or a one-off page is requested), there's a self-healing route at src/app/research/[slug]/infographic/route.ts. It checks the blob, generates on miss, uploads, and 302s to the CDN — with 24h ISR:

export const revalidate = 86400; // 24h ISR

export async function GET(_req, { params }) {
  const { slug } = await params;
  const blobName = `research-${slug}.png`;

  if (await publicBlobExists(blobName)) {
    return NextResponse.redirect(publicBlobUrl(blobName), 302); // already done
  }

  const article = await getResearchArticleBySlug(slug);
  if (!article?.stats) return new Response("Not found", { status: 404 });

  const buffer = Buffer.from(await createTrialsInfographic(article.stats, logoSrc).arrayBuffer());
  await uploadPublicBlob(blobName, buffer);            // generate once, cache forever
  return NextResponse.redirect(publicBlobUrl(blobName), 302);
}

So the image is generated at most once, whether by the batch pipeline or the first visitor, then served straight off the CDN forever after.

Storage: cheap, public, cached

Images go to Azure Blob Storage (account hipaaipublic, container infographics), world-readable, with a 7-day cache header baked in at upload:

// uploadPublicBlob(...)
blobHTTPHeaders: {
  blobContentType: "image/png",
  blobCacheControl: "public, max-age=604800", // 7 days at the CDN/browser
}

The article page just references the deterministic URL — no DB lookup needed to find the image:

<img
  src={publicBlobUrl(`research-${slug}.png`)}  // → https://hipaaipublic.blob.core.windows.net/infographics/research-<slug>.png
  alt={`Clinical trials infographic for ${scope} — ${article.month}`}
  width={1200} height={675} loading="lazy"
/>

Because the URL is a pure function of the slug, the page never queries anything to render its hero — it just points at the predictable blob path. (Alt text is templated from the scope + month, not LLM-generated — it's structured data we already have.)

Track 2 & 3, briefly

The OG/social card (1200×630) is generated on demand by src/app/research/[slug]/opengraph-image.tsx — same Satori engine, same brand, but a simpler eyebrow/title/subtitle layout. It's referenced in the page's NewsArticle JSON-LD so social platforms and Google get a clean card. Not stored; cheap enough to make per request. You can hit one live: /research/us-clinical-trials-2026-05/opengraph-image.

In-page interactive charts are the one place we do use Chart.js — because here we have a real browser. react-chartjs-2 renders a "Data at a Glance" section (closing-soon by condition, new trials by city/condition, a stacked new-vs-closing trend) client-side after mount. Live DOM, not an image — interactive tooltips, no PNG.

So the rule of thumb we landed on:

Server-side, no browser → Satori (flexbox + SVG → PNG). Client-side, real browser → Chart.js. Never drag a headless Chromium into the build just to screenshot a chart.

The one exception: raw SVG → `sharp`

For a different surface (drug-development timelines on /drug/* pages) we generate Gantt-style program-span charts as hand-written SVG strings, then rasterize with sharp:

sharp(Buffer.from(svg), { density: 300 }).resize({ width: 2400 }).png()

Worth mentioning as the "other approach": when the layout is geometric enough that JSX is awkward, raw SVG + sharp at density: 300 gives you crisp, print-ready output. But for anything card-and-grid shaped, Satori-from-JSX wins on maintainability — you're writing React, not string-concatenating <rect>s.

Takeaways

For pSEO at scale, generate the images. Manual design doesn't scale to ~100 fresh, on-brand, data-accurate visuals a month.
Keep the pixels deterministic. Let the LLM write prose; compute every pixel from your data. It's free, reproducible, and cache-friendly.
Satori (next/og) beats a headless browser for server-rendered charts. Build a tiny flexbox+SVG primitive library once; reuse the skin everywhere.
Order matters: image (free) before LLM (paid), in the same per-doc body, with skip-existing. A two-phase design is how you ship 404 hero images.
Make the image URL a pure function of the slug so pages render their hero with zero extra DB calls, and add a lazy generate-on-miss route so it self-heals.

Go poke at a few live ones — every infographic on these is generated by the pipeline above:

About Hipa.ai

Hipa.ai helps patients and healthy volunteers discover and apply to clinical trials across the US. If this post brought you in from the technical side, here's the product side of the same data:

🔎 Find clinical trials — search recruiting studies by location, condition, and drug
🏙️ Clinical trials near you — city-level recruiting listings (e.g. Houston, TX)
💊 Drug catalog — trials, alternatives, and updates by drug
📊 Monthly research reports — the national / state / city infographics this post is about
📰 Healthcare data news — provider and registry trends

DEV Community

How we auto-generate a unique data infographic for every one of our research reports (with zero LLM tokens for the pixels)

Why generate images at all?

The three image tracks

How: render React → PNG with Satori, not a headless browser

The shared primitive library

Composing the report infographic

The pipeline ordering (this is where we got burned)

A lazy fallback route

Storage: cheap, public, cached

Track 2 & 3, briefly

The one exception: raw SVG → `sharp`

Takeaways

About Hipa.ai

Top comments (0)

Why generate images at all?

The three image tracks

How: render React → PNG with Satori, not a headless browser

The shared primitive library

Composing the report infographic

The pipeline ordering (this is where we got burned)

A lazy fallback route

Storage: cheap, public, cached

Track 2 & 3, briefly

The one exception: raw SVG → sharp

Takeaways

About Hipa.ai

The one exception: raw SVG → `sharp`