DEV Community

KNALLHART.DEV
KNALLHART.DEV

Posted on

The scrollHeight Lie: How I Finally Got Full-Page Screenshots Right with Playwright

If you've ever tried to take a full-page screenshot of a modern website
programmatically, you've probably run into the same wall I did:
document.body.scrollHeight lies. Constantly.

The problem

Lazy-loaded images, infinite scroll sections, sticky headers, scroll-triggered
animations — all of these mess with the page's reported height in ways that
make scrollHeight an unreliable signal for "have I reached the bottom of
this page yet?"

I was building knallhart.dev, a tool that takes a
full-page screenshot of any website and sends an AI-generated critique via
email. The screenshot step turned out to be the hardest part of the entire
build — harder than the AI integration, harder than the payment flow.

What didn't work

My first approach was the obvious one: read scrollHeight, calculate how
many scroll steps are needed, scroll that many times, done.

const pageHeight = await page.evaluate(() => document.body.scrollHeight);
const steps = Math.ceil(pageHeight / 600);

for (let i = 0; i < steps; i++) {
  await page.evaluate((y) => window.scrollBy(0, y), 600);
  await page.waitForTimeout(200);
}
Enter fullscreen mode Exit fullscreen mode

This worked on simple static pages and broke immediately on anything modern.
Pages with lazy-loaded content report a short initial scrollHeight, then
grow as you scroll — so the calculated step count is wrong before you've
even started. Pages with infinite scroll never stop growing at all.

What actually works

Instead of trying to calculate the destination upfront, I scroll step by
step and watch whether window.scrollY is still changing. If it stops
changing for a few consecutive checks, I've actually hit the bottom —
regardless of what scrollHeight claims.

let lastScrollY = -1;
let noChangeCount = 0;
const maxAttempts = 40; // safety cap, ~8 seconds

for (let i = 0; i < maxAttempts; i++) {
  await page.evaluate(() => window.scrollBy(0, 600));
  await page.waitForTimeout(200);

  const currentScrollY = await page.evaluate(() => window.scrollY);

  if (currentScrollY === lastScrollY) {
    noChangeCount++;
    if (noChangeCount >= 3) break; // confirmed: no more progress
  } else {
    noChangeCount = 0;
    lastScrollY = currentScrollY;
  }
}

await page.evaluate(() => window.scrollTo(0, 0));
Enter fullscreen mode Exit fullscreen mode

This is more robust because it doesn't trust any single height value — it
trusts observed behavior over time. A page that's still loading content
will keep moving scrollY; a page that's truly done won't, no matter how
confusing its scrollHeight is.

The bonus problem: images

Triggering the scroll isn't enough on its own — lazy-loaded images often
need a moment to actually finish loading after they enter the viewport.
I added an explicit wait for image load events before taking the final
screenshot:

\\javascript
await page.evaluate(async () => {
const images = Array.from(document.querySelectorAll("img"));
await Promise.all(
images.map((img) => {
if (img.complete) return Promise.resolve();
return new Promise((resolve) => {
img.addEventListener("load", resolve);
img.addEventListener("error", resolve);
setTimeout(resolve, 5000); // don't hang forever on one broken image
});
})
);
});
\\

What I'd tell past me

Don't trust any single DOM measurement as an endpoint signal on a page you
don't control. Modern websites are dynamic enough that almost any static
value can lie to you at some point. Behavior over time — does this keep
changing or not — is a much sturdier signal than asking the DOM "are we
there yet?"

If you're curious what this screenshot pipeline turned into:
knallhart.dev — it roasts your website with AI
and emails you three things that are actually wrong with it.

Happy to go deeper on any part of this if useful.

Top comments (0)