"Why your Playwright screenshots show for Japanese / Chinese / Korean text, and the 3-line Dockerfile fix"

#playwright #docker #i18n #e2e

I opened the screenshot artifact for our codens.ai landing page smoke test and the page was full of square boxes. Where the Japanese hero copy should have been, there was a row of □□□□□. Where the feature names were, more boxes. The nav looked like an ancient artifact from a half-decoded file.

The page itself was fine. I had the dev server open in another tab and the Japanese rendered perfectly. The problem was inside the Playwright container.

Three lines in the Dockerfile fixed it:

    fonts-noto-cjk \
    fonts-noto-cjk-extra \
    fonts-noto-color-emoji \

That is the entire fix. If you only came for the answer, you can close the tab now. If you want to know why this happens and where else it will bite you, keep reading.

What is actually happening

The official Playwright Docker image (and most slim base images people build on) only installs Latin fonts. In our case it was fonts-liberation plus fonts-dejavu-core. That is enough to render English, most European languages, basic punctuation, and not much else.

When Chromium tries to paint a character it has no glyph for, it does the only thing it can do. It draws the missing-glyph placeholder, which on most systems is that hollow rectangle people call a tofu box. The character code is correct. The DOM is correct. The page is correct. The screenshot rendering side just has no shape to draw.

This is the part that confuses people the first time. The browser is not broken. The test is not broken. The page is not broken. The container does not have the font installed, so when the screenshot is composited there is nothing to fill the box with.

You can verify this in two seconds. SSH into the container, run fc-list | grep -i cjk, and you will see an empty result. That is the whole story.

The fix

Three apt packages, added to whatever RUN apt-get install block already exists in your Dockerfile.

Before:

RUN apt-get update && apt-get install -y \
    fonts-liberation \
    fonts-dejavu-core \
    && rm -rf /var/lib/apt/lists/*

After:

RUN apt-get update && apt-get install -y \
    fonts-liberation \
    fonts-dejavu-core \
    fonts-noto-cjk \
    fonts-noto-cjk-extra \
    fonts-noto-color-emoji \
    && rm -rf /var/lib/apt/lists/*

What each one buys you:

fonts-noto-cjk is the main package. It covers Japanese kana, the Han characters used in both Japanese and Simplified Chinese, and Korean Hangul. This is the one that fixes most of the boxes.
fonts-noto-cjk-extra covers the long tail. Traditional Chinese variants, less common Han glyphs, characters that show up in proper nouns. Worth including because the cost is small and you do not want to debug a single rare character later.
fonts-noto-color-emoji is the one people forget. If your page has any emoji, you will get tofu for those too. Most modern marketing pages have at least a checkmark or a sparkle somewhere.

Image size impact is about 70 MB on a Debian or Ubuntu base. CJK font files are large because there are tens of thousands of glyphs. If you are squeezing every megabyte you can use the smaller variable-weight subset, but for a CI image used by a test runner the 70 MB is irrelevant.

I shipped this in commit 40422650 for Codens Blue, our QA agent. Rebuilt the image, reran the same smoke test, and the screenshot came out with actual readable Japanese.

Why you only notice after the fact

This is the annoying part. Nothing in your test suite tells you the screenshot is broken.

Unit tests pass. The page renders correctly when a human visits it. The Playwright test reports green because the test only checks that the page loaded and the screenshot was saved. CI is happy. The artifact thumbnail in the GitHub Actions UI is tiny and you cannot tell tofu from text at that size.

You notice when someone opens the screenshot to share it. A designer asks for the latest LP screenshot to compare against a Figma mock. A stakeholder pulls a screenshot for a Slack thread. A regression alert fires and you open the diff. That is when the boxes show up and someone asks why the page is full of squares.

You can technically assert against tofu rendering inside the test. Sample a region that should contain CJK text, check that not every pixel in that region is identical white, fail if it looks suspiciously uniform. I have seen people do this. The implementation cost almost never beats the cost of just installing the fonts once. Three lines of Dockerfile beats a hundred lines of pixel sampling logic.

The same trap is everywhere

Playwright is just the messenger. Anything that wraps a headless Chromium in a Docker container has this problem if the base image lacks CJK fonts.

Puppeteer, pyppeteer, playwright-python, Selenium with headless Chrome, any custom screenshot service built on chrome-launcher, server-side rendering pipelines that use headless Chrome to generate Open Graph images. Same root cause every time. Same fix every time.

If your product touches any audience outside Latin script, default to installing the CJK and emoji fonts in your base image. Treat it as part of the container setup, not as a thing you wait to hit. The cost is 70 MB and three lines. The cost of not doing it is some future Slack message that says "why is the page full of boxes" and then an afternoon of confused debugging.

Wrap

That is the whole thing. Three apt packages, one rebuild, done. If you are running Codens Blue or any other screenshot-based QA flow against a multilingual page, this is the first place to look when boxes appear.

If you want to see the actual landing page these screenshots are taken from, it lives at https://www.codens.ai/en/.