DEV Community

Cover image for What's Actually in Your Docker Image? Reading the Parser That Tells You
Wes
Wes

Posted on • Originally published at wshoffner.dev

What's Actually in Your Docker Image? Reading the Parser That Tells You

Everyone who works with Docker images eventually asks the question: why is this image 2 GB? You squint at your Dockerfile, count the layers, maybe add a .dockerignore and rebuild. The image is still 1.8 GB. You suspect there's a 400 MB node_modules directory hiding in layer 3 from a build stage that didn't clean up after itself, but you can't see it without extracting the whole thing and poking around.

The standard answer for the last six years has been dive. 48,000 stars. Works fine for small images. But if you're dealing with large images (5 GB and up), dive starts to struggle. It loads the entire layer tree into memory, the UI lags, and sometimes it just hangs. If you've hit that wall, you've probably done what I did: extracted the tarball manually, grepped through file listings, and wished someone had written a faster tool.

Someone did.

What Is xray?

xray is a TUI-based layer inspector for OCI-compliant container images, written in Rust. Built by h33333333 as a solo project, it parses Docker and Podman images and shows you per-layer file changes with filtering by path, regex, file size, or layer. The selling point over dive is performance: xray claims ~80 MB of RAM for an 8 GB image, versus dive loading the whole tree. It does this through a custom streaming parser that processes the nested tar archives without fully extracting them.

117 stars. About 15 months of development. Shipped v1.4.0 last week with Podman support and fully customizable keybindings. Two prior external contributors (a docs fix and an optimization). Zero tests until this article.

The Snapshot

Project xray
Stars 117 at time of writing
Maintainer Solo (h33333333), responds within 24 hours
Code health 6,300 lines, clean workspace separation, zero unwrap() in production paths. Zero tests before our PR
Docs Good README with usage, installation, planned improvements section
Contributor UX No CONTRIBUTING.md, no CI, but the maintainer ships requested features in under two weeks
Worth using Yes, especially for large images. Faster than dive, leaner on memory

Under the Hood

The project is a 4-crate Cargo workspace. xray holds the parser, TUI, and config. xray-docker implements the Docker socket API for pulling images directly. xray-podman wraps the Podman CLI. formatted-index-macro is a small proc macro for index formatting.

The parser is where the interesting engineering lives. An OCI image is a tar archive containing JSON manifests and multiple blobs, some of which are themselves tar archives (the layers), some of which are gzipped tar archives. The parser needs to identify blob types, parse nested archives, and build a tree of file changes across layers. The entry point is Parser::parse_image(), which takes any Read + Seek source and walks the top-level tar entries.

Blob type detection reads the first 512 bytes and checks for magic numbers: gzip (0x1f 0x8b), tar (ustar at byte offset 257), all zeros (empty), or falls back to JSON. The tricky part is handling nested tar blobs. The tar crate expects to own an archive from start to finish, tracking its own cursor position. But xray needs to pause the outer archive, parse an inner one (the layer), and then resume. The solution is SeekerWithOffset, a wrapper that normalizes seek positions using offsets. When the parser enters a layer, it calls mark_offset() to set a baseline, constructs a new inner archive over the same reader, and the offset translation keeps both archives consistent. It's a clean workaround for a library limitation without forking the dependency.

Each layer produces a LayerChangeSet, which is a tree of Node values. Nodes are either files or directories, each with a status: Added, Modified, or Deleted. The merge logic combines layer trees into a cumulative view. Two directories merge their children recursively. A file that exists in both layers becomes Modified. A whiteout file (.wh.filename) marks the target as Deleted and propagates recursively through subdirectories. The tree also handles an edge case where a node changes type between layers. Some images create a symlink, then later create a directory at the same path with children inside it. The insert logic detects this and promotes the file to a directory on the fly.

Filtering happens after the tree is built. Four filter types compose together: path (absolute or relative, with partial matching), regex (matched against individual path components), size (minimum bytes), and layer-change (show only files modified in a specific layer). The filter walks the tree and prunes nodes that don't match, removing empty directories whose children were all filtered out. The path filter uses RestorablePath, a custom iterator that tracks the current component index and can reset to its original state for backtracking during relative path searches.

What's rough: there are no tests (before our PR), no CI, and the project requires nightly Rust due to let_chains in the filter code. The keybindings module has broken doctests. Opaque whiteouts (.wh..wh..opq) are skipped with a FIXME comment rather than handled. None of these are structural problems. The architecture is sound, the code is consistent, and the parser logic is correct for the cases it handles.

The Contribution

The README's "Planned Improvements" section lists "Add unit and fuzz tests" explicitly. Zero tests existed. I wrote 55 unit tests covering the parser core.

Getting into the codebase was straightforward because the parser is completely decoupled from the TUI. You can read and test parser/util.rs, parser/node/, and parser/mod.rs without touching anything related to rendering, keybindings, or terminal state. The node tree logic is the most complex part, and the InnerNode type is self-contained enough that you can construct test trees in a few lines and verify insert/merge/delete behavior without any I/O.

The tests cover three areas: utility functions (SHA256 hex parsing, tar block math, blob type detection via magic numbers), tree operations (node insertion with auto-created intermediate directories, layer merging with status transitions, whiteout propagation, tree iteration with depth tracking), and the filter system (all four filter types plus directory retention verification). Three files changed, 838 insertions, no production code modifications.

PR #19 was merged the same day it was submitted. The maintainer asked for one test to be removed (a redundant case that didn't actually verify the behavior it claimed to), I fixed it, rebased, and it was in.

The Verdict

xray is for anyone working with container images that are too large for dive to handle comfortably. If your CI builds produce multi-gigabyte images and you need to understand where the size comes from, this is the tool. It's also worth using as a reference if you're interested in how OCI images are structured. Reading the parser teaches you the tar-in-tar layout, the role of JSON manifests, how layer diffs work through whiteout files, and how gzip detection happens at the byte level.

The project is actively growing. v1.4.0 just shipped with Podman support, the maintainer is responsive, and the architecture can absorb new features without structural changes. The main gap now is CI (a natural follow-up, since the tests exist but nothing enforces them). The nightly requirement is the only real adoption friction: you need rustup toolchain install nightly before building from source.

What would push xray further: CI enforcement of clippy and tests, handling opaque whiteouts, and getting listed in more package managers (currently just crates.io and AUR). The parser is solid enough to be a library that other tools build on. Right now it's coupled to the TUI binary, but the workspace structure already separates the concerns cleanly.

Go Look At This

xray on GitHub. Install with cargo +nightly install --locked xray-tui and point it at one of your larger images. The filter system (hit / for path, r for regex, s for size) is where it earns its keep.

Our merged PR adding the test suite is here. If you want to contribute, opaque whiteout handling and a GitHub Actions CI pipeline are both open territory. The maintainer moves fast.

This is Review Bomb, a series where I find under-the-radar projects on GitHub, read the code, contribute something, and write it up. If you know a project that deserves more eyeballs, drop it in the comments.


This post was originally published at wshoffner.dev/blog. If you liked it, the Review Bomb series lives there too.

Top comments (1)

Collapse
 
alvarito1983 profile image
Alvarito1983

This lands at exactly the right moment for me. I've been building NEXUS Security — a CVE scanner for self-hosted Docker environments — and the "why is this image 2 GB" question is one I've been living with for weeks.

The SeekerWithOffset solution for nested tar parsing is elegant. The problem of pausing an outer archive to parse an inner one without forking the dependency is the kind of thing that looks obvious in retrospect and takes real thought to get right.

What I find most interesting is the 80 MB vs full-tree-in-memory tradeoff. Dive's approach works fine until it doesn't, and when it fails it fails badly — exactly the wall you described. The streaming parser isn't just a performance optimization, it changes the failure mode. Instead of "hangs on large images," you get consistent behavior at any size. That's a more important property than the raw memory numbers.

One thing I'd be curious about: how does xray handle images built with multi-stage builds where intermediate layers were squashed? The whiteout file handling you describe covers per-file deletions, but squashed images drop all intermediate history. Does the layer tree still give you useful information in that case, or does it just show the final squashed layer as a single "added" changeset?

Going to try this on the nexus-nexus-security image. Grype found 48 vulnerabilities in it. Would be interesting to see which layers they're actually living in.