Exploring Metal 4 Placement Sparse Buffers: Granularity Limits in 3D Textures
Introduction
With the release of Metal 4 at WWDC25, I was particularly interested in the updated Placement Sparse Buffers (and textures). I wanted to see how the new APIs changed the landscape of memory management and whether they improved memory reduction efficiency for 3D resources like Signed Distance Fields (SDFs).
To find out, I built a verification app to audit the actual behavior on physical hardware.
You can find the full source code here:
tatsuya-ogawa/MetalPlacementSparseVerification
Verification App Overview
The app renders a sphere using Raymarching from an SDF stored in a 3D texture. It compares three different memory strategies:
- Dense: Fully allocated 3D texture (Baseline)
- Hardware Sparse (Metal 4): Using the new Placement Sparse APIs
- Software Atlas (Bricks): A custom implementation of a 3D brick atlas
Finding 1: The "64x64x1" Granularity Bottleneck
While implementing the Hardware Sparse mode using Metal 4, I encountered a physical constraint that significantly impacts 3D resource optimization: Sparse Page Granularity.
On a physical iOS device using R32Float and a 16KB page size, device.sparseTileSize returns a tile dimension of 64 x 64 x 1.
Why is this a problem?
When trying to store a "shell" of an object (like an SDF surface), a depth of "1" is extremely thin but the 64x64 footprint is quite large. If a surface even slightly grazes a tile, the entire 64x64x1 block is committed to memory.
At a resolution of $256^3$:
- Dense: ~102.4 MB
- Hardware Sparse: 48.4 MB
While there is a saving, it's not as efficient as it could be because the "coarse" footprint captures too much empty space.
Finding 2: Superiority of the Software Brick Atlas
To overcome this, I implemented a traditional "Software Brick Atlas" using $8 \times 8 \times 8$ blocks.
- Software Atlas (Bricks): 15.8 MB
The result was clear: the software approach used only about 1/3 of the memory compared to the hardware-based approach. By using small cubes ($8^3$) instead of thin plates ($64 \times 64 \times 1$), the atlas can tightly bound the surface of the sphere, excluding far more empty voxels.
Metal 4 vs. Metal 3: What Actually Changed?
The biggest takeaway from this verification is that while Metal 4 introduces a more refined API (Placement Sparse Buffers, improved residency sets, etc.), the underlying memory reduction efficiency remains identical to Metal 3.
The "minimum unit" of memory allocation is defined by the hardware's sparse page size. Since that hasn't changed, Metal 4 doesn't provide any inherent memory-saving advantage over Metal 3 for 3D textures. Both are limited by the same tile dimensions.
Conclusion
Metal 4's Placement Sparse Buffers offer a modern and clean API for resource management. However, for 3D memory optimization, we must still respect the physical limits of hardware tile granularity.
If you are dealing with sparse 3D data where high-density packing is critical, a Software Brick Atlas remains a superior choice despite the added implementation complexity.
Feel free to check out the repo for the implementation details!
tatsuya-ogawa/MetalPlacementSparseVerification
Verified on physical iOS hardware with Metal 4 support.

Top comments (0)