Series: Unity GPU Instancing - Learning Out Loud -> Part 1 of 4
Part 1 - Why Instancing Exists: A Hands-On Baseline
Summary
We're going to spawn one hundred thousand cubes, profile the pain, flip on Enable GPU Instancing, then add per-renderer colors with MaterialPropertyBlock without breaking instancing. By the end, we'll have screenshots and numbers that explain where the time goes.
Prerequisites
- Unity 6000.2.2f1 Universal 3D project.
- Create a URP Unlit material named
Mat_Instanced
. - Set VSync Count to
Don't Sync
(Default) - Set
Application.targetFrameRate = -1
in a bootstrap script. - Disable Dynamic Batching for consistent measures.
- Keep Game view Stats, Profiler, and Frame Debugger handy.
Project Setup
We're going to create an empty scene named Part1_Baseline
. Create a GameObject named InstancingPlayground
and attach the script below. Assign Mat_Instanced
to the baseMaterial
field. For the first test, we want to make sure the material's Enable GPU Instancing checkbox is off.
Experiment A: Baseline with Instancing Off
We're going to use the scene as it is now. Ensure that the material's Enable GPU Instancing is off. Enter Play Mode and record Game view Stats, Profiler Main Thread time, and a Frame Debugger capture. Expecting thousands and thousands of draws.
Experiment B: Flip on Enable GPU Instancing
For this we're going to tick the checkbox on Mat_Instanced
. Repeat the same measurements. Expect lower main thread time, possibly similar draw count though because each renderer still issues a draw.
Experiment C: Per-Instance Color via MPB
Use Shader Graph or the HLSL shader that I've got below and set a per-renderer color with MaterialPropertyBlock
so instancing remains intact.
Part 1 Results
Scenario | Count | SRP | Instancing | Per-Instance Color | Technique | Batches | SetPass | Main Thread (ms) | FPS (avg) |
---|---|---|---|---|---|---|---|---|---|
Baseline (GameObjects) | 100000 | Off | Off | — | GO | 103324 | 47 | 52.2 | 19.1 |
GameObjects + Instancing | 100000 | Off | On | — | GO | 70764 | 47 | 46.6 | 21.5 |
GameObjects + Instancing + MPB | 100000 | Off | On | Yes | GO | 157 | 24 | 30.9 | 32.3 |
Conclusions
- Checkbox alone is not magic! Turning on GPU instancing, but keeping everything as separate GameObjects, helped only a little. SetPass stayed the same so the CPU state setup didn't improve.
- Real win came from using instanced properties correctly. When we changed to per-instance color via MaterialPropertyBlock (and the shader marked it as an instanced property), Unity actually combined draws.
What does this mean so far?
1. Enabling GPU Instancing is necessary but not sufficient.
Keeping materials/keywords identical and feed variation through instanced properties (MPB or instanced arrays). This allows Unity to automatically instance draws, but, the 157 batches is still getting close to the per pass limit, not including other passes like shadows.
2. CPU is a big bottleneck when dealing with GameObjects.
Even after the huge batch drop, 30.9ms main thread isn't 60FPS. It's overhead from the transform, culling, and renderers from 100k GameObjects. To scale further, we'll need to move to Graphics.DrawMeshInstanced
(Part 2!) and then Indirect (Part 3)
3. SetPass tells the story.
No change with the checkbox alone (47 to 47!) means we aren't reducing state binds; the big drop to 24 with MPB shows we're finally reusing the same pass across big instance groups.
Next Post - Part 2!
We will compare SRP Batcher with and without instancing, then switch to Graphics.DrawMeshInstanced
to achieve real batching.
Top comments (1)
I realized after the fact that I forgot to include the shader file and the spawning file! Happy to share if anyone is interested