ndesmic

Posted on Sep 15

WebGPU Engine from Scratch Part 9: Shadow Maps

#webgpu #vanillajs

This was a topic I wanted to get to for a while. There's 2 common ways (but other less common ways) to do shadows for 3D graphics. Raytracing is the most accurate and simple but very expensive. When we can't afford that, the stock standard approach is shadow mapping because it's inexpensive and works well enough. This is what most games you've seen use and you can usually tell by the characteristic artifacts where shadows look "pixelated" and that's what I want to implement. There are tons of tweaks to shadow maps to improve results but for now I'm just looking at the simpler version.

How it works

The intuition here is that we treat each light like a camera and then we take a picture of the scene with it. This picture is just a depth map. Then when we render the scene for each point we compare to see if the light camera saw it, that is, the world space position of the fragment as calculated from the depth map is the same distance from the light as the world space position as calculated by the actual camera. If it is then the fragment is not in shadow. If the distance in the depth map distance is shorter then that means it was occluded by something else between the position and the camera.

Directional lighting

We have two light types, directional and point. Directional is the easier of the two to deal with so we'll start with that. Directional lighting when modeled as a camera is doesn't have any perspective because it's infinitely wide (we do have to decide a finite cut-off for the depth map dimensions though). So it's really just an orthographic camera pointed the direction of the light. So let's do that.

Housekeeping

First some housekeeping. We'll need to build a new pipeline but because these take up a lot of space I want to start moving them out of the main engine class. Now each engine will have it's own folder so that it can have sub-resources that are specific to it. Small path changes.

Building the shadow map

We need to build the new pipeline that creates the shadow map next. This is basically just rendering the geometry with no shader. Since we turned on depth mapping we'll automatically get that output even though we didn't rendering anything in particular. We'll also need the appropriate bindings for the vertex transforms. Most of this can be borrowed from the camera code but we do need to decide how to set the bounds for the projection matrix.

For the direction light in order to build the view matrix we need to pick a position. Since it's orthographic it shouldn't really matter, just as long as it's far enough out to encompass the scene. I've kept it slightly generic using constants even though we can simplify if we are always looking at the origin. 0.5 is a good enough a distance to show the teapot but it needs to be manually calibrated per scene for now. Therefore we also need to pick the bounds of the view frustum. I decided to go with -2 to 2 with a depth of 1 (this will be updated later because it turns out it's not a great choice). I created a new entity called ShadowMappedLight with this logic. The reason is because it makes sense to group this with the light but it's implementation specific, for example ray-tracing does not need these concepts so I subclassed it.

//shadow-mapped-light.js
import { Light } from "./light.js";
import { getLookAtMatrix, getOrthoMatrix, scaleVector, subtractVector } from "../utilities/vector.js";

const distance = 0.5;
const center = [0, 0, 0];
const frustumScale = 2;

export class ShadowMappedLight extends Light {
    getViewMatrix(light) {
        const lightPosition = scaleVector(subtractVector(center, this.direction), distance);
        return getLookAtMatrix(lightPosition, center);
    }

    getProjectionMatrix(aspectRatio) {
        const right = aspectRatio * frustumScale;
        return getOrthoMatrix(-right, right, -frustumScale, frustumScale, 0.1, distance * 2);
    }
}

One thing we need to be careful about that I didn't notice before is the when the camera, or in this case, the light, is looking down the Y axis this will cause the view matrix to get filled with NaNs because they will cross multiply a zero vector. In this case we need to choose a different up. I've modified the getLookAtMatrix to handle this case:

//vector.js
export function getLookAtMatrix(position, target, up = UP) {
    const forward = normalizeVector(subtractVector(target, position));

    if(Math.abs(dotVector(forward, up)) > 0.999){
        up = Math.abs(forward[1]) < 0.999 ? UP : FORWARD;
    }

    const right = normalizeVector(crossVector(up, forward));
    const newUp = crossVector(forward, right);


    return new Float32Array([
        right[0], newUp[0], forward[0], 0,
        right[1], newUp[1], forward[1], 0,
        right[2], newUp[2], forward[2], 0,
        -dotVector(position, right), -dotVector(position, newUp), -dotVector(position, forward), 1
    ]);
}

If the vector is too similar to UP then we choose FORWARD instead. Note that if the direction simply contained a Y component but not a full Y component like [0,-1,1] we should still choose UP even though the dot product is 1/-1. This should make things a bit more robust.

We also need to get the aspect ratio of the shadow map to generate the view frustum bounds so that we're not distorting the image. I'm not sure if it's strictly necessary but it puts me at ease.

The rest is near identical to the main scene bindgroup while removing stuff that we don't need.

//gpu-engine.js
//create the matrices from above...
const scene = {
    viewMatrix,
    projectionMatrix,
    modelMatrix: getTranspose(mesh.getModelMatrix(), [4, 4]), //change to col major?
    normalMatrix: getTranspose(
        getInverse(
            trimMatrix(
                mesh.getModelMatrix(),
                new Float32Array([1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1]),
                [4, 4],
                [3, 3]
            ),
            [3, 3]
        ),
        [3, 3])
};
const sceneData = packStruct(scene, [
    ["viewMatrix", "mat4x4f32"],
    ["projectionMatrix", "mat4x4f32"],
    ["modelMatrix", "mat4x4f32"],
    ["normalMatrix", "mat3x3f32"]
]);
const sceneBuffer = this.#device.createBuffer({
    size: sceneData.byteLength,
    usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
    label: "shadow-map-scene-buffer"
});
this.#device.queue.writeBuffer(sceneBuffer, 0, sceneData);
const sceneBindGroup = this.#device.createBindGroup({
    label: "shadow-map-scene-bind-group",
    layout: bindGroupLayouts.get("scene"),
    entries: [
        {
            binding: 0,
            resource: {
                buffer: sceneBuffer,
                offset: 0,
                size: sceneData.byteLength
            }
        }
    ]
});
passEncoder.setBindGroup(0, sceneBindGroup);

When debugging this I highly recommend rendering an actual scene because the depth buffer is a little more finicky to look at even with our debug tool. Speaking of which, I added an option shouldScaleGamma

//debug-utils.js
/**
 * 
 * @param {GPUDevice} device 
 * @param {*} context 
 * @param {{ shouldGammaScale?: boolean }} options 
 * @returns 
 */
export function setupExtractDepthBuffer(device, context, options = {}) {

//debug-utils.js
const shaderModule = device.createShaderModule({
    label: "depth-buffer-export-shader",
    code: `
        struct VertexOut {
            @builtin(position) frag_position : vec4<f32>,
            @location(0) clip_position: vec4<f32>,
            @location(2) uv: vec2<f32>
        };
        @group(0) @binding(0) var depthSampler: sampler;
        @group(0) @binding(1) var depthTex: texture_depth_2d;
        @vertex
        fn vertex_main(@location(0) position: vec2<f32>) -> VertexOut
        {
            var output : VertexOut;
            output.frag_position =  vec4(position, 1.0, 1.0);
            output.clip_position = vec4(position, 1.0, 1.0);
            output.uv = position.xy * 0.5 + vec2<f32>(0.5, 0.5);
            return output;
        }
        @fragment
        fn fragment_main(fragData: VertexOut) -> @location(0) vec4<f32> {
            let depth = textureSample(depthTex, depthSampler, fragData.uv);

            ${options.shouldGammaScale
                ? `
                let gamma_depth = pow(depth, 10.0);
                return vec4<f32>(gamma_depth, gamma_depth, gamma_depth, 1.0);
                `
                : `
                return vec4<f32>(depth, depth, depth, 1.0);
                `
            }
        }
    `
});

This is because this depth buffer doesn't really need it because we're more appropriately using the depth range (I could probably have made this a parameter at the render call instead of setup but then I'd have to add bindings and I was lazy). Any way here's the depth buffer.

Hmmmm...it is correct for being taken on the side ([0,0,-1]) but it's upside down (note if you want to get in close shorten the distance constant). This is because the depth texture has Y = 0 at the top and moves down, we didn't notice this last time because it was symmetric. It requires a small fix to the shader to flip the Y-axis:

output.uv = vec2(position.x * 0.5 + 0.5, 1.0 - (position.y * 0.5 + 0.5));

At first I thought this was just a visualization bug but, nope, even when we using the shadow map we need to flip the Y (V value). I spent a lot of time debugging that one.

Binding the shadow maps

Now we have the map and we need to use it. One annoying limitation is that depth textures cannot have layers as this would probably be the most intuitive way to have multiple lights casting shadows. Instead we have to create a finite number of bindings. If we have too many then it won't run, if we have too few then we need to fill them with dummy depth textures.

//gpu-engine.js - initializeLights
this.#shadowMaps.set("dummy", this.#device.createTexture({
    label: "dummy-depth-texture",
    size: { width: 1, height: 1, depthOrArrayLayers: 1 },
    format: "depth32float",
    usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
}));

We also need a sampler. This one is for depth so it needs to do comparisons:

//gpu-engine.js - initializeSampler
this.#samplers.set("shadow-map-default", this.#device.createSampler({
    label: "shadow-map-default-sampler",
    compare: "less",
    magFilter: "nearest",
    minFilter: "nearest"
}));

Then we can setup the bindgroup. We need access to the same transforms we used to make the shadow map so that we can take the world space position and recover the coordinates in light view-projection space, which can then be used to get the UV space in the texture. You could also combine the transform matrices so you don't have to pass both because we don't use them individually, but for now I split it out to be explicit.

//gpu-engine.js
setMainShadowBindGroup(passEncoder, bindGroupLayouts, lights, shadowMaps){
    const lightsArray = lights.entries().toArray();
    if(lightsArray.length > 4){
        throw new Exception("This engine can only handle shadows from 4 lights max");
    }
    const lightTransforms = [];
    const shadowMapsToBind = new Array(4).fill(null);
    let i = 0;
    for(const [key, light] of lightsArray){
        const shadowMap = shadowMaps.get(key);
        shadowMapsToBind[i++] = shadowMap;
        if(shadowMap){
            const shadowMapAspectRatio = shadowMap.width / shadowMap.height;
            lightTransforms.push({
                projectionMatrix: light.getProjectionMatrix(shadowMapAspectRatio),
                viewMatrix: light.getViewMatrix()
            })
        } else {
            lightTransforms.push({
                projectionMatrix: getEmptyMatrix([4, 4]),
                viewMatrix: getEmptyMatrix([4, 4])
            });
        }
    }
    const lightTransformData = packArray(lightTransforms, [
        ["projectionMatrix", "mat4x4f32"],
        ["viewMatrix", "mat4x4f32"]
    ]);
    const lightTransformBuffer = this.#device.createBuffer({
        size: lightTransformData.byteLength,
        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST | GPUBufferUsage.STORAGE,
        label: "main-shadow-light-transform-buffer"
    });
    this.#device.queue.writeBuffer(lightTransformBuffer, 0, lightTransformData);
    const dummyView = shadowMaps.get("dummy").createView();
    const shadowBindGroup = this.#device.createBindGroup({
        label: "main-shadow-bind-group",
        layout: bindGroupLayouts.get("shadows"),
        entries: [
            {
                binding: 0,
                resource: this.#samplers.get("shadow-map-default")
            },
            {
                binding: 1,
                resource: {
                    buffer: lightTransformBuffer,
                    offset: 0,
                    size: lightTransformData.byteLength
                }
            },
            ...(shadowMapsToBind.map((value, i) => {
                return {
                    binding: i + 2,
                    resource: value ? value.createView() : dummyView
                };
            }))
        ]
    });
    passEncoder.setBindGroup(3, shadowBindGroup);
}

There's a lot of boilerplate for filling dummy values if the number of lights or shadow maps isn't aligned. The key thing is we need to create the transforms and pack them, then attach the shadow maps or dummy views until we reach the max (4). This is hard-coded manual since there's no way to bind array light structures for depth maps. Make sure to call this from setMainBindGroups with the right parameters.

//gpu-engine.js - setMainBindGroups
this.setMainShadowBindGroup(passEncoder, bindGroupLayouts, lights, shadowMaps)

The shader

We'll start with 4 possible lights as our scenes shouldn't get too complicated yet. In the shader we'll setup this bindgroup.

const shadow_max = 4;
struct LightTransform {
    projection_matrix: mat4x4<f32>,
    view_matrix: mat4x4<f32>,
}

@group(3) @binding(0) var shadow_sampler: sampler_comparison;
@group(3) @binding(1) var<storage, read> light_transforms: array<LightTransform>;
@group(3) @binding(1) var shadow_map_0: texture_depth_2d;
@group(3) @binding(2) var shadow_map_1: texture_depth_2d;
@group(3) @binding(3) var shadow_map_2: texture_depth_2d;
@group(3) @binding(4) var shadow_map_3: texture_depth_2d;

Notice the sample_comparison. This lets it know it's a comparison type and this is also necessary for "auto" bind group layouts to work correctly (before in debug utils we made manual bind group layouts). Reminder if you are debugging to check to make sure you actually use them or the bindings get erased!

Shadow comparison

We need to have access to the light-space transform to convert a world space coordinate into a light space one (and then turn that into a shadow map UV). Once we have that, in the fragment shader we can work backward to get the the UV to sample.

fn get_shadow(world_position: vec4<f32>) -> f32 {
    var shadow = 1.0;
    for(var i = 0; i < shadow_max; i++){
        let shadow_space = light_transforms[i].projection_matrix * light_transforms[i].view_matrix * world_position;
        let shadow_projection = shadow_space.xyz / shadow_space.w;
        let shadow_uv = shadow_projection.xy * 0.5 + vec2<f32>(0.5);
        let shadow_depth = shadow_projection.z;
        switch(i){
            case 0: { shadow *= textureSampleCompare(shadow_map_1, shadow_sampler, shadow_uv, shadow_depth - 0.05); }
            case 1: { shadow *= textureSampleCompare(shadow_map_1, shadow_sampler, shadow_uv, shadow_depth - 0.05); }
            case 2: { shadow *= textureSampleCompare(shadow_map_2, shadow_sampler, shadow_uv, shadow_depth - 0.05); }
            case 3: { shadow *= textureSampleCompare(shadow_map_3, shadow_sampler, shadow_uv, shadow_depth - 0.05); }
            default: { 
            } //shouldn't happen...
        }
    }
    return shadow;
}

//pbr.wgsl
@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32> {
    //...other stuff
    var shadow = get_shadow(frag_data.world_position);
    total_color *= shadow; //add this line

    var tone_mapped_color = total_color / (total_color + vec3(1.0));
    return vec4(pow(total_color, vec3(1.0/2.2)), 1.0);
}

We first multiple by the light projection-view matrix which gives us the light space coordinates and then divide by w to get the projection to put it into the equivalent of "screen" space for the light (-1 to 1). We can then transform that into UVs because we know that the edges will lineup with the shadow map if we convert coordinates from (-1, 1) to (0, 1). The z value is the distance in world space from the light to the fragment.

If we know how far away the light is supposed to be, we can compare that with the value in the shadow map. If the shadow map shows less then it was occluded and it means that there is a shadow on that fragment. The value we get from textureSampleCompare isn't boolean though, it's 0 or 1, where 1 is not occluded and 0 is occluded (remember the depth compare in the sampler is less, so if the value in the provided is less than the provided value it returns 1.0). Note that it's strictly less, because equal means that it's in shadow (and minus precision issues this is what we expect, the shadow map should be equal or less, in practicality though greater is possible and that's not in shadow). Although, if we changed the shadow map sampler to have linear scaling the value can actually vary between 0 and 1 depending on how many samples passed, this will give us softer shadows. The output of this is simply multiplied by light to get the final result (this is actually not correct, but it's a starting point, it actually needs to be per light).

If everything was correct (always double check your bindings are actually uploading the resources you think they are!) we'll get something like this (light direction is [0,-1,0]):

This is because the places the shadow map samples are slightly different than the ones in the scene so on steep faces when might get parts that are higher or lower that correspond to the same shadow map text. We can add a bias value to play around with which basically adds a tolerance so that if the distance is a little longer to still count it as shadow:

shadow *= textureSampleCompare(shadow_map_1, shadow_sampler, shadow_uv, shadow_depth - 0.05);

Cranking it to 0.05 (which is really high) reduces but doesn't fully alleviate the artifacts on the most oblique of angles.

If we use linear filtering it somewhat improves but is still a bit weird.

We can also up the resolution from 1280x720 to 4096x4096:

This make the shadow higher resolution and reduces aliasing. As for the holes, that's because the there are holes in the model itself. Playing around with it more (bias values, light position, shadow map frustum etc) not everything looks quite right. I get a sharp cutoff at a certain point.

The shadow artifacts come from the bias but the direction light (-1,-1,0) show be shadowing the back part of the rug. As explained in the section about debug visualizations I forgot to flip the Y coordinate on the shadow map. This took a while to figure out. I actually had to project the depth texture onto the scene in order to find this. This isn't easy because depth textures don't like to be sampled normally unless you do manual bind layouts. In the end I was forced to switch from autobinding because there didn't seem to be a way to indicate a sampler is non-filtering. I guess it gives a little more control in the end but it's a lot of boilerplate to carry around. Here's the fixed version:

-let shadow_uv = shadow_projection.xy * 0.5 + vec2<f32>(0.5);
+let shadow_uv = vec2(shadow_projection.x * 0.5 + 0.5, 1.0 - (shadow_projection.y * 0.5 + 0.5));

Multiple lights

So the code above has a lot of boilerplate to accommodate multiple lights but there's only one light. In fact, the code itself is wrong because we need an association of the light to the shadow map because they should only occlude themselves, not all lights, so we can't just multiply the final output. This created some more refactoring and code shuffling. The lights should actually carry their transforms so it's easier to iterate over them and the shadow maps are really tied to the light itself. They should also have a property that indicates whether or not they cast shadows at all since in a complex scene we don't want to pay for too many shadow casts so we can just ignore non-important lights.

//shadow-mapped-light.js
import { Light } from "./light.js";
import { getLookAtMatrix, getOrthoMatrix, scaleVector, subtractVector } from "../utilities/vector.js";

const distance = 0.75;
const center = [0, 0, 0];
const frustumScale = 2;

export class ShadowMappedLight extends Light {
    #hasShadow = false;

    constructor(options){
        super(options);
        this.#hasShadow = options.hasShadow;
    }
    getViewMatrix() {
        const lightPosition = scaleVector(subtractVector(center, this.direction), distance);
        return getLookAtMatrix(lightPosition, center);
    }

    getProjectionMatrix(aspectRatio) {
        const right = aspectRatio * frustumScale;
        return getOrthoMatrix(-right, right, -frustumScale, frustumScale, 0.1, Math.min(distance * 2, 2.0));
    }
    set hasShadow(value){
        this.#hasShadow = value;
    }
    get hasShadow(){
        return this.#hasShadow;
    }
}

Once we have this, we can dump the the whole shadow bind group entirely. This is actually better since I found that the max bindgroups on my computer is only 4 so it's better to not use more than we need. Everything gets put in the the light bind group instead.

setMainLightBindGroup(passEncoder, bindGroupLayouts, lights, shadowMaps){
    let shadowMapIndex = 0;
    const shadowMappedLights = lights
        .entries()
        .map(([key, value]) => {
            const shadowMap = shadowMaps.get(key);
            const shadowMapAspectRatio = shadowMap.width / shadowMap.height;
            return {
                typeInt: value.typeInt,
                position: value.position,
                direction: value.direction,
                color: value.color,
                shadowMap,
                projectionMatrix: shadowMap ? value.getProjectionMatrix(shadowMapAspectRatio) : getEmptyMatrix([4,4]),
                viewMatrix: shadowMap ? value.getViewMatrix() : getEmptyMatrix([4,4]),
                hasShadow: value.hasShadow ? 1 : 0,
                shadowMapIndex: (value.hasShadow && shadowMap) ? shadowMapIndex++ : -1
            };
        }).toArray();
    const shadowMapsToBind = shadowMappedLights
        .filter(lightData => lightData.shadowMapIndex > -1)
        .map(lightData => lightData.shadowMap);
    const lightData = packArray(shadowMappedLights,
    [ 
        ["typeInt", "u32"],
        ["position", "vec3f32"],
        ["direction", "vec3f32"],
        ["color", "vec4f32"],
        ["projectionMatrix", "mat4x4f32"],
        ["viewMatrix", "mat4x4f32"],
        ["hasShadow", "u32"],
        ["shadowMapIndex", "i32"]
    ]
    , 64);
    const lightBuffer = this.#device.createBuffer({
        size: lightData.byteLength,
        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST | GPUBufferUsage.STORAGE,
        label: "main-light-buffer"
    });
    this.#device.queue.writeBuffer(lightBuffer, 0, lightData);
    const lightCountBuffer = this.#device.createBuffer({
        size: 4,
        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
        label: "main-light-count-buffer"
    });
    this.#device.queue.writeBuffer(lightCountBuffer, 0, new Int32Array([lights.size]));
    const placeholderView = shadowMaps.get("placeholder").createView({ label: "placeholder-view" });
    const lightBindGroup = this.#device.createBindGroup({
        label: "main-light-bind-group",
        layout: bindGroupLayouts.get("lights"),
        entries: [
            {
                binding: 0,
                resource: {
                    buffer: lightBuffer,
                    offset: 0,
                    size: lightData.byteLength
                }
            },
            {
                binding: 1,
                resource: {
                    buffer: lightCountBuffer,
                    offset: 0,
                    size: 4
                }
            },
            {
                binding: 2,
                resource: this.#samplers.get("shadow-map-default")
            },
            ...(getRange({ end: 3 }).map((index) => {
                const shadowMap = shadowMapsToBind[index];
                return {
                    binding: index + 3,
                    resource: shadowMap ? shadowMap.createView({ label: `shadow-view-${index}`}) : placeholderView
                };
            })),
            {
                binding: 7,
                resource: this.#samplers.get("shadow-map-debug")
            }
        ]
    });
    passEncoder.setBindGroup(2, lightBindGroup);
}

We can do this with just two lists. First we iterate over the lights and pack their new properties in. Remember that bools need to be converted to ints. We also need to keep track of which shadow map slot it being used since some of the lights might not cast shadows. The second list is used to figure out which slots are actually filled and by which shadow map. Like last time, we bind 4 slots and any that don't have shadow maps will get a dummy shadow map (I actually changed these variable names and keys to "placeholder" because it was a better word to describe them).

Now on to the shader.

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32>
{   
    var surface_albedo = textureSample(albedo_map, albedo_sampler, frag_data.uv).rgb;
    var roughness_from_map = textureSample(roughness_map, roughness_sampler, frag_data.uv).x;
    var roughness = max(mix(material.roughness, roughness_from_map, f32(material.use_specular_map)), 0.0001);
    var f0 = mix(vec3(0.04, 0.04, 0.04), material.base_reflectance, material.metalness);
    var total_color = vec3(0.0);
    var normal = normalize(frag_data.normal);
    var shadow = get_shadow(frag_data.world_position);

    var i = 0;
    for(var i: u32 = 0; i < light_count.count; i++){
        var light = lights[i];
        var light_distance = length(light.position - frag_data.world_position.xyz);
        var to_light = vec3(0.0);

        switch light.light_type {
            case 0: {
                to_light = normalize(light.position - frag_data.world_position.xyz);
            }
            case 1: {
                to_light = normalize(-light.direction);
            }
            default: {}
        }

        var attenuation = 1.0 / pow(light_distance, 2.0);
        var radiance = light.color.rgb * attenuation;
        var lit_color = get_bdrf(
            surface_albedo, 
            f0, 
            roughness, 
            material.metalness,
            normal, 
            radiance, 
            to_light,
            scene.camera_position, 
            frag_data.world_position.xyz
        );

        let shadowed_color = lit_color * shadow;
        total_color += shadowed_color;
    }

    var tone_mapped_color = total_color / (total_color + vec3(1.0));
    return vec4(pow(total_color, vec3(1.0/2.2)), 1.0);
}

The main changes here is that the shadow multiplication is moved inside the look to apply the each light individually and then that result is added to the light. get_shadow has minor changes to get the projections from the light object and to use the has_shadow property to skip calculating the shadow if it's not needed for that light.

fn get_shadow(world_position: vec4<f32>) -> f32 {
    var shadow = 1.0;
    var i = 0;
    for(var i: u32 = 0; i < light_count.count; i++){
        if lights[i].has_shadow == 0 {
            continue;
        }
        let shadow_space = lights[i].projection_matrix * lights[i].view_matrix * world_position;
        let shadow_projection = shadow_space.xyz / shadow_space.w;
        let shadow_uv = vec2(shadow_projection.x * 0.5 + 0.5, 1.0 - (shadow_projection.y * 0.5 + 0.5));
        let shadow_depth = shadow_projection.z;
        switch(lights[i].shadow_map_index){
            case 0: { shadow *= textureSampleCompare(shadow_map_0, shadow_sampler, shadow_uv, shadow_depth - 0.01); }
            case 1: { shadow *= textureSampleCompare(shadow_map_1, shadow_sampler, shadow_uv, shadow_depth - 0.01); }
            case 2: { shadow *= textureSampleCompare(shadow_map_2, shadow_sampler, shadow_uv, shadow_depth - 0.01); }
            case 3: { shadow *= textureSampleCompare(shadow_map_3, shadow_sampler, shadow_uv, shadow_depth - 0.01); }
            default: { } //shouldn't happen...
        }
    }
    return shadow;
}

This took a little debugging to make sure all the values were actually being bound and packed and that the shadow maps were rendering correctly before being bound. We can test that in fact multiple directional light cast shadows:

We can play around with turning off shadows, moving the lights and changing colors to convince ourselves it works.

Cleaning up the shadows a bit

I'm still not too happy with the look of the shadows and the banding (also called shadow "acne") so I wanted to try something to help reduce that at oblique angles. I tried scaling the bais factor based on the dot product of the light direction and the normal (eg more perpendicular would get more bias).

let diffuse_factor = max(dot(normal, to_light), 0.0);
let bias = mix(shadow_bias, 0.0, diffuse_factor);

At first I got this wrong because I mixed in the wrong order (similar lines tend toward 1.0 so that should tend to 0.0). The result now looks better, but I also found that just setting the bias to constant 0.05 with the changed shadow code just works now. Whatever.

We can actually optimize a little by letting the GPU handle this. We can do this with by setting some values on the depth stencil render attachment when making the shadow map. To add constant bias, we can set depthBias. This is similar to subtracting bias in the main pass but instead we add bias to the depth map when it gets rendered. To do that bias where we consider the slop of the fragment we use depthBiasSlopScale instead. These are where the depth buffer is defined in the pipeline definition (not the render pass).

//pipelines.js - getShadowMapPipeline
depthStencil: {
    depthWriteEnabled: true,
    depthCompare: "less",
    depthBias: 1000000,
    depthBiasSlopeScale: 2.0,
    format: "depth32float"
}

I removed the bias term and used the above instead and it got the same result. Tuning the depthBias value is kinda stupid because the buffer is non-linear so best I could tell you guess and check so I don't know how sustainable this is or how to calculate it correctly. But it works for now.

Where to go from here

At this point we have working shadows for directional lights which are the easiest but also the ones that are most useful (eg sunlight). We don't have spotlights at this point so we can skip that for now (it's the same but with perspective transform). We do have point lights which are more complicated because we basically have to render 6 views for a cubemap but after that it's the same. Curious what limits that would butt up against. There's also lots of techniques to optimize the size to visual quality (cascade shadow maps, perspective shadow maps etc).