ndesmic

Posted on Dec 22

WebGPU Engine from Scratch Part 12: Ambient Lighting

#webgpu #computergraphics #vanillajs

One persistent quality issue with the engine is that the shadows are very hard because things are either lit or not lit. In the real world there's a lot of ambient light that comes in from all angles because the rays are bouncing off the walls, clouds and other objects. We don't take this into account. Realistically, this can't be computed at all because it explodes exponentially tracing the rays as they diffuse into multiple rays. The most sophisticated renderers use things like path-tracing to aggressively estimate light. Our version will do something similar though more suitable for real-time rendering without a ray-tracing support.

The API

There's a few ways we could tackle this. We could create a new type of light "ambient" that just adds this to the scene. I don't want to go this route for a few reasons. The main one is that the shape of the API will different from other lights. In fact, I already think that maybe the different light types should be broken up into different entities rather than try to share the same interface since point lights don't need direction, directional lights to need positions and spotlights (whenever we get around to them) need position, direction and a field of view. In the case of ambient lights we either want a color (which could also encode intensity if components are > 1.0), or we want a cubemap that defines the color of the light coming in from each direction, also known as image based lighting (IBL).

It would also be scoped in a way that lights currently aren't, that is, if it's in a group, then all nested elements in that group would have it apply but not outside since we might have different ambient lights for different parts of the scene. In this case ambient light only applies to two entities, meshes and groups. It would be more annoying I think to look at the group's children to decide the ambient light rather than make it a property of the group itself. Even if we were to decide later to change that, the lookup would come from the group so the group would still have methods to get the ambient light for its children so I think that an attribute will be the best place to start.

The fun part here is dealing with the types and multi-inheritance for Transformable and the new AmbientLit class.

Adventures in jsdoc

So we're going to try to do some fancy stuff to make the types work. As usual we'll be using jsdoc to avoid any build-time or tooling (we have deno which recently re-added the bundle command so we don't actually need another tool which is nice, but maybe you don't want to use deno for web code and 0 performance overhead is still infinitely better than ts-go). It will cost us a little bit of ugly syntax though, I don't blame you if you're following along and want go full ts.

Getting interfaces to work in jsdoc is somewhat annoying. At first I thought you could just annotate the interface class with @interface but this will force you to implement the private fields as well (which doesn't actually work) so I feel that's just wrong and should be fixed. We actually have to create the interface separately which is a bit of a shame because it means keeping them up-to-date. Try as I might you can't just create the interface using jsdoc like this:

/**
 * @typedef {{ x?: number, y?: number, z?: number }} TransformValues
 *  

 * @typedef {Object} ITransformable
 * @property {Float32Array} modelMatrix
 * @property {Float32Array} worldMatrix
 * @property {(params: TransformValues) => this} translate
 * @property {(params: TransformValues) => this} scale
 * @property {(params: TransformValues) => this} rotate
 * @property {() => void} resetTransforms
 */

The reason is because we're also using ESM and jsdoc apparently was never updated to handle ESM, there's no way to export typedefs and they get block scoped in the module. So we could put it in the same file as mesh.js but that's messy and not properly segmenting them by function. But what we can do is a d.ts file to hold the types. While this is technically using typescript these d.ts files are never shipped with the code so they just exist for the editor (normally ts compiles into these file). They are easy to write because they just contain pure type typescript.

//transformable.d.ts
export type TransformValues = { x?: number, y?: number, z?: number };

export interface ITransformable {
  readonly modelMatrix: Float32Array;
  readonly worldMatrix: Float32Array;
  translate(params: TransformValues): this;
  scale(params: TransformValues): this;
  rotate(params: TransformValues): this;
  resetTransforms(): void;
}

The next problem is that we can't actually import them the normal way, again because of jsdoc ESM compatibility. So instead of:

/** @implements(import("./transformable.d.ts").ITransformable)

We need

/** @typedef {import("./transformable.d.ts").ITransformable} ITransformable
* @implements {ITransformable}
*/

You need the typedef for it to pick up. However with that it will now check assuming you add @ts-check to the top of the mesh.js file. We now have full type-checking on the interface. There are a few other type errors though. For example in bakeTransforms there's a line like this:

const transformedNormals = chunk(this.normals, this.normalSize)
    .map(values => multiplyMatrixVector(normalMatrix, values, this.normalSize)) //error on values
    .map(values => normalizeVector(values))
    .toArray();

There will be an error on value because it thinks its a any[] and not compatible with Float32Array<ArrayBuffer>. This is a nasty little hangup because while TypedArrays are compatible with Arrays in this way typescript doesn't know that. Without getting too wild we can just override the typing to work:

onst transformedNormals = chunk(this.normals, this.normalSize)
    .map(values => multiplyMatrixVector(normalMatrix, /** @type {Float32Array} */ (/** @type {unknown} */ (values)), this.normalSize))
    .map(values => normalizeVector(values))
    .toArray();

Since we can't directly cast any[] to Float32Array because they are incompatible we first cast to unknown and then to Float32Array. It's a weird typescript pattern that occasionally comes up.

Creating a multi-inheritance pattern

Now that we can type to make sure we're doing things correctly let's implement the interface. We can keep transformable.js and just create a new Transformable inside of the Mesh class (Transformable should also implement ITransformable.

/**
 * @implements {ITransformable}
 */
export class Mesh {
    //...existing code
    #transformable;

    constructor(mesh) {
        this.#transformable = new Transformable();
        //...existing code
    }

    //...existing code
    get modelMatrix(){
        return this.#transformable.modelMatrix;
    }
    get worldMatrix(){
        return this.#transformable.worldMatrix;
    }

    //Transformable
    translate(params){
        this.#transformable.translate(params);
        return this;
    }
    scale(params){
        this.#transformable.scale(params);
        return this;
    }
    rotate(params){
        this.#transformable.rotate(params);
        return this;
    }
    resetTransforms(){
        this.#transformable.resetTransforms();
    }
}

I believe this is called the "composite" pattern but it's been a while. Do the same with the AmbientLit:

//ambient-lit.d.ts
export interface IAmbientLit {
    ambientLightMap: any;
}

//amient-lit.js
//@ts-check
/** @typedef {import("./ambient-lit.d.ts").IAmbientLit} IAmbientLit */

/** @implements {IAmbientLit} */
export class AmbientLit {
    #ambientLightMap;

    constructor(options = {}){
        this.#ambientLightMap = options.ambientLightMap;
    }

    set ambientLightMap(val){
        this.#ambientLightMap = val;
    }
    get ambientLightMap(){
        return this.#ambientLightMap;
    }
}

//mesh.js
/**
 * @implements {ITransformable}
 * @implements {IAmbientLit}
 */
export class Mesh {
    #ambientLit;
    //...existing code
    constructor(){
        this.#ambientLit = new AmbientLit();
    }

    set ambientLightMap(val){
        this.#ambientLit.ambientLightMap = val;
    }
    get ambientLightMap(){
        return this.#ambientLit.ambientLightMap;
    }
}

If we need more of these patterns in the future this is how we can do them (there are other ways using more non-class objects but modern JS for some reason really wanted to push us into them so I don't want to fight it.)

A simple single color case

There are multiple ways we could do this. The simplest is to say if the ambient light is defined as a color then we add that color. The problem here is that when we expand to the IBL version we need to branch depending on the data that we have. I think the simplest way is to force everything through a unified pipeline even if simple cases are more complex. To this end, we need a way to create a cube map of a single color (in reality it will use an array of colors that are the same value). We already have createColorTexture from gpu-utils.js so let's just expand that.

/**
 * Creates a 1x1 texture of a color or a layered texture for an array of colors, colors are in float format
 * @param {GPUDevice} device
 * @param {{ label?: string, color?: [number, number, number, number], colors?: [number, number, number, number][] }}
 * @returns 
 */
export function createColorTexture(device, options = {}) {
    const colors = options.colors ?? [options.color];
    const size = {
        height: 1,
        width: 1,
        depthOrArrayLayers: colors.length
    };

    const texture = device.createTexture({
        label: options.label,
        size,
        format: 'rgba8unorm',
        usage: GPUTextureUsage.TEXTURE_BINDING | GPUTextureUsage.COPY_DST,
    });

    colors.forEach((color, layer) => {
        const texel = color
        ? new Uint8Array(color.map(v => v*255))
        : new Uint8Array([255, 255, 255, 255]);

        device.queue.writeTexture(
            { texture, origin: { x: 0, y: 0, z: layer } },
            texel,
            { bytesPerRow: 4, rowsPerImage: 1 },
            { width: 1, height: 1, depthOrArrayLayers: 1 }
        );
    });

    return texture;
}

There was actually an old bug here where the float values were not scaled into int values. Anyway, now if we pass it an array it will make a layered texture. But now we need to update <geo-texture> to support multiple colors. First we need to parse these attributes:

export function parseListOfFloatVector(text, length, defaultValue){
    return text?.trim()
        ? text.split(";").map(v => v.trim().split(",").map(x => parseFloat(x.trim())).slice(0, length))
        : defaultValue
}

Since , is already a separator for vector elements, we use ; as the outer list delimiter so we can define attributes with multiple colors. This is immediately useful for the debug-cubemap because we can eliminate the external images and define it inline.

async function parseTexture(textureEl){
    const name = textureEl.getAttribute("name");
    const src = textureEl.getAttribute("src");
    const srcs = parseListOrDefault(textureEl.getAttribute("srcs"));
    const color = textureEl.getAttribute("color");
    const colors = textureEl.getAttribute("colors");
    let value;
    if (src) {
        value = { entity: "texture", image: await loadImage(src), name  };
    } else if(srcs){
        value = { entity: "texture", images: await Promise.all(srcs.map(s => loadImage(s))), name } 
    } else if (color) {
        value = { entity: "texture", color: parseFloatVector(color, 4), name };
    } else if (colors){
        value = { entity: "texture", colors: parseListOfFloatVector(colors, 4), name }
    }

    return value;
}

So in this case we can use the plural colors to define a list of colors for <geo-texture>.

- <geo-texture name="debug-background" srcs="../../img/debug-cube/red-right.png, ../../img/debug-cube/red-left.png, ../../img/debug-cube/green-top.png, ../../img/debug-cube/green-bottom.png, ../../img/debug-cube/blue-back.png, ../../img/debug-cube/blue-front.png"></geo-texture>
+ <geo-texture name="debug-background" colors="1,0,0,1; 0.25,0,0,1; 0,1,0,1; 0,0.25,0,1; 0,0,1,1; 0,0,0.25,1"></geo-texture>

Nice little upgrade.

Struct packing 201

I want to pass the ambient light data to the pbr shader but the light bindgroup is already a bit packed. I see some ability to optimize the light data into a single struct so let's try that first. The first thing is to change the signature of packStruct and packArray to take options because passing empty parameters is ugly and error prone.


/**
 * @typedef {[string,GpuType | Prop[]]} Prop
 * @typedef {Prop[]} Schema
 * 
 * @param {object} data 
 * @param {Schema} schema
 * @param {{ minSize?: number, buffer?: ArrayBuffer, offset?: number }} options
 */
export function pack(data, schema, options = {}){
    const offset = options.offset ?? 0;

    if(Array.isArray(data)){
        const { totalSize: structSize } = getAlignments(getValuesFromEntriesRecursive(schema), { minSize: options.minSize });
        const outBuffer = options.buffer ?? new ArrayBuffer(structSize * data.length);

        for(let i = 0; i < data.length; i++){
            pack(data[i], schema, {
                minSize: options.minSize, 
                buffer: outBuffer, 
                offset: offset + i * structSize
            });
        }
        return outBuffer;
    } else {
        const lastSchema = schema.at(-1);
        const lastProp = data[/**@type {Prop} */(lastSchema)[0]]; 
        const count = (Array.isArray(lastProp) && Array.isArray(/** @type {Prop} */(lastSchema)[1])) ? lastProp.length : 1; //if last data and schema are arrays then it's an array
        const { offsets, totalSize } = getAlignments(getValuesFromEntriesRecursive(schema), { minSize: options.minSize, arrayCount: count });
        const outBuffer = options.buffer ?? new ArrayBuffer(totalSize);
        const dataView = new DataView(outBuffer);

        for(let i = 0; i < schema.length; i++){
            let type;
            let name;
            let value;

            if(Array.isArray(schema[i])){
                name = schema[i][0];
                type = schema[i][1];
                value = data[name];
            } else {
                type = schema[i];
                value = data;
            }

            if(value === undefined){
                throw new Error(`Value lookup for prop '${name}' failed!  Double check the prop name is correct.`);
            }
            //TODO: add other GPU Types
            const totalOffset = offset + offsets[i];
            switch(type){
                case "i32": {
                    dataView.setInt32(totalOffset, value, true);
                    break;
                }
                case "u32": {
                    dataView.setUint32(totalOffset, value, true);
                    break;
                }
                case "f32": {
                    dataView.setFloat32(totalOffset, value, true);
                    break;
                }
                case "vec2u32": {
                    dataView.setUint32(totalOffset, value[0], true);
                    dataView.setUint32(totalOffset + 4, value[1], true);
                    break;
                }
                case "vec2f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);
                    break;
                }
                case "vec3f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);
                    dataView.setFloat32(totalOffset + 8, value[2], true);
                    break;
                }
                case "vec4f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);
                    dataView.setFloat32(totalOffset + 8, value[2], true);
                    dataView.setFloat32(totalOffset + 12, value[3], true);
                    break;
                }
                case "mat2x2f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);

                    dataView.setFloat32(totalOffset + 8, value[2], true);
                    dataView.setFloat32(totalOffset + 12, value[3], true);
                    break;
                }
                case "mat3x3f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);
                    dataView.setFloat32(totalOffset + 8, value[2], true);

                    dataView.setFloat32(totalOffset + 16, value[3], true);
                    dataView.setFloat32(totalOffset + 20, value[4], true);
                    dataView.setFloat32(totalOffset + 24, value[5], true);

                    dataView.setFloat32(totalOffset + 32, value[6], true);
                    dataView.setFloat32(totalOffset + 36, value[7], true);
                    dataView.setFloat32(totalOffset + 40, value[8], true);
                    break;
                }
                case "mat4x4f32": {
                    dataView.setFloat32(totalOffset, value[0], true);
                    dataView.setFloat32(totalOffset + 4, value[1], true);
                    dataView.setFloat32(totalOffset + 8, value[2], true);
                    dataView.setFloat32(totalOffset + 12, value[3], true);

                    dataView.setFloat32(totalOffset + 16, value[4], true);
                    dataView.setFloat32(totalOffset + 20, value[5], true);
                    dataView.setFloat32(totalOffset + 24, value[6], true);
                    dataView.setFloat32(totalOffset + 28, value[7], true);

                    dataView.setFloat32(totalOffset + 32, value[8], true);
                    dataView.setFloat32(totalOffset + 36, value[9], true);
                    dataView.setFloat32(totalOffset + 40, value[10], true);
                    dataView.setFloat32(totalOffset + 44, value[11], true);

                    dataView.setFloat32(totalOffset + 48, value[12], true);
                    dataView.setFloat32(totalOffset + 52, value[13], true);
                    dataView.setFloat32(totalOffset + 56, value[14], true);
                    dataView.setFloat32(totalOffset + 60, value[15], true);
                    break;
                }
                default: {
                    if(Array.isArray(type)){
                        if(Array.isArray(value) && i !== (schema.length - 1)){
                            throw new Error("Array must be the last element in a struct!")
                        }
                        pack(value, /** @type {Prop[]}*/(type), { buffer: outBuffer, offset: totalOffset });
                    } else {
                        throw new Error(`Cannot pack type ${type} at prop index ${i} with value ${value}`);
                    }
                }
            }
        }
        return outBuffer;
    }
}

/**
 * @param {GpuType[]} typesToPack
 * @param {{ minSize?: number, arrayCount?: number }} options
 */
export function getAlignments(typesToPack, options = {}){
    let offset = 0;
    let maxAlign = 0;
    const offsets = new Array(typesToPack.length);

    for(let i = 0; i < typesToPack.length; i++){
        let align;
        let size;
        if(Array.isArray(typesToPack[i])){
            const alignSize = getAlignments(/** @type {GpuType[]} */(typesToPack[i]));
            align = alignSize.maxAlign;
            size = alignSize.totalSize * (options.arrayCount ?? 1);
        }else {
            const alignSize = gpuTypeAlignSize[typesToPack[i]];
            align = alignSize.align;
            size = alignSize.size;
        }   

        if(maxAlign < align){
            maxAlign = align;
        }

        offset = getPaddedSize(offset, align);
        offsets[i] = offset;
        offset += size;
    }
    return {
        offsets,
        maxAlign,
        totalSize: getPaddedSize(offset, maxAlign, options.minSize)
    };
}

I've combined all the logic into 2 functions, the main one is pack which will pack arrays or structs, it doesn't matter (the align table has also been converted to objects from tuples). I've also upgraded it to handle array-in-struct cases since it didn't before. Array alignment is based on whatever the align of the structs members themselves are. Furthermore, a struct may only have one array and it must be at the very end. This is a restriction WGSL has, and writing the updated packer I kind of see why, it's hard to know where the end of the data structure is if the arrays of arbitrary length are in the middle. With this new packer we can also do nested structs, it basically just checks if the type is an array and then if the value is a scalar it packs the nested struct, otherwise packs the nested array. The arrays themselves can be scalar value by making the type a string instead of a tuple.

I used typescript to help me update the call sites but this required adding @ts-check to gpu-engine.js and lighting the file up in red. I did find a few places where I made mistakes but there were other problems like the lack of webgpu types:

deno add @webgpu/types

//deno.json
{
  "imports": {
    "@webgpu/types": "npm:@webgpu/types@^0.1.65",
  },
  "lint": {
    "rules": {
      "exclude": [
        "no-explicit-any"
      ]
    }
  },
  "compilerOptions": {
    "noImplicitAny": false,
    "types": [
      "@webgpu/types"
    ],
    "lib": [
      "esnext",
      "dom"
    ]
  }
}

You'll see some new coercing strategies in the code as a result because I had to appease the checker. Let's update the bind group for lights:

setMainLightBindGroup(passEncoder, bindGroupLayouts, lights, shadowMaps) {
    let shadowMapIndex = 0;
    const shadowMappedLights = lights
        .entries()
        .map(([key, value]) => {
            const shadowMap = shadowMaps.get(key);
            const shadowMapAspectRatio = shadowMap.width / shadowMap.height;
            const combinedModelMatrix = multiplyMatrix(value.worldMatrix, [4,4], value.modelMatrix, [4,4]);
            return {
                typeInt: value.typeInt,
                position: multiplyMatrixVector(combinedModelMatrix, value.position, 4),
                direction: multiplyMatrixVector(combinedModelMatrix, value.direction, 4),
                color: value.color,
                shadowMap,
                projectionMatrix: shadowMap ? getLightProjectionMatrix(shadowMapAspectRatio) : getEmptyMatrix([4, 4]), //probably needs transpose
                viewMatrix: shadowMap ? getLightViewMatrix(value.direction) : getEmptyMatrix([4, 4]), //probably needs transpose
                castsShadow: value.castsShadow ? 1 : 0,
                shadowMapIndex: (value.castsShadow && shadowMap) ? shadowMapIndex++ : -1
            };
        }).toArray();
    const shadowMapsToBind = shadowMappedLights
        .filter(lightData => lightData.shadowMapIndex > -1)
        .map(lightData => lightData.shadowMap);
    const lightData = pack(
        {
            lights: shadowMappedLights,
            lightCount: shadowMappedLights.length,
        },
        [
            ["lightCount", "u32"],
            ["lights", [
                ["typeInt", "u32"],
                ["position", "vec3f32"],
                ["direction", "vec3f32"],
                ["color", "vec4f32"],
                ["projectionMatrix", "mat4x4f32"],
                ["viewMatrix", "mat4x4f32"],
                ["castsShadow", "u32"],
                ["shadowMapIndex", "i32"]
            ]]
        ],
        { minSize: 64 }
    );
    const lightBuffer = this.#device.createBuffer({
        size: lightData.byteLength,
        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST | GPUBufferUsage.STORAGE,
        label: "main-light-buffer"
    });
    this.#device.queue.writeBuffer(lightBuffer, 0, lightData);
    const placeholderView = shadowMaps.get("placeholder").createView({ label: "placeholder-view" });
    const lightBindGroup = this.#device.createBindGroup({
        label: "main-light-bind-group",
        layout: bindGroupLayouts.get("lights"),
        entries: [
            {
                binding: 0,
                resource: {
                    buffer: lightBuffer,
                    offset: 0,
                    size: lightData.byteLength
                }
            },
            {
                binding: 1,
                resource: this.#samplers.get(DEFAULT_SHADOW_SAMPLER)
            },
            ...(getRange({ end: 3 }).map((index) => {
                const shadowMap = shadowMapsToBind[index];
                return {
                    binding: index + 2, //manually offset bind index
                    resource: shadowMap ? shadowMap.createView({ label: `shadow-view-${index}` }) : placeholderView
                };
            })),
        ]
    });
    passEncoder.setBindGroup(2, lightBindGroup);
}

We can now combine two of the bindings into one which is nice. You'll also need to update the pipeline too (not shown) as well as the shader (also not shown) but these should be fairly obvious.

Passing the ambient light map

First we need to parse it off the element.

//geo-markup-parser.js
function updateMeshAttributes(meshEl, mesh) {
    //..

    const ambientLightMap = meshEl.getAttribute("ambient-light-map");
    const ambientLightMap = meshEl.getAttribute
    //...
}

Simple. We'll also need to setup a placeholder in-case we don't define it.

//gpu-engine.js
async initializeScene(scene) {
    //...
    this.#textures.set(PLACEHOLDER_CUBEMAP, createColorTexture(this.#device, { label: "placeholder-cubemap", colors: [
        [0,0,0,1],[0,0,0,1],[0,0,0,1],[0,0,0,1],[0,0,0,1],[0,0,0,1]
    ] }));
    //..
}

It's important that it's black because we'll just use it as normal, it just won't contribute anything.

setMainLightBindGroup(passEncoder, bindGroupLayouts, lights, shadowMaps, ambientLightMap) {
    //...
    const lightBindGroup = this.#device.createBindGroup({
        label: "main-light-bind-group",
        layout: bindGroupLayouts.get("lights"),
        entries: [
            //...
            {
                binding: 6,
                resource: this.#textures.get(ambientLightMap ?? PLACEHOLDER_CUBEMAP).createView({ dimension: "cube", label: "ambient-light-cube-view"}) 
            }
        ]
    });
    //...
}

As the final binding we'll pass in the cubemap defined as the ambient light map or the placeholder if none.

Lastly let's update the shader:

//pbr.wgsl
@group(2) @binding(0) var<storage, read> light_data: LightData;
@group(2) @binding(1) var shadow_sampler: sampler_comparison;
@group(2) @binding(2) var shadow_map_0: texture_depth_2d;
@group(2) @binding(3) var shadow_map_1: texture_depth_2d;
@group(2) @binding(4) var shadow_map_2: texture_depth_2d;
@group(2) @binding(5) var shadow_map_3: texture_depth_2d;
+@group(2) @binding(6) var ambient_light_map: texture_cube<f32>;

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32>
{   
    var surface_albedo = textureSample(albedo_map, albedo_sampler, frag_data.uv).rgb;
    var roughness_from_map = textureSample(roughness_map, roughness_sampler, frag_data.uv).x;
    var roughness = max(mix(material.roughness, roughness_from_map, f32(material.use_specular_map)), 0.0001);
    var f0 = mix(vec3(0.04, 0.04, 0.04), material.base_reflectance, material.metalness);
    var total_color = vec3(0.0);
    var normal = round_small_mag_3(normalize(frag_data.normal));

    let i = 0u;
    for(var i: u32 = 0; i < light_data.light_count; i++){
        let light = light_data.lights[i];
        let light_distance = length(light.position - frag_data.world_position.xyz);
        var to_light = vec3(0.0);

        switch light.light_type {
            case 0: { //point
                to_light = normalize(light.position - frag_data.world_position.xyz);
            }
            case 1: { //directional
                to_light = normalize(-light.direction);
            }
            default: {}
        }

        let attenuation = 1.0 / pow(light_distance, 2.0);
        let radiance = light.color.rgb * attenuation;
        let lit_color = get_bdrf(
            surface_albedo, 
            f0, 
            roughness, 
            material.metalness,
            normal, 
            radiance, 
            to_light,
            scene.camera_position, 
            frag_data.world_position.xyz
        );
        let diffuse_factor = max(dot(normal, to_light), 0.0);
        let shadow = get_shadow(frag_data.world_position, diffuse_factor);
        let shadowed_color = lit_color * shadow;

+       let ambient_light = textureSample(ambient_light_map, albedo_sampler, normal);

        total_color += shadowed_color;
+       total_color += ambient_light.rgb;
    }

    let tone_mapped_color = total_color / (total_color + vec3(1.0));
    return vec4(pow(total_color, vec3(1.0/2.2)), 1.0);
}

I'm recycling the albedo sampler since it a simple bilinear filter but maybe we could pass one in too, I just don't see a point.

The effect is interesting on a white teapot:

This is not yet correct but you can kinda see what it's going for using the very simple debug environment. It looks like we're getting indirect lighting from the walls. However there are two big issues, one is that we're not representing the other object, the rug and getting leakage from the floor. The other is that we're only measuring light from a single point along the normal, realistically we need all of the light from the hemisphere around each point. The nice part is that we can precompute this with a slightly different sort of cubemap built specifically for this purpose called an irradiance map.

Light probes

To build this map we'll need to have a concept of a light probe which is kinda like a virtual light measuring camera that sits in the environment. For the sake of authoring it will be part of the markup although I'm not excited about that because it is a specific concept tied to this implementation, I just don't have a better way to model it yet.

The basic idea will look like this:

<geo-probe type="irradiance" position="0, 0, 0" output-name="debug-irradiance-map" samples="10"></geo-probe>

We have an element called <geo-probe> that has a type (maybe we'll make other types for things like specular light), a position in the environment, a texture name to output to and the number of samples which will be how many samples we take per texel (this number will impact quality but we can't take every angle in the hemisphere because that would require infinite compute so we take a finite number of samples and average them).

//geo-markup-parser.js
function parseProbe(probeEl){
    const name = probeEl.getAttribute("name");
    const type = probeEl.getAttribute("type");
    const outputName = probeEl.getAttribute("outputName");
    const samples = parseInt(probeEl.getAttribute("samples"), 10);
    const position = parseFloatVector(probeEl.getAttribute("position"), 3);
    const resolution = parseInt(probeEl.getAttribute("resolution"));

    const probe = new Probe({
        name,
        type,
        outputName,
        position,
        samples,
        resolution
    });

    return probe;
}

Hopefully the plumbing is pretty natural now so I'm going to skip over that but all we need to do for now is to add it to the probe set. We also need the texture to draw to.

//gpu-engine.js
initializeProbe(probe, defaultName){
    const key = probe.name ?? defaultName;
    this.#probes.set(key, probe);
    this.#textures.set(`${key}-cubemap`, this.#device.createTexture({
        size: [probe.resolution, probe.resolution, 6],
        format: "rgba32float",
        usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
    }));
}

Refactoring the pipeline

I'm not going to go too deep into the changes because they are large but the basic code has not changed. I was finding it hard to work the existing pipelines because they had a lot of code cluttering gpu-engine.js and having to add a 4th was just too much. So instead that code was moved into classes under the pipelines folder. Each class implements and interface with createPipeline which sets up the pipeline descriptor, and render which does the rendering for that pipeline since these are always paired. This also means that the bind group code moves in for each pipeline class. There's a bunch of changes for types and to fit the shared interface, using types really helps to make sure everything is still working.

Since they are all bundled up this makes it much easier to add new pipelines and declutters a lot. It also means things like tracking the bindgroups doesn't require extra data structures. Worth it but it was a lot of work (maybe LLMs could have done okay at this task?), the number of file changes in the PR will increase a lot because of this, but it's mostly code moving, not changing, however one change was to stop using multiple command encoders (this was a mistake I didn't catch early on) and put all passes into a single one which I'm sure has big performance improvements.

Rendering the scene as a cubemap

For the next part I'm going to add a new phase to the engine. Before we start rendering we'll have a phase where the preprocessing stuff happens. This way we can separate out those things and maybe show a loading indicator while we wait for them to process.

In this preprocess step we can setup another pipeline to create the irradiance map. The first part of this step will be very similar to the main pipeline in terms of passes and bindings but it will render a cubemap. This is sort of a caching step as the next step will be to collect light sample data for each texel of the irradiance map using that cubemap rendering.

One of the biggest differences for rendering is we'll need to get the cameras for each side of the cube map. This is done by taking the probes and making a view matrix in each of the 6 directions. The projection matrix is always a 1:1 ratio so we just need one resolution measurement and it always encompasses 90 degrees. The near and far plane can be adjusted as necessary, perhaps parameterized (I didn't go that far).

//probe.js
import { getProjectionMatrix } from "..//utilities/vector.js";

export class Probe {
    static FIELD_OF_VIEW = 90;
    static NEAR = 0.01;
    static FAR = 5;

    #name;
    #type;
    #position;
    #outputName;
    #samples;
    #resolution;

    constructor(probe){
        this.name = probe.name;
        this.type = probe.type;
        this.position = probe.position;
        this.outputName = probe.outputName;
        this.samples = probe.samples;
        this.resolution = probe.resolution ?? 32;
    }

    static getProjectionMatrix(resolution) {
        return getProjectionMatrix(
            resolution,
            resolution,
            Probe.FIELD_OF_VIEW,
            Probe.NEAR,
            Probe.FAR
        );
    }

    set name(val){
        this.#name = val;
    }
    get name(){
        return this.#name;
    }

    set type(val){
        this.#type = val;
    }
    get type(){
        return this.#type;
    }

    set position(val){
        if(val.length === 3){
            this.#position = new Float32Array([...val, 1]);
        } else {
            this.#position = new Float32Array(val);
        }
    }
    get position(){
        return this.#position;
    }

    set outputName(val){
        this.#outputName = val;
    }
    get outputName(){
        return this.#outputName;
    }

    set samples(val){
        this.#samples = val;
    }
    get samples(){
        return this.#samples;
    }

    set resolution(val){
        this.#resolution = val;
    }
    get resolution(){
        return this.#resolution;
    }
}

    /**
     * @param {GPUDevice} device
     * @param {Mesh | Group} root
     * @param {AttachmentViews} attachmentViews }
     * @param {{
     *  meshContainers: Map<Mesh, MeshContainer>,
     *  lights: Map<string | symbol, Light>,
     *  shadowMaps: Map<string | symbol,  GPUTexture>,
     *  textures: Map<string | symbol, GPUTexture>,
     *  samplers: Map<string | symbol, GPUSampler>,
     *  materials: Map<string | symbol, Material>,
     *  probes: Map<string | symbol, Probe>,
     *  cameras: Map<string | symbol, Camera>,
     *  primaryCamera: Camera,
     *  background: IBackground,
     *  commandEncoder: GPUCommandEncoder
     * }} info
     */
    render(device, root, attachmentViews, info) {
        const commandEncoder = info.commandEncoder ?? device.createCommandEncoder({
            label: "cubemap-render-command-encoder",
        });

        const width = 720;
        const height = 720;
        const sceneCubeMap = device.createTexture({
            size: [height, width, 6],
            format: "rgba8unorm",
            usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING,
            dimension: "2d"
        });
        const sceneDepthMap = device.createTexture({
            label: "depth-texture",
            size: {
                width,
                height,
                depthOrArrayLayers: 1
            },
            format: "depth32float",
            usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
        });

        info.textures.set("ir-test", sceneCubeMap);

        for (const probe of info.probes.values()) {
            for (let i = 0; i < 6; i++) {
                const viewMatrix = getWorldToCameraMatrixFromDirection(probe.position, CUBEMAP_DIRECTIONS[i]);
                const colorView = sceneCubeMap.createView({ dimension: "2d", baseArrayLayer:  i, arrayLayerCount: 1, label: `cubemap-render-scene-cube-map-${i}` });
                const depthView = sceneDepthMap.createView({ label: "cubemap-render-scene-depth-map" });
                const attachmentViews = { colorView, depthView };

                const innerInfo = { ...info, commandEncoder, 
                    primaryCamera: new Camera({ 
                        name: `cubemap-render-map-direction-${i}`,
                        position: probe.position,
                        direction: CUBEMAP_DIRECTIONS[i],
                        fieldOfView: Probe.FIELD_OF_VIEW, 
                        near: Probe.NEAR, 
                        far: Probe.FAR, 
                        screenWidth: probe.resolution, 
                        screenHeight: probe.resolution,
                        isPerspective: true
                    })
                };

                this.#shadowPipeline?.render(device, root, attachmentViews, innerInfo);
                this.renderScene(device, commandEncoder, root, attachmentViews, innerInfo, viewMatrix, probe, i);
                this.#backgroundPipeline?.render(device, info.background.mesh, attachmentViews, innerInfo);
            }
        }
        if(!info.commandEncoder){
            device.queue.submit([commandEncoder.finish()]);
        }
    }

For each probe, we'll iterate over all 6 sides and render and image and finally arrange all 6 of those images into the cubemap. The first thing was just to copy the pbr.wgsl because most of it is still relevant. I removed the ambient light map (since it doesn't exist in this phase) and using the six different view matrices as the camera rendered to the six layers of the cubemap texture. The nice part about abstracting the pipelines is we can now nest them inside one-another so we can just reuse the shadow and background pipeline as-is. To test, I moved the probe to 0,0,-2 and showed the +Z side showing the teapot to test out that the shader is working. The output looks like this:

The whole cubemap looks like this:

Once we can render all 6 sides into a cubemap we then need to use that cube map and then do the work to sample the light coming into our point.

Creating the irradiance map

Finally we can start sampling. We take some parameter that defines the sample count and carve it up across the two dimensions we sample over, in this case we're using spherical coordinates and so it's theta and phi. To get the number of steps per dimension we take the square root of the total samples.

Each sample needs to be scaled by cos(phi). This is basically dot(N, to_light) to get the weighted amount of light given the normal. This could theoretically could be done during rendering like we do with diffuse shading but it's easier and slightly cheaper to precompute this and bake it into the irradiance map. We also need to scale by sin(phi). This term comes from the fact that we are integrating over a spherical surface grid, while d_phi is constant d_theta isn't because as phi gets us closer to the poles the size of the arcs traced is smaller.

Finally we weight each sample by d_phi * d_theta the area of the little patch it represents (these would add up to the total hemisphere area).

const PI = 3.14159265359;
const TWO_PI = 2 * PI;
const HALF_PI = PI / 2;

struct VertexOut {
    @builtin(position) frag_position : vec4<f32>,
    @location(0) clip_position: vec4<f32>,
    @location(1) uv: vec2<f32>
};
struct Int {
    value: u32
};

@group(0) @binding(0) var environment_sampler: sampler;
@group(0) @binding(1) var environment_map: texture_cube<f32>;
@group(0) @binding(2) var<uniform> face: Int;

@vertex
fn vertex_main(@location(0) position: vec2<f32>) -> VertexOut
{
    var output : VertexOut;
    output.frag_position = vec4(position, 0.0, 1.0);
    output.clip_position = vec4(position, 0.0, 1.0);
    output.uv = vec2(position.x * 0.5 + 0.5, 1.0 - (position.y * 0.5 + 0.5));
    return output;
}

fn get_direction_from_uv_and_index(i: u32, uv: vec2<f32>) -> vec3<f32>
{
    let xy = uv * 2.0 - 1.0;

    switch i {
        case 0: {
            return vec3(1.0, -xy.y, -xy.x);
        }
        case 1: {
            return vec3(-1.0, -xy.y, xy.x);
        }
        case 2: {
            return vec3(xy.x, 1.0, xy.y);
        }
        case 3: {
            return vec3(xy.x, -1.0, -xy.y);
        }
        case 4: {
            return vec3(xy.x, -xy.y, 1.0);
        }
        case 5: {
            return vec3(-xy.x, -xy.y, -1.0);
        }
        default {
            //shouldn't pass this
            return vec3(0.0, 0.0, 0.0);
        }
    }
}

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32> {
    let normal = normalize(get_direction_from_uv_and_index(face.value, frag_data.uv));
    var irradiance = vec3(0.0);

    const sample_count = 128u;

    let samples_per_dimension = u32(ceil(sqrt(f32(sample_count))));
    let d_theta = TWO_PI / f32(samples_per_dimension);
    let d_phi = HALF_PI / f32(samples_per_dimension);

    for(var i: u32 = 0; i < samples_per_dimension; i = i + 1u){
        var theta = (f32(i) + 0.5) * d_theta;
        for(var j: u32 = 0; j < samples_per_dimension; j = j + 1u){
            var phi = (f32(j) + 0.5) * d_phi;
            let sin_phi = sin(phi);
            let cos_phi = cos(phi);
            let x = cos_phi * cos(theta);
            let y = sin_phi;
            let z = cos_phi * sin(theta);
            let local_dir = vec3(x,y,z);

            let up = normal;
            let tangent = select(vec3<f32>(0.0, 1.0, 0.0), vec3<f32>(1.0, 0.0, 0.0), abs(normal.y) > 0.99);
            let right = normalize(cross(tangent, up));
            let forward = cross(up, right);
            let direction = normalize(local_dir.x * right + local_dir.y * normal + local_dir.z * forward);
            let weight = cos_phi * sin_phi;
            irradiance += textureSample(environment_map, environment_sampler, direction).rgb * weight;
        }
    }

    irradiance *= d_phi * d_theta;
    return vec4(irradiance, 1.0);
}

The normals are perturbed by the local direction vector and we need to convert them back into world space. There's a line in there (the select) which accounts for times when the normal is too close to the up direction to form a valid cross product (we had this problem earlier when doing space transforms, this time it's in the shader itself). The resulting irradiance map looks like this:

Looking at this we can see a few things going on (this is also way higher resolution than we actually need...). What we are getting seems reasonable. If we look at the mid points they are sampling from the areas we expect (we see green at the top of each vertical plane for example). However, the sampling is pretty bad. We can see clear starburst patterns on the top and bottom where the very consistent spacing that leaves gaps. We also see rings in the center were the sample density is much greater. This is useable but maybe we can do better.

Cosine Weighted Distribution

So one way we can fix the issue is by using cosine weighted distribution. The idea here is that we should actually sample more the closer we are to the normal. The reason is because light that is closer to the normal direction contributes much more than light on the edges (dot product). This means that we'll get closer to the true color if we focus our sampling there. The other part is that we can be a little more random. The start bursts are likely because our sampling is too ridged and periodic, if it was a little more chaotic then we'll likely get fewer gaps. This leads us to a new problem, WGSL doesn't give us anything to produce random numbers, so how can we deal with that? Luckily there's a paper that addresses just this topic: https://indico.cern.ch/event/93877/contributions/2118070/attachments/1104200/1575343/acat3_revised_final.pdf

Using some fancy bitmath we can get pseudo-random numbers that are good enough for the job entirely on the GPU itself. I'm not going to go into it but just translate this to something we can use. Here's a shader to generate "random" noise:

const U32_MAX: u32 = 4294967295u;
const SCREEN_WIDTH = 320;

fn seed_per_thread(id: u32) -> u32 {
    return id * 1099087573u;
}
fn taus_step(z: u32, s1: u32, s2: u32, s3: u32, m: u32) -> u32 {
    let b = ((z << s1) ^ z) >> s2;
    return ((z & m) << s3) ^ b;
}
fn lcg_step(z: u32, a: u32, c: u32) -> u32 {
    return a * z + c;
}
fn hybrid_taus(z1: u32, z2: u32, z3: u32, z4: u32) -> f32 {
    let r = taus_step(z1, 13u, 19u, 12u, 4294967294u) ^
            taus_step(z2, 2u, 25u, 4u, 4294967288u) ^
            taus_step(z3, 3u, 11u, 17u, 4294967280u) ^
            lcg_step(z4, 1664525u, 1013904223u);
    return f32(r) * 2.3283064365387e-10;
}

fn rand_uint(last_r: u32) -> u32 {
    let z1 = taus_step(last_r, 13u, 19u, 12u, 429496729u);
    let z2 = taus_step(last_r, 2u, 25u, 4u, 4294967288u);
    let z3 = taus_step(last_r, 3u, 11u, 17u, 429496280u);
    let z4 = lcg_step(1664525, last_r, 1013904223u);
    return (z1 ^ z2 ^ z3 ^ z4);
}

fn uint_to_normalized_float(value: u32) -> f32 {
    return f32(value) / (f32(U32_MAX) + 1.0); //+1 to account for 0
}


@group(0) @binding(0) var my_sampler: sampler;
@group(0) @binding(1) var my_texture: texture_2d<f32>;

struct VertexOut {
    @builtin(position) position : vec4<f32>,
    @location(0) uv : vec2<f32>
};

@vertex
fn vertex_main(@location(0) position: vec2<f32>, @location(1) uv: vec2<f32>) -> VertexOut
{
    var output : VertexOut;
    output.position = vec4<f32>(position, 0.0, 1.0);
    output.uv = uv;
    return output;
}

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32>
{
    let id = u32(frag_data.position.x) + (u32(frag_data.position.y) * SCREEN_WIDTH);
    let seed = seed_per_thread(id);

    let int_r = rand_uint(seed);
    let int_g = rand_uint(int_r);
    let int_b = rand_uint(int_g);

    let r = uint_to_normalized_float(int_r);
    let b = uint_to_normalized_float(int_g);
    let g = uint_to_normalized_float(int_b);

    return vec4(r, g, b, 1.0);
}

This follows the the paper using shifts, ors and magic primes. Since we don't have a thread id the seed per thread is generated by taking the pixel position and flattening it into a single index int. To get each successive number we need to pass in the last value we got. I looked into stateful ways to do this but they seemed more trouble than they were worth. The downside is you have to keep track and limit uses of functional style programming so you can retain the intermediate uint. The noise we get looks pretty decent to me.

We can then take this random number generation and apply it to our sampling routine.

const PI = 3.14159265359;
const TWO_PI = 2 * PI;
const HALF_PI = PI / 2;
const U32_MAX: u32 = 4294967295u;

struct VertexOut {
    @builtin(position) frag_position : vec4<f32>,
    @location(0) clip_position: vec4<f32>,
    @location(1) uv: vec2<f32>
};
struct IrradianceParams {
    face_index: u32,
    screen_width: u32,
    sample_count: u32
};

fn seed_per_thread(id: u32) -> u32 {
    return id * 1099087573u;
}
fn taus_step(z: u32, s1: u32, s2: u32, s3: u32, m: u32) -> u32 {
    let b = ((z << s1) ^ z) >> s2;
    return ((z & m) << s3) ^ b;
}
fn lcg_step(z: u32, a: u32, c: u32) -> u32 {
    return a * z + c;
}
fn hybrid_taus(z1: u32, z2: u32, z3: u32, z4: u32) -> f32 {
    let r = taus_step(z1, 13u, 19u, 12u, 4294967294u) ^
            taus_step(z2, 2u, 25u, 4u, 4294967288u) ^
            taus_step(z3, 3u, 11u, 17u, 4294967280u) ^
            lcg_step(z4, 1664525u, 1013904223u);
    return f32(r) * 2.3283064365387e-10;
}

fn rand_uint(last_r: u32) -> u32 {
    let z1 = taus_step(last_r, 13u, 19u, 12u, 429496729u);
    let z2 = taus_step(last_r, 2u, 25u, 4u, 4294967288u);
    let z3 = taus_step(last_r, 3u, 11u, 17u, 429496280u);
    let z4 = lcg_step(1664525u, last_r, 1013904223u);
    return (z1 ^ z2 ^ z3 ^ z4);
}

fn uint_to_normalized_float(value: u32) -> f32 {
    return f32(value) / (f32(U32_MAX) + 1.0);
}

@group(0) @binding(0) var environment_sampler: sampler;
@group(0) @binding(1) var environment_map: texture_cube<f32>;
@group(0) @binding(2) var<uniform> params: IrradianceParams;

@vertex
fn vertex_main(@location(0) position: vec2<f32>) -> VertexOut
{
    var output : VertexOut;
    output.frag_position = vec4(position, 0.0, 1.0);
    output.clip_position = vec4(position, 0.0, 1.0);
    output.uv = vec2(position.x * 0.5 + 0.5, 1.0 - (position.y * 0.5 + 0.5));
    return output;
}

fn get_direction_from_uv_and_index(i: u32, uv: vec2<f32>) -> vec3<f32>
{
    let xy = uv * 2.0 - 1.0;

    switch i {
        case 0: {
            return vec3(1.0, -xy.y, -xy.x);
        }
        case 1: {
            return vec3(-1.0, -xy.y, xy.x);
        }
        case 2: {
            return vec3(xy.x, 1.0, xy.y);
        }
        case 3: {
            return vec3(xy.x, -1.0, -xy.y);
        }
        case 4: {
            return vec3(xy.x, -xy.y, 1.0);
        }
        case 5: {
            return vec3(-xy.x, -xy.y, -1.0);
        }
        default {
            //shouldn't pass this
            return vec3(0.0, 0.0, 0.0);
        }
    }
}

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32> {
    let id = u32(frag_data.frag_position.x) + (u32(frag_data.frag_position.y) * params.screen_width);
    let seed = seed_per_thread(id);
    let normal = normalize(get_direction_from_uv_and_index(params.face_index, frag_data.uv));

    var irradiance = vec3(0.0);
    var rand_int_a = rand_uint(seed); //seed value
    var rand_int_b = rand_uint(rand_int_a); //seed value

    for(var i: u32; i < params.sample_count; i = i + 1u){
        rand_int_a = rand_uint(rand_int_b);
        rand_int_b = rand_uint(rand_int_a);
        let theta = uint_to_normalized_float(rand_int_a) * TWO_PI;
        let r_square = uint_to_normalized_float(rand_int_b);
        let radius = sqrt(r_square);
        let local_tangent = radius * cos(theta);
        let local_bitangent = radius * sin(theta);
        let local_normal = sqrt(1 - r_square);
        let local_dir = vec3(local_tangent, local_bitangent, local_normal);
        let up = normal;
        let tangent = select(vec3<f32>(0.0, 1.0, 0.0), vec3<f32>(1.0, 0.0, 0.0), abs(normal.y) > 0.99);
        let right = normalize(cross(tangent, up));
        let forward = cross(up, right);
        let world_direction = normalize(local_dir.x * right + local_dir.y * forward + local_dir.z * up);
        irradiance += textureSample(environment_map, environment_sampler, world_direction).rgb;
    }

    irradiance /= f32(params.sample_count);
    irradiance *= PI;
    return vec4(irradiance, 1.0);
}

The first thing is we are entirely in spherical coordinates which makes things much easier. We get a normalized random float for theta and the radius term. The tangent and bitangent of the sample are just the spherical coordinates converted. The sqrt(1-r^2) term is more interesting. The reason is because to model the hemisphere we can think of it as sampling from a circle of area 1 at the base x^2 + Y^2 = 1. It's a cross section of the hemisphere in the tangent/bitangent plane. However as we get higher in elevation (the normal direction) the circle shrinks at a rate of sqrt(1-z^2) so at hemisphere radius 1, the circle has radius 0, and at hemisphere radius 0 the circle has radius 1. This is what essentially biases the samples toward the normal direction

The resulting map is a lot better:

Note to get the same brightness we need to multiply by pi which comes from the volume under the hemisphere and we divide to normalize it back to 1 for the probability density function. If we don't you'll get a much darker map. But with the same amount of samples, this is much, much better.

But this cubemap is fairly expensive to render and it would be nice if we could lower the resolution or sample count. Lowering the resolution looks bad though (64x64 with 128 samples):

It feels like we could could run a blur filter over it to make it look better (I won't be detailing this part because I already have another post on convolutions but take a look at the convolution-pipeline.js in the code if you want to see it). It's also not perfect as you can see the visible seams where the edge behavior is incorrect. Realistically this doesn't improve much and will need more sophisticated methods to get looking better.

I also parameterized the pipelines better so things like resolution of the cubemaps weren't hardcoded. If you see differences in the final code that's why.

The final output is a little pixelated but adjustable with parameters.

Outputting a cubemap

Something helpful is a the ability to output the cubemap or a texture. I made a whole pipeline to do this:

//@ts-check

/**
 * @typedef {import("../../../entities/camera.js").Camera} Camera
 * @typedef {import("../../../entities/material.js").Material} Material
 * @typedef {import("../../../entities/probe.js").Probe} Probe
 * @typedef {import("../../../entities/mesh.js").Mesh} Mesh
 * @typedef {import("../../../entities/light.js").Light} Light
 * @typedef {import("../../../entities/group.js").Group} Group
 * @typedef {import("../../../entities/background.d.ts").IBackground} IBackground
 * @typedef {import("../../../entities/pipeline.d.ts").IPipeline} IPipeline
 * @typedef {import("../../../entities/pipeline.d.ts").MeshContainer} MeshContainer
 * @typedef {import("../../../entities/pipeline.d.ts").AttachmentViews} AttachmentViews
 */

import { uploadShader } from "../../../utilities/wgpu-utils.js";

/**
 * @implements {IPipeline}
 */
export class DisplayTexturePipeline {
    #bindGroupLayouts = new Map();
    #pipeline;
    #sampler;
    #vertexBuffer;

    /**
     * 
     * @param {GPUDevice} device 
     * @param {{ textureType?: "texture" | "depthmap" | "cubemap"}} options 
     */
    async createPipeline(device, options = {}) {
        const textureType = options.textureType ?? "texture";
        const vertexBufferDescriptor = [{
            attributes: [
                {
                    shaderLocation: 0,
                    offset: 0,
                    format: "float32x2"
                }
            ],
            arrayStride: 8,
            stepMode: "vertex"
        }];

        let samplerType;
        let sampleType;
        let viewDimension;
        let url;

        switch(textureType){
            case "texture": {
                samplerType = "filtering";
                sampleType = "float";
                viewDimension = "2d";
                url = "../shaders/display-texture.wgsl";
                break;
            }
            case "cubemap": {
                samplerType = "filtering";
                sampleType = "float";
                viewDimension = "2d-array";
                url = "../shaders/display-cubemap.wgsl";
                break;
            }
            case "depthmap": {
                samplerType = "non-filtering";
                sampleType = "depth";
                viewDimension = "2d";
                url = "../shaders/display-texture.wgsl";
                break;
            }
        }

        const relativeShaderUrl = import.meta.resolve(url);
        const shaderModule = await uploadShader(device, relativeShaderUrl);

        this.#bindGroupLayouts.set("texture", device.createBindGroupLayout({
            entries: [
                {
                    binding: 0,
                    visibility: GPUShaderStage.FRAGMENT,
                    sampler: {
                        type: samplerType
                    }
                },
                {
                    binding: 1,
                    visibility: GPUShaderStage.FRAGMENT,
                    texture: {
                        sampleType,
                        viewDimension 
                    }
                }
            ]
        }));

        const pipelineLayout = device.createPipelineLayout({
            label: "display-texture-pipeline-layout",
            bindGroupLayouts: this.#bindGroupLayouts.values().toArray()
        });

        const pipelineDescriptor = {
            label: "display-texture-pipeline",
            layout: pipelineLayout,
            vertex: {
                module: shaderModule,
                entryPoint: "vertex_main",
                buffers: vertexBufferDescriptor
            },
            fragment: {
                module: shaderModule,
                entryPoint: "fragment_main",
                targets: [
                    { format: "rgba8unorm" }
                ]
            },
            primitive: {
                topology: "triangle-list"
            }
        };

        this.#pipeline = device.createRenderPipeline(pipelineDescriptor);


        this.#sampler = options.sampler;
        const vertices = new Float32Array([
            -1.0, -1.0,
            3.0, -1.0,
            -1.0, 3.0
        ]);

        this.#vertexBuffer = device.createBuffer({
            label: "display-texture-export-tri",
            size: vertices.byteLength,
            usage: GPUBufferUsage.VERTEX | GPUBufferUsage.COPY_DST,
        });
        device.queue.writeBuffer(this.#vertexBuffer, 0, vertices);
    }

    /**
     * @param {GPUDevice} device
     * @param {Mesh} root
     * @param {AttachmentViews} attachmentViews }
     * @param {{
     *  meshContainers: Map<Mesh, MeshContainer>,
     *  lights: Map<string | symbol, Light>,
     *  shadowMaps: Map<string | symbol, GPUTexture>,
     *  textures: Map<string | symbol, GPUTexture>,
     *  samplers: Map<string | symbol, GPUSampler>,
     *  materials: Map<string | symbol, Material>,
     *  cameras: Map<string | symbol, Camera>,
     *  primaryCamera: Camera;
     *  probes: Map<string | symbol, Probe>,
     *  background: IBackground
     *  commandEncoder: GPUCommandEncoder
     * }} info
     * @param {GPUTextureView?} args
     */
    render(device, root, attachmentViews, info, args) {
        const commandEncoder = info.commandEncoder ?? device.createCommandEncoder({
            label: "display-texture-command-encoder"
        });

        const passEncoder = commandEncoder.beginRenderPass({
            label: "display-texture-render-pass",
            //@ts-expect-error WGPU types are not good
            colorAttachments: [
                {
                    storeOp: "store",
                    loadOp: "clear",
                    clearValue: { r: 0, g: 0, b: 0, a: 1 },
                    view: attachmentViews.colorView
                }
            ]
        });

        const textureBindGroup = device.createBindGroup({
            label: "display-texture-bind-group",
            layout: this.#bindGroupLayouts.get("texture"),
            entries: [
                { binding: 0, resource: this.#sampler },
                { binding: 1, resource: args},
            ]
        });

        passEncoder.setPipeline(this.#pipeline);
        passEncoder.setBindGroup(0, textureBindGroup);
        passEncoder.setVertexBuffer(0, this.#vertexBuffer);
        passEncoder.draw(3);
        passEncoder.end();

        if(!info.commandEncoder){
            device.queue.submit([commandEncoder.finish()]);
        }
    }
}

It works just like the old depth map debugger but expanded using our new pipeline abstraction. To refresh we simply draw a triangle large enough to fill the whole screen and then for each fragment we sample the texel.

struct VertexOut {
    @builtin(position) frag_position : vec4<f32>,
    @location(0) clip_position: vec4<f32>,
    @location(2) uv: vec2<f32>
};
@group(0) @binding(0) var my_sampler: sampler;
@group(0) @binding(1) var my_texture: texture_2d<f32>;

@vertex
fn vertex_main(@location(0) position: vec2<f32>) -> VertexOut
{
    var output : VertexOut;
    output.frag_position =  vec4(position, 1.0, 1.0);
    output.clip_position = vec4(position, 1.0, 1.0);
    output.uv = vec2(position.x * 0.5 + 0.5, 1.0 - (position.y * 0.5 + 0.5));
    return output;
}

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32> {
    let value = textureSample(my_texture, my_sampler, frag_data.uv);
    return value;
}

For the cube map we create a view as a texture-2d-array (because cubemaps are sample by direction vectors). Then for each patch of pixels we sample a layer of the texture:

struct VertexOut {
    @builtin(position) frag_position : vec4<f32>,
    @location(0) clip_position: vec4<f32>,
    @location(2) uv: vec2<f32>
};
@group(0) @binding(0) var my_sampler: sampler;
@group(0) @binding(1) var my_texture: texture_2d_array<f32>;

@vertex
fn vertex_main(@location(0) position: vec2<f32>) -> VertexOut
{
    var output : VertexOut;
    output.frag_position =  vec4(position, 1.0, 1.0);
    output.clip_position = vec4(position, 1.0, 1.0);
    output.uv = vec2(position.x * 0.5 + 0.5, 1.0 - (position.y * 0.5 + 0.5));
    return output;
}

@fragment
fn fragment_main(frag_data: VertexOut) -> @location(0) vec4<f32> {
    let tile_width = 0.25;
    let tile_height = 0.33;
    var layer: i32 = -1;
    var origin: vec2<f32> = vec2(0.0, 0.0);

    //top
    if(frag_data.clip_position.x > 0.0 && frag_data.clip_position.x < 0.5 && frag_data.clip_position.y <= 1.0 && frag_data.clip_position.y > 0.33) {
        layer = 2;
        origin = vec2(tile_width * 2, 0.0);
    //back
    } else if(frag_data.clip_position.x >= -1.0 && frag_data.clip_position.x < -0.5 && frag_data.clip_position.y <= 0.33 && frag_data.clip_position.y > -0.33) {
        layer = 5;
        origin = vec2(0.0, tile_height);
    //left
    } else if(frag_data.clip_position.x >= -0.5 && frag_data.clip_position.x < 0.0 && frag_data.clip_position.y <= 0.33 && frag_data.clip_position.y > -0.33) {
        layer = 1;
        origin = vec2(tile_width, tile_height);
    //front
    } else if(frag_data.clip_position.x >= 0.0 && frag_data.clip_position.x < 0.5 && frag_data.clip_position.y <= 0.33 && frag_data.clip_position.y > -0.33) {
        layer = 4;
        origin = vec2(tile_width * 2, tile_height);
    //right
    } else if(frag_data.clip_position.x >= 0.5 && frag_data.clip_position.y <= 0.33 && frag_data.clip_position.y > -0.33) {
        layer = 0;
        origin = vec2(tile_width * 3, tile_height);
    //bottom
    } else if(frag_data.clip_position.x > 0.0 && frag_data.clip_position.x < 0.5 && frag_data.clip_position.y <= -0.33) {
        layer = 3;
        origin = vec2(tile_width * 2, tile_height * 2);
    }

    let dims = textureDimensions(my_texture, 0);
    let cubemap_px = vec2<f32>(dims.xy);
    let tile_px = vec2<f32>(tile_width * cubemap_px.x, tile_height * cubemap_px.y); //get tile in terms of texture pixels
    let half_texel = vec2<f32>(0.5) / tile_px;       

    let local_uv = clamp((frag_data.uv - origin) / vec2(tile_width, tile_height), vec2(half_texel), vec2(1.0 - half_texel));

    let sample = textureSample(my_texture, my_sampler, local_uv, layer);

    if(layer == -1){
        return vec4(0.0, 0.0, 0.0, 1.0);
    }
    return sample;
}

There are 2 rows and 4 columns of titles. For each tile, we take the position inside the tile and scale it to get the local UV to sample. If we're not in a valid layer then we just return black. The order here is important. If you try to sample the texture in an if block you'll get errors because you can't sample textures in a way in which all threads would not sample the texture at the same time. I guess that makes sense thinking about how GPUs work but it's a tad unexpected. You also might wonder about the half texel business. We need to calculate what half a texel is in the space of the cubemap layer and then add or subtract from the end point so that we do not sample at the very end which can produce ugly seams. I think in this case it could be because of the repeat on the sampler instead of mirror, but in any case it's nice to not be dependent on the sampler since it was parameterized (mostly for nearest neighbor scenarios).

Further Exploration

There were a lot of things I didn't get to because this chapter was already too long. Apparently spherical harmonics can be used to compress the data we have. I don't know anything about that but it seems interesting. In a real scene objects move around so being able to generate lighting for these objects in real time would be an interesting upgrade. Apparently the most common way is to generate a lattice of probes and interpolate between them. It would also be nice to cache the preprocessing stuff so it doesn't have to compute that on every load.

There's also some cleanup to do. It would be nice if the main pipeline was combined with the background pipeline. It would also be nice if the cubemap render was able to just call the main pipeline in general. Maybe at that point a probe just becomes a new type of camera. They could be used for other things like point lights or for reflections. Ideas for another day.

Conclusion

This one took a while to learn about. I really wasn't expecting to jump headfirst into global illumination but I just took my original idea for adding ambient light and then researched what people actually tend to do. It's a lot to take in though but it's pretty cool that this works and it way higher fidelity than I originally intended. I might have to take a break after this though as I kinda burned out at the finish line and there's a ton of refactorings nagging me that I can hopefully knock by themselves out and stay mostly on topic for once.