ndesmic

Posted on Oct 6

WebGPU Engine from Scratch Part 10: Markup Language and Scene Graph

#webgpu #vanillajs

One thing that is becoming really annoying while doing manual tests is that I have to setup scenes. These are done by adding objects to the various dictionaries in the gpu-engine.js file which involves lots of scrolling and you can't see the scene description as a whole. This time I want to create an SVG-like scene descriptions for this purpose.

The scene as an object

The engine shouldn't deal with anything like this, it should just take raw scene data. It does but it's all over the place so let's combine it all into one object that we can pass in. For each initialize method we pass in the object. At this point I'm not sure if they should be classes themselves or just the object passed to the class constructor. For now they are classes since I don't see an issue with having those external.

Cameras

initializeCameras(cameras){
    for(const [key, camera] of Object.entries(cameras)){
        this.#cameras.set(key, camera);
    }
}

Nothing too interesting here, just iterate over the object to turn it into the map.

Textures

export const DEPTH_TEXTURE = Symbol("depth-texture");
export const PLACEHOLDER_TEXTURE = Symbol("placeholder-texture");

async initializeTextures(textures) {
    for(const [key, texture] of Object.entries(textures)){
        if(texture.image ?? texture.images){
            this.#textures.set(key, await uploadTexture(this.#device, texture.image ?? texture.images, { label: `${key}-texture` }));
        } else if(texture.color){
            this.#textures.set(key, createColorTexture(this.#device, { color: texture.color, label: `${key}-texture` }));
        }
    }
    //default textures
    this.#textures.set(DEPTH_TEXTURE, this.#device.createTexture({
        label: "depth-texture",
        size: {
            width: this.#canvas.width,
            height: this.#canvas.height,
            depthOrArrayLayers: 1
        },
        format: "depth32float",
        usage: GPUTextureUsage.RENDER_ATTACHMENT
    }));
    this.#textures.set(PLACEHOLDER_TEXTURE, createColorTexture(this.#device, { label: "placeholder-texture" }));
}

There are two types of textures. Those that come from images and those that are colors so we handle those cases based on which key is passed in. For some semantic ergonomics we support image or images but they are normalized in either case. We also have our 2 textures that are necessary for the pipeline to function. I've changed those to use symbol keys so they don't get overridden from the outside unintentionally. These need to be applied to the Material class which references them as well. I've pushed the async fetching outside of the engine because that's really not it's concern. With that I've updated uploadTexture to not do the fetching:

//wgpu-utils.js
/**
 * Loads an image url, uploads to GPU and returns texture ref.
 * Cubemaps defined like [+X, -X, +Y, -Y, +Z, -Z]
 * @param {GPUDevice} device 
 * @param {HTMLImageElement | HTMLImageElement[]} imageOrImages
 * @param {{ label?: string }} options 
 */
export function uploadTexture(device, imageOrImages, options = {}) {
    const images = [].concat(imageOrImages);
    const size = {
        width: images[0].width,
        height: images[0].height,
        depthOrArrayLayers: images.length
    };

    const texture = device.createTexture({
        label: options.label,
        size,
        dimension: "2d",
        format: `rgba8unorm`,
        usage: GPUTextureUsage.COPY_DST | GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
    });

    images.forEach((img, layer) => {
        device.queue.copyExternalImageToTexture(
            {
                source: img,
                flipY: true
            },
            {
                texture,
                origin: [0, 0, layer]
            },
            {
                width: img.width,
                height: img.height,
                depthOrArrayLayers: 1
            }
        );
    });

    return texture;
}

Materials

initializeMaterials(materials) {
    for (const [key, material] of Object.entries(materials)) {
        this.#materials.set(key, material);
    }
}

Materials are simple.

Samplers

I don't see a good reason right now to let these be set from the outside so they are left alone. But if we do pass them in we need to build a wrapper object for them and iterate over them like usual. I did add symbols for the default and shadow default sampler and those references need to be updated.

Meshes

initializeMeshes(meshes){
    for(const [key, mesh] of Object.entries(meshes)){
        const { vertexBuffer, indexBuffer } = uploadMesh(this.#device, mesh, { label: `${key}-mesh` });
        this.#meshContainers.set(key, { mesh, vertexBuffer, indexBuffer });
    }
}

These are pretty simple too but we need to make sure to keep GPU related activities like uploadMesh in the engine itself. We'd also like to push async network stuff outside of the engine so fetchObj is in the component.

Lights

initializeLights(lights){
    for (const [key, light] of Object.entries(lights)) {
        this.#lights.set(key, light)
    }
    for(const key of this.#lights.keys()){
        this.#shadowMaps.set(key, this.#device.createTexture({
            label: `shadow-map-${key}`,
            size: {
                width: 2048,
                height: 2048,
                depthOrArrayLayers: 1
            },
            format: "depth32float",
            usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
        }));
    }
    this.#shadowMaps.set("placeholder", this.#device.createTexture({
        label: "placeholder-depth-texture",
        size: { width: 1, height: 1, depthOrArrayLayers: 1 },
        format: "depth32float",
        usage: GPUTextureUsage.RENDER_ATTACHMENT | GPUTextureUsage.TEXTURE_BINDING
    }));
}

Nothing fancy here either except we need to setup the shadow maps. Shadow maps are an internal only thing so I'm not assigning symbols to anything. Since the ShadowMappedLight class is passed in this might indicate some internal leakage...

Pipelines

These shouldn't be exposed.

Pipeline - Mesh associations

initializePipelineMeshes(pipelineMeshes){
    for(const [key, pipelineMesh] of Object.entries(pipelineMeshes)){
        this.#pipelineMesh.set(key, pipelineMesh);
    }
}

Nothing to it. Although this means that the author needs to know about our pipelines and what they are named. Should pipelines be exposed or maybe we need some other abstraction so we can hide the pipeline stuff?

Here's how the new initialization works:

await this.engine.initialize({
    scene: {
        cameras: {
            "main": new Camera({
                position: [0.5, 0.2, -0.5],
                screenHeight: this.dom.canvas.height,
                screenWidth: this.dom.canvas.width,
                fieldOfView: 90,
                near: 0.01,
                far: 5,
                isPerspective: true
            })
        },
        textures: {
            "marble": { image: await loadImage("./img/marble-white/marble-white-base.jpg") },
            "marble-roughness": { image: await loadImage("./img/marble-white/marble-white-roughness.jpg") },
            "red-fabric": { image: await loadImage("./img/red-fabric/red-fabric-base.jpg") },
            "red-fabric-roughness": { image: await loadImage("./img/red-fabric/red-fabric-roughness.jpg") },
            "gold": { color: [0, 0, 0, 1] },
        },
        materials : {
            "marble": new Material({
                texture: "marble",
                useSpecularMap: true,
                specularMap: "marble-roughness"
            }),
            "red-fabric": new Material({
                texture: "red-fabric",
                useSpecularMap: true,
                specularMap: "red-fabric-roughness"
            }),
            "gold": new Material({
                texture: "gold",
                useSpecularMap: false,
                roughness: 0.2,
                metalness: 1,
                baseReflectance: [1.059, 0.773, 0.307]
            })
        },
        meshes: {
            "teapot": (await fetchObjMesh("./objs/teapot.obj", { reverseWinding: true }))
                .useAttributes(["positions", "uvs", "normals"])
                .normalizePositions()
                .resizeUvs(2)
                .setMaterial("gold"),
            "rug": new Mesh(surfaceGrid(2, 2))
                .useAttributes(["positions", "uvs", "normals"])
                .translate({ y: -0.25 })
                .bakeTransforms()
                .setMaterial("red-fabric")
        },
        lights: {
            "light": new ShadowMappedLight({
                type: "directional",
                color: [1.0, 1.0, 1.0, 1],
                direction: [0, -1, 1],
                hasShadow: true,
            })
        },
        pipelineMeshes: {
            "main": ["teapot", "rug"]
        }
    }
});

It's all nicely packed into one structure. It's not perfect. ShadowMappedLight probably shouldn't be part of this, and we know about the pipelines which if this was a different engine these implementations might not be a thing. So there will likely be more work to do here in the future to not leak that into the scene representation.

Building the markup

Might as well use HTML because it's already parsed as DOM. We don't really need custom elements or anything, just the ability to read it. In order to get the key for each entity it will be required that they all have an attribute key. I had originally went with id before deciding that was a bad idea because it conflates DOM ids with our entities key naming.

We do this with a new utility function parseScene which takes a wc-geo element as an parameter.

export async function parseScene(element) { //more below

Cameras

//geo-markup-parser.js
function parseCamera(cameraEl, options) {
    const key = getKey(cameraEl, "camera");

    return [
        key,
        new Camera({
            position: parseVector(cameraEl.getAttribute("position"), 3),
            screenHeight: cameraEl.getAttribute("height") ?? options.defaultHeight,
            screenWidth: cameraEl.getAttribute("width") ?? options.defaultHWidth,
            fieldOfView: cameraEl.getAttribute("fov") ?? 90,
            near: cameraEl.getAttribute("near") ?? 0.01,
            far: cameraEl.getAttribute("far") ?? 5,
            isPerspective: !cameraEl.hasAttribute("is-orthographic")
        })
    ]
}

//geo-markup-parser.js - parseScene
const cameras = Object.fromEntries(Array.from(element.querySelectorAll("geo-camera"))
    .map(c => parseCamera(c, { defaultHeight: element.dom.canvas.height, defaultHWidth: element.dom.canvas.width })));

The only interesting thing here is that we need the default screenHeigth and screenWidth which come from the canvas. It's a bit of hack to just read it out.

Textures

// geo-markup-parser.js
async function parseTexture(textureEl){
    const key = getKey(textureEl, "texture")
    const src = textureEl.getAttribute("src");
    const color = textureEl.getAttribute("color");
    let value;
    if (src) {
        value = { image: await loadImage(src) };
    } else if (color) {
        value = { color: parseVector(color, 4) };
    }

    return [key, value];
}

// geo-markup-parser.js - parseScene
const textures = Object.fromEntries(await Promise.all(Array.from(element.querySelectorAll("geo-texture"))
        .map(parseTexture)));

Textures have two types, color and image url. Unfortunately they are async which complicates things (ideally async things should not happen in the parser but whatever).

Materials

For the materials I decided we'll stick with the PBR paradigm going forward and so instances of string "texture" are renamed "albedoMap" and "specularMap" to "roughnessMap".

// geo-markup-parser.js
function parseMaterial(materialEl) {
    const key = getKey(materialEl, "material");
    const roughnessMap = materialEl.getAttribute("roughness-map");
    const albedoMap = materialEl.getAttribute("albedo-map");

    return [
        key,
        new Material({
            name: key,
            albedoMap: albedoMap,
            useRoughnessMap: !!roughnessMap,
            roughness: parseFloatOrDefault(materialEl.getAttribute("roughness")),
            metalness: parseFloatOrDefault(materialEl.getAttribute("metalness")),
            baseReflectance: parseVector(materialEl.getAttribute("base-reflectance"), 3)
        })
    ]
}

// geo-markup-parser.js - parseScene
const materials = Object.fromEntries(Array.from(element.querySelectorAll("geo-material"))
    .map(parseMaterial));

Nothing interesting otherwise.

Meshes

The surface grid can be it's own element since it's a generative primitive mesh. I've renamed the height and width parameters to rowCount and colCount respectively because they align better with what's happening and don't imply orientation.

// geo-markup-parser.js
async function parseMesh(meshEl) {
    const key = getKey(meshEl, "mesh");
    const reverseWinding = meshEl.hasAttribute("reverse-winding");
    const src = meshEl.getAttribute("src");
    const mesh = await fetchObjMesh(src, { reverseWinding });

    updateMeshAttributes(meshEl, mesh);

    return [key, mesh];
}

// geo-markup-parser.js
function parseSurfaceGrid(meshEl) {
    const key = getKey(meshEl, "surface-grid");
    const rowCount = parseInt(meshEl.getAttribute("row-count"), 10);
    const colCount = parseInt(meshEl.getAttribute("col-count"), 10);
    const mesh = new Mesh(surfaceGrid(rowCount, colCount));

    updateMeshAttributes(meshEl, mesh);

    return [key, mesh];
}

// geo-markup-parser.js - sceneParser
const meshes = Object.fromEntries(await Promise.all(Array.from(element.querySelectorAll("geo-mesh"))
    .map(parseMesh)));
const surfaceGrids = Object.fromEntries(Array.from(element.querySelectorAll("geo-surface-grid"))
    .map(parseSurfaceGrid));

Lights

After thinking about it harder, the ShadowMappedLight doesn't need to exist after all. Since we can calculate the view matrix and projection matrix with a few constants and non-light values we can make these into helper functions. The only exception is hasShadow which can be moved into the Light itself because it does make sense for it to be used in other engine contexts. I did that and renamed it to castsShadow because it makes more grammatical sense

//light-utils.js
const distance = 0.75;
const center = [0, 0, 0];
const frustumScale = 2;

export function getLightViewMatrix(direction) {
    const lightPosition = scaleVector(subtractVector(center, direction), distance);
    return getLookAtMatrix(lightPosition, center);
}

export function getLightProjectionMatrix(aspectRatio) {
    const right = aspectRatio * frustumScale;
    return getOrthoMatrix(-right, right, -frustumScale, frustumScale, 0.1, Math.min(distance * 2, 2.0));
}

Even this is dubious because it's just some transforms that are near-generic. Could also be removed but I want some level of abstraction because I feel in the future this might handle more than it does like point and spotlights.

// geo-markup-parser.js
function parseLights(lightEl) {
    const key = getKey(lightEl, "light");
    const light = new Light({
        type: lightEl.getAttribute("type") ?? "point",
        color: parseVector(lightEl.getAttribute("color"), 4, [1, 1, 1, 1]),
        direction: parseVector(lightEl.getAttribute("direction"), 3, [0, 0, 0]),
        castsShadow: lightEl.hasAttribute("casts-shadow")
    });

    return [key, light];
}

// geo-markup-parser.js - parseScene
const lights = Object.fromEntries(Array.from(element.querySelectorAll("geo-light"))
        .map(parseLights));

Pipeline mesh

I still don't like it but we'll take it from the meshes themselves.

// geo-markup-parser.js
function getPipelineMesh(meshEl) {
    const pipeline = meshEl.getAttribute("pipeline");
    const meshKey = meshEl.getAttribute("key");

    return {
        pipeline,
        meshKey
    };
}

// geo-markup-parser.js - parseScene
const pipelineMeshes = Array.from(element.querySelectorAll("geo-mesh, geo-surface-grid"))
        .map(getPipelineMesh);

Finally we return the rest:

// geo-markup-parser.js - parseScene
if (cameras.length === 0) {
    throw new Error("Need a 'main' camera defined");
}
return {
    cameras,
    textures,
    materials,
    meshes: { ...meshes, ...surfaceGrids },
    lights,
    pipelineMeshes
};

This code is not fantastic. We don't have a lot of validation or error messages that would be nice. This won't work on the server without a DOM polyfill either. It might be nice if some resources could be anonymous and just nested, like textures under materials for example. But it gets the job done. If there's more need maybe we can do a bit more.

Grouping and scene graph

One thing that would be nice is some grouping though. The idea here is that we group related parts of a scene so we can do things like apply transforms to all of them. We can start with a group class.

//group.js
export class Group {
    #children = [];
    #transforms = [];

    constructor(options) {
        this.#children = options.children;
    }

    get children() {
        return this.#children;
    }

    getModelMatrix() {
        return this.#transforms.reduce((mm, tm) => multiplyMatrix(tm, [4, 4], mm, [4, 4]), getIdentityMatrix());
    }

    translate({ x = 0, y = 0, z = 0 }) {
        this.#transforms.push(getTranslationMatrix(x, y, z));
        return this;
    }
    scale({ x = 1, y = 1, z = 1 }) {
        this.#transforms.push(getScaleMatrix(x, y, z));
        return this;
    }
    rotate({ x, y, z }) {
        //there's an order dependency here... something something quaternions...
        if (x) {
            this.#transforms.push(getRotationXMatrix(x));
        }
        if (y) {
            this.#transforms.push(getRotationYMatrix(y));
        }
        if (z) {
            this.#transforms.push(getRotationZMatrix(z));
        }
        return this;
    }
}

This just collects meshes and has a transform array. Technically we could add more types of things to the group like lights and cameras but for simplicity today we'll just consider meshes. I'm also not going to immediately deal with subclassing yet but it makes sense that "transformable" things share one.

To parse a group

//geo-markup-parser.js
/**
 * 
 * @param {HTMLElement} groupEl 
 */
async function parseGroup(groupEl){
    const key = getKey(groupEl, "group");
    const children = await Promise.all(Array.from(groupEl.children).map(async c => {
        switch(c.tagName){
            case "GEO-MESH": {
                return (await parseMesh(c))[1];
            }
            case "GEO-SURFACE-GRID": {
                return parseSurfaceGrid(c)[1];
            }
            default: {
                throw new Error(`Group doesn't support ${c.tagName} children`)
            }
        }
    }))

    const group = new Group({
        children
    });

    return [key, group];
}

//geo-markup-parser.js - parseScene
const groups = Object.fromEntries(await Promise.all(Array.from(element.children).filter(c => c.tagName === "GEO-GROUP")
    .map(parseGroup)));

Parsing is simple but I've modified it so that it only checks one level deep. This will be a limitation for now but it sets up recursion. Note that the children we only need the value, not the key. Once recursion is setup the keys will be less important. Likewise, meshes will not check the whole tree, just the top level.

//geo-markup-parser.js - parseScene
const meshes = Object.fromEntries(await Promise.all(Array.from(element.children).filter(c => c.tagName === "GEO-MESH")
    .map(parseMesh)));

And hook it up in the engine:

//gpu-engine.js - intialize
this.initializeGroups(scene.groups);

//gpu-engine.js
initializeGroups(groups){
    for(const [key, group] of Object.entries(groups)){
        this.initializeMeshes(group.children);
        this.#groups.set(key, group);
    }
}

At this point nothing should change.

To render the groups we need to make the rendering recursive.

//gpu-engine.js - renderShadowMaps
renderShadowMaps(){
    const commandEncoder = this.#device.createCommandEncoder({
        label: "shadow-map-command-encoder"
    });
    const shadowMapPipelineContainer = this.#pipelines.get("shadow-map");
    for(const [key, light] of this.#lights){
        let isFirstPass = true;
        const passEncoder = commandEncoder.beginRenderPass({
            label: `shadow-map-render-pass`,
            colorAttachments: [],
            depthStencilAttachment: {
                view: this.#shadowMaps.get(key).createView(),
                depthClearValue: 1.0,
                depthStoreOp: "store",
                depthLoadOp: isFirstPass ? "clear" : "load",
            }
        });
        passEncoder.setPipeline(shadowMapPipelineContainer.pipeline);
        const renderRecursive = (meshOrGroup) => {
            if(meshOrGroup instanceof Group){
                for(const child of meshOrGroup.children){
                    renderRecursive(child)
                }
            } else {
                const shadowMap = this.#shadowMaps.get(key);
                const meshContainer = this.#meshContainers.get(meshOrGroup);
                shadowMapPipelineContainer.bindMethod(passEncoder, shadowMapPipelineContainer.bindGroupLayouts, light, shadowMap, meshOrGroup);
                passEncoder.setVertexBuffer(0, meshContainer.vertexBuffer);
                passEncoder.setIndexBuffer(meshContainer.indexBuffer, "uint16");
                passEncoder.drawIndexed(meshContainer.mesh.indices.length);
            }
        }
        for (const meshName of this.#pipelineMesh.get("main")) {
            const group = this.#groups.get(meshName);
            renderRecursive(group);
        }
        passEncoder.end();
        isFirstPass = false;
    }
    this.#device.queue.submit([commandEncoder.finish()]);
}

//gpu-engine.js - renderScene
renderScene(){
    const commandEncoder = this.#device.createCommandEncoder({
        label: "main-command-encoder"
    });
    const camera = this.#cameras.get("main");
    let isFirstPass = true;
    const depthView = this.#textures.get(DEPTH_TEXTURE).createView();
    for (const [pipelineName, meshNames] of this.#pipelineMesh.entries()) {
        const passEncoder = commandEncoder.beginRenderPass({
            label: `${pipelineName}-render-pass`,
            colorAttachments: [
                {
                    storeOp: "store",
                    loadOp: isFirstPass ? "clear" : "load",
                    clearValue: { r: 0.1, g: 0.3, b: 0.8, a: 1.0 },
                    view: this.#context.getCurrentTexture().createView()
                }
            ],
            depthStencilAttachment: {
                view: depthView,
                depthClearValue: 1.0,
                depthStoreOp: "store",
                depthLoadOp: isFirstPass ? "clear" : "load"
            }
        });
        const pipelineContainer = this.#pipelines.get(pipelineName);
        passEncoder.setPipeline(pipelineContainer.pipeline);
        const renderRecursive = (meshOrGroup) => {
            if(meshOrGroup instanceof Group){
                for(const child of meshOrGroup.children){
                    renderRecursive(child)
                }
            } else {
                const meshContainer = this.#meshContainers.get(meshOrGroup);
                pipelineContainer.bindMethod(passEncoder, pipelineContainer.bindGroupLayouts, camera, meshContainer.mesh, this.#lights, this.#shadowMaps);
                passEncoder.setVertexBuffer(0, meshContainer.vertexBuffer);
                passEncoder.setIndexBuffer(meshContainer.indexBuffer, "uint16");
                passEncoder.drawIndexed(meshContainer.mesh.indices.length);
            }
        }
        for (const meshName of meshNames) {
            const group = this.#groups.get(meshName);
            renderRecursive(group);
        }
        passEncoder.end();
        isFirstPass = false;
    }
    this.#device.queue.submit([commandEncoder.finish()]);
}

Note that since meshes no longer have specific keys, I changed the lookup to work on the mesh object itself.

//gpu-engine.js - initializeMeshes
initializeMeshes(meshes){
    for(const [key, mesh] of Object.entries(meshes)){
        const { vertexBuffer, indexBuffer } = uploadMesh(this.#device, mesh, { label: `${key}-mesh` });
        this.#meshContainers.set(mesh, { mesh, vertexBuffer, indexBuffer });
    }
}

This again should render the same scene provided we update the markup.

<wc-geo>
    <geo-camera key="main" position="0.5, 0.2, -0.5"></geo-camera>
    <geo-texture key="marble" src="./img/marble-white/marble-white-base.jpg"></geo-texture>
    <geo-texture key="marble-roughness" src="./img/marble-white/marble-white-roughness.jpg"></geo-texture>
    <geo-texture key="red-fabric" src="./img/red-fabric/red-fabric-base.jpg"></geo-texture>
    <geo-texture key="red-fabric-roughness" src="./img/red-fabric/red-fabric-roughness.jpg"></geo-texture>
    <geo-texture key="gold" color="0, 0, 0, 1"></geo-texture>
    <geo-material key="marble" roughness-map="marble-roughness" albedo-map="marble"></geo-material>
    <geo-material key="red-fabric" roughness-map="red-fabric-roughness" albedo-map="red-fabric"></geo-material>
    <geo-material key="gold" roughness="0.2" metalness="1" base-reflectance="1.059, 0.773, 0.307" albedo-map="gold"></geo-material>
    <geo-group key="teapot-rug" pipeline="main" rotate="1.5707963267948966, 0, 0">
        <geo-mesh 
            key="teapot" 
            normalize
            bake-transforms 
            reverse-winding 
            src="./objs/teapot.obj" 
            resize-uvs="2" 
            material="gold" 
            attributes="positions, normals, uvs">
        </geo-mesh>
        <geo-surface-grid 
            key="rug" 
            row-count="2" 
            col-count="2" 
            translate="0, -0.25, 0" 
            bake-transforms 
            material="red-fabric" 
            attributes="positions, normals, uvs">
        </geo-surface-grid>
    </geo-group>

    <geo-light key="light1" type="directional" color="1, 1, 1, 1" direction="0, -1, 1" casts-shadow></geo-light>

I've added the pipeline to the group instead. The one thing that doesn't work is the rotation because we need to recursively setup the transform.

Transforms on groups

Groups should be recursive in order to make full use of them. In order to do that we should start parsing them recursively:

// geo-markup-parser.js
/**
 * 
 * @param {HTMLElement} groupEl 
 */
async function parseGroup(groupEl){
    const key = getKey(groupEl, "group");
    const children = await Promise.all(Array.from(groupEl.children).map(async c => {
        switch(c.tagName){
            case "GEO-MESH": {
                return (await parseMesh(c))[1];
            }
            case "GEO-SURFACE-GRID": {
                return parseSurfaceGrid(c)[1];
            }
            case "GEO-QUAD": {
                return parseQuad(c);
            }
            case "GEO-CUBE": {
                return parseCube(c)
            }
            case "GEO-GROUP": {
                return (await parseGroup(c))[1]
            }
            default: {
                throw new Error(`Group doesn't support ${c.tagName} children`)
            }
        }
    }));

    const group = new Group({
        children
    });

    const translate = parseVector(groupEl.getAttribute("translate"), 3);
    if (translate) {
        group.translate({ x: translate[0], y: translate[1], z: translate[2] });
    }

    const rotate = parseVector(groupEl.getAttribute("rotate"), 3);
    if (rotate) {
        group.rotate({ x: rotate[0], y: rotate[1], z: rotate[2] });
    }

    const scale = parseVector(groupEl.getAttribute("scale"), 3);
    if (scale) {
        group.scale({ x: scale[0], y: scale[1], z: scale[2] });
    }

    return [key, group];
}

It's not very clean because we have keys we aren't using but it'll do for now. In the engine we should initialize things recursively.

//gpu-engine.js
initializeMesh(mesh, key) {
    const { vertexBuffer, indexBuffer } = uploadMesh(this.#device, mesh, { label: `${key}-mesh` });
    this.#meshContainers.set(mesh, { mesh, vertexBuffer, indexBuffer });
}
initializeMeshes(meshes) {
    for (const [key, mesh] of Object.entries(meshes)) {
        this.initializeMesh(mesh, key);
    }
}
initializeGroups(groups) {
    for (const [key, group] of Object.entries(groups)) {
        this.initializeGroup(group, key);
    }
}
initializeGroup(group, key) {
    for (const child of group.children) {
        if (child instanceof Mesh) {
            this.initializeMesh(child);
        } else if (child instanceof Group) {
            this.initializeGroup(child);
        }
    }
    this.#groups.set(key, group);
}

This still allows things to work the other way for now but if we restricted everything to a group this would become cleaner.

In order to actually apply the transforms we'll create a new concept of a world matrix per entity. (I also changed the getModelMatrix method to just be a getter since I doubt we'll need parameters for those)

//mesh.ts
#worldMatrix = getIdentityMatrix();
//mesh.js
get worldMatrix() {
    return this.#worldMatrix;
}
/**
 * @param {Float32Array} value 
 */
set worldMatrix(value){
    this.#worldMatrix = value;
}

This is world matrix tells the object its parent transforms. The idea here is that the world matrix is set recursively. At each group we multiple the world matrix by the model matrix of the group, this becomes the world matrix for each of that group's child entities. By recursively multiplying them they will get us coordinates in the top-most space, the true global world space.

//group.ts
import { getIdentityMatrix, getRotationXMatrix, getRotationYMatrix, getRotationZMatrix, getScaleMatrix, getTranslationMatrix, multiplyMatrix } from "../utilities/vector.js";

export class Group {
    #children = [];
    #transforms = [];
    #worldMatrix = getIdentityMatrix();

    constructor(options) {
        this.#children = options.children;
    }

    get children() {
        return this.#children;
    }

    get modelMatrix() {
        return this.#transforms.reduce((mm, tm) => multiplyMatrix(tm, [4, 4], mm, [4, 4]), getIdentityMatrix());
    }

    get worldMatrix(){
        return this.#worldMatrix;
    }

    set worldMatrix(value){
        this.#worldMatrix = value;
        this.updateWorldMatrix();
    }

    updateWorldMatrix(){
        const worldMatrix = multiplyMatrix(this.modelMatrix, [4,4], this.#worldMatrix, [4,4]);
        for(const child of this.#children){
            child.worldMatrix = worldMatrix;
        }
    }

    translate({ x = 0, y = 0, z = 0 }) {
        this.#transforms.push(getTranslationMatrix(x, y, z));
        this.updateWorldMatrix();
        return this;
    }
    scale({ x = 1, y = 1, z = 1 }) {
        this.#transforms.push(getScaleMatrix(x, y, z));
        this.updateWorldMatrix();
        return this;
    }
    rotate({ x, y, z }) {
        //there's an order dependency here... something something quaternions...
        if (x) {
            this.#transforms.push(getRotationXMatrix(x));
        }
        if (y) {
            this.#transforms.push(getRotationYMatrix(y));
        }
        if (z) {
            this.#transforms.push(getRotationZMatrix(z));
        }
        this.updateWorldMatrix();
        return this;
    }
}

Here, whenever a group is updated with a transform it will recursively update its children, and if they are groups, their children. This is the magic that makes nesting work.

The mesh-level multiplication will happen in the shader, so we'll pass this matrix in as well.

//gpu-engine.js
setMainSceneBindGroup(passEncoder, bindGroupLayouts, camera, mesh) {
    const scene = {
        viewMatrix: camera.getViewMatrix(),
        projectionMatrix: camera.getProjectionMatrix(),
        modelMatrix: getTranspose(mesh.modelMatrix, [4, 4]), //change to col major
        worldMatrix: mesh.worldMatrix,
        normalMatrix: getTranspose(
            getInverse(
                trimMatrix(
                    multiplyMatrix(mesh.worldMatrix, [4, 4], mesh.modelMatrix, [4, 4]),
                    [4, 4],
                    [3, 3]
                ),
                [3, 3]
            ),
            [3, 3]),
        cameraPosition: camera.getPosition()
    };
    const sceneData = packStruct(scene, [
        ["viewMatrix", "mat4x4f32"],
        ["projectionMatrix", "mat4x4f32"],
        ["modelMatrix", "mat4x4f32"],
        ["worldMatrix", "mat4x4f32"],
        ["normalMatrix", "mat3x3f32"],
        ["cameraPosition", "vec3f32"]
    ]);
    const sceneBuffer = this.#device.createBuffer({
        size: sceneData.byteLength,
        usage: GPUBufferUsage.UNIFORM | GPUBufferUsage.COPY_DST,
        label: "main-scene-buffer"
    });
    this.#device.queue.writeBuffer(sceneBuffer, 0, sceneData);
    const sceneBindGroup = this.#device.createBindGroup({
        label: "main-scene-bind-group",
        layout: bindGroupLayouts.get("scene"),
        entries: [
            {
                binding: 0,
                resource: {
                    buffer: sceneBuffer,
                    offset: 0,
                    size: sceneData.byteLength
                }
            }
        ]
    });
    passEncoder.setBindGroup(0, sceneBindGroup);
}

The main change here is adding the world matrix to the struct we pass in and multiplying the world matrix with the model matrix before doing all the transforms to get the correct normal matrix.

On the shader side it's just one more multiplication:

@vertex
fn vertex_main(@location(0) position: vec3<f32>, @location(1) uv: vec2<f32>, @location(2) normal: vec3<f32>) -> VertexOut
{
    var output : VertexOut;
    output.frag_position =  scene.projection_matrix * scene.view_matrix * scene.world_matrix * scene.model_matrix * vec4<f32>(position, 1.0);
    output.world_position = scene.world_matrix * scene.model_matrix * vec4<f32>(position, 1.0);
    output.uv = uv;
    output.normal = scene.normal_matrix * normal;

    return output;
}

Now, we could just bake this transform data right into the model matrix itself before passing it in. I don't have a great answer for why I'm not doing that other than to spell it out more but I expect at some point this will return to a single matrix at least as far as the shader is concerned.

As a test I nested two groups doing different rotations:

<wc-geo>
    <geo-camera key="main" position="0, 0, -2"></geo-camera>
    <geo-texture key="marble" src="./img/marble-white/marble-white-base.jpg"></geo-texture>
    <geo-texture key="marble-roughness" src="./img/marble-white/marble-white-roughness.jpg"></geo-texture>
    <geo-texture key="red-fabric" src="./img/red-fabric/red-fabric-base.jpg"></geo-texture>
    <geo-texture key="red-fabric-roughness" src="./img/red-fabric/red-fabric-roughness.jpg"></geo-texture>
    <geo-texture key="gold" color="0, 0, 0, 1"></geo-texture>
    <geo-material key="marble" roughness-map="marble-roughness" albedo-map="marble"></geo-material>
    <geo-material key="red-fabric" roughness-map="red-fabric-roughness" albedo-map="red-fabric"></geo-material>
    <geo-material key="gold" roughness="0.2" metalness="1" base-reflectance="1.059, 0.773, 0.307" albedo-map="gold"></geo-material>
    <geo-group key="teapot-rug-0" pipeline="main" rotate="0, 1.5707963267948966, 0">
    <geo-group key="teapot-rug-1" rotate="1.5707963267948966, 0, 0">
        <geo-mesh 
            key="teapot" 
            normalize
            bake-transforms 
            reverse-winding 
            src="./objs/teapot.obj" 
            resize-uvs="2" 
            material="gold" 
            attributes="positions, normals, uvs">
        </geo-mesh>
        <geo-surface-grid 
            key="rug" 
            row-count="2" 
            col-count="2" 
            translate="0, -0.25, 0" 
            bake-transforms 
            material="red-fabric" 
            attributes="positions, normals, uvs">
        </geo-surface-grid>
    </geo-group>
    </geo-group>

    <geo-light key="light1" type="directional" color="1, 1, 1, 1" direction="-1, 0, 0" casts-shadow></geo-light>
</wc-geo>

Which yields this image:

Debugging

One way I tried to the test the result was to make a cube shape and then shine light on it from one side. This would test that the normals are correct after rotation. But I had a problem. When shining light from one side I would get an image like this:

With 2 sides lit. This was puzzling and took a while to figure out since if I just drew the colors of the normal vectors it looked correct. The problem here is that there is precision errors. Even very slight near-zero agitation of the normals to point toward the light will cause the material to look very lit. To fix this issue I had to make a function to round low values toward 0.

fn round_small_mag_3(v: vec3<f32>) -> vec3<f32> {
  return select(v, vec3<f32>(0.0), abs(v) < vec3<f32>(1e-6));
}

This might not be necessary for "real" scenes since it's unlikely the planes are exactly orthogonal to light in that way and that the viewer would notice in a complex scene. However for something like a technical rendering like this it definitely looks off. So we can just apply this function to the normal value to clean up those almost-zero values. vec3<f32>(1e-6) is the threshold per element, you can play around with the value to get it to work depending on how bad the precision issue is.

Conclusion

It's nice working on a project nobody else uses because you can always go in an fix all your stupid naming conventions and cleanup bad abstractions. While functionally this didn't add very much nor was super complicated it really helps to have a clean codebase.

Of course the code here is a little iffy because it was added a bit ad-hoc to see what conventions stick and will need some cleanup.

Code

https://github.com/ndesmic/geo/pull/4/files

DEV Community

WebGPU Engine from Scratch Part 10: Markup Language and Scene Graph

The scene as an object

Cameras

Textures

Materials

Samplers

Meshes

Lights

Pipelines

Pipeline - Mesh associations

Building the markup

Cameras

Textures

Materials

Meshes

Lights

Pipeline mesh

Grouping and scene graph

Transforms on groups

Debugging

Conclusion

Code

Top comments (0)