ndesmic

Posted on Dec 28, 2021

Raytracing 3D Engine from Scratch Part 1: Simple Raycasting

#raytracing #vanillajs #webcomponents

As my WebGL 3D engine gets more mature I thought it would be a good idea to dip a little into raytracing. Ray tracing has gotten a bit of a promotional bump due to the hardware raytracing units built into modern GPUs. One of the nice thing about raytracing is that it is much more accurate at simulating light but on top of that it's actually much simpler than some of the concepts going on in raster graphics using WebGL. Its simply not used as much because it's very expensive. Each pixel on the screen is a ray that can have technically infinite bounces. Despite that we can still make ray tracing engines that have real-time performance, even if they need to sacrifice resolution and frame-rate. This is my attempt (though I don't expect expect it to get to "real-time" performance).

Also, we'll borrow code from the WebGL engine because many parts are still the same. Please reference that series: https://dev.to/ndesmic/series/12426

Boilerplate

As will my other projects we'll contain the whole engine inside a web component. Let's start with a base:

export class WcGeoRt extends HTMLElement {
    #context;
    #width = 1280;
    #height = 720;

    constructor(){
        super();
        this.bind(this);
    }
    bind(element){
        element.attachEvents = element.attachEvents.bind(element);
        element.cacheDom = element.cacheDom.bind(element);
        element.createShadowDom = element.createShadowDom.bind(element);
    }
    async connectedCallback() {
        this.createShadowDom();
        this.cacheDom();
        this.attachEvents();

        this.#context = this.dom.canvas.getContext("2d");

        this.render();
    }
    createShadowDom() {
        this.shadow = this.attachShadow({ mode: "open" });
        this.shadow.innerHTML = `
                <style>
                    :host { display: block; }
                </style>
                <canvas width="${this.#width}" height="${this.#height}"></canvas>
            `;
    }
    cacheDom() {
        this.dom = {
            canvas: this.shadow.querySelector("canvas")
        };
    }
    render(){
        this.#context.fillStyle = "#ff0000";
        this.#context.fillRect(0, 0, this.#width, this.#height);
    }
    attachEvents() {
    }

    //Attrs
    attributeChangedCallback(name, oldValue, newValue) {
        if (newValue !== oldValue) {
            this[name] = newValue;
        }
    }
    set height(value) {
        this.#height = value;
        if (this.dom) {
            this.dom.canvas.height = value;
        }
    }
    set width(value) {
        this.#width = value;
        if (this.dom) {
            this.dom.canvas.height = value;
        }
    }
}

customElements.define("wc-geo-rt", WcGeoRt);

Pretty simple. We just have a canvas and it draws a red box so we can make sure it works.

Camera

In the WebGL engine we introduced some classes for common scene entities. We'll port over some of that. First up is the camera:

import { UP, subtractVector, crossVector, normalizeVector } from "../lib/vector.js";
import { cartesianToLatLng, latLngToCartesian, clamp } from "../lib/math-helpers.js";

export class Camera {
    #position = [0, 0, -1];
    #target = [0, 0, 0];
    #screenWidth;
    #screenHeight;
    #near = 0.01;
    #far = 5;

    constructor(camera) {
        this.#position = camera.position;
        this.#screenWidth = camera.screenWidth;
        this.#screenHeight = camera.screenHeight;
        this.#near = camera.near ?? this.#near;
        this.#far = camera.far ?? this.#far;
    }

    moveTo(x, y, z) {
        this.#position = [x, y, z];
    }

    moveBy({ x = 0, y = 0, z = 0 }) {
        this.#position[0] += x;
        this.#position[1] += y;
        this.#position[2] += z;
    }

    panBy({ x = 0, y = 0, z = 0 }) {
        this.#position[0] += x;
        this.#target[0] += x;
        this.#position[1] += y;
        this.#target[1] += y;
        this.#position[2] += z;
        this.#target[2] += z;
    }

    orbitBy({ lat = 0, long = 0, radius = 0 }) {
        const [r, currentLat, currentLng] = this.getOrbit();
        const newLat = clamp(currentLat + lat, -Math.PI / 2, Math.PI / 2);
        const newRadius = Math.max(0.1, r + radius);
        this.#position = latLngToCartesian([newRadius, newLat, currentLng - long]);
    }

    zoomBy(value) {
        const [r, currentLat, currentLng] = this.getOrbit();
        const newRadius = Math.max(0.1, r / value);
        this.#position = latLngToCartesian([newRadius, currentLat, currentLng]);
    }

    lookAt(x, y, z) {
        this.#target = [x, y, z];
    }

    getForwardDirection(){
        return normalizeVector(subtractVector(this.#target, this.#position));
    }

    getRightDirection(){
        return crossVector(UP, this.getForwardDirection());
    }

    getUpDirection(){
        return crossVector(this.getForwardDirection(), this.getRightDirection());
    }

    getAspectRatio(){
        return this.#screenWidth / this.#screenHeight;
    }

    getOrbit() {
        const targetDelta = subtractVector(this.#position, this.#target);
        return cartesianToLatLng(targetDelta);
    }

    getPosition() {
        return this.#position;
    }

    setPosition(position) {
        this.#position = position;
    }
}

I trimmed a lot out. I won't actually deal with camera movement right now as it's simply not real-time enough (but I'd like to revisit that sometime) so you can remove the methods that move the camera if you'd like. It's interesting because all of that perspective stuff just falls away. I've also added some getter methods for getting the camera's axes orientation.

Math libraries

And we'll need vector.js and math-helpers.js to provide basic vector arithmetic and other basic stuff:

export function transpose(matrix) {
    return [
        [matrix[0][0], matrix[1][0], matrix[2][0], matrix[3][0]],
        [matrix[0][1], matrix[1][1], matrix[2][1], matrix[3][1]],
        [matrix[0][2], matrix[1][2], matrix[2][2], matrix[3][2]],
        [matrix[0][3], matrix[1][3], matrix[2][3], matrix[3][3]]
    ];
}

export function getRotationXMatrix(theta) {
    return [
        [1, 0, 0, 0],
        [0, Math.cos(theta), -Math.sin(theta), 0],
        [0, Math.sin(theta), Math.cos(theta), 0],
        [0, 0, 0, 1]
    ];
}

export function getRotationYMatrix(theta) {
    return [
        [Math.cos(theta), 0, Math.sin(theta), 0],
        [0, 1, 0, 0],
        [-Math.sin(theta), 0, Math.cos(theta), 0],
        [0, 0, 0, 1]
    ];
}

export function getRotationZMatrix(theta) {
    return [
        [Math.cos(theta), -Math.sin(theta), 0, 0],
        [Math.sin(theta), Math.cos(theta), 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1]
    ];
}

export function getTranslationMatrix(x, y, z) {
    return [
        [1, 0, 0, x],
        [0, 1, 0, y],
        [0, 0, 1, z],
        [0, 0, 0, 1]
    ];
}

export function getScaleMatrix(x, y, z){
    return [
        [x, 0, 0, 0],
        [0, y, 0, 0],
        [0, 0, z, 0],
        [0, 0, 0, 1]
    ];
}

export function multiplyMatrix(a, b) {
    const matrix = [
        new Array(4),
        new Array(4),
        new Array(4),
        new Array(4)
    ];
    for (let c = 0; c < 4; c++) {
        for (let r = 0; r < 4; r++) {
            matrix[r][c] = a[r][0] * b[0][c] + a[r][1] * b[1][c] + a[r][2] * b[2][c] + a[r][3] * b[3][c];
        }
    }

    return matrix;
}

export function getIdentityMatrix() {
    return [
        [1, 0, 0, 0],
        [0, 1, 0, 0],
        [0, 0, 1, 0],
        [0, 0, 0, 1]
    ];
}

export function multiplyMatrixVector(vector, matrix) {
    //normalize 3 vectors
    if (vector.length === 3) {
        vector.push(1);
    }

    return [
        vector[0] * matrix[0][0] + vector[1] * matrix[0][1] + vector[2] * matrix[0][2] + vector[3] * matrix[0][3],
        vector[0] * matrix[1][0] + vector[1] * matrix[1][1] + vector[2] * matrix[1][2] + vector[3] * matrix[1][3],
        vector[0] * matrix[2][0] + vector[1] * matrix[2][1] + vector[2] * matrix[2][2] + vector[3] * matrix[2][3],
        vector[0] * matrix[3][0] + vector[1] * matrix[3][1] + vector[2] * matrix[3][2] + vector[3] * matrix[3][3]
    ];
}

export function getVectorMagnitude(vec) {
    let sum = 0;
    for(const el of vec){
        sum += el ** 2;
    }
    return Math.sqrt(sum);
}

export function addVector(a, b) {
    return [
        a[0] + b[0],
        a[1] + b[1],
        a[2] + b[2]
    ];
}

export function subtractVector(a, b) {
    return [
        a[0] - b[0],
        a[1] - b[1],
        a[2] - b[2]
    ];
}

//not a great name
export function multiplyVector(vec, s) {
    return [
        vec[0] * s,
        vec[1] * s,
        vec[2] * s
    ];
}

export function divideVector(vec, s) {
    return [
        vec[0] / s,
        vec[1] / s,
        vec[2] / s
    ];
}

export function normalizeVector(vec) {
    return divideVector(vec, getVectorMagnitude(vec));
}

export function crossVector(a, b) {
    return [
        a[1] * b[2] - a[2] * b[1],
        a[2] * b[0] - a[0] * b[2],
        a[0] * b[1] - a[1] * b[0]
    ];
}

export function dotVector(a, b) {
    return a[0] * b[0] + a[1] * b[1] + a[2] * b[2];
}

export function invertVector(vec){
    return vec.map(x => -x);
}

export const UP = [0, 1, 0];
export const FORWARD = [0, 0, 1];
export const RIGHT = [1, 0, 0];

And our math helpers.

export const TWO_PI = Math.PI * 2
export const QUARTER_TURN = Math.PI / 2;

export function normalizeAngle(angle) {
    if (angle < 0) {
        return TWO_PI - (Math.abs(angle) % TWO_PI);
    }
    return angle % TWO_PI;
}

export function radToDegrees(rad) {
    return rad * 180 / Math.PI;
}

export function cartesianToLatLng([x, y, z]) {
    const radius = Math.sqrt(x ** 2 + y ** 2 + z ** 2);
    return [
        radius,
        (Math.PI / 2) - Math.acos(y / radius),
        normalizeAngle(Math.atan2(x, -z)),
    ];
}

export function latLngToCartesian([radius, lat, lng]) {
    lng = -lng + Math.PI / 2;
    return [
        radius * Math.cos(lat) * Math.cos(lng),
        radius * Math.sin(lat),
        radius * -Math.cos(lat) * Math.sin(lng),
    ];
}

export function clamp(value, low, high) {
    low = low !== undefined ? low : Number.MIN_SAFE_INTEGER;
    high = high !== undefined ? high : Number.MAX_SAFE_INTEGER;
    if (value < low) {
        value = low;
    }
    if (value > high) {
        value = high;
    }
    return value;
}

If you followed along with the previous series these are the same, I've just removed some functions we aren't using to simplify.

Shooting rays

Rays are shot from the center of the camera in different directions. One of the nice things to modeling pixels this way is that we don't need to directly worry about perspective transforms. The way it works is that we will start from the top row and go column by column, row by row casting a ray out. For this most simple version, we'll just test if anything was hit and draw that pixel in red.

So what direction do we shoot the rays? There's a few way to do this but a simple perspective correct way is to visualize the view window as a box in space. We actually did this in WebGL when we defined "clip space". Typically the left edge is [-1,0,0], right is [1,0,0], top is [0,1,0], bottom is [0,-1,0] with the center at [0,0,0] so we'll just pick that. If you wanted a different base scale you can change those.

Once the space is defined we can find the amount of change in view coordinate per amount in change in pixels. As it turns out finding the half-width and half-height values are more useful because we can use them elsewhere.

const halfVolumeHeight = 1;
const halfPixelHeight = this.#height / 2;
const pixelHeightRatio = halfVolumeHeight / halfPixelHeight;
const halfVolumeWidth = this.cameras.default.getAspectRatio();
const halfPixelWidth = this.#width / 2;
const pixelWidthRatio = halfVolumeWidth / halfPixelWidth;

The half height is 1 because the view port's top is at [0,1,0] and the center is at [0,0,0] so 1 unit of Y spans half the screen. Width is slightly different. Depending on the dimensions of the rendering surface you might need something wider than tall (typically on monitors but if this were for phones you might want to flip it). I'm just going to assume you're using a monitor since web development on a phone is hard. Anyway, we can get the aspect ratio of the canvas and multiply the half width (also 1 unit) by it to get a correct aspect ratio image (if you chose 1 and the canvas wasn't square you'll get squished images). Finally we divide that by half the pixel width to get how much a coordinate changes per pixel.

Next we iterate over all the pixels. If you've never done direct pixel manipulation on a canvas, you need to getImageData and then iterate across the rows and columns. The image data is a flat array of pixels represented by 4 bytes for RGBA, so we need to add an extra *4 to move from pixel to pixel. Finally you can put the data back.

const pixelData = this.#context.getImageData(0, 0, this.#width, this.#height);

for (let row = 0; row < this.#height; row++) {
    for (let col = 0; col < this.#width; col++) {
        const index = (row * this.#width * 4) + (col * 4);
                //const color = getPixelFor(row, col);
        pixelData.data[index + 0] = color[0];
        pixelData.data[index + 1] = color[1];
        pixelData.data[index + 2] = color[2];
        pixelData.data[index + 3] = 255;
    }
}

this.#context.putImageData(pixelData, 0, 0);

This is the basic form with getPixelFor being an implementation detail.

Next let's talk about rays. Rays are models as structs of data an origin position and a direction. So to cast a ray, for each pixel it'll look like this:

const xDelta = multiplyVector(this.cameras.default.getRightDirection(), (col - halfPixelWidth) * pixelWidthRatio);
const yDelta = multiplyVector(multiplyVector(this.cameras.default.getUpDirection(), -1), (row - halfPixelHeight) * pixelHeightRatio);
const ray = {
    origin: this.cameras.default.getPosition(),
    direction: normalizeVector(addVector(addVector(this.cameras.default.getForwardDirection(), xDelta), yDelta))
};

Using the pixel ratio about we can figure out the x and y the ray is pointing toward. We assume the center shoots straight forward, the very left will shoot to the left of the view box [-1,0,0] and the right will shoot toward the right of the view box [1,0,0]. Same for Y but obviously in the Y direction. This is all relative to the camera's coordinate space. So first we need to figure out which direction is forward for the camera. This is the direction it's looking. We're reusing the camera from the WebGL project that faces a target position and by default it's looking at the origin. The Camera class has a method to get the forward direction. We can find the right direction by crossing the forward vector with UP ([0,1,0]). Since these fall in a vertical plane, the cross product gets us a vector perpendicular to both, which will be the right direction (make sure the cross order is correct otherwise you'll get the inverse "left"). Finally we can get the camera's up direction by crossing forward and right. This "up" is slightly different than true UP if the camera is tilted but the assumption here is that the camera is always right-side-up and level.

Okay, so now that we have directions in camera space we just need to multiply them by the current column or row, times the pixel to coordinate ratio. Since the center is at 0, we need to subtract one half width/height to put us properly negative. yDelta also needs to be multiplied by -1 to flip it, since pixel coordinates are top-down and not bottom-up like our view units. We can add the X and Y deltas to a forward facing ray to get the direction which we'll normalize because directions should really be normalized.

Finding a Collision

Now that we have the main screen drawing loop, we need to use those rays. What we want to find is where it intersects with an object. The canonical "Hello World" for raytracing is a bunch of marbles. This I think is for a few reasons:

1) Spheres are hard in raster graphics but easy in raytracing
2) Spheres have normals that point in all directions so it gives interesting effects when you're bouncing rays
3) Sphere intersections are easy enough to compute

So we'll also be using spheres. Similar to the other project we'll add a method to initialize our "meshes."

createMeshes(){
    this.meshes = {
        sphere: {
            position: [0,0,0],
            radius: 1
        }
    }
}

This drops a sphere of radius 1 at the origin. Next comes probably the hardest part of the whole thing, figuring out if we have hit the sphere.

sphereIntersection(ray, sphere) {
    const a = dotVector(ray.direction, ray.direction);
    const cameraToCenter = subtractVector(ray.origin, sphere.position);
    const b = 2 * dotVector(ray.direction, cameraToCenter);
    const c = dotVector(cameraToCenter, cameraToCenter) - sphere.radius ** 2;
    const discriminant = (b ** 2) - (4 * a * c);
    if(discriminant < 0) return undefined; //no solution, no hit
    const s1 = (-b + Math.sqrt(discriminant)) / 2*a;
    const s2 = (-b - Math.sqrt(discriminant)) / 2*a;
    if(s1 < 0 || s2 < 0) return undefined; //either facing away or origin is inside sphere, no hit
    return Math.min(s1, s2);
}

There are two ways to do this, a geometric way and an analytical way. The geometric way confuses me so I won't talk about it. This code uses the analytical way. The basic idea is that the equation for a sphere is x**2 + y**2 + z**2 - radius**2 = 0 and the equation for a ray is origin + (direction * t). We can use the ray equation to parameterize x,y,z by t. Then we just need to solve for t. After some substitution you can get a quadratic function which we can solve using the quadratic equation. The values in the code above directly correspond to the values you need to plug in.

Since it's quadratic we have 3 cases:

1) There is no solution, the discriminant is < 1, no hit
2) There is one solution, the discriminant is 0, hit
3) There are two solutions, the discriminant is > 0, we need to find the closest hit

For 3 there's also an extra case we can take care of depending on how you want things to work. If the solutions have opposite signs then it means that the origin is inside the sphere. In this case if you want the sphere to be hollow you can use the positive result, otherwise you might want to just ignore it (similar to back-face culling in a 3d game). Negative solutions mean that the intersection was actually behind the camera and we don't draw those. In the 2 hit case, we find the closest hit as the other hit would be the back side of the sphere.

Ok, hard part is over. Now to bring it all together. For each ray we need to test against each sphere in the scene and find the closest one as spheres that intersect further are visibly behind other spheres and we shouldn't see them:

intersectObjects(ray) {
    let closest = { distance: Infinity, mesh: null };
    for (let mesh of Object.values(this.meshes)) {
        const distance = this.sphereIntersection(ray, mesh);
        if (distance != undefined && distance < closest.distance) {
            closest = { distance, mesh };
        }
    }
    return closest;
}

We'll also keep track of which mesh it came from, that might be useful later.

Finally we do our coloring. If we get any hit: draw red, otherwise white.

raytrace(ray){
    const intersection = this.intersectObjects(ray);
    if (intersection.distance === Infinity) {
        return [255, 255, 255];
    }
    return [255, 0, 0];
}

raytrace is called once for every pixel in the loop and returns the pixel's color.

Results

This is super simple, so it's not going to be that impressive but:

We get a red circle, kinda like the Japanese flag. To prove to ourselves this is working we can move the sphere around. Like to the left and right:

Or top and bottom:

This seems reasonable. We could also move the camera around but because of how it works (spheres look the same at all angles), it's not going to give a good idea of what we're looking at unless it's off center.

What's really cool is how little code this took compared to WebGL. There's fewer view transforms, no buffers, no constructing geometery or complex shaders and how that it actually does a better job of rendering a sphere. On the flip side it's much, much slower, not just because of JS but because of the amount of computation.

Ok. That's a good stopping point for now. You can check out the code here: https://github.com/ndesmic/geort/tree/v1 or the code pen up top.