Recently I found myself in a need for some low-res sprites of some low-poly 3D models from different angles for an upcoming project. Something like that:
Possible ways to obtain them included:
- learn a little bit of Blender
- make it in WebGL
- write my own renderer
My short adventure with blender already traumatized me and WebGL has a somewhat confusing API. And what a better opportunity to brush up on linear algebra than do some 3D transformations in my own 3D engine?
Here is link to the Github repo.
How are 3D objects represented
Super short intro into 3D graphics: All 3D objects are represented by triangles in 3D space. It means each triangle is made up from 3 points, each of which holds 3 variables: x, y and z.
Our goal is to transform arrays of those points, colors, etc. into a nice .png
picture that we could use as sprites.
Most 3D graphics you see are in perspective projection, so objects that are further from camera seem smaller. But we will only focus on orthographic projection where it's not the case. No matter how far from a camera an object is, it's size doesn't change.
More reading on that.
So what we will build here is:
✔️ a functioning 3D renderer with adjustable camera direction
✔️ Z-buffering to accurately represent overlapping triangles
✔️ basic directional shading
What we won't cover (at least for now):
❌ shade thrown by objects onto other objects
❌ texturing
❌ perspective projection
Also it won't run on GPU. It won't be super efficient. It won't be enough to render complex scenes (or possibly even simple ones) in real time. But it will be fun and satisfying, at least it was for me.
The first image
Install dependencies, we will use numpy for matrix operations and Pillow to turn arrays into images.
pip install Pillow numpy
Create Object3D
class in renderer.py
file for storing individual models.
renderer.py
import numpy as np
from PIL import Image
class Object3D:
def __init__(self, points, triangles, colors):
self.points = points
self.triangles = triangles
self.colors = colors
-
points
: array containing coordinates of points in 3D object -
triangles
: array containing indices of points making up a triangle -
colors
: array of RGB values for each triangle (e.g. white = [255, 255, 255])
And Renderer
class.
renderer.py
class Renderer:
def __init__(
self,
objects,
viewport,
resolution,
camera_angle=0.0,
camera_pitch=0.0,
):
self._objects = objects
self._viewport = viewport
self._resolution = resolution
self._camera_angle = camera_angle
self._camera_pitch = camera_pitch
resolution_x, resolution_y = self._resolution
self._screen = np.ones((resolution_y, resolution_x, 3), "uint8") * 120
x_min, y_min, x_max, y_max = self._viewport
self._range_x = np.linspace(x_min, x_max, resolution_x)
self._range_y = np.linspace(y_max, y_min, resolution_y)
def render_scene(self, output_path):
for object_3d in self._objects:
self._render_object(object_3d)
im = Image.fromarray(self._screen)
im.save(output_path)
-
_objects
: list of 3D objects in scene -
_viewport
: visible region in form of(min_x, min_y, max_x, max_y)
-
_resolution
: output image resolution,(80, 48)
means image will be 80px wide and 48px high. -
_camera_angle
and_camera_pitch
defines camera's position and direction as on the diagram below -
_screen
: array representing RGB bitmap of output image -
_range_x
,_range_y
: arrays representing x and y world coordinates for each row/column of pixels
So far nothing too fancy. Now we will implement _render_object
.
renderer.py
class Renderer:
...
def _render_object(self, object_3d):
projected_points = self._get_object_projected_points(object_3d)
projected_triangles = projected_points[object_3d.triangles]
for triangle_points, color in zip(projected_triangles, object_3d.colors):
self._render_triangle(triangle_points, color)
First we need to convert world coordinates to camera coordinates using _get_object_projected_points
.
renderer.py
class Renderer:
...
def _get_object_projected_points(self, object_3d):
return (
object_3d.points
@ _get_z_rotation_matrix(self._camera_angle)
@ _get_x_rotation_matrix(self._camera_pitch)
)
def _get_x_rotation_matrix(angle):
return np.array(
[
[1, 0, 0],
[0, np.cos(angle), -np.sin(angle)],
[0, np.sin(angle), np.cos(angle)],
]
)
def _get_z_rotation_matrix(angle):
return np.array(
[
[np.cos(angle), -np.sin(angle), 0],
[np.sin(angle), np.cos(angle), 0],
[0, 0, 1],
]
)
We do so by first rotating object's coordinates by _camera_angle
in Z axis, then by _camera_pitch
in X axis. You can find rotation matrices for each axis on wikipedia.
Now let's write a method for rendering individual triangle.
renderer.py
class Renderer:
...
def _render_triangle(self, points, color):
bounding_box = _get_bounding_box(points)
for screen_x, scene_x in enumerate(self._range_x):
if scene_x < bounding_box[0, 0] or scene_x > bounding_box[1, 0]:
continue
for screen_y, scene_y in enumerate(self._range_y):
if scene_y < bounding_box[0, 1] or scene_y > bounding_box[1, 1]:
continue
if not _point_in_triangle(np.array([scene_x, scene_y, 0]), points):
continue
self._screen[screen_y, screen_x, :] = color
def _get_bounding_box(points):
return np.array(
[
[np.min(points[:, 0]), np.min(points[:, 1])],
[np.max(points[:, 0]), np.max(points[:, 1])],
]
)
screen_x
: integer in range [0, resolution[0]]
scene_x
: float in range [viewport[0], viewport[2]]
We will iterate each pixel and skip if current point doesn't lie within triangle's bounding box. Otherwise we check if point is inside a triangle, and if so, we color current pixel.
Checking if point is inside a triangle is not actually a trivial problem. I copy-pasted solution from this stackoverflow post.
renderer.py
def _sign(p1, p2, p3):
return (p1[0] - p3[0]) * (p2[1] - (p3[1])) - (p2[0] - p3[0]) * (p1[1] - p3[1])
def _point_in_triangle(p, triangle):
a, b, c = triangle
d1 = _sign(p, a, b)
d2 = _sign(p, b, c)
d3 = _sign(p, c, a)
has_neg = (d1 < 0) or (d2 < 0) or (d3 < 0)
has_pos = (d1 > 0) or (d2 > 0) or (d3 > 0)
return not (has_neg and has_pos)
Without going too much into detail, we check if the point lies on the same side for each of three halfplanes constructed from triangle's vertices. If it does, it means point is inside a triangle. This method works for both clockwise, and counterclockwise triangles.
Let's now create a simple scene and render it in main.py
main.py
import numpy as np
from renderer import Renderer, Object3D
WHITE = [255, 255, 255]
YELLOW = [200, 200, 0]
GREEN = [0, 200, 0]
BLUE = [0, 0, 200]
RED = [200, 0, 0]
ORANGE = [200, 100, 0]
cube_points = np.array(
[
[-1, -1, -1],
[-1, 1, -1],
[1, 1, -1],
[1, -1, -1],
[-1, -1, 1],
[-1, 1, 1],
[1, 1, 1],
[1, -1, 1],
]
)
cube_triangles = np.array(
[
# bottom
[0, 2, 1],
[0, 3, 2],
# top
[4, 5, 6],
[4, 6, 7],
# back
[1, 2, 6],
[1, 6, 5],
# front
[0, 4, 7],
[0, 7, 3],
# left
[0, 1, 5],
[0, 5, 4],
# right
[3, 7, 6],
[3, 6, 2],
]
)
cube_colors = np.array(
[
YELLOW,
YELLOW,
WHITE,
WHITE,
ORANGE,
ORANGE,
RED,
RED,
GREEN,
GREEN,
BLUE,
BLUE,
]
)
cube = Object3D(cube_points, cube_triangles, cube_colors)
renderer = Renderer(
[cube],
(-2, -2, 2, 2),
(300, 300),
camera_angle=np.deg2rad(45),
camera_pitch=np.deg2rad(60),
)
renderer.render_scene("scene.png")
Let's run the script.
Not looking like a cube yet, because as we can see faces have been drawn in a somewhat random order and overlap each other in unpredictable ways.
We can also see that orange face (back) and green face (left) are drawn, even though they shouldn't be visible from camera direction. Triangles in 3D engines have only a single visible side to avoid unnecessary computation. We will address it first.
Removing triangles that do not face the camera
Let's add another property to Renderer
renderer.py
class Renderer:
def __init__(
...
):
...
self._camera_dir = np.array([0, 0, -1])
And modify _render_object
.
renderer.py
class Renderer:
...
def _render_object(self, object_3d):
projected_points = self._get_object_projected_points(object_3d)
projected_triangles = projected_points[object_3d.triangles]
visible_mask = self._get_screen_dot_products(projected_triangles) < 0
if not any(visible_mask):
return
for triangle_points, color in zip(
projected_triangles[visible_mask], object_3d.colors[visible_mask]
):
self._render_triangle(triangle_points, color)
def _get_screen_dot_products(self, triangles):
normals = np.cross(
(triangles[:, 0] - triangles[:, 1]),
(triangles[:, 2] - triangles[:, 1]),
)
return (self._camera_dir * normals).sum(axis=1)
How do we know if a triangle is visible?
We first need to find triangle's surface vectors, which is cross product of vector ab and ac.
Dot product is a measure of how closely two vectors align, in terms of the directions to which they are pointing. So let's calculate dot product of camera direction and triangle surface vectors. Value less than 0 means triangle is visible, greater than 0 means it's invisible, and 0 means the two vectors are perpendicular (which also means the triangle is not visible).
Running the script now we can see something resembling Rubik's cube.
But the renderer is still lacking a very basic capability. If we try to render triangles stacked on top of each other, we will get result depending on ordering of the triangles defined in Object3D.triangles
.
Although green triangle is stacked on top of red triangle in camera and world coordinates, the green triangle was drawn before the red one.
We could find a way to sort the triangles and render them in order from the ones farthest from camera, to the ones that are the closest. But sorting the triangles isn't actually an easy task. We will also run into cases where two or more triangles intersect, or ones where where we can't accurately sort the triangles without splitting them into smaller ones.
✨Z-buffering✨
This is where Z-buffering comes in. Instead of operating on whole triangles, we will calculate and store Z value for each pixel painted on our bitmap in a different array. Each cell will be initialized to -infinity, then before coloring a pixel _screen
we will compare depth of our current point on a triangle and stored depth for this pixel in _z_buffer
array.
How to calculate Z coordinate for a point with known X and Y coordinates on triangle plane? We need to solve simple linear system to get delta Z values on X and Y axis.
For a triangle ABC lets first calculate vectors AB and AC.
v1 = B - A
v2 = C - A
Then, calculate for Z:
v1.x + v1.y = v1.z
v2.x + v2.y = v2.z
renderer.py
def _calculate_delta_z(points):
v_ab = points[1] - points[0]
v_ac = points[2] - points[0]
slope = np.array([v_ab[:2], v_ac[:2]])
zs = np.array([v_ab[2], v_ac[2]])
return np.linalg.solve(slope, zs)
When we have our delta Z (dz), and choose an arbitrary vertex of a triangle (a), we can calculate depth for any point (p):
v_pa = a - p
depth = a.z + v_pa * dz
Let's add z-buffer array and modify _render_triangle
method:
renderer.py
class Renderer:
def __init__(
...
):
...
self._z_buffer = np.ones((resolution_y, resolution_x)) * -np.inf
...
def _render_triangle(self, points, color):
bounding_box = _get_bounding_box(points)
# np.linalg.solve can throw an error so we need to handle it
try:
delta_z = _calculate_delta_z(points)
except np.linalg.LinAlgError:
return
for screen_x, scene_x in enumerate(self._range_x):
if scene_x < bounding_box[0, 0] or scene_x > bounding_box[1, 0]:
continue
for screen_y, scene_y in enumerate(self._range_y):
if scene_y < bounding_box[0, 1] or scene_y > bounding_box[1, 1]:
continue
if not _point_in_triangle(np.array([scene_x, scene_y, 0]), points):
continue
depth = (
points[0][2]
+ (np.array([scene_x, scene_y]) - points[0][:2]) @ delta_z
)
if depth <= self._z_buffer[screen_y, screen_x]:
continue
self._screen[screen_y, screen_x, :] = color
self._z_buffer[screen_y, screen_x] = depth
The stacked triangles are rendered correctly now, regardless of their Object3D.triangles
ordering.
Renderer is still lacking one basic and aesthetically pleasing feature, which is shading. For now each pixel is colored with color defined for a particular triangle. So we will add directional lighting to make the rendered image more pleasing.
Basic directional lighting
We will determine how much light a triangle gets in similar way to the one we used to determine if a triangle is visible. But this time we will not only need to know if dot product is larger or smaller than 0, but also need the exact value, so we will be operating on unit vectors, which are vectors that are a single unit of length. To convert a vector to unit vector, you need to calculate its length, then multiply by 1/length
for vector A:
length = sqrt(A.x^2 + A.y^2 + A.z^2)
unit_A = A * 1 / length
renderer.py
def _normalize_vector(p):
return 1 / np.sqrt((p**2).sum()) * p
This will yield value between -1 (pointing in opposite directions) and 1 (pointing in exactly the same direction).
Add light direction as a property to Renderer object:
renderer.py
class Renderer:
def __init__(
...
):
...
self._light_dir = _normalize_vector(np.array([1, -1, -1]))
renderer.py
class Renderer:
...
def _calculate_color(self, points, color):
v_ba = points[0] - points[1]
v_bc = points[2] - points[1]
surface_unit_vector = _normalize_vector(np.cross(v_ba, v_bc))
factor = 1 - (np.dot(self._light_dir, surface_unit_vector) + 1) / 2
r, g, b = color
r = int(factor * r)
g = int(factor * g)
b = int(factor * b)
return np.array([r, g, b], "uint8")
I will normalize the dot product to [0, 1] range and multiply RGB values by 1 minus it, but you can use fancier function for that.
📢 DISCLAIMER 📢
I chose to bind the direction of light with position/direction of the camera. To make it independent, you would need to make the same steps, but instead of using projected points of a triangle, use it's untransformed world coordinates.
Now let's modify _render_triangle
:
renderer.py
class Renderer:
...
def _render_triangle(self, points, color):
shaded_color = self._calculate_color(points, color)
...
for screen_x, scene_x in enumerate(self._range_x):
...
for screen_y, scene_y in enumerate(self._range_y):
...
self._screen[screen_y, screen_x, :] = shaded_color
self._z_buffer[screen_y, screen_x] = depth
Change cube's colors to white to see the difference better:
Something more complex than a cube.
If you want to see it in action with something more interesting than a cube, copy content of sherman.py on Github. I created this model from scratch by adding primitives and methods for manipulating and merging objects.
I also used interpolation create to this animated sequence of rendered images:
So I can say mission accomplished, it's exactly the thing I wanted for the task of creating sprites.
If there is some interest in this topic I'm not ruling out creating tutorials extending the renderer. Possible topics include shades, animation and texturing.
Top comments (1)
Very nice.. I wish I had done this myself. Great way to get basic 3D graphics down in an understandable way.