I love video games... even though it's not that apparent. Recently I've been looking into the making of the Godfather of all first-person shooters, the original DOOM (1993). I'll be honest the creator of DOOM, John Carmack is one name I wasn't familiar with but after I looked into his works? It's kind of weird how he's not a household name yet. I'm pretty sure he is in the world of programmers though. The actual reason I'm looking into all this is because I wanted to learn how video games work, mostly the image rendering aspect. The more I looked into it, I realized how much I'm lacking in basics and what better way to learn the basics than go through Computerphile's Graphic Series with John Chapman?
So, this article is going to be the first of a 5 part series where I break down each video of the playlist, my iteration of trying to break it down simply as John Chapman did for me.
You're saying the matrix stuff I learned in high school is important?
Just a heads up, for the people who already know about matrixes and dread it (like me) I promise you I'm going to make this interesting. For those who don't, prepare to get your mind blown :)
First question, how do we take a 3D object and represent it in the computer?
We use something known as a Coordinate System (Specifically Cartesian Coordinate System). It involves an x-axis and a y-axis, where you can plot points with respect to the origin. The points are separated by a unit of size. So two dimensionally, you can use this system to represent the location of points as shown in the image above.
What about three dimensions? Well one more axis comes into the play, called the z-axis. If you have the power to visualize, the above image is how plotting points three dimensionally might look like.
Yay! Now you know how a Coordinate System works. Now let's jump into binary space partitioning (BSP) which is a method for space partitioning which recursively subdivides a Euclidean space into two convex sets by using hyperplanes as partitions.
Just kidding, don't stop reading yet. Let's now try to represent a simple 3D object, let's say a pyramid (pretty confident you all know what that is) in a coordinate system.
As shown in the image, a pyramid has 5 vertices. However, all we're doing right now is specifying the corners and our representation tells us nothing about the faces of the pyramid.
So how do we represent the faces?
Well one way to go about it is have three vertices per face of the pyramid. Since a pyramid has five faces (including the bottom) that's 5 times 3 equals 15 vertices overall. But... don't you think that's a lot of duplication for the vertices?
For a pyramid it's not a big deal. I get it. What about a 3D object that has hundreds and thousands of points? The duplication depending on the shape can be insanely huge which none of us wants. Bleh.
If you look at the pyramid in the image, we have named each and every point with an alphabet. Using that, I can say one of the faces has points ABC. Where each alphabet represents that point. Similarly, another face can have ADC, AEB and so on and so forth. Congrats, we successfully represented the faces of a pyramid without using 15 individual points!
Wait! We're not done yet. What about the bottom face? The bottom face of a pyramid is a square. So according to the image we can represent it as a square and make it have 4 vertices. In this case it'll be EDCB. So, problem solved? Umm...well yes, but no.
What do you mean by 'yes, but no?'
A better approach is to divide the rectangular base into two triangles, as shown in the image. Why triangles? This can get a little trippy so bear with me XD
The best option for situations like these is to always divide these objects into triangles. In the words of John Chapman,
"Any 3D surface in the universe is able to be approximated by fitting together triangles."
But why though? Why triangles?
- It's the simplest polygon. There are only 3 vertices. No brainer.
- Triangles are easy and fast for GPUs to process. They are more or less the native language of GPUs.
- Most importantly, any arrangement of the three vertices in a 3D space will always be coplanar. Don't worry, I'll explain.
What does coplanar even mean?
Imagine a triangular piece of rigid cardboard. Hold two corners down and try to lift it from the third one. Does the triangle bend? I don't think so.
Now try the same with a rectangular piece of cardboard. Hold down 3 points, try lifting from the fourth one. I don't know about you, but I'm pretty sure you can't do that without bending the cardboard.
That's all what coplanar means. In the case of a triangle, all three points must lie on one plane. It can rotate or tilt, but it must never bend.
What's crazy is that every shape can be divided into triangles! Have a look at the example given below on how a cylinder is divided into triangles.
The more triangles we use, the smoother the edges become.
Before I 'wind' (no pun intended) up this article, there's one more problem we need to address. How do we recognize the faces of an object? Heads up: the names we gave to each vertex comes into play here. Recognizing the faces of an object is a very important concept that comes into play for advanced concepts such as lighting and shadows on 3D objects. We normally call this process as 'winding'.
This was pretty fun. I almost forgot what it was like researching topics on your own and writing articles from scratch. Hopefully my motivation lasts for the next 4 articles in this series.
Thank you so much for reading guys! Have a look at the video I was talking about below:










Top comments (0)