Rendering myself with NVIDIA's Instant NeRF

#ai #nvidia #nerf #opensource

Back in January 1999 a few friends and I went to the cinema to watch the movie "Enemy of the State". It was a really popular movie and most likely you have seen it, but if you have not seen it in short it is a high tech action thriller where the main character is chased by some secret agents. In the movie there is one scene where Jack Black as the main hacker in the movie is using images from a video camera and reconstructing 3D scene trying to figure out what Will Smith is carrying in the bag. I remember we were happy with the movie, but we commented how that scene was way too far stretched. Today, 23 years later, I have to say technology went really a long way from where it was back then, and this is not so far stretched anymore.

NeRF (Neural Radiance Field) is a fully connected neural network capable of generating 3D scene from collections of 2D images. NVIDIA's Instant NeRF is a fastest neural rendering model developed so far that achieved up to 1000 times speedups, and it is capable of rendering 3D scenes in seconds. Instant NeRF was showcased at NVIDIA GTC at the end of March this year, examples shown were just amazing, you can check out the official announcement here.

I had to try it out and I decided to render myself standing in the forest. If you think about it, the scene itself is pretty complex with trees around, but I wanted to see how well Instant NeRF can figure out such a scene. My wife Helena took 150 pictures walking around me in the forest making almost a full circle. Following preview shows some of the pictures taken:

NVIDIA's Instant NeRF is available on GitHub, if you would like to try it out you will of course need the NVIDIA GPU. Examples shown on the repository are rendered with RTX 3090. Unfortunately, I have a GTX 1660 Ti with only 1536 CUDA cores (RTX 3090 has 10496 CUDA cores), but I still managed to get really nice results.

In this blogpost I will give you a high overview regarding setup and also some tips that I figured out on the way, beside the info in official repository you can find a great tutorial here if you would like to know how it is done in the details.

After you clone the repository locally, you should install all dependencies and make a build with cmake. Copy pictures into your data folder, and run colmap2nerf.py in order to generate the transform.json file.

python scripts/colmap2nerf.py --colmap_matcher exhaustive --run_colmap --aabb_scale 8 --images data/<insert data folder name>

Please note that aabb_scale parameter specifies the extent of the scene, default value is 1 and it can be set to larger values by power of 2, up to 128. This parameter actually defines a bounding box around your scene, Instant NeRF assumes pictures are taken in a way an object of interest is always in the center, so setting a higher number for aabb_scale parameter means you are extending the bounding box around the object in the center. For the scene I was rendering with a few tries I found value 8 was optimal.

Generated file transform.json contains a lot of information about the pictures used (such as path, sharpness, transform matrix, etc.), and you need this file in order to run 3D scene rendering:

<insert path to instant-ngp>\build\testbed.exe --scene data/<insert data folder name>

Neural Graphic Primitives window will pop up and initially you will be able to see something like this:

In just a few seconds picture will become much more clear and you can easily monitor the progress:

If you are, like me, using not so powerful graphic card as RTX 3090 you can pause training in order to zoom or rotate scene around, otherwise you can do it live. After a minute or two you will not be able to see any more improvements and if you are satisfied with the result you can stop the training. Use camera option to set waypoint around the scene, save current snapshot and waypoints, and go back to command prompt to render it into a video with render.py script:

python scripts/render.py --scene data/<insert data folder name> --n_seconds <insert number of seconds> --fps <insert number of fps> --render_name <insert your video name> --width <insert video width> --height <insert video height>

Final result of my render can be seen in the following video:

I personally find photogrammetry amazing, recently I have seen several great things in this field of science, and NVIDIA's Instant NeRF is really impressive. Being able to generate 3D scenes and objects from 2D images opens a lot of possibilities for generating digital twins, quickly training autonomous robots to navigate environments and much more.

My wife Helena (kudos for taking pictures of me in the forest) said this video render looks like it is from some parallel dimension. No matter how great the video looks, I have not managed yet to explain to her why I need to invest a few thousand dollars in a new graphic card, but I will continue to work on that. ;)

Thanks for reading, if you found it interesting feel free to like and share!

DEV Community

Rendering myself with NVIDIA's Instant NeRF

Top comments (0)