Short/TLDR version
Last year, I got an old miner (a PC with multiple GPUs) for free, and I have been learning some Machine Learning recently. I decided to revive the old miner and use it for ML. It was fun and rewarding, and it reminded me of my interest for PCs. However, I think renting a cloud computer with GPU is a better option, unless you have a free PC and want to experiment. Some cloud services, like Google’s Colab, are even free.
Background
I have been learning Machine Learning (ML) from Jeremy Howard’s FastAI's course. The course recommends using cloud computers for learning ML, as they are more convenient and powerful. However, I wanted to run my Jupyter notebooks locally, by using an old miner that I got for free. My work mate left me this miner (equipped with 4 GPUs), it was too big bring it along when he moved to another city.
1st challenge: No hard drive and OS.
The old miner did not have a hard drive and an operating system (OS). My old Linux HDD that I had as a backup failed to boot, so I bought a new cheap 2TB HDD to start with. I also wanted to learn how to boot over the network using PXE boot.
I chose Iventoy as my PXE boot server, because it seemed simple and easy to use. I had an old Thinkpad with Ubuntu installed, which I used as the host machine. I connected the Thinkpad and the miner with a cross cable. The general steps I followed were:
- Download Iventoy
- Download ISO and copy to ISO directory
- Start Iventoy
This guide helped me to understand the basic steps
I encountered some minor issues, such as the miner having only 4GB RAM (I later upgraded it to 16GB). The PXE boot process would copy the ISO image to the RAM, which meant that larger ISOs like Ubuntu 22.04 would not fit. I decided to try Proxmox, which had a fairly small ISO.
2nd challenge: Accessing the GPUs in virtualization.
I chose Proxmox as my OS, because it had a small ISO and it worked with PXE boot. I also wanted to have the flexibility to experiment with different virtual machines on the miner. However, this option also posed a new challenge: how to access the GPUs from the guest OS.
At first, Proxmox would not install, because it said that the hardware did not support virtualization. I thought the miner was too old for that. But after checking the Proxmox requirements and the Intel ARK specs, I realized that it should work. I just had to enable some settings in the BIOS:
- Intel VT - required for hypervisors like Proxmox
- VT-d - required interrupt remapping for PCI passthrough
After Proxmox was installed, I had to make the guest OS use the four GPUs exclusively. This required a PCI passthrough, which allows the guest OS to directly access the hardware devices. The following guides were very helpful:
This video also helped a lot
To verify that I got it working, I used nvidia-smi and pytorch:
3rd problem: Finding a place for the big machine.
I was so absorbed in getting the miner to work that I didn't think about where to put it. It was a big and noisy machine, and it needed a lot of power and cooling. I learned how to make all four GPUs run together, and I forked this benchmark to test them with PyTorch DDP. I was able to train a model using all four GPUs, which was very satisfying. Comment below, if you want me to write about this in detail.
My wife noticed that I was very happy with my project. She said, "You always enjoy fixing and solving things." She was very supportive of me and suggested that we find a place for the miner. We decided to get a glass coffee table and put the miner underneath it for now. It looked nice, as my daughter said, "It looks legit dad!".
Top comments (0)