I recently got a new laptop but didn't want to say goodbye to my 2014 MacBookPro (10,1) yet. Then I remembered it has an NVIDIA graphics card, so I thought maybe I'll use it for training toy deep learning models. I never train a deep learning model before.
My goal was to training a tacotron2 model. Tacotron is Google's deep learning-based speech synthesis system. I was not able to set up my old laptop to train a tacotron2 model but found a different way to achieve my goal, i.e., using Google Colab. Still, I'm sharing my journey in case someone could learn from the mistakes I made.
I installed Ubuntu 18.04. I searched for related documentation from help.ubuntu.com or wiki.ubuntu.com but they seem outdated so I just googled "install ubuntu 18.04 MacBookPro and found a medium post. Following the instructions on the post just worked. I also installed an NVIDIA driver via Software Settings.
Training tacotron2 requires Pytorch 1.0. Although it seemed like I could install Pytorch via pip, I wanted to try installing it from source to gain better control of it. So I decided to install cuda & cudnn myself. It seemed like cuda 10.1 and cudnn 7.6.x are the latest pytorch compatible version of cuda and cudnn in March 2020.
NVIDIA's instructions for installing cuda and cudnn weren't as straightforward as I hoped them to be. After running the main cuda install command
sudo apt-get install cuda-10-1(not
sudo apt-get install cuda-10because I want to control the cuda version) I was seeing errors like
... found 'diversion of /usr/lib/x86_64-linux-gnu/libGL.so.1 to /usr/lib/x86_64-linux-gnu/libGL.so.1.distrib by nvidia-340' ...
I found a solution at stackoverflow and shamelessly applied it without understanding the commands fixed the problem. I suspected the error was due to installing an NVIDIA driver at 1. but never confirmed it. I also did not understand why the solution worked, but in the interest of time, I marched forward. cudnn installation was smooth.
Now it was time to build Pytorch from source. Although I should be using conda, I just used
pip3in the interest of time and started following the instructions. After running the grand
python setup.py install, I got stuck:
... /home/mjyc/.local/lib/python3.6/site-packages/torch/cuda/__init__.py:134: UserWarning: Found GPU0 GeForce GT 750M which is of cuda capability 3.0. PyTorch no longer supports this GPU because it is too old. The minimum cuda capability that we support is 3.5. ...
It turns out MacBookPro 10,1's GPU GeForce GT 750M was too old for PyTorch 1.4 (the latest compatible PyTorch version seems to be 0.3.1). My first reaction to this error message was just buying an external GPU (eGPU). However, quick google search results showed that the eGPU case alone costs ~$300.00! and learned that just choosing which GPU to buy requires some research work.
At this point, I realized I spent much more time and efforts and I originally budgeted so I gave up turning my old MacBookPro to a workstation and started looking for an alternative solution. I read a Reddit thread suggesting a cloud solution and looked into which service is a good starting point. Seems like Google Colab is a good place to start. So I stopped my exploration here.
To me, the lessons learned from this journey was always focus on the end-goal instead of the mean. Given my goal was to train a deep learning model, the mean to achieve the goal--using a laptop workstation or cloud--should not have mattered.