DEV Community

Cover image for 5 Reasons Why Google Colaboratory is the Right Tool for Beginner Data Scientists
Rodolfo Mendes
Rodolfo Mendes

Posted on • Originally published at reinforcementlearning4.fun

5 Reasons Why Google Colaboratory is the Right Tool for Beginner Data Scientists

With the growing value of big data and machine learning, Data Science attracted interest from professionals of various areas of expertise. You are one of these professionals, and then you studied linear algebra, calculus, probabilities, machine learning, and now you want to put this knowledge in practice.

All you want to do is to load some small data, perform some exploration, create some visualization, and train a simple model. Then you go to the Internet searching for the right tool to start your brand new data science project, and you find a lot of options. You install new software, libraries, and spend some time reading tutorials. But you still can't decide which tool to use.

In the next sessions, we help you with this decision by listing five reasons that make Google Colab the right tool for beginner data scientists.

1. It does not require installation

Although software installation should not be a problem, you still have to spend some time with this activity. You have to check installation instructions, you spend time choosing the right software version, and sometimes you even have to free space on your disk. After some hours, you notice that what should be a straightforward step takes much more time than you would like. As time goes by, you feel so frustrated that you decide your data science project can wait a little longer, and you then you go to play some video-games.

So, this is the first reason why Google Colab is excellent for beginners. Because the tool is hosted on the cloud, you don't have to waste time with software installation. All you need is an Internet connection and a Web browser, and you can start coding your data science project right away, with absolutely no chance for procrastination.

2. Pre-installed libraries

Installing a programming language and a development environment is not enough to start developing your data science project. You also need to install various libraries to help you with tasks like manipulating datasets, plotting graphics, or training machine learning models. And although installing these libraries should not be a challenging task, beginners can make some mistakes that could take some time to fix, like installing old versions.

With Google Colab, the most used data science libraries, like pandas, scikit-learn, and Tensorflow, are installed by default. It does not mean that you have to limit yourself to these tools. You can install additional libraries if you need them. But having the most used libraries already installed saves you precious time, and helps you to bootstrap your project more quickly. 

3. Great documentation

Great tools are useless if you can't learn how to use them. For Google Colab, this is not a problem. At the Google Colab homepage, you find comprehensive documentation so that you can use the tool more effectively.

The documentation begins with simple instructions on how to use cells and execute Python commands. As you read the following sections, you find more complex examples and complete guides on how to load data from different sources, how to visualize data, and even how to work with accelerated hardware. Finally, the Google Colab documentation contains end-to-end examples so that you can put all the pieces together.

Following the documentation saves you from one more tedious, boring task, which is searching the Internet for guides and tutorials. With Colab documentation, you have a lot of resources in one place. This way, you can concentrate on active learning and projects instead of losing time on Web search and forums.

4. Easy to share your work

Once you start developing your analysis and models, you want to publish your portfolio so that potential employers can discover your skills. Google Colab saves your notebooks directly into your Google Drive account so that it's straightforward to share your work. You can make the document public to the Internet or can restrict the access only to people with the shareable link. Another option is to save the notebook directly to one of your Github repositories, just like a regular file system, so you don't have to deal with Git commands like addcommit, and push, which can be a little confusing for beginners. 

5. Free GPUs and TPUs

After gaining some practice with tabular data and regular machine learning models, you decide to move forward and decide to try some Deep Learning projects, like to build an image classifier or transfer Van-Gogh style to your family picture. However, Deep Learning algorithms require lots of data and training time, and using regular CPUs can make your project impractical.

Because of the massive matrix operations used in Neural Networks, GPUs can significantly increase the performance of Deep Learning algorithms. But the problem is the high price of GPUs. They can be too expensive, and if you have a notebook, you must buy a new one, because you can't install a new video card. Another option would be to provision a GPU powered virtual machine in a cloud provider. But once more, the high prices are impractical for beginners and even for small professional projects. Also, in case you already have a GPU installed in your computer, you still need to install and configure the proper drivers, so the numerical libraries can access and use the GPU for calculations, which can be a time -consuming and error-prone task.

With Google Colab, you have free GPUs and TPUs available to run your notebooks. By navigating to your notebook settings, you enable your GPU or TPU with just one click. And because you have pre-installed libraries, you also don't need to worry about configuring drivers to use the GPU. All you have to do is to enable the hardware acceleration and import the proper libraries to your code.  

Conclusion

Google Colab is a fantastic tool to get started with data science and machine learning. You don't have to waste time installing new software or libraries; you have proper documentation in place, examples, and code snippets to insert into your notebook. It saves all your work automatically in your Google Drive account, or you can send it to your Github repositories. Also, if you want to work on problems that require more computational power, you have free GPUs and TPUs. And all this for free. 

Finally, with Google Colab, you don't have any excuses to procrastinate your data science project. You can create your first project in just 5 minutes. By clicking the link below, you will find a short guide that will help you to create your first machine learning project quickly:

Create your first machine learning model in 5 minutes with Google Colab

So don't lose time anymore and start your project right now!

Links:

Top comments (1)

Collapse
 
ad0791 profile image
Alexandro Disla

I don't know if you are a real Data scientist or Machine Learning Engineer. But We must have an #dataScience on dev.to for sure. Since Every software engineer who sure swear they hate math. Just do think they can do everything. Well Of course i am talking about you specifically. Great Articles!!!!!