Data is the new oil, every day a huge amount of data is generated the world over. Virtually every human action today generates some data. Human beings are in a race to make meaningful use of this data to arrive at decisions that improves their lives. This has given rise to a whole lot new careers in this field, data science is one of them. It is the use of scientific methods, mathematics, statistics and computer science to derive meaningful insights to data. This is a brief roadmap into how you can develop your skills in this field.
What you need to learn
Mathematics
Mathematics and Statistics form the foundations of data science, we rely heavily on statistics to understand the data, calculate distributions, probabilities and make inferences. The mean, standard deviation, hypothesis testing are some of the insights derived from the data. An understanding of mathematics and statistics is therefore integral.
Python and/or R
Python and R are the most popular tools that data scientist use in this field. Python is a open source, high-level programming language that I would definitely recommend.
It is easy to learn, wide community support and has extensive libraries. R is a popular tool too and great for statistical analysis and research work. I guess you try out both and find which one works for you.
Understand databases
The next step after having your mathematics and programming tool polished is to understand databases, data is stored in databases and knowing your way around retrieving, updating and manipulating data is a must. MySQL and PostgreSQL are the most popular RDBMS and MongoDB for NoSQL.
Master visualization
To make sense of the data it has to be presented in a visual and graphical format thus make it communicate to a wider audience, They are lot of tools for this both free and one which require subscription. Tableau, Excel, Power BI, Matplotlib etc. are some of them. Master a few based on your need.
Big data and cloud
Though not entirely needed for a beginner in this field, as your grow you will come across extremely huge and complex datasets that cannot be analyzed using the methods discussed. Hadoop. Apache Kafka, Apache spark, Databricks, BigQuery, Amazon Redshift are some of the terms and technologies you'll come across.
Join competitions and practice
Every skill learnt has to be perfected through practice. Data scientists and AI Engineers use online competitions like Kaggle(by Google) and Zindi(focused on Africa) to practice, compete and be part of a community. Be sure to check them out and good luck if you decide to enroll in a competition.
Get a job
Easier said than done but at this point getting an entry level job or internship is a good idea. Industry experience is different and you get to work on real-life situations affecting businesses and societies.
Be part of a community
Everyone like to be part of a community, go on you are into data now. Learn, engage and contribute
Top comments (0)