What's the best way to get started with machine learning?

Did you find this post useful? Show some love!

Hey there, we see you aren't signed in.

Please consider creating an account on dev.to. It literally takes a few seconds and we'd appreciate the support so much. ❤️

If you have the time and the love for online course, I highly recommend Udacity's Intro to Machine Learning. I think this is a great way to get a high-level understanding without digging too deep into the mathematics behind the algorithm.


Depending on your background,

Linear Algebra and Calculus is vital to understand what's happening under the hood of the algorithms.

There is plenty of resources online but it can get overwhelming without a plan of attack so my two cents are this:

You can follow this plan created by Siraj


as well as a fast-pace crash course created by Google,


Happy learning!

Obviously I think my Machine learning textbook is pretty good :) However, the current version is admittedly rather hard for beginners. I am actually in the process of writing the second edition, which will ramp up more slowly, making it more accessible to beginners. (I'm also adding new content on deep learning, reinforcement learning, etc.) But it will take me a while to finish (~2 years?).
You can also learn machine learning here: hackr.io/tutorials/learn-machine-l...

In the meantime, there are many good books available. See eg this list: josephmisiti/awesome-machine-learning . One book which I think is particularly good for beginners is Introduction to Statistical Learning, by James, Witten, Hastie and Tibshirani. It has a few references to concepts from frequentist statistics, such as p-values (which you can safely ignore :), and doesn't cover topics such as deep learning or graphical models, but nevertheless it's a good place to start.

Classifying documents, or websites, can be an interesting way to start. Start with a list of domains that you like on a topic, and a bunch of unrelated domains. Now try to build a system that will pick out the other domains you like. This will introduce several of the concepts behind machine learning:

  • attribute extraction: picking data out of the websites for use as attributes (this can be as simple as parsing the HTML and pulling out words)
  • training: using known good data (a subset of the domains you like) to train the system
  • classification: let the system run on the new data. You have only two categories at this point so it should be easy to manually confirm the results.

I'm specifically not including any technologies or programming languages here. All languages have numerous libraries, and just following the concept words, or just tracing machine learning should pick out many. But the above gives the basic outline of what you're trying to do, to help guide the search.

(Note, this basic task doesn't involve neural networks. Don't get side-tracked, at least not yet at least.)

Once you get into it you'll start learning about the statistical models being used, and can start branching off. You can look at classifying more categories. Identifying attributes automatically. Correlating documents to each other. Pattern and behavior prediction.

Classic DEV Post from Jan 12

What was the worst bug you've ever written?

So I just wrote my worst bug ever. The Practical Dev @thepracticalde...

Follow @ben to see more of their posts in your feed.
Ben Halpern
A Canadian living in New York, having a lot of fun cultivating this community! Creator and webmaster of dev.to.
Trending on dev.to
What is your on-boarding process at your company?
#discuss #onboarding
I've just started a new job and they want me to work with technology I see as no benefit to my career, what do I do?
software entrepreneurs: at what point did you decide to strike out on your own?
What is your favorite CI/CD tool and why?
#discuss #devops #ci #cd
Best OS for programmers
Often neglected skills new devs should learn?
#webdev #discuss #beginners
How do I get experience when I have no experience?
#career #beginners #discuss
What is a website that you visit regularly despite its technical quality and user experience?
#discuss #webdev #ux