DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’»

DEV Community πŸ‘©β€πŸ’»πŸ‘¨β€πŸ’» is a community of 966,904 amazing developers

We're a place where coders share, stay up-to-date and grow their careers.

Create account Log in
James McPherson
James McPherson

Posted on

How I started learning Apache Spark

I've realised over the years that the best way for me to start learning a new language, toolkit or technology is to dive right in and start trying to solve problems with it.

This is most definitely true for Apache Spark, which I had to do recently in order to prepare for a #DataScience interview.

I wrote a utility to Extract information from my 6+ years of PV Inverter data, Transform it and Load it (#ETL) into #DataFrames which I query for record dates, minimum and maximum output as well as daily average output. Keeping with my standard practice, I've put that code on GitHub, and written a blog post about the process. See more (much more!) at https://www.jmcpdotcom.com/blog/posts/2019-10-11-apache-spark-init/

Apache #Spark, #ETL, #Python

Top comments (0)

🌚 Life is too short to browse without dark mode