Skip to content

DEV Community

Suttipong Kullawattana

Posted on Oct 15, 2022

How to setup pyspark with Jupyter Notebook for Data Engineer

#datascience #python #beginners #tutorial

I have to conclusion how to setup pyspark for building ETL pipeline with Jupyter Notebook by summary step like this.

First step, Install python 3.9.1 for use on python.

Second step, install scala $ brew install scala

Third step, start with $ pyspark

Fourth step, run data frame

Reference: apache-spark, Getting started with mongodb, pyspark and jupyter-notebook, How to install pyspark on mac

Top comments (0)

Subscribe