I have to conclusion how to setup pyspark for building ETL pipeline with Jupyter Notebook by summary step like this.
First step, Install python 3.9.1 for use on python.
Second step, install scala $ brew install scala
Third step, start with $ pyspark
Fourth step, run data frame
Reference: apache-spark, Getting started with mongodb, pyspark and jupyter-notebook, How to install pyspark on mac
Top comments (0)