IntelliJ and Spark is the best combination for doing the real Big Data development. IntelliJ IDEA is the best IDE for Spark, whether your are using Scala, Java or Python. In this guide we will be setting up IntelliJ, Spark and Scala to support the development of Apache Spark application in Scala language.
Install IntelliJ Scala plugins
First of all we need to install the required plugins into our IntelliJ. Go to File -> Settings -> Plugins and look for both Scala and Sbt.
After installing them, you might need to restart your IDE. Do that if prompted.
Create Spark with Scala project
No we can start creating our first, sample Scala project. Go to File -> New -> Project and then Select Scala / Sbt
On the next screen choose the right version of Scala. Your chosen version should be compatible with the version of Spark you will be using. In my case it was Scala 2.12
Add Spark libraries to Sbt
Now in our newly created project, find build.sbt file, and add the following lines:
name := "SparkTest"
version := "0.1"
scalaVersion := "2.12.8"
libraryDependencies ++= Seq(
"org.apache.spark" %% "spark-core" % "2.3.3",
"org.apache.spark" %% "spark-sql" % "2.3.3"
)
After that, IntelliJ should ask if you want to download new dependencies. If prompted, click yes. In a situation when IDE is not asking you about that, you might have automatic downloading of dependencies turned on, which is totally fine. These two libraries will add support for Spark code we will be writing in a moment.
Run the Spark Scala application in IntelliJ
Let’s create a basic application and test if everything runs properly. Create an object named FirstSparkApplication and paste in the code below:
import org.apache.spark.sql.SparkSession
object FirstSparkApplication extends App {
val spark = SparkSession.builder
.master("local[*]")
.appName("Sample App")
.getOrCreate()
val data = spark.sparkContext.parallelize(
Seq("I like Spark", "Spark is awesome", "My first Spark job is working now and is counting down these words")
)
val filtered = data.filter(line => line.contains("awesome"))
filtered.collect().foreach(print)
}
Now just execute it, and in the run console, you should see a string Spark is awesome.
Summary
I hope you have found this post useful. If so, don’t hesitate to like or share this post. Additionally you can follow me on my social media if you fancy so :)
Setting up @intellijidea for @ApacheSpark and @scala_lang language. Improve your #BigData workflow, by automating compilation and #testing in #Spark.
bartoszgajda.com/2019/07/05/set…
#intellij #intellijidea #apache #apachespark #bigdata #scala #programming #softwareengineering20:12 PM - 06 Feb 2020
Top comments (0)