DEV Community

Ivan G
Ivan G

Posted on

2 2

Using Deequ 1.1 with Spark 3

If you try to upgrade AWS Deequ to latest version (1.1.0) atm and use with Spark 3.0.1 you will get following error:

[error] (update) Conflicting cross-version suffixes in: org.apache.spark:spark-launcher, org.apache.spark:spark-sketch, org.apache.spark:spark-kvstore, org.json4s:json4s-ast, org.apache.spark:spark-catalyst, org.apache.spark:spark-network-shuffle, com.twitter:chill, org.apache.spark:spark-sql, org.scala-lang.modules:scala-xml, org.json4s:json4s-jackson, com.fasterxml.jackson.module:jackson-module-scala, org.json4s:json4s-core, org.apache.spark:spark-unsafe, org.json4s:json4s-scalap, org.scala-lang.modules:scala-parser-combinators, org.apache.spark:spark-tags, org.apache.spark:spark-core, org.apache.spark:spark-network-common
[error] Total time: 6 s, completed 10-Feb-2021 13:07:46
Enter fullscreen mode Exit fullscreen mode

This is due to the fact that Deque has transitive dependencies to Scala 2.11 for some unknown reason (a bug?). You can fix that by using the following build.sbt:

name := "dq"

scalaVersion := "2.12.12"

val sparkVersion = "3.0.1"

libraryDependencies += "org.apache.spark" %% "spark-sql" % sparkVersion % "provided"


// https://mvnrepository.com/artifact/com.amazon.deequ/deequ
// issue with Deequ transitive libs cross-compiled to Scala 2.11
libraryDependencies += ("com.amazon.deequ" % "deequ" % "1.1.0_spark-3.0-scala-2.12")
  .exclude("org.scalanlp", "breeze_2.11")
  .exclude("com.chuusai", "shapeless_2.11")
  .exclude("org.apache.spark", "spark-core_2.11")
  .exclude("org.apache.spark", "spark-sql_2.11")
Enter fullscreen mode Exit fullscreen mode

P.S. Originally published on my blog.

Top comments (0)

Postmark Image

Speedy emails, satisfied customers

Are delayed transactional emails costing you user satisfaction? Postmark delivers your emails almost instantly, keeping your customers happy and connected.

Sign up