Dependencies for Spark-Streaming and Twiter-Streaming in SBT

888 Views Asked by At

I was trying to use the following dependencies in my build.sbt, but it keeps giving "unresolved dependency" issue.

libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter_2.11" % "2.2.0.1.0.0-SNAPSHOT"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.2.0"

I'm using Spark 2.2.0. What are the correct dependencies?

2

There are 2 best solutions below

1
On

Below are the dependencies you need to add for Spark-Twitter Streaming.

<dependency>
    <groupId>org.apache.spark</groupId>
    <artifactId>spark-streaming_2.11</artifactId>
    <version>2.0.0</version>
</dependency>
<dependency>
   <groupId>org.apache.bahir</groupId>
   <artifactId>spark-streaming-twitter_2.11</artifactId>
   <version>2.0.0</version>
</dependency>
<dependency>
   <groupId>org.twitter4j</groupId>
   <artifactId>twitter4j-core</artifactId>
   <version>4.0.4</version>
  </dependency>
 <dependency>
   <groupId>org.twitter4j</groupId>
   <artifactId>twitter4j-stream</artifactId>
   <version>4.0.4</version>
   </dependency >
 <dependency>
  <groupId>com.twitter</groupId>
  <artifactId>jsr166e</artifactId>
  <version>1.1.0</version>  
</dependency>
0
On

The question was posted a while ago, but I ran into the same problem this week. Here is the solution for those who still have the problem :

As you can see here, the correct syntax of the artifact for importing the lib with SBT is "spark-streaming-twitter", while with Maven it is "spark-streaming-twitter_2.11". It is because, for some reason, when importing with SBT, the Scala version is appended later (the last number is truncated).

But the thing is that the only artifact that work is "spark-streaming-twitter_2.11". For example, with a Scala 2.12, you will have the error

[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  ::          UNRESOLVED DEPENDENCIES         ::
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::
[warn]  :: org.apache.bahir#spark-streaming-twitter_2.12;2.3.2: not found
[warn]  ::::::::::::::::::::::::::::::::::::::::::::::

But if you use Scala 2.11, it should work fine. Here is a working sbt file :

name := "twitter-read"

version := "0.1"

scalaVersion := "2.11.12"


libraryDependencies += "org.apache.spark" %% "spark-core" % "2.4.2"
libraryDependencies += "org.apache.spark" %% "spark-sql" % "2.4.2"
libraryDependencies += "org.apache.spark" %% "spark-streaming" % "2.4.2" % "provided"

libraryDependencies += "org.twitter4j" % "twitter4j-core" % "3.0.3"
libraryDependencies += "org.twitter4j" % "twitter4j-stream" % "3.0.3"

libraryDependencies += "org.apache.bahir" %% "spark-streaming-twitter" % "2.3.2"