Pushing Spark Streaming RDDs to Neo4j -Scala

693 Views Asked by Naren At 09 October 2025 at 06:39

I need to establish a connection from Spark Streaming to Neo4j graph database.The RDDs are of type((is,I),(am,Hello)(sam,happy)....). I need to establish a edge between each pair of words in Neo4j.

In Spark Streaming documentation I found

dstream.foreachRDD { rdd =>
  rdd.foreachPartition { partitionOfRecords =>
    // ConnectionPool is a static, lazily initialized pool of connections
    val connection = ConnectionPool.getConnection()
    partitionOfRecords.foreach(record => connection.send(record))
    ConnectionPool.returnConnection(connection)  // return to the pool for future reuse
  }
}

to the push to the data to an external database.

I am doing this in Scala. I am little confused about how to go about? I found AnormCypher and Neo4jScala wrapper. Can I use these to get work done? If so, how can I do that? If not, all there any better alternatives?

Thank you all....

Original Q&A

There are 2 best solutions below

Michael Hunger On 25 June 2015 at 20:50 BEST ANSWER

I did an experiment with AnormCypher

Like this:

implicit val connection = Neo4jREST.setServer("localhost", 7474, "/db/data/")
val conf = new SparkConf().setAppName("Simple Application")
val sc = new SparkContext(conf)
val logData = sc.textFile(FILE, 4).cache()
val count = logData
  .flatMap( _.split(" "))
  .map( w =>
    Cypher("CREATE(:Word {text:{text}})")
      .on( "text" -> w ).execute()
   ).filter( _ ).count()

Neo4j 2.2.x has great concurrent write performance that you can use from Spark. So the more concurrent threads you can have to write to Neo4j the better. If you can batch statements in batches of 100 to 1000 each per request then even better.

Dave Fauth On 25 June 2015 at 20:43

Take a look at MazeRunner (http://www.kennybastani.com/2014/11/using-apache-spark-and-neo4j-for-big.html) as it will give you some ideas.

Pushing Spark Streaming RDDs to Neo4j -Scala

There are 2 best solutions below

Related Questions in SCALA

Related Questions in NEO4J

Related Questions in APACHE-SPARK

Related Questions in SPARK-STREAMING

Related Questions in ANORMCYPHER

Trending Questions

Popular # Hahtags

Popular Questions