Not serializable result: org.apache.hadoop.io.IntWritable when reading Sequence File with Spark / Scala

371 Views Asked by thebluephantom At 17 August 2025 at 18:42

Reading a sequence file with Int and String logically,

then if I do this:

val sequence_data = sc.sequenceFile("/seq_01/seq-directory/*", classOf[IntWritable], classOf[Text])
                  .map{case (x, y) => (x.toString(), y.toString().split("/")(0), y.toString().split("/")(1))}
                  .collect

this is ok as the IntWritable is converted to String.

If I do this:

val sequence_data = sc.sequenceFile("/seq_01/seq-directory/*", classOf[IntWritable], classOf[Text])
                  .map{case (x, y) => (x, y.toString().split("/")(0), y.toString().split("/")(1))}
                  .collect

then I get this error immediately:

org.apache.spark.SparkException: Job aborted due to stage failure: Task 5.0 in stage 42.0 (TID 692) had a not serializable result: org.apache.hadoop.io.IntWritable

Underlying reason is not really clear - serialization, but why so difficult? This is another type of serialization aspect I note. Also it is only noted at run-time.

Original Q&A

There are 1 best solutions below

OneCricketeer On 10 November 2018 at 14:16 BEST ANSWER

If the goal is to just get an Integer value, you would need to call a get on the writable

.map{case (x, y) => (x.get()

And then the JVM handles serialization of the Integer object rather than not knowing how to process a IntWritable because it doesn't implement the Serializable interface

String does implement Serializable

Not serializable result: org.apache.hadoop.io.IntWritable when reading Sequence File with Spark / Scala

There are 1 best solutions below

Related Questions in APACHE-SPARK

Related Questions in HADOOP

Related Questions in SERIALIZATION

Related Questions in SEQUENCEFILE

Trending Questions

Popular # Hahtags

Popular Questions