Convert InputStream to Java/Scala object

2.6k Views Asked by At

While reading from InputStream, how do I convert the InputStream to a java/scala object? An example use case being, receive a CSV file as stream and parse the CSV row by row, on the fly.

For example: I have

case class Row(v1: String, v2: String, v3: String)

and a sample CSV file's single row is (Andy, Morgan, Male). Now suppose I receive this CSV InputStream and this CSV has millions of rows and can't be held into memory. Is it possible to cast the InputStream to the above mentioned case class, use it for my purpose, discard the instance of this case class and repeat this process for the entire stream.

A vague example would be on the lines of:

try( val inputstream = new FileInputStream("file.txt") ) {
  var data = inputstream.read();
  while(data != ???){
    ////// somehow convert/buffer the data and convert to Row class mentioned above
    data = inputstream.read();
  }
}

I want to understand the internals, so I'd be very thankful to a solution in native java/scala without any 3rd party libraries.

1

There are 1 best solutions below

0
On

There's actually a Java inputstream called ObjectInputStream that you could use for this exact purpose of casting as a class.

try{
    val fileInputStream = new FileInputStream("file.txt")
    val objectInputStream = new ObjectInputStream(fileInputStream)
    var data = objectInputStream.readObject.asInstanceOf[Row]
    while(data != ???){
        /*Do stuff here*/
        data = objectInputStream.readObject.asInstanceOf[Row]
    }
}catch{
     /*Catch cases*/
}

Of course this assumes that your inputStream is streaming these Row objects, otherwise you'd have to go about this differently. If the file is to big to fit in memory to begin with, you may want to consider streaming these Row objects rather than to do this on the receiving end.

I did, however find this class, which seems fairly versatile that has constructors like so (among many others):

CSVFileReader(File f) /*or*/ CSVFileReader(String filename, CSVFormat format)

There is a method to read in by line (readLine() which returns a CSVLine) which may help you in converting to the Rowobject.

Hope this helps!