How can I parse a csv in low memory, using some parser in Java?

1.7k Views Asked by At

I used InputStream, and on parsing, if there is a "," in one column then it considers it as a separate column. ex - abc, xyz, "m,n" then the parsed output is abc , xyz, m, n Here m and n are considered as separate columns.

2

There are 2 best solutions below

3
On BEST ANSWER

I really like the Apache Commons CSVParser. This is almost verbatim from their user guide:

Reader reader = new FileReader("input.csv");
final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT);
try {
    for (final CSVRecord record : parser) {
        final String string = record.get("SomeColumn");
        ...
    }
} finally {
    parser.close();
    reader.close();
}

This is simple, configurable and line-oriented.

You could configure it like this:

final CSVParser parser = new CSVParser(reader, CSVFormat.DEFAULT.withHeader().withDelimiter(';'));

For the record, this configuration is unnecessary, as the CSVFormat.DEFAULT works exactly the way you want it to.

This would be my first attempt to see whether it fits into the memory. If it doesn't, can you be a little more specific about low memory footprint?

0
On

There are many thirdParty Csv parsing library like

  1. UniVocity Parser

  2. CommonsCsv Parser

  3. OpenCsv Parser

  4. SuperCsv Parser

I am using UniVocity csv parser which is very fast and automatically detect separator in rows. You can go through above given csv libraries.