Is it possible to parse the CSV rows into JsonNodes having the correct value types

1k Views Asked by At

Its easy to parse a csv into a List<Map<String,String>> as described in https://github.com/FasterXML/jackson-dataformats-text/tree/master/csv#with-column-names-from-first-row

However, then, type inference has to be done manually. E.g. like the stuff I try to do below? Using Boolean.parseBoolean etc feels like bloat to me, as jackson does all that stuff pretty nicely...

CsvSchema schema = CsvSchema.emptySchema().withHeader();
CsvMapper mapper = new CsvMapper();

ObjectReader with = mapper.readerFor(ArrayNode.class).with(schema);
mapper.enable(CsvParser.Feature.WRAP_AS_ARRAY);
JsonNode readTree = with.readTree(new FileInputStream(file));

JsonNode jsonNode = arrayNode.get(0);
if (jsonNode.isObject()) {
    Iterator<Entry<String, JsonNode>> fields = jsonNode.fields();
    int counter = 0;
    while (fields.hasNext()) {
        Entry<String, JsonNode> entry = fields.next();
        JsonNode value = entry.getValue();
        if (value.isBoolean()) {
            // nah
        } else if (value.isNumber()) {
            // yeah
        } else if (value.isTextual()) {
            // nah
        }
    }
}

Is it possible to parse the CSV rows into JsonNodes having the correct value types without relying on a POJO to provide them?

1

There are 1 best solutions below

1
On BEST ANSWER

There are two ways to read your question.

Is it theoretically possible to parse the CSV rows into JsonNodes having the correct value types .... ?

The answer is Yes. For example:

  • You can write your own CSV parser from scratch that emits JsonNode objects.
  • You can take the List<Map<String,String>> from your existing CSV parser, apply some heuristics1 to convert to the corresponding JsonNode structure (whatever that might be).

Is it possible to parse the CSV rows into JsonNodes having the correct value types ... using an existing parser?

The answer is almost certainly No. A well-designed2 general purpose CSV parser won't emit JSON data structures, and a well-designed2 general purpose JSON parser won't accept CSV format input.


But reading between the lines, I suspect that you are using JsonNode just because it is a convenient way of representing loosely typed information. But Strings work just as well (as pure representation), and you can implement the conversion to (more) typed representations using a simple custom utility or wrapper class.


1 - Herein lies a problem. CSV is essentially typeless, so you need to use a heuristic to tell you if a value, is a boolean, an integer, a floating point number, or a string. But the conversion is ambiguous, and the lines in a CSV file can be inconsistent. So your conversion to JsonNode objects is liable to be unreliable if you don't have a "schema" for the CSV file.

2 - In both cases, this would violate the Separation of Concerns (SoC) design principle. This does not mean that such an API would always be wrong. Sometimes it is appropriate to ignore design principles in specific circumstances for pragmatic reasons. However, such a design would not be general purpose.