So I'm trying to build a parser that basically takes in a fixed-length file w/ multiple records separated by newline (that each have a variable number of repeating segments) and parses it to a POJO. Then it should write the POJO to a JSON document as well as insert it as a document into a MongoDB collection (collection per fixed length file; initial thought here is that I can just import created JSON to db, but I'm not sure if that's more or less efficient).
HOWEVER the parser should also be able to take some kind of mapping csv-type file that defines the field names + lengths of the fixed-length file. Effectively, this should allow the parser to parse any fixed-length file, given a mapping file.
My thoughts so far:
thinking of using Apache Camel to handle the unmarshalling of data from fixed-length to POJO (BeanIO component) as well as marshalling of POJO to JSON
Parse CSV to get field names coupled with field length. Find some way to define a POJO and JSON schema with this information (for sake of simplicity I'm assuming I can also pull data type of field from this csv)
What I need help with:
Is there a way to generate POJO/class definitions from the data I can pull from the CSV?
Is it also possible to generate some kind of JSON schema from CSV to marshal the POJO to?
Thanks. Might have more questions as I think about this, but this is all I have for now.
The BeanIO data format in camel will help you to write a mapping between records in CSV file and POJO. There is no way by which you can generate POJO classes from CSV file, you will have do it manually.
Once you create your mapping file, you can process your files as follows: