I have a route which supposes to read a huge XML file and then write a CSV file with a header. XML Record needs to be transformed first so I map it to java POJO and then marshal it again to write into a csv file.
I can't load all of the records in memory as the file contains more 200k records.
Issue: I am only seeing the last record being added to the CSV file. Not sure why it's not appending the data into the existing file.
Any idea how to make it work. The header is required in CSV.I am not seeing any other option to directly transform the stream and write headers along with to CSV without unmarshalling it to Pojo first. I tried using BeanIO as well, which requires me to add a Header record and not sure how that can be injected into a stream.
from("{{xml.files.route}}")
.split(body().tokenizeXML("EMPLOYEE", null))
.streaming()
.unmarshal().jacksonXml(Employee.class)
.marshal(bindyDataFormat)
.to("file://C:/Files/Test/emp/csv/?fileName=test.csv")
.end();
If I try to append into the existing file then CSV file appends headers to each iteration of records.
.to("file://C:/Files/Test/emp/csv/?fileName=test.csv&fileExist=append")
Your problem here is related to camel-bindy and not the file-component. It kinda expects you to marshal collection objects instead of individual objects hence if you marshal each object individually and have
@CsvRecord(generateHeaderColumns = true )on your Employee class then you'll get headers every time you marshal an individual Employee object.You could set
generateHeaderColumnsto false and start the file with headers string manually. One way to obtain headers for Bindy annotated class is to get fields annotated with DataField usingorg.apache.commons.lang3.reflect.FieldUtilsfrom apache-commons and construct headers string based on position, columnName and fieldName.I usually prefer camel-stream over file-component when I need to stream something to a file but using file-component with appends probably works just as well.
Example: