I have a problem with parsing an XML file. I want to remove strings with characters like \t\n.
XML File: http://ftp.thinkimmo.com/home/immoanzeigen24/immo.xml
{
trim: true,
normalize: true,
attrValueProcessors: [cleanValue, name => name],
valueProcessors: [cleanValue, name => name]
}
cleanValue:
const cleanValue = value => {
return value.toString().trim().replace("\t","atest");
};
I tried cleaning it with a lot of regex I've found online - but value always stays like following:
"verwaltung_objekt": {
"objektadresse_freigeben": "0",
"verfuegbar_ab": "nachaasjkdhkjshadjkashdAbsprache",
"bisdatum": "2016-01-15",
"min_mietdauer": "\n\t\t\t\t",
"max_mietdauer": "\n\t\t\t\t",
}
This is a difficult one!
I'd suggest following a simple strategy and pre-processing the xml data before you parse it.
This should resolve your issue at least.
If you just do something like:
Then parse the trimmed xml data. You should see the output now looks like so:
Which is a bit more like what you want!