xml2js valueProcessor removing \t and \n

1.1k Views Asked by At

I have a problem with parsing an XML file. I want to remove strings with characters like \t\n.

XML File: http://ftp.thinkimmo.com/home/immoanzeigen24/immo.xml

{
        trim: true,
        normalize: true,
        attrValueProcessors: [cleanValue, name => name],
        valueProcessors: [cleanValue, name => name]
      }

cleanValue:

const cleanValue = value => {
  return value.toString().trim().replace("\t","atest");
};

I tried cleaning it with a lot of regex I've found online - but value always stays like following:

 "verwaltung_objekt": {
      "objektadresse_freigeben": "0",
      "verfuegbar_ab": "nachaasjkdhkjshadjkashdAbsprache",
      "bisdatum": "2016-01-15",
      "min_mietdauer": "\n\t\t\t\t",
      "max_mietdauer": "\n\t\t\t\t",
}
1

There are 1 best solutions below

0
On BEST ANSWER

This is a difficult one!

I'd suggest following a simple strategy and pre-processing the xml data before you parse it.

This should resolve your issue at least.

If you just do something like:

function trimXml(xml) {
    return xml.replace(/>\s+</g, "><");
}

xml = trimXml(xml);

Then parse the trimmed xml data. You should see the output now looks like so:

"verwaltung_objekt": [
    {
        "objektadresse_freigeben": [
            "1"
        ],
        "abdatum": [
            "2017-03-01"
        ],
        "min_mietdauer": [
            ""
        ],
        "max_mietdauer": [
            ""
        ]
    }
],

Which is a bit more like what you want!