Replace & with & using Jackson ObjectMapper

2.3k Views Asked by At

In the application that I'm working we have a requirement to transform a huge json into an even bigger XML. The structure of both elements is very different, therefore we have decided to create an XML file matching the XSD and populate the fields using unified expression language. For example:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<Objects xmlns="..">
  <Object>
    <Field>${json.field}</Field>
  <Object>
<Objects>

The replacement of ${json.field} is done using JUEL

After running the JUEL process we unmarshall the xml string into the object and we continue with the process. The unmarshal code is something like this:

  private XmlObjects unmarshal(StringReader xmlString) {
    try {
      Unmarshaller unmarshaller = jaxbContext.createUnmarshaller();
      return (XmlObjects) unmarshaller.unmarshal(xmlString);
    } catch (JAXBException e) {
      throw new RuntimeException(e);
    }
  }

The problem that we are facing is that the json.field can contain characters that are not allowed in the XML, for example an & or <, >.

An easy fix is to replace all the & by & en the above method, but that is not going to solve the problem of < or > and I cannot replace that in that point.

What I would like to do is to use Jackson to make the replacement when the json is mapped into the POJO, but I cannot find a way to do it. So far I've tried creating a custom CharacterEscapes class and setting that to the ObjectMapper but didn't work.

So, this is the test that summarizes everything:

  @Test
  public void test() throws IOException {
    ObjectMapper objectMapper = Jackson.newObjectMapper();
    objectMapper.getFactory().setCharacterEscapes(new XMLCharacterEscapes());

    String json = "{\"variable\":\"a string with &\"}";
    FooJson fooJson = objectMapper.readValue(json, FooJson.class);
    assertEquals("a string with &amp;", fooJson.getVariable());
  }

This is the XMLCharacterEscapes class:

public class XMLCharacterEscapes extends CharacterEscapes {

  private final int[] asciiEscapes;

  public XMLCharacterEscapes() {
    int[] esc = CharacterEscapes.standardAsciiEscapesForJSON();
    esc['&'] = CharacterEscapes.ESCAPE_CUSTOM;
    asciiEscapes = esc;
  }

  @Override
  public int[] getEscapeCodesForAscii() {
    return asciiEscapes;
  }

  @Override
  public SerializableString getEscapeSequence(int i) {
    return new SerializedString("&amp;");
  }
}
2

There are 2 best solutions below

0
On

Enclose your field in a CDATA section

<Field><![CDATA[${json.field}]]></Field>

1
On

If I may suggest an alternative, could you instead have your unmarshaller apply JUEL to the element values as it reads the XML? Or, could you walk the XML object graph after unmarshalling and apply JUEL to the element values?

Edit: It seems like the problem is just the order you apply replacements. Once you have an XML document, you should be able to set any values you like, and it will take care of proper escaping.