How to convert WebAnno Name Entity annotation to use in OpenNLP?

606 Views Asked by At

Based in this issue I need to export in XMI format and use DKPro Core to convert to Brat format:

https://github.com/webanno/webanno/issues/328

I tried this code but did not have success

public void convert() throws Exception {

    SimplePipeline.runPipeline(CollectionReaderFactory
            .createReaderDescription(XmiReader.class, XmiReader.PARAM_SOURCE_LOCATION, "/tmp", XmiReader.PARAM_PATTERNS,
                    XmiReader.INCLUDE_PREFIX + "*.xmi"), AnalysisEngineFactory
              .createEngineDescription(BratWriter.class, BratWriter.PARAM_TARGET_LOCATION, "/tmp"));

    }
1

There are 1 best solutions below

5
On BEST ANSWER

The dialect of the brat format may be different between what the DKPro Core BratWriter produces and what OpenNLP expects - the brat file format is quite flexible.

If you are using the built-in Named Entity layer in WebAnno, then I would propose an alternative route:

  • Stay with the XMI export
  • Load the XMI with DKPro Core 1.9.0-SNAPSHOT and feed it to the OpenNlpNamedEntityRecognizerTrainer component

That should avoid the need for the additional conversion step.

Disclosure: I am a WebAnno and DKPro Core developer.

Suggestions that didn't work: