Scala XML pull parser and location

1.2k Views Asked by At

I have been asked to write a utility to search a large number of XML files for elements with a missing attribute. The department responsible cannot just make the attribute mandatory in the DTD because it will break hundreds of files. They want to edit them manually over a period of days/weeks.

I am writing a small command line tool in Scala 2.8.1. I will be using a "pull" parser so that I can keep my code functional-programming pure and run it multi-threaded.

I need the location of XML events. The API provided in Java 6 (javax.xml.stream. XMLStreamReader) has a method (getLocation()) that returns the line number of an event. I can use this to write messages that inform the user where to look for the missing attribute.

I would prefer to use the pull parser in scala.xml.pull.XMLEventReader, but it does not appear to offer location information.

Am I missing something? Is it somewhere else in the Scala API?

1

There are 1 best solutions below

0
On BEST ANSWER

As far as I can see, this is not provided by the XMLEventReader api.

It relies on io.Source and could provide access to the location but does not. I don't see an easy way to go around that as the object that has access to the position is private.

You may want to make your own copy of XMLEventReader that produces a custom XMLEvent with the position. The method to modify would be override def elemStart which has access to the position and could generate another EvPos(line:Int, column:Int) after each EvElemStart.

You may also consider using 2.9.RC1 as certain performance related bugs were fixed.