Running EPubCheck inside Nailgun or Drip

112 Views Asked by At

The epubcheck.jar tool from IDPF/W3 is expanding to cover the new possibilities in ePub. As such, the number of libraries required to check everything are increasing. This increases the run-time to complete a check of a single ePub file. epubcheck v3 was taking about 3s per epub, while epubcheck v4 is up to 6.5s. There are about twice as many underlying libraries to load.

As such, I have been looking into ways to keep a version of epubcheck running such that the JVM does not have to startup and re-load each library for each file. (We sometimes have to check hundreds of epubs at a time.)

Possible solutions to reduce the library load overhead and JVM startup time are Drip or Nailgun, but in order to load libraries and call epubcheck on the command the jar files all have to be loaded in the classpath. Then, the class com.adobe.epubcheck.tool.Checker must be called explicitly.

Using both Drip and Nailgun, I get the same SAXParseException error:

org.xml.sax.SAXParseException; systemId: jar:file:/app-lib/epubcheck-4.0.2/epubcheck.jar!/com/adobe/epubcheck/schema/20/rng/container.rng; lineNumber: 4; columnNumber: 71; root element of schema must have a namespace

This is on a file that validates just fine loading the JAR file on the command line:

java -jar /app-lib/epubcheck-4.0.2/epubcheck.jar FILE.epub

I'm at a loss as to what the issue might be, especially as Java isn't my strong suit.

1

There are 1 best solutions below

0
On

Turns out that Red Hat has its own version of the SAX parser library that was stepping on the toes of the one in epubcheck. I was starting both Drip and Nailgun with the system libraries included in the classpath.

Starting Nailgun (or Drip) without the shared system libraries removed the SAXParseException error.

For Nailgun:

java -cp "/usr/share/java/nailgun-server-0.9.3-SNAPSHOT.jar:/app-lib/epubcheck-4.0.2/*:/app-lib/epubcheck-4.0.2/lib/*" com.martiansoftware.nailgun.NGServer 127.0.0.1

then, for the NG client:

ng com.adobe.epubcheck.tool.Checker FILE.epub

For Drip:

drip -cp "/app-lib/epubcheck-4.0.2/*:/app-lib/epubcheck-4.0.2/lib/*" com.adobe.epubcheck.tool.Checker FILE.epub

For what it's worth, Drip doesn't actually do what I needed, as the reserve JVM spins up with java settings and classpath pre-defined, but the instance of the class still causes the libraries to be loaded from scratch. Its speed is exactly the same as the "java" command.

Nailgun runs at full time the first time (5-7 seconds), then much faster (0.8-1.3 seconds) on subsequent runs.