I am facing a problem when I build an application that uses pdfbox. The application is able to read books with jbig2 images when I run it from IDE (I use netbeans 8.1) (I have maven dependencies for jbig2 in pom.xml). The problem is when I build the application creating a fat jar. When I run the fat jar with the same input pdf, it gives the following error:
“Cannot read JBIG2 image: jbig2-imageio is not installed”
The threads that comment that error, do not seem to solve my problem (they say that a maven dependency has to be added to pom, but that dependency is already on my pom).
I have also checked that jbig2 library classes are inside the fat jar, so I have no idea of what is happening.
I have isolated the problem in a tinny application that looks like this:
public static void main( String[] args )
{
String fileName = null;
if( args.length == 0 )
{
fileName = "test.pdf";
}
else
{
fileName = args[0];
}
PdfDocumentWrapper doc = null;
try
{
PdfboxFactory factory = new PdfboxFactory();
doc = factory.createPdfDocumentWrapper();
doc.loadPdf( fileName );
for( int ii = 0; ii < doc.getNumberOfPages(); ii++ )
{
int pageNum = ii+1;
System.out.println("\n\nProcessing page: " + pageNum +"\n---------------------------------");
List<ImageWrapper> imageList = doc.getImagesOfPage(ii);
int jj=0;
for( ImageWrapper image: imageList )
{
jj++;
System.out.println(String.format(" Page[%d]. Image[%d] -> bounds: %s",
pageNum, jj, image.getBounds().toString() ) );
}
}
}
catch( Exception ex )
{
ex.printStackTrace();
}
finally
{
if( doc != null )
{
try
{
doc.close();
}
catch( Exception ex )
{
ex.printStackTrace();
}
}
}
}
I have placed the whole isolated example project here (with the purpose to help to solve the issue): http://www.frojasg1.com/20200504.PdfImageExtractor.zip
When I run that application from IDE, it produces the following output:
Processing page: 1
---------------------------------
Page[1]. Image[1] -> bounds: java.awt.Rectangle[x=17,y=33,width=442,height=116]
Page[1]. Image[2] -> bounds: java.awt.Rectangle[x=53,y=513,width=376,height=124]
Page[1]. Image[3] -> bounds: java.awt.Rectangle[x=101,y=250,width=285,height=5]
------------------------------------------------------------------------
When I run the application from command line, it gives the following output:
$ java -jar ./PdfImageExtractor-v1.0-SNAPSHOT-all.jar
Processing page: 1
---------------------------------
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
may 04, 2020 3:40:18 PM org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
GRAVE: Cannot read JBIG2 image: jbig2-imageio is not installed
Does anybody know why the fat jar is not able to read jbig2 images?
I posted the same question in pdfbox users mailing list, and here is the answer:
And the solution:
Thank you very much, it was that!
I have been able to create these META-INF files:
merging them from the ones in ImageIO jars
By adding these lines to pom.xml:
Problem solved.