How can I find out in Java if a PDF file contains a JBIG2 image?

368 Views Asked by At

I am using Apache PDFBox to read a PDF file and convert it into a JPEG image.

import java.io.ByteArrayInputStream;
import java.awt.image.BufferedImage;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
...

byte[] fileBytes;
...
PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(fileBytes));
BufferedImage image = new PDFRenderer(pdDocument).renderImage(0);
pdDocument.close();

Sometimes the PDF document contains JBIG2 images. I am using the JBIG2 ImageIO Plugin for PDFBox to correctly process such PDF documents. This works fine. But I would like to know after the conversion whether the orignal PDF document contained a JBIG2 image or not.

I checked the PDDocument Javadoc, but I cannot figure out a way to answer this seemingly simple question: Does a given PDF document contain at least one JBIG2 image or not?

Since I am already using PDFBox, a solution with the means of PDFBox would be preferred, but other suggestions would also be highly appreciated.

1

There are 1 best solutions below

0
On

Get the source code of ExtractImages.java from the source code download or from here. Search for this line in the write2file method:

String suffix = pdImage.getSuffix();

now add some code like

if ("jb2".equals(suffix))
{
    // do your stuff here, i.e. remember that it is JBIG2
}

then remove all the rest from that method.