I am using Apache PDFBox to read a PDF file and convert it into a JPEG image.
import java.io.ByteArrayInputStream;
import java.awt.image.BufferedImage;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.rendering.PDFRenderer;
...
byte[] fileBytes;
...
PDDocument pdDocument = PDDocument.load(new ByteArrayInputStream(fileBytes));
BufferedImage image = new PDFRenderer(pdDocument).renderImage(0);
pdDocument.close();
Sometimes the PDF document contains JBIG2 images. I am using the JBIG2 ImageIO Plugin for PDFBox to correctly process such PDF documents. This works fine. But I would like to know after the conversion whether the orignal PDF document contained a JBIG2 image or not.
I checked the PDDocument Javadoc, but I cannot figure out a way to answer this seemingly simple question: Does a given PDF document contain at least one JBIG2 image or not?
Since I am already using PDFBox, a solution with the means of PDFBox would be preferred, but other suggestions would also be highly appreciated.
Get the source code of
ExtractImages.java
from the source code download or from here. Search for this line in thewrite2file
method:now add some code like
then remove all the rest from that method.