I am trying to extract all the images in a PDF file using PDFBox. Its working fine for the pdf containing jpeg and png images. But it is not working for OpenJPEG2000 images. I am getting the below exception: Getting the below error:
org.apache.pdfbox.contentstream.PDFStreamEngine operatorException
SEVERE: Cannot read JPEG2000 image: Java Advanced Imaging (JAI) Image I/O Tools are not installed
In all version of PDFBox, same exception is coming. Tried with standalone jar as well.
I included the necessary dependencies in pom.xml as well.
<dependency>
<groupId>org.apache.pdfbox</groupId>
<artifactId>jbig2-imageio</artifactId>
</dependency>
<!-- For legal reasons (incompatible license), these two dependencies
are to be used only in the tests and may not be distributed. -->
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-core</artifactId>
</dependency>
<dependency>
<groupId>com.github.jai-imageio</groupId>
<artifactId>jai-imageio-jpeg2000</artifactId>
</dependency>
Any help will be appreciated.
Copy the imageing related .jar files into the lib subdirectory, and then use this command line:
Use ";" on windows, ":" on linux.
org.apache.pdfbox.tools.PDFBox
is the name of the main class.