I am trying to read the contents of a PDF file using Java-Selenium. Below is my code. getWebDriver is a custom method in the framework. It returns the webdriver.
URL urlOfPdf = new URL(this.getWebDriver().getCurrentUrl());
BufferedInputStream fileToParse = new BufferedInputStream(urlOfPdf.openStream());
PDFParser parser = new PDFParser((RandomAccessRead) fileToParse);
parser.parse();
String output = new PDFTextStripper().getText(parser.getPDDocument());
The second line of the code gives compile time error if I don't parse it to RandomAccessRead
type.
And when I parse it, I get this run time error:
java.lang.ClassCastException: java.io.BufferedInputStream cannot be cast to org.apache.pdfbox.io.RandomAccessRead
I need help with getting rid of these errors.
First of, unless you want to interfere in the PDF loading process, there is no need to explicitly use the
PdfParser
class. You can instead use a staticPDDocument.load
method:Otherwise, if you do want to interfere in the loading process, you have to create a
RandomAccessRead
instance for yourBufferedInputStream
, you cannot simply cast it because the classes are not related.You can do that like this
(This essentially is copied and pasted from the source of
PDDocument.load
.)