Python does not print PDF with pyPDF2

2.6k Views Asked by At

I tried to print pages of a pdf document:

import PyPDF2
FILE_PATH = 'my.pdf'
with open(FILE_PATH, mode='rb') as f:
    reader = PyPDF2.PdfFileReader(f)
    page = reader.getPage(0) # I tried also other pages e.g 1,2,..
    print(page.extractText())

But I only get a lot of blank space and no error message. Could it be that this pdf version (my.pdf) is not supported by PyPDF2?

This solved it (prints all pages of the document). Thanks

from pdfreader import SimplePDFViewer
fd = open("my.pdf", "rb")
viewer = SimplePDFViewer(fd)
for i in range(1,16): # need range from 1 - max number of pages +1
    viewer.navigate(i)
    viewer.render()
    page_1_content=viewer.canvas.text_content
    page_1_text = "".join(viewer.canvas.strings)
    print (page_1_text)
2

There are 2 best solutions below

0
On BEST ANSWER

Try pdfreader

from pdfreader import SimplePDFViewer

fd = open("my.pdf", "rb")
viewer = SimplePDFViewer(fd)
viewer.render()

page_0_content=viewer.canvas.text_content
page_0_text = "".join(viewer.canvas.strings)
3
On

If it's blank, either the PDF is being read and it's format can't be read by pypdf so it just outputs blank. Maybe put in the absolute filepath instead of relative filepath. If all else fails, try with different PDFs , and if there is a version that does work and yours doesn't, you might need to convert yours to that working type.