pdf2image conversion of multi page PDFs to images returns the last page on all images

Question

pdf2image conversion of multi page PDFs to images returns the last page on all images

5.2k Views Asked by Simon Mortensen At 07 June 2025 at 17:11

So when I use the pdf2image python import, and pass a multi page PDF into the convert_from_bytes()- or convert_from_path() method, the output array does contain multiple images - but all images are of the last PDF page (whereas I would've expected that each image represented one of the PDF pages).

The output looks something like this:

Any idea on why this would occur? I can't find any solution to this online. I've found some vague suggestion that the use_cropbox argument might be used, but modifying it has no effect.

def convert(opened_file)
    # Read PDF and convert pages to PPM image objects
    try:
        _ppm_pages = self.pdf2image.convert_from_bytes(
            opened_file.read(),
            grayscale = True
        )
    except Exception as e:
        print(f"[CreateJPEG] Could not convert PDF pages to JPEG image due to error: \n    '{e}'")
        return

    # Do stuff with _ppm_pages
    for img in _ppm_pages:
        img.show() # ...all images in that list are of the last page

Sometimes the output is an empty 1x1 image, instead, which I also haven't found a reason for. So if you have any idea what that is about, please do let me know!

Thanks in advance, Simon

EDIT: Added code.

EDIT: So, when I try this in a random notebook, it actually works fine.

I've removed a few detours I used in my original code, and now it works. Still not sure what the underlying reason was though...

All the same, thanks for your help, everyone!

Original Q&A

There are 2 best solutions below

**Amiga500** · Answer 1

Amiga500 On 10 March 2022 at 10:16

I'm using this right now....

from pdf2image import convert_from_path

imgSet = convert_from_path(pathToPDF, 500)

That gives me a list of images within imgSet

**balu** · Answer 2

I guess you have to do something like this as described in the unit tests of the package.

        with open("./tests/test.pdf", "rb") as pdf_file:
            images_from_bytes = convert_from_bytes(pdf_file.read(), fmt="jpg")
            self.assertTrue(images_from_bytes[0].format == "JPEG")

pdf2image conversion of multi page PDFs to images returns the last page on all images

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PDF2IMAGE

Trending Questions

Popular # Hahtags

Popular Questions