Python merge pdf's together with hyperlinks

597 Views Asked by At

I am trying to merge two pdf's together. The other pdf is table of contents that I create manually using fpdf that links to specific pages. The other pdf is the text where the table of content links to.


from PyPDF2.merger imort PdfFileMerger

merger = PdfFileMerger()
merger.append("toc.pdf")
merger.append("temp.pdf")
merger.write("combined.pdf")

But I get the following error:

PdfReadWarning: Object 19 0 not defined. [pdf.py:1628]
Traceback (most recent call last):
...
...
raise utils.PdfReadError("Could not find object.")
PyPDF2.utils.PdfReadError: Could not find object.

I think the error comes as I have hyperlinks that point to nothing as the pages are not created. If I create the table of contents without the hyperlinks merging works correctly. Is there any way I can merge the files so that I preserve the hyperlinks?

To clarify: I believe that I can't add the content pdf's from the start to the table of contents as pyfpdf doesn't seem to have support for adding pdf files together.

Edit: more code


merger = PdfFileMerger()
pages = []
chapters = []
for file in pdfs:
    read_pdf = PdfFileReader(file)
    txt = read_pdf.getPage(0)
    page_content = txt.extractText()
    chapter = helper_functions.get_chapter_from_pdf_txt(page_content)

    pages.append(read_pdf.getNumPages())
    chapters.append(chapter)
    merger.append(fileobj=file)
merger.write("temp.pdf")
pdfs.append("temp.pdf")
merger.close()

num_pages = sum(pages)
toc_len = 0
if toc_orientation == "P":
    toc_len = math.ceil(len(pages) / 27)
if toc_orientation == "L":
    toc_len = math.ceil(len(pages) / 17)

print(num_pages)
print(toc_len)

### Creating toc
toc = compile_toc(chapters, pages, orientation=toc_orientation)
pdf = PDF()
pdf.set_title("")
pdf.table_of_contents(toc, orientation=toc_orientation, create_hyperlink=True)
pdf.output("toc.pdf", 'F')
pdf.close()
time.sleep(2)

merger = PdfFileMerger()
merger.append(PdfFileReader(open("toc.pdf", 'rb')))
merger.append(PdfFileReader(open("temp.pdf", 'rb')))
merger.write("combined.pdf")
´´´
0

There are 0 best solutions below