Merge multiple pdf with the acroform

174 Views Asked by At

I have a pdf file with two pages and on the first page a form field. I successfully fill out the form from the csv file and save as separate files approximately 400pdf files. Now I need to merge them into one file so I can print them in bulk. I have not found any suitable solution.

My code does create a final file, but all pages contain the same form data.

def marge(list):
    writer = PdfWriter()
    for fname in list:
        r = PdfReader(fname)
        acro_form = r.Root.AcroForm
        writer.addpages(r.pages)
        writer.trailer.Root.AcroForm = acro_form
    writer.write("./OUT/output.pdf")
2

There are 2 best solutions below

0
On

No doubt the author no longer needs an answer but for future readers finding this question:

def merge_pdfs(pdf_list):

    output_writer = PdfWriter()
    index = 0
    for pdf_file in pdf_list:
        input_reader = PdfReader(pdf_file)
        for page in input_reader.pages:
            if '/Annots' in page and isinstance(page.Annots, pdfrw.PdfArray):
                for annot in page.Annots:
                    if annot['/T']:
                        annot.T = annot.T+"__"+str(index)
                    else:
                        annot['/Parent'].T = annot['/Parent'].T+"__"+str(index)
        
            output_writer.addpage(page)
            output_writer.trailer.Root.AcroForm = input_reader.Root.AcroForm

        index+=1

    output_writer.write("merged.pdf")

This function takes a pdf list, then edits the Acro information to make the instance of the form field in every pdf unique by appending a __[index] to their name.

Fields that don't have multiple copies within the same pdf file will have their annotation directly on them (annot['/T']), others will have the form field name/value in a parent object (annot['/Parent']) and the placement information on the child object. At least the pdf files I handled are sufficiently simple that they adhere to these rules, in which case the script above is enough.

This removes collisions between form field names and the OP's approach then works.

1
On

You can use PyPDF2 module to merge all pdf files into single pdf

import os
from PyPDF2 import PdfMerger

pdfs = [a for a in os.listdir() if a.endswith(".pdf")]

merger = PdfMerger()

for pdf in pdfs:
    merger.append(pdf)

merger.write("result.pdf")
merger.close()