Creating a Python script performing Mail Merge on a word document using an Excel file and generating a PDF File with a different Page Setup

400 Views Asked by At

To give some context of the problem, I wrote a Python script that performs mail merge utilizing a word template and an excel sheet as the data source. The Python script that I wrote is able to generate the required word document as well as the corresponding PDF files. There are about 95,000 records that are in the excel file for which I am creating the word as well as the PDF files. I am processing the records in batches of 3000 each to make it more efficient. Every record (row) in the Excel sheet takes one Page in the word document. Now, what I want to do is that, using programming only, I want to change the Page Setup for the PDF file from 1 Page Per Sheet to 2 Pages Per Sheet. I have tried using all the packages that I could to resolve this issue but I have not been successful so far. Can any one here help me with it? Below is my code:

import os
import win32com.client as win32

working_directory = '<workig_directory>'
source_name = os.path.join(working_directory, 'test.xlsx')
destination_folder = os.path.join(working_directory, 'Destination')

# Create a Word application instance
wordApp = win32.Dispatch('Word.Application')
wordApp.Visible = True

# Open Word Template
sourceDoc = wordApp.Documents.Open(os.path.join(working_directory, 'test_doc.docx'))
mail_merge = sourceDoc.MailMerge
mail_merge.OpenDataSource(Name:=os.path.join(working_directory, source_name),
                          sqlstatement:= "SELECT * FROM [Sheet1$]")

record_count = mail_merge.DataSource.RecordCount

# Define the batch size (e.g., 3,000 records per batch)
batch_size = 3000

# Calculate the number of batches needed
num_batches = (record_count + batch_size - 1) // batch_size

# Initialize a counter for file naming
file_counter = 1

try:
    # Perform Mail Merge in batches
    for batch_index in range(num_batches):
        # Calculate the start and end records for the current batch
        start_record = batch_index * batch_size + 1
        end_record = min((batch_index + 1) * batch_size, record_count)

        # Set the range for the mail merge
        mail_merge.DataSource.FirstRecord = start_record
        mail_merge.DataSource.LastRecord = end_record

        # Execute the mail merge for the current batch
        mail_merge.Execute(False)

        # Create the base name for the output file with the record range
        base_name = f'output_file_{start_record}-{end_record}'

        targetDoc = wordApp.ActiveDocument

        # Save the document with the computed file name
        targetDoc.SaveAs2(os.path.join(destination_folder, base_name + '.docx'), 16)

        targetDoc.ExportAsFixedFormat(os.path.join(destination_folder, base_name), ExportFormat=17)

        targetDoc.Close(False)
        targetDoc = None

        # Increment the file counter
        file_counter += 1

finally:
    # Close the source document and release the Word application object
    sourceDoc.Close(False)
    wordApp.Quit()
0

There are 0 best solutions below