Python-docx: how to save each edited ".docx" file while retaining the original name ".docx" in a new folder

368 Views Asked by At

I'm new to Python:docx and need some help with my script.

  1. the script should pick up all ".docx" in that folder.
  2. For each of those ".docx", the Python script replaces certain strings.
  3. After editing each ".docx", these are saved into a different folder under the original .docx file. I tried a few things but I was unable to complete item 3.

Code below:

# NEW FUNCTION:
    import docx
    import glob
    
    # This section goes thru the entire order folder (about 1350 docx files in the folder) and picks up .docx to edit then save it into a new folder while keeping the original customer name.
    for i in glob.glob(r'C:\Users\LOCAL\Documents\TOPPAN_MOR_BASE_PART\*.docx', recursive=True): 

    # this section edits the new values into the docx file.
        def replacer(p, replace_dict):
            inline = p.runs  # Specify the list being used
            for j in range(0, len(inline)):
    
            # Iterate over the dictionary
                for k, v in replace_dict.items():
                    if k in inline[j].text:
                        inline[j].text = inline[j].text.replace(k, v)
            return p
    
    
    # Replace Paragraphs & save each docx file with its name intact after content edits.
    file_suffix = raw_input(i) # should save file wit original name??
    doc = docx.Document(".docx")  # Get the file
    dict = {'Don X': 'TOPPAN', '5459APSIM':'5499AP', 'Special Mask Build Notes': 'Special Mask Build Notes: Special Mask Build Notes: TOPPAN 4X to 5X'}  # Build the dict
    for p in doc.paragraphs:  # If needed, iter over paragraphs
        p = replacer(p, dict)  # Call the new replacer function

# This section saves the original *.docx into NewOrder folder. it's not working.
doc.save(r'C:\Users\LOCAL\Documents\TOPPAN_MOR_BASE_PART\NewOrder\SUBSTITUTE'+file_suffix+'docx')
1

There are 1 best solutions below

5
David Moruzzi On

With Python3, please ensure pip install python-docx.

import docx
import glob
import os
import shutil

# get all docx files in current directory
files = glob.glob('*.docx')

# create backup folder
if not os.path.exists('backup'):
    os.makedirs('backup')
# copy original files
for file in files:
    # copy files to backup folder
    shutil.copy(file, 'backup')

# replacement dictionary
replacements = {'test':'TEST'}

# loop through all files
for file in files:
    # open file
    doc = docx.Document(file)
    # loop through all paragraphs
    for para in doc.paragraphs:
        # for each of the words to be replaced, evaluate and do replacement
        for word in replacements.keys():
            # if paragraph contains the word to be replaced
            if word in para.text:
                # replace the word with the new one
                para.text = para.text.replace(word, replacements[word])
    # save file
    doc.save(file)