I am trying to create a combination docx file that will be the concatenation of 2 docx files. I have the following python code:
from docx import Document
files = ['Doc2.docx', 'Doc3.docx']
def combine_word_documents(files):
combined_document = Document('empty.docx')
count, number_of_files = 0, len(files)
for file in files:
sub_doc = Document(file)
# Don't add a page break if you've
# reached the last file.
if count < number_of_files - 1:
sub_doc.add_page_break()
for element in sub_doc._document_part.body._element:
combined_document._document_part.body._element.append(element)
count += 1
combined_document.save('both_docx_files.docx')
combine_word_documents(files)
The issues are:
- in the resulted both_docx_files.docx file, the docx files are overlapping one on the other instead of being in separate pages.
- images are lost Any help or advice is appreciated.
I tried the given python code. docx files should be concatenated one after the other in the new docx file.
You are adding the page break at the beginning of the first file rather than at the end:
Move the test and page break after the
forloop as follows:You need to provide more information in relation to the second question, as it is not clear what you mean.
[edit]
I have installed the python-docx module and tried to reproduce your problem. It seems that the package cannot copy pictures from one to the other, and indeed there is no obvious way of identifying an element as a picture as far as I can tell from the documentation.
Also I get an exception when using the code as supplied on the line(s) that include a reference to
_document_part.body. I was able to correct it by replacing it with the simple_body. So maybe the version I downloaded (v1.1.0) is different from the one you are using.[/edit]