Split a .docx file into two documents using officer

34 Views Asked by At

I have a document that I will use block_pour_docx() to insert another document into. I want to split this original document into two, splitting it where it finds a term using grepl.

I am not a seasoned programmer, so I have been using some ChatGPT to help build this. I can get the docx in, inspect it using docx_summary, and then find the split point using grepl. The problem is the next part, where I split the document into two new block_lists.

# Load the existing Word document
doc_split <- read_docx(background_facts_recon_file_name)

# Get the document's blocks
content <- docx_summary(doc_split)

# Find the position where the document says "Reconstruction"
split_position <- NULL
for (i in seq_len(nrow(content))) {
  if (grepl("Reconstruction", content$text[i])) {
    split_position <- i
    break
  }
}

if (is.null(split_position)) {
  stop("Text 'Reconstruction' not found in the background facts/reconstruction document.")
}

# Split the document into two parts
part1 <- block_list(doc_split)[1:(split_position - 1)]
part2 <- block_list(doc_split)[split_position:nrow(content)]

# Create new documents for each part
doc1 <- read_docx()
doc2 <- read_docx()

# Add blocks to each document
for (block in part1) {
  doc1 <- body_add(doc1, block)
}
for (block in part2) {
  doc2 <- body_add(doc2, block)
}
0

There are 0 best solutions below