create pdf in addition to word docx using officer

3.8k Views Asked by At

I am using officer (used to use reporters) within a loop to create 150 unique documents. I need these documents however to be exported from R as word docx AND pdfs.

Is there a way to export the document created with officer to a pdf?

3

There are 3 best solutions below

1
On

There is a way to convert your docx into the pdf. There is a function convert_to_pdf from the docxtractr package.

Note that this function is using LibreOffice to convert docx to pdf. So you have to install LibreOffice before and write the path to the soffice.exe. Read more about paths for different OS here.

Here is a simple example how to convert several docx documents into pdf on the Windows machine. I have Windows 10 and LibreOffice 6.4 installed. Just imagine that you have X Word documents stored in the data folder and you want to create the same amount of PDF in the data/pdf folder (you have to create the pdf folder before).

library(dplyr)
library(purrr)
library(docxtractr)

# You have to show the way to the LibreOffice before
set_libreoffice_path("C:/Program Files/LibreOffice/program/soffice.exe")

# 1) List of word documents
words <- list.files("data/",
                    pattern = "?.docx",
                    full.names = T)

# 2) Custom function
word2pdf <- function(path){
  
  # Let's extract the name of the file
  name <- str_remove(path, "data/") %>% 
    str_remove(".docx")
  
  convert_to_pdf(path,
                 pdf_file = paste0("data/pdf/",
                                   name,
                                   ".pdf"))
  
}

# 3) Convert
words %>%
  map(~word2pdf(.x))
5
On

That's possible but the solution I have depends on libreoffice. Here is the code I am using. Hope it will help. I've hard-coded libreoffice path then you probably will have to adapt or improve the code for variable cmd_.

The code is transforming a PPTX or DOCX file to PDF.

library(pdftools)
office_shot <- function( file, wd = getwd() ){
  cmd_ <- sprintf(
    "/Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to pdf --outdir %s %s",
    wd, file )
  system(cmd_)

  pdf_file <- gsub("\\.(docx|pptx)$", ".pdf", basename(file))
  pdf_file
}
office_shot(file = "your_presentation.pptx")
2
On

I've been using RDCOMClient to convert my OfficeR created docx's to PDFs.

library(RDCOMClient)

file <- "C:/path/to your/doc.docx"
wordApp <- COMCreate("Word.Application") #creates COM object
wordApp[["Documents"]]$Open(Filename=file) #opens your docx in wordApp
wordApp[["ActiveDocument"]]$SaveAs("C:/path/to your/doc.pdf"), FileFormat=17) #saves as PDF 
wordApp$Quit() #quits the COM Word application

I found the FileFormat=17 bit here https://learn.microsoft.com/en-us/office/vba/api/word.wdexportformat

I've been able to put the above in a loop to convert multiple docx's to PDFs quickly, too.

Hope this helps!