Error converting docx to pdf using Unoconv

1.4k Views Asked by At

I am trying to convert .docx files to .pdf files using Unoconv. Libreoffice is installed on my server and the script works for another website on the server.

Using the line use Unoconv\Unoconv; results in an HTTP ERROR 500.

Does someone know why I get a HTTP ERROR 500?

Here is my script:

<?php
    require './Unoconv.php';
    use Unoconv\Unoconv;
        
    $originFilePath = './uf/invoice/17/word/202100021.docx';
    $outputDirPath  = './uf/invoice/17/pdf/202100021.pdf';
    
    Unoconv::convertToPdf($originFilePath, $outputDirPath);

    header("Content-type:application/pdf");
    header("Content-Disposition:attachment;filename=202100021.pdf");
?>

Here is my Unoconv.php script:

<?php

namespace Unoconv;

class Unoconv {

    public static function convert($originFilePath, $outputDirPath, $toFormat)
    {
        $command = 'unoconv --format %s --output %s %s';
        $command = sprintf($command, $toFormat, $outputDirPath, $originFilePath);
        system($command, $output);

        return $output;
    }

    public static function convertToPdf($originFilePath, $outputDirPath)
    {
        return self::convert($originFilePath, $outputDirPath, 'pdf');
    }

    public static function convertToTxt($originFilePath, $outputDirPath)
    {
        return self::convert($originFilePath, $outputDirPath, 'txt');
    }

}
?>
3

There are 3 best solutions below

0
On

I've observed that LibreOffice can be a little quirky when doing conversions, especially when running in headless mode from a webserver account.

The simplest thing to try is to modify unoconv to use the same Python binary that is shipped with LibreOffice:

#!/usr/bin/env python

should be (after checking where libreoffice is installed)

#!/opt/libreoffice7.1/program/python

Otherwise, I have worked around the problem by invoking libreoffice directly (without Unoconv):

    $dir    = dirname($docfile);
    // Libreoffice saves here
    $pdf    = $dir . DIRECTORY_SEPARATOR . basename($docfile, '.docx').'.pdf';
    $ret = shell_exec("export HOME={$dir} && /usr/bin/libreoffice --headless --convert-to pdf --outdir '{$dir}' '{$docfile}' 2>&1");
    if (file_exists($pdf)) {
        rename($pdf, $realPDFName);
    } else {
        return false;
    }
    return true;

Note the export HOME={$dir} directive, to ensure that temporary lock files will be saved in the current directory where, presumably, the web server has full permissions. If this requirement isn't met, LibreOffice will silently fail (or at least, it will fail - that much I observed - and I haven't been able to locate an error message anywhere - I found out what was going on through the use of strace).

So your code would become:

$originFilePath = './uf/invoice/17/word/202100021.docx';
$outputDirPath  = './uf/invoice/17/pdf/202100021.pdf';

$dir    = dirname($originFilePath);
$pdf    = $dir . DIRECTORY_SEPARATOR . basename($originFilePath, '.docx').'.pdf';
$ret = shell_exec("export HOME={$dir} && /usr/bin/libreoffice --headless --convert-to pdf --outdir '{$dir}' '{$originFilePath}' 2>&1");
// $ret will contain any errors
if (!file_exists($pdf)) {
    die("Conversion error: " . htmlentities($ret));
}
rename($pdf, $outputDirPath);

header("Content-type:application/pdf");
header("Content-Disposition:attachment;filename=202100021.pdf");
readfile($outputDirPath);

I assume that libreoffice is present in the usual alternatives link of "/usr/bin/libreoffice", otherwise you need to retrieve its path with the terminal command of "which libreoffice". Or, from a php script,

<?php
header('Content-Type: text/plain');
print "If this works:\n";
system('which libreoffice 2>&1');
print "\n-- otherwise a different attempt, returning too much information --\n";
system('locate libreoffice');
2
On

@Alex is correct about wrapping in try/catch first, but should the syntax be:

...
} catch(\Exception $e){
...
10
On

Start from wrapping your code with try...catch to get the error message first:

<?php
try {
    require 'Unoconv.php';
    use Unoconv\Unoconv;
    
    $map1 = $_SESSION['companyid'];
    $filename = $result1['filename'];
    
    $originFilePath = './uf/doc/'.$map1.'/word/'.$filename.'.docx';
    $outputDirPath  = './uf/doc/'.$map1.'/pdf/'.$filename.'.pdf';
    
    Unoconv::convertToPdf($originFilePath, $outputDirPath);
    
    header("Content-type:application/pdf");
    header("Content-Disposition:attachment;filename=".$filename.".pdf");
    readfile($outputDirPath);
} catch (\Exception $e) {
    die($e->getMessage());
}