Apache poi 5 -PDF converter from docx generates overlapping words

1.1k Views Asked by At

When converting a docx file (testDocument.docx) to PDF using xdocreport, the output file (testDocument-new.pdf) has some overlapping words.

In order to replicate the issue, here you have the code:

@Test
void simpletestconversion() {
    try(InputStream in = new FileInputStream(docPath);
        OutputStream out = new FileOutputStream(pdfPath)) {

        XWPFDocument document = new XWPFDocument(in);
        PdfOptions pdfOptions = PdfOptions.create();
        // Use a special font provider for chinese
        pdfOptions.fontProvider(CHINESE_FONT_PROVIDER);

        PdfConverter.getInstance().convert(document, out, pdfOptions);
    } catch(Exception e) {
        e.printStackTrace();
    }
}

with Chinese font provider defined as follow (here you can find the font)

private static final IFontProvider CHINESE_FONT_PROVIDER = (familyName, encoding, size, style, color) -> {
    try {
        BaseFont bf = BaseFont.createFont("NotoSansCJK-Regular.ttc" + ",0", BaseFont.IDENTITY_H,
                                          BaseFont.NOT_EMBEDDED);
        Font font = new Font(bf, size, style, color);
        if(familyName != null) {
            font.setFamily(familyName);
        }
        return font;
    } catch(DocumentException | IOException e) {
        log.error("Font error", e);
        return ITextFontRegistry.getRegistry().getFont(familyName, encoding, size, style, color);
    }
};

and using the following dependencies in the pom file

...
 <apache-poi.version>5.2.3</apache-poi.version>
...
       <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi</artifactId>
            <version>5.2.3</version>
        </dependency>

        <dependency>
            <groupId>org.apache.poi</groupId>
            <artifactId>poi-ooxml-full</artifactId>
            <version>5.2.3</version>
        </dependency>

        <dependency>
            <groupId>fr.opensagres.xdocreport</groupId>
            <artifactId>fr.opensagres.poi.xwpf.converter.pdf</artifactId>
            <version>2.0.4</version>
        </dependency>

Thank you so much for your help since I haven't found any related issues.

0

There are 0 best solutions below