pdf2htmlEX error during conversion - CMap is not valid and got dropped for font

171 Views Asked by At

I'm using this version https://github.com/pdf2htmlEX/pdf2htmlEX/releases/tag/v0.18.8.rc1

this debian version pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb

When I run the conversion I get a bunch of these errors: Working: 97/100ToUnicode CMap is not valid and got dropped for font: b7

which result in empty files, without any text.

I'm running via docker, this is my dockerfile:

FROM ubuntu:20.04

RUN echo 'debconf debconf/frontend select Noninteractive' | debconf-set-selections
ENV DEBIAN_FRONTEND=noninteractive

RUN dpkg --configure -a
RUN apt-get clean
RUN apt-get update 
RUN apt-get install -f -y python3
RUN apt-get install dialog apt-utils -y
RUN apt-get install -f -y python3-pip 
RUN apt-get install -f -y python3-setuptools 
RUN apt-get install -f -y wget 
RUN apt-get install -f -y poppler-utils
RUN apt-get install -f -y poppler-data
RUN apt-get install -f -y jq 
RUN apt-get install -f -y zip unzip
RUN apt-get install -f -y pdftk 
RUN apt-get install -f -y ffmpeg
RUN apt-get install -f -y libfontforge-dev
RUN DEBIAN_FRONTEND=noninteractive; apt-get install -f -y pdftk-java
RUN apt install -f -y ghostscript
RUN pip3 install --upgrade pip \
    && apt-get clean
RUN pip3 --no-cache-dir install --upgrade awscli

WORKDIR /tmp

COPY lib/pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb /tmp
RUN apt install -y ./pdf2htmlEX-0.18.8.rc1-master-20200630-Ubuntu-focal-x86_64.deb

RUN wget https://www.imagemagick.org/download/ImageMagick.tar.gz && \
    tar -xf ImageMagick.tar.gz && \
    cd ImageMagick* && \
    ./configure && \
    make && \
    make install && \
    ldconfig /usr/local/lib

Please advise how can I resolve this?

0

There are 0 best solutions below