Running pdf2htmlEX on Heroku

834 Views Asked by At

I'm trying to run pdf2htmlEX on Heroku. At first I thought of compiling pdf2htmlEX on a VM with the same stack as Heroku and then including the binary on the git repo. That did not work (I kept getting problems with dependencies).

As there is no heroku buildpack for running pdf2htmlEX specifically, I decided to try using heroku-buildpack-multi with heroku-buildpack-ruby and heroku buildpack-apt (buildpack-apt adds support for apt-based dependencies during both compile and runtime). The pdf2htmlEX package is not in the main PPA's (it's in ppa:coolwanglu/pdf2htmlex) I couldn't just add pdf2htmlEX to the Aptfile (which is where you specify your apt dependencies).

I ended up getting the dependencies for pdf2htmlEX:

pdf2htmlex
  Depends: libc6
  Depends: libcairo2
  Depends: libfontforge1
  Depends: libfreetype6
  Depends: libgcc1
  Depends: libpoppler44
  Depends: libstdc++6
  Suggests: ttfautohint

Taking this into account I made my Aptfile the following:

libc6
libcairo2
libfontforge1
libfreetype6
libpoppler44
libgcc1
libstdc++6
ttfautohint
http://ftp.us.debian.org/debian/pool/main/p/pdf2htmlex/pdf2htmlex_0.14.6+ds-1+b1_amd64.deb

The issue is that if I get a bash prompt in a one-off dyno and try to run pdf2htmlEX I get the following error:

pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by pdf2htmlEX)
pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.21' not found (required by pdf2htmlEX)
pdf2htmlEX: /app/.apt/usr/lib/x86_64-linux-gnu/libstdc++.so.6: version `GLIBCXX_3.4.20' not found (required by /app/.apt/usr/lib/x86_64-linux-gnu/libpoppler.so.57)

The few articles I found on stackoverflow about this specific error were not particularly helpful. It seems to be something to do with libstdc++6 but I can't figure out how to solve it.

Any ideas? Also, if you know an easier way of running pdf2htmlEX on heroku, please do let me know.

2

There are 2 best solutions below

0
On BEST ANSWER

I just got this working - many thanks for the post that set me in the right direction.

First, I ended up using this package from the ppa instead of the debian package you referenced.

It looks like the issue is that the package you referenced was compiled against different versions of libstdc++6 than the one that is installed by your libstdc++6 line in your Aptfile. To fix, we replaced the libstdc++6 line with an explicit reference to a specific, more recent .deb - I used this version hosted on kernel.org

I also replaced the libpoppler57 reference, but you may not need to.

My final, working Aptfile:

libc6
libfontforge1
libgcc1
libjs-pdf
http://mirrors.kernel.org/ubuntu/pool/main/g/gcc-5/libstdc++6_5.3.1-5ubuntu2_amd64.deb
https://mirrors.kernel.org/ubuntu/pool/main/p/poppler/libpoppler57_0.38.0-0.ubuntu1_amd64.deb
https://launchpad.net/~coolwanglu/+archive/ubuntu/pdf2htmlex/+files/pdf2htmlex_0.12-1~git201411121058r1a6ec-0ubuntu1~trusty1_adm64.deb
ttfautohint

Hope this helps!

0
On

Along with the above mentioned libraries in Aptfile please supply data-dir to pdf2htmlEX while converting.

One of the main issue while converting is on Ubuntu pdf2htmlEX gets installed on /usr/bin/pdf2htmlEX whereas on heroku it is installed at /app/.apt/usr/bin/pdf2htmlEX. And when we don't supply data-dir while converting it by default looks for /usr/bin/pdf2htmlEX.

Tested on heroku-18 with v0.16.0-poppler-0.62.0-ubuntu-18.04

If you are using ruby you can use Kristin gem with master branch