postbank_pdf2csv: how to setup with Cygwin in Windows?

28 Views Asked by At

I wanted to convert some Postbank PDFs to CSV and found Postbank_PDF2CSV which is based on command line utils and bash-script-based parsing.
I wanted to try it out with Cygwin under Windows since I had no other Linux at hand and was quite positive it would work.

It did not work right away and greeted me with:

$ ./postbank_pdf2csv_bis_2023-02.sh -?
Dieses Skript funktioniert nur mit FreeBSD, Linux or Macos.

I tried to remove the overjealous restriction, installed some required libs, fixed some script bug (that exited the script before CSV generation and echoing hallo) but could not make it work getting errors like:

GPL Ghostscript 10.01.2: Unrecoverable error, exit code 1

Although I have a github account I could not file an issue or get in contact at the project itself.

1

There are 1 best solutions below

0
Andreas Covidiot On

Finally I could make it work and I want to share important steps that should help to get it done or make it into the scripts and wiki of the project:

  1. install Cygwin and add wget command in the process to download some package manager

  2. download and install apt-cyg for easier tool install without using setup.exe:

    $ wget rawgit.com/transcode-open/apt-cyg/master/apt-cyg
    $ install apt-cyg /bin
    
  3. make the internal script-os-check in postbank_pdf2csv*.sh run under Cygwin by adding some code here:

    • old:
      if [ "$(uname -o)" = "FreeBSD" ] || [ "$(uname -o)" = "Darwin" ] ; then
    • new:
      if [ "$(uname -o)" = "FreeBSD" ] || [ "$(uname -o)" = "Darwin" ] || [ "$(uname -o)" = "Cygwin" ] ; then
  4. in some history version scripts there is some test code that causes some early exit of the script which can be fixed by outcommenting (#) those lines, e.g.:

    #echo 'hallo'
    #exit
    
  5. README.md says to install some required tools and we do it like this:

    $ apt-cyg install ghostscript poppler pstotext

  6. different errors arise because of still missing libraries so in summary I fixed it like this:

    • the typical error is
      C:/cygwin64/bin/pdftotext.exe: error while loading shared libraries: ?: cannot open shared object file: No such file or directory

    • to determine the missing libraries you could check iteratively the missing ones running cygcheck on it, e.g.: $ cygcheck /usr/bin/pdftotext.exe which may result in e.g.:

    C:\cygwin64\bin\pdftotext.exe
    
      C:\cygwin64\bin\cygpoppler-106.dll
        C:\cygwin64\bin\cygwin1.dll
          C:\Windows\system32\KERNEL32.dll
            C:\Program Files\Eclipse Adoptium\jdk-17.0.2.8-hotspot\bin\api-ms-win-core-rtlsupport-l1-1-0.dll
            C:\Windows\system32\ntdll.dll
            C:\Windows\system32\KERNELBASE.dll
    ...
          C:\cygwin64\bin\cygssh2-1.dll
            C:\cygwin64\bin\cygcrypto-1.1.dll
          C:\cygwin64\bin\cygzstd-1.dll
    cygcheck: track_down: could not find cygfontconfig-1.dll
    
    cygcheck: track_down: could not find cygfreetype-6.dll
    
    cygcheck: track_down: could not find cygjpeg-8.dll
    
    cygcheck: track_down: could not find cyglcms2-2.dll
    
    cygcheck: track_down: could not find cygnspr4.dll
    
    cygcheck: track_down: could not find cygnss3.dll
    
    cygcheck: track_down: could not find cygopenjp2-7.dll
    
    cygcheck: track_down: could not find cygplc4.dll
    
    cygcheck: track_down: could not find cygpng16-16.dll
    
    cygcheck: track_down: could not find cygsmime3.dll
    
    cygcheck: track_down: could not find cygtiff-6.dll
    
    • lib dependency solution for me was:
      $ apt-cyg install libpoppler106 libfontconfig1 libfreetype6 libjpeg8 liblcms2_2 libnspr4 libnss3 libopenjp2_7 libpng16 libtiff6 libdeflate0 libjbig2 libwebp7 libexpat1 libgs10 libX11_6 libXt6 libpaper1 libtiff7 libxcb1 libICE6 libSM6 libXau6 libXdmcp6 libiconv
  7. if you get the above error GPL Ghostscript 10.01.2: Unrecoverable error, exit code 1 I fixed it by moving my PDFs to some local drive (C:) while it was a mapped remote one before (network share)

Related Questions in POSTBANK-PDF2CSV