efficient image compression for pdf embedding with linux

2.6k Views Asked by At

I would like to compress scanned text (monochrome or few colours) and store it in pdf (maybe djvu) files. I remember that I got very good results with Windows/Acrobat and "ZRLE" compressed monochrome tiff embedded into pdf. The algorithm was loossless as far as I remember. Now I search a way to obtain good results on linux. It should be storage saving and avoid loss (I do not mind loosing colours, but I do not want e.g. jpeg compression which would create noisy results for text scans). I need it for batch conversion, so I was thinking of the ImageMagick convert command. But which output format should I use so I get good results and to be able to embed it into pdf files (for example using pdflatex)? Or is it generally better to use djvu files?

2

There are 2 best solutions below

0
On BEST ANSWER

jbig2enc encoder for images using jbig2 compression, was originally written for GoogleBooks by Adam Langley

https://github.com/agl/jbig2enc

I forked to include latest improvements By Rubypdf and others

https://github.com/DingoDog/jbig2enc

I also built several binaries of jbig2enc for puppy linux (it can be working also on other distributions)

http://dokupuppylinux.info/programs:encoders

1
On

DJVU is not a bad choice, but if you want to stay in PDF for better compatibility you may want to look into lossless JBIG2 compression.

Quote from Wikipedia:

Overall, the algorithm used by JBIG2 to compress text is very similar to the JB2 compression scheme used in the DjVu file format for coding binary images.