Convert HTML to PDF and preserve letter spacing

623 Views Asked by At

We have many files that need to be converted to create search/highlight word capability in a web browser and to index the searchable words server-side.

I have used http://pdf.investintech.com online service (Step 1, Step 2 on linked page) to convert a PDF to HTML. I've tried others as well.

The PDFs have tables with background shading. This converter uses absolute positioning for each line of text which is working well, but the rendered line of text is slightly longer causing it to extend beyond the background shaded area.

Is there a solution that you could point me to that will preserve the letter spacing, so I can convert to HTML without such an anomaly as changing sentence length? Vertical alignment is important as well, but absolute positioning a separate div per line takes care of that nicely.

1

There are 1 best solutions below

0
On

After some additional Google searching, I found this project, which seems to be more robust than any other I could find. It is especially good at text alignment and selection handling.

https://github.com/coolwanglu/pdf2htmlEX