pdf2htmlEX's output shows Times New Roman font for only a few characters?

159 Views Asked by At

I have never seen anything like this. I use a tool called pdf2htmlEX, which converts a PDF to HTML, but I have a weird issue. Look at this screenshot:

See the first character (W)? It's in Times New Roman. Now here's the even more weird part:

Only the W and ' is in Times New Roman (2 glyphs), while the rest are in Libration Sans. How on earth is that possible? How is pdf2htmlEX able to use a different font for each character?

Mind you, if I write these characters anywhere else, they're all in a sans-serif font (document is originally Verdana, so that's why).

Any clue why this is happening and how I can fix it?

1

There are 1 best solutions below

0
On

Sooo I might've found an answer, but it's honestly not the one I wanted.

The PDF I have, which was created in Microsoft Word and exported as PDF, never used the character W (in this case). When I added it randomly in my document, it now shows it normally in the HTML.

I have a feeling pdf2htmlEX compiles a font based on the characters used in the document. Very odd.

Not sure I have a fix for this, but now I have an explanation at least.