Is there a high fidelity way to convert HTML into PDF and DOCX?

437 Views Asked by At

I need to convert HTML files into PDF and DOCX respectively (just the HTML -> PDF part would good enough for now though).

Obviously I know there are some projects that help with what I want to achieve, I am currently using HTML-Renderer for the PDF part, and OpenXML for the DOCX.

I've tried HTML-Renderer but the fidelity of the conversion is not great, since I read somewhere I can't make headers and footers with HTML for multipage formats. furthermore the conversion scratches off the end of the text when it passes from one page to another.

As for the DOCX, I don't know what the best options are.

I want, if possible, to know what are good high fidelity ways to convert HTML to those formats, any helps is greatly appreciated.

I'm open to ideas/advice on how to make it myself, but right now I don't have the time to do so, so I would much rather use an existent NuGet/DLL/library.

1

There are 1 best solutions below

4
On

You could consider shelling out to pandoc:

For visual appeal, you might like the Eisvogel template:

...which although designed for Markdown, ought to work for well structured, semantic HTML as input to Pandoc too.