Converting docx into pdf from an ASP.NET MVC app

1.1k Views Asked by At

I am trying to convert a docx into a pdf file from an ASP.NET MVC application. I have been using Microsoft interop saveas command til now but it sometimes (not always) fails with the error "command failed". I have seen that it is already deprecated and not supported by Microsoft anymore and Microsoft says it is not recommended to use it anymore from an ASP.NET application so I am trying to get alternatives.

I have seen there is a good one, that is, aspose.words but it is not free. I am interested in a free one. So nowadays is there any free alternative out there that is compatible with Microsoft docx documents and capable to convert into pdf without problems?

1

There are 1 best solutions below

10
On

I am interested in a free one

There isn't one. Office/Word's .docx file format is incredibly long and complicated (see below), so writing a program that can fully parse a Word document alone is a mammoth undertaking, alone the as-important task of generating a visual-formatting model representation, and then convert that visual model to a PDF file by generating PostScript/PDF commands from it.

This is what OOXML specification looks like when it's printed out:

enter image description here

(Source: https://fussnotes.typepad.com/plexnex/2007/05/ooxml_more_than_1.html )

Then consider all the features and edge-cases present in the Word formatting model: tables, headings, drop-caps, captions, (don't forget embedded and external content using OLE!), floating textboxes, WordArt, and so on.

Non-visual processing of the XML representation of a Word document is actually trivial and can be done with any XML library - though you should use an OOXML-schema-aware library so you process the Word document correctly (so you don't end-up inserting a paragraph into a header, or a caption that fills the page).

Everything else is the difficult (and expensive) part of the problem. This is why, even today, almost 40 years after Word was first released and 15 years after the OOXML format specification was released, third-party software like OpenOffice (nee StarOffice) and Apple iWork still cannot fully and correctly import or render Word documents.