MarkLogic - MS Word and Excel to XML through CPF WEBDAV

144 Views Asked by At

I am using Developer licence to learn MarkLogic, I am a certified MarkLogic developer.

https://docs.marklogic.com/guide/cpf/default

By going through above link, I can successfully generate xml from an input PDF file through WEBDAV, but I cannot generate xml from Microsoft Word or Excel, I have enabled all the pipelines. Excel and Word document are loaded successfully, but I did not see the XMLs for the same.

May I know what could be the reason, could you please guide, since I need this feature to show a prototype.

1

There are 1 best solutions below

0
On BEST ANSWER

The upconversion is supported for old-style MS Office documents (.doc). For new style (.docx) you need to make sure the OpenXML Extract and WordprocessingML Process pipelines are attached to the domain. They will extract and slightly clean up the Office XML from the .docx zip.