What are the steps to verify integrity of these documents ? doc,docx,docm,odt,rtf,pdf,odf,odp,xls,xlsx,xlsm,ppt,pptm
Or at least of some of them. Usually when uploaded to a content repository.
I guess that inputStream is always 99,99% read properly from MultiPart http request otherwise exception would be thrown and action taken. But user can upload already corrupted file - do I use third party libraries for checking that? I didn't see anything like that in odftoolkit, itextpdf, pdfbox, apache poi or tika
For all of the above listed file formats there are 3rd-party libraries which can open etc. - I don't know of a "verification only" but I think being able to open them without exceptions etc. is at least a basic check that the file is within the specified format... One such (commercial) library is
Aspose- not affiliated, just a happy customer...