I have some ole file with ole2 format in legacy system. These are office word or excel & with embed object (e.g. picture) I think. If I rename the file with docx or xlsx externsion, it will say file is corrupted.
Could I extract the ole file with some existing C# library? And save it as word or excel document?
NOTE:
- The OlePres\d\d\d stream are embed ole object I think.
- The Ole stream says it's a embed file not link.
- The compObj stream indicate it's file type. e.g. Microsoft Word Document
- For package type ole file, I have follow below blog to extract the file from ole10native stream successfully -- https://eigenein.wordpress.com/2011/08/03/how-to-extract-ole-attachment-body-from-ole10native-stream/
Updates: (Possible solution)
For old style, e.g. xls, doc, it could just rename the ole file to those extension and it works. But some of the file cannot be opened via MS Office, but it open successfully via Libre Office.
For new style, e.g. xlsx, docx. It could extract the Package stream and save as xlsx or docx. file.
For old style, e.g. xls, doc, it could just rename the ole file to those extension and it works. But some of the file cannot be opened via MS Office, but it open successfully via Libre Office. So I use the Libre office command line tool to convert it with same format, e.g.
soffice --convert-to docx *.docx --outdir ../Converted
Then it could be opened via MS Office.For new style, e.g. xlsx, docx. It could extract the Package stream and save as xlsx or docx. file.