Parse OLE-Object Email Attachments from Outlook (Java)

747 Views Asked by At

Situation: The System is fetching Emails via standard methods (Pop3) from a Mailserver and sends them to the Archiving component as multi-part messages (*.eml files).

If the mail was sent from Outlook it may contain an OLE-Object for example a MS Word, MS Excel and so on. There are several ways to include such an Object, for example via Menu "Insert->Object"

Problem: Our requirement is now to extract those OLE-Objects archive them as separate attachments. It would be best to do it in Java or other JVM Languages. Other Languages and Frameworks would be possible but they must be working on different platforms (Win, Linux, Unix)

The problem is we haven't found any library or functions in the libraries to do this.

First issue is, that the message the receiver gets depends on how outlook is configured:

  • It may send RTF messages: Then the receiver get's a message having an attachment "Untitled Attachment.bin"
  • It may send HTML messages: Then the receiver get's a message inlcuding an attachment "oledata.mso".

What we've tried so far: We tried Apache POI, especially POIFS to load the file "oledata.mso" but it complained about that some header value is wrong:

Invalid header signature; read 0xD7EC9C7800013000, expected 0xE11AB1A1E011CFD0 - Your file appears not to be a valid OLE2 document

We found a website talking about the same issue. As far as we understood, the oledata.mso is an collection of Compound File Binary Files. Which should also be parsable with POI individually because the OpenMCDF is doing the same things as POI. On this website they somehow separate the parts and parse them separatly. But we haven't found a similar function or any specification how this is done.

Can anybody please shed some light on this?

0

There are 0 best solutions below