I am trying to extract all the embedded files in a word file(docx) and put the embedded files in a seperate folder. I followed the example given by apache community here https://svn.apache.org/repos/asf/tika/trunk/tika-example/src/main/java/org/apache/tika/example/ExtractEmbeddedFiles.java
though this is able to parse most of the embedded objects correctly but converts the embedded word pad files to OleObject.bin. I want to access the word pad file in the same format as they were embedded in the document as well.
I am new to Apache Tika and i am not able to find any solution for this through a normal google search, there was a mention of a fix related to my problem in v1.3 of Tika but I am using 1.18 so i think it is fixed and I might be missing something in the implementation, please help me with this issue.