The text in One Note file type is not being extracted properly by apache tika

12 Views Asked by At

one line is being shown multiple times in the extracted output.

I tried using writing the oneNoteParser instead of autodetect parser but still the output is same.

I didn't find any bug about this in apache tika website too. May be it needs changes in apache tika side code.

0

There are 0 best solutions below