TIKA - Compute Content-Encoding of a document

229 Views Asked by At

I'm using Tika 1.26 in order to extract metadata of a document.

I first gave a try to the Tika Server and then I switched to programmatic API. Nevertheless, even if the documentation states that the Content-Encoding of a document should be returned via the /meta API or the MetadataParser, the property is not actually returned.

I found that the API that actually returns a Charset is the CharsetDetector, but I don't know how to invoke this same API via the Tika Server. I don't have any clue right now.

Can someone point me out what's the correct way to model this use case or if I'm doing something wrong?

0

There are 0 best solutions below