High CPU consumption by Apache Tika

80 Views Asked by Yogesh Tembe At 19 March 2024 at 04:55

I am using Apache Tika to extract text from PDF files. My problem is that the Tika service shows CPU spikes from 100-400% in Linux.

I'm using Tika 2.9.1, which is the latest stable version of Tika. I also observed the same CPU spikes when using Tika 1.20.

I'm using this section of code to get text from PDF file:

BodyContentHandler handler = new BodyContentHandler();
Metadata metadata = new Metadata();
FileInputStream inputstream = new FileInputStream(new File("Example.pdf"));
ParseContext pcontext = new ParseContext();
  
//parsing the document using PDF parser
PDFParser pdfparser = new PDFParser(); 
pdfparser.parse(inputstream, handler, metadata,pcontext);
  
//getting the content of the document
System.out.println("Contents of the PDF :" + handler.toString());

Is there any parameter that I can set to reduce Tika's CPU usage?

Original Q&A

High CPU consumption by Apache Tika

There are 0 best solutions below

Related Questions in JAVA

Related Questions in APACHE-TIKA

Related Questions in TIKA-SERVER

Related Questions in TIKA-PYTHON

Trending Questions

Popular # Hahtags

Popular Questions