pdf miner adding extra new lines

49 Views Asked by lali At 24 December 2023 at 06:19

While employing PDFMinerLoader to parse PDF files, I've observed that it introduces additional new lines when encountering bullets or numbers. For example:

Original pdf:

use the...
replace the..
update the..

I got:

1.
2.
3.
use the..
replace the..
update the..

Similar issues occur with bullet points, such as: ●

How can I address this problem?

I attempted to switch to an alternative parser method, but it yielded unsatisfactory results, specifically causing text concatenation issues.

Original Q&A

pdf miner adding extra new lines

There are 0 best solutions below

Related Questions in PYTHON

Related Questions in PDF

Related Questions in LANGCHAIN

Related Questions in PDFMINER

Related Questions in PDF-PARSING

Trending Questions

Popular # Hahtags

Popular Questions