pdf miner adding extra new lines

49 Views Asked by At

While employing PDFMinerLoader to parse PDF files, I've observed that it introduces additional new lines when encountering bullets or numbers. For example:

Original pdf:

  1. use the...
  2. replace the..
  3. update the..

I got:

1.
2.
3.
use the..
replace the..
update the..

Similar issues occur with bullet points, such as: ●

How can I address this problem?

I attempted to switch to an alternative parser method, but it yielded unsatisfactory results, specifically causing text concatenation issues.

0

There are 0 best solutions below