PDF conversion with poppler-utils: Is there a way to avoid decoding difficulties?

260 Views Asked by lowflyer7 At 13 April 2023 at 20:07

I am converting pdf to text using poppler-utils and the pdftotext-function on Ubuntu. Unfortunately I keep running into a problem where some files are not converted decently.

A correctly converted file looks like this:

  82 => '23:00 23:00 - 05:00 05:00 01:30',
  83 => 'Page 1 of 5',
  84 => 'Generated on Feb 05, 2023 17:11',

But some files result in something like this:

  82 => 'WĂƌƚŝĂůK&&;ĞŶĐƌŽĂĐŚĞĚďǇ',
  83 => 'ĚƵƚǇͿ',
  84 => 'ϬϬ͗ϭϯͲϮϯ͗ϱϵ D',

Both documents are pdf version 1.4 and appear to have been encoded with the same software, so I'm at a loss, what is causing this problem.

Does anyone have a suggestion, what to try next?

Original Q&A

PDF conversion with poppler-utils: Is there a way to avoid decoding difficulties?

There are 0 best solutions below

Related Questions in UBUNTU

Related Questions in PDF

Related Questions in FILE-CONVERSION

Related Questions in PDFTOTEXT

Related Questions in POPPLER-UTILS

Trending Questions

Popular # Hahtags

Popular Questions