I have a table on a pdf that looks like this:

I want to transform it into a pandas dataframe. This is what I tried :
import tabula
df = tabula.read_pdf(filename, pages=pages)[0]
If I am going to print the result this is what I am going to have:
Pos Name Surname DateOfBirth Address
1 James Brown 1923-01-02 1313 E Main St,
Nan Nan Nan Nan Portage MI 49024-
Nan Nan Nan Nan 2001
2 Abram Red 1934-07-15 1313 E Main St,
Nan Nan Nan Nan Portage MI 49024-
Nan Nan Nan Nan 2001
And so on..
How can I obtain the desired output? In other words, how can I say that "1313 E Main St, Portage MI 49024-2001" belongs to only one row ?
I can't reproduce the issue with three different libraries :
tabula-py :
pdfplumber :
pymupdf :
Output :
NB : They all give the same output except for tabula (where
\nis replaced by\r).PDF used (file.pdf) :
Generated with :