I am using tabula-py for extracting table from pdf. Where I am using lattice for parsing the file. It is doing good for all rows except the first one.
code:
df = read_pdf("filename.pdf", pages=21, multiple_tables=True, lattice=True)
Table in pdf: enter image description here
Output from Tabula: enter image description here
There are multiple table tables with varying area and number of columns in the pdf. As you can see in image lattice is working good for 2 and 3rd rows and for 1st row it is not working good.
I tried camelot library but it is giving deprecation error of pypdf2.