How can I convert image coordinates to PDF coordinates when using pdf2image and table-transformers?

Question

How can I convert image coordinates to PDF coordinates when using pdf2image and table-transformers?

964 Views Asked by siddharth patel At 22 May 2023 at 07:48

I am using pdf2image to convert pdf to images and detecting tables with table-transformers. I need help with coordinates.

Issue is, I am getting perfect table borders but pixels in images are different from PDF coordinates. Any way to convert image coordinates to PDF coordinates? Here is my code for reference:

from pdf2image import convert_from_path

images = convert_from_path('/content/Sample Statement Format Bancslink.pdf')

for i in range(len(images)):
  images[i].save('/content/pages_sbi/page'+str(i)+'.jpeg')

Original Q&A

There are 2 best solutions below

**Jorj McKie** · Answer 1 · 2023-05-22T09:21:23.200000

Here is how to use PyMuPDF to transform image coordinates back to PDF page coordinates.

This of course works page by page. So in the following, an image file is assumed to be made from the corresponding page.

import fitz  # PyMuPDF import

doc = fitz.open("input.pdf")
page = doc[pno]  # page number pno is 0-based
image = f"image{pno}.jpg"  # filename of the matching image of the page

# rectangle, e.g. one that wraps a table in the image
# x0, y0 are coordinates of its top-left point
# x1, y1 is the bottom-right point
rect = fitz.Rect(x0, y0, x1, y1)

# make a PyMuPDF iamge from the JPEG
pix = fitz.Pixmap(image)

# make a matrix that converts any image coordinates to page coordinates
mat = pix.irect.torect(page.rect)

# now every image coordinate can be converted to page coordinates
# e.g. this is the table rect in page coordinates:
pdfrect = rect * mat

# if you don't want PyMuPDF objects as rectangle, just use
# tuple(pdfrect) to retrieve the 4 coordinates

Just as an aside, PyMuPDF is also able to render pages to images. So if your table detection mechanism can be invoke page, by page, you could make a loop like this:

Read page using PyMuPDF
Convert page to an image. Could be in memory, too.
Pass page image to table recognizer, which returns table coordinates
Use table coordinates and convert them to page coordinates as shown above.

**siddharth patel** · Answer 2 · 2023-05-22T13:59:13.233000

Alright, found perfect solution which will work on almost all problems.
Consider this as your code for PDF to Image:

from pdf2image import convert_from_path

images = convert_from_path('PATH')

!mkdir pages

for i in range(len(images)):
  images[i].save('/content/pages/page'+str(i)+'.jpeg')

Now, you need to get data of PDF first:

from pypdf import PdfReader

reader = PdfReader('PATH')
box = reader.pages[0].mediabox

pdf_width = box.width
pdf_height = box.height

Now read and get data about image:

import cv2
im = cv2.imread('/content/pages/page0.jpeg')
height, width, channels = im.shape

Now consider x_1, x_2, y_1 and y_2 as coordinates in image. To get location of same in PDF, use following code:

x_1  = x_1/width*pdf_width
y_1  = y_1/width*pdf_width
x_2  = x_2/width*pdf_width
y_2  = y_2/width*pdf_width

Use this coordinates for your work.

How can I convert image coordinates to PDF coordinates when using pdf2image and table-transformers?

There are 2 best solutions below

Related Questions in PYTHON

Related Questions in PYTHON-3.X

Related Questions in PDF2IMAGE

Trending Questions

Popular # Hahtags

Popular Questions