Is it possible to put the captions generated by AI models back into the pdf file?

60 Views Asked by Clara At 22 January 2024 at 10:06

I have a pdf that contains multiple pages where each page consists of texts and/or images. I have found ways to extract images from a pdf file and I have found ways to use AI models to generate captions for images. But is it possible to put back the captions generated by the AI model to the corresponding image in the pdf file? If it is possible, then what library should I use? Or does anyone know how to code it?

Original Q&A

There are 1 best solutions below

Jorj McKie On 22 January 2024 at 10:35

You can use PyMuPDF for writing text to PDF pages ... in multiple ways.

Note: I am a maintainer and the original creator of PyMuPDF.

You need to locate the image position on the page first. Then decide about a rectangle (like above or below the image boundary box) to receive the caption text.

For example, assume the image boundary box is called bbox, then define rect = (bbox.x0, bbox.y1, bbox.x1, bbox.y1 + 20). This is a rectangle below the image with the same width as bbox and a height of 20.

Then do page.insert_htmlbox(rect, caption) using the caption text.

That method also allows you to align (e.g. center) the caption text via HTML styling instructions, like page.insert_htmlbox(rect, caption, css="* {text-align: center;}").

Is it possible to put the captions generated by AI models back into the pdf file?

There are 1 best solutions below

Related Questions in PYTHON

Related Questions in PDF

Related Questions in ARTIFICIAL-INTELLIGENCE

Related Questions in TEXT-FILES

Trending Questions

Popular # Hahtags

Popular Questions