Document AI - Multi-page files performance affect

33 Views Asked by Yaniv Ben-Malka At 26 March 2024 at 15:52

I’ve noticed that it’s possible to upload multi-page files to Document AI, such that all pages are connected to each other by being associated to the same file.

My use case is invoice files that I would like to extract data from, using a custom extractor.
Most of the invoices are 1-pagers, but some of them span over 2 pages, meaning that the second page usually is leaner than the first page, and does not contain most of the information.

My question is - will there be a difference in a trained model performance between the following file upload mechanisms:

Uploading each page as a separate file, even when an invoice spans over multiple pages (I preprocess it beforehand)
Uploading each file without splitting it to pages

I assume that the performance of option # 2 will be equal or greater than option # 1 - my question is mainly whether it makes a difference or not, as uploading pages separately has its own advantages for us (our use case is a bit more complicated, I simplified it for the explanation).

Original Q&A

Document AI - Multi-page files performance affect

There are 0 best solutions below

Related Questions in GOOGLE-CLOUD-PLATFORM

Related Questions in CLOUD-DOCUMENT-AI

Trending Questions

Popular # Hahtags

Popular Questions