How to deal the multipage pdf file with tess4j

280 Views Asked by At

I am using tess4j to recognize the image file.

Pix pix = Leptonica1.pixRead(image.getPath());
        TessAPI1.TessBaseAPIInit3(tessBaseAPI, tessDataPath, "eng");
        TessAPI1.TessBaseAPISetImage2(tessBaseAPI, pix);
//        TessAPI1.TessBaseAPIProcessPages(tessBaseAPI,image.getPath(),"",0,null);

        PointerByReference pixa = null;
        PointerByReference blockids = null;
        Boxa boxa = TessAPI1.TessBaseAPIGetComponentImages(tessBaseAPI, ITessAPI.TessPageIteratorLevel.RIL_TEXTLINE, 1, pixa, blockids);

For multiple page tiff files only the Boxa information in the first page can be returned by TessBaseAPIGetComponentImages(). If I use TessAPI1.TessBaseAPIProcessPages(tessBaseAPI,image.getPath(),"",0,null); only the last page information can be returned. So how can I deal with the recognized information page by page for multiple pages?

Thanks.

0

There are 0 best solutions below