FineReader Engine Java SDK. How to ignore pictures during conversion from PDF to DOCX

Question

FineReader Engine Java SDK. How to ignore pictures during conversion from PDF to DOCX

497 Views Asked by Uladzislau Kaminski At 29 July 2025 at 11:45

I need to find a way to ignore pictures and photos from PDF document during conversion to DOCX file.

I am creating an instance of FineReader Engine:

IEngine engine = Engine.InitializeEngine(
engineConfig.getDllFolder(), engineConfig.getCustomerProjectId(),
engineConfig.getLicensePath(), engineConfig.getLicensePassword(), "", "", false);

After that, I am converting a document:

IFRDocument document = engine.CreateFRDocument();
document.AddImageFile(file.getAbsolutePath(), null, null);
document.Process(null);
String exportPath = FileUtil.prepareExportPath(file, resultFolder);
document.Export(exportPath, FileExportFormatEnum.FEF_DOCX, null);

As a result, it converts all images from the initial pdf document.

Original Q&A

There are 3 best solutions below

Duman Zhanbolatov On 30 October 2019 at 08:56

When you exporting pdf to docx you should use some export params. In this way you can use IRTFExportParams. You can get this object:

IRTFExportParams irtfExportParams = engine.CreateRTFExportParams();

and there you can set writePicture property like this:

irtfExportParams.setWritePictures(false);

there: IEngine engine is main interface. I think u know how to initialize it;)))

Also you have to set in method document.Process() property. (document is from IFRDocument document). In Process() method you have to give IDocumentProcessingParams iDocumentProcessingParams. This object has method setPageProcessingParams() and there you have to put IPageProcessingParams iPageProcessingParams params(You can get this object by engine.CreatePageProcessingParams()). And this object has methods:

iPageProcessingParams.setPerformAnalysis(true);
iPageProcessingParams.setPageAnalysisParams(iPageAnalysisParams);

In the first method set true, and in the second one we give iPageAnalysisParams(IPageAnalysisParams iPageAnalysisParams = engine.CreatePageAnalysisParams()).

Last step, you have to set false value in setDetectPictures(false) method from iPageAnalysisParams like this. Thats all:)

And when you are going to export document you should put this param like this:

IFRDocument document = engine.CreateFRDocument();
document.Export(filePath, FileExportFormatEnum.FEF_DOCX, irtfExportParams);

I hope my answer will help to everyone)))

Sergii On 22 August 2019 at 08:58

What do PDF input pages contain? What is expected in MS Word? It would be great if you would attach an example of an input PDF file and an example of the desired result in MS Word format. Then give a useful recommendation will be much easier.

**gdaly** · Accepted Answer

I'm not really familiar with PDF to DOCX conversion, but i think you could try custom profiles according to your needs.

At some point in your code you should create a Engine object, and then create a Document object (or IFRDocument object depending of your application). Add this line just before giving your document to your engine for processing:

engine.LoadProfile(PROFILE_FILENAME);

Then create your file with some processing parameters described in the documentation packaged with your FRE installation under "Working with Profiles" section. Do not forget to add in your file:

... some params under other sections

[PageAnalysisParams]
DetectText = TRUE       --> force text detection
DetectPictures = FALSE  --> ignore pictures
... other params under PageAnalysisParams

... some params under other sections

It works the same way for Barcodes, etc... But keep in mind to benchmark your results when adding or removing things from this file as it may alter processing speed and global quality of your result.

FineReader Engine Java SDK. How to ignore pictures during conversion from PDF to DOCX

There are 3 best solutions below

Related Questions in JAVA

Related Questions in ABBYY

Related Questions in FINEREADER

Trending Questions

Popular # Hahtags

Popular Questions