Split Doc file Pages And Convert To PDF with gembox Document

743 Views Asked by At

I want to convert the entire content of that page to PDF by searching for a specific word on each page (which may be on one page or more). For example, we have a file that has three pages, there is a special word on the first page, and the next special word on the third page. I want to save the PDF from the first to the second page and then save the third page separately. The PDF files will be named according to the specific word on that page.

My problem is that I don't know how to loop for each page and read the content of that page to get to the special word and save the pages as a PDF. Thank You

1

There are 1 best solutions below

3
On

Here is how you can do it.

  1. Paginate your Word document using DocumentModel.GetPaginator method.
  2. Read the text content of each page using FrameworkElement.ToText extension method.
  3. Save selected pages to PDF using DocumentModelPage.Save method.

In other words, try the following:

string search = "Your Specific Word";
string inputPath = "input.docx";

// Load Word document.
var document = DocumentModel.Load(inputPath);

// 1. Get document's pages.
var pages = document.GetPaginator().Pages;

for (int i = 0, count = pages.Count; i < count; ++i)
{
    // 2. Read page's text content.
    DocumentModelPage page = pages[i];
    string pageTextContent = page.PageContent.ToText();

    // 3. Save page as PDF.
    if (pageTextContent.Contains(search))
    {
        string outputPath = $"{search}_{i}.pdf";
        page.Save(outputPath);
    }
}