I'm using GemBox.Pdf and I need to extract individual chapters in a PDF file as a separate PDF files.
The first page (maybe the second page as well) contains TOC (Table Of Contents) and I need to split the rest of the PDF pages based on it:
Also, those PDF documents that are split, should be named as the chapters they contains.
I can split the PDF based on the number of pages for each document (I figured that out using this example):
using (var source = PdfDocument.Load("Chapters.pdf"))
{
int pagesPerSplit = 3;
int count = source.Pages.Count;
for (int index = 1; index < count; index += pagesPerSplit)
{
using (var destination = new PdfDocument())
{
for (int splitIndex = 0; splitIndex < pagesPerSplit; splitIndex++)
destination.Pages.AddClone(source.Pages[index + splitIndex]);
destination.Save("Chapter " + index + ".pdf");
}
}
}
But I can't figure out how to read and process that TOC and incorporate the chapters splitting base on its items.
EDIT:
On that same page that you linked, there is now Split PDF file by bookmarks (outlines) example.
ORIGINAL:
You should iterate through the document's bookmarks (outlines) and split it based on the bookmark destination pages.
For instance, try this:
Note, from the screenshot it seems that your chapter bookmarks include the order's number (roman numerals). If needed, you can easily remove those with something like this: