I compiled PoDoFo 0.10.3 already successfully in Visual Studio 2022. Now I want to use this library to extract text from a PDF document, but I am struggling with the API. Even I can´t find any example how to do that...
void parseOneFile(const string_view& filename)
{
PdfMemDocument document;
document.Load(filename);
// iterate over all pages of the whole pdf document
for (int pn = 0; pn < document.GetPageCount(); ++pn)
{
PoDoFo::PdfPage* page = document.GetPage(pn);
// todo: ectract the text from the page
}
Unfortunately the above code example is not working... (class PoDoFo::PdfMemDocument has no member GetPageCount)
Does anyone have an idea how to do this?
I just want to extract the text and save it in a container like std::vector<std::string> for further processing.
Thank you!
After reading the API, I was able to write the following lines of code:
But I'm not sure if I'm on the right track... I still don't have the data / the text (of type std::string).