I want to convert a pdf file's each page to a new image. To do this, i use GhostScript.Net.
The problem is i can't figure out why pageImage returns null in the System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i);
line. Here is the method i use:
public static List<string> GetPDFPageText(Stream pdfStream, string dataPath)
{
try
{
int dpi = 100;
GhostscriptVersionInfo lastInstalledVersion =
GhostscriptVersionInfo.GetLastInstalledVersion(
GhostscriptLicense.GPL | GhostscriptLicense.AFPL,
GhostscriptLicense.GPL);
List<string> textParagraphs = new List<string>();
using (GhostscriptRasterizer rasterizer = new GhostscriptRasterizer())
{
rasterizer.Open(pdfStream, lastInstalledVersion,false);
for (int i = 1; i <= rasterizer.PageCount; i++)
{
// here is the problem, pageImage returns null
System.Drawing.Image pageImage = rasterizer.GetPage(dpi, i);
// rest of code is unrelated to problem..
}
}
return textParagraphs;
}
catch (Exception ex)
{
throw new Exception("An error occurred.");
}
}
Function parameter Stream pdfStream
comes from the below code:
using (StreamCollection streamCollection = new StreamCollection())
{
FileStream imageStream = new FileStream(imagePath, FileMode.Open, FileAccess.Read);
// This is the parameter I used for "Stream pdfStream"
FileStream pdfStream = new FileStream(pdfPath, FileMode.Open, FileAccess.Read);
streamCollection.Streams.Add(imageStream);
streamCollection.Streams.Add(pdfStream);
PDFHelper.SavePDFByFilesTest(dataPath, streamCollection.Streams,mergedFilePath);
}
I am already comfortable with the use of StreamCollection
class because i used it before in a similar situation and it worked. I verified that the filepath is true and stream has the file correctly. Also i tried using MemoryStream
instead of FileStream
and filename
instead of stream
just to see if the problem is related to them or not. Is there any suggestion you could suggest? I would really appreciate that.
Okay, i figured out why it didn't work. I use the latest version of Ghostscript (9.56.1) as K J mentioned (thank you for the response) and it uses a new PDF interpreter as default PDF interpreter. I assume it didn't work properly for some reason because it is a really new tool and still may have little problems for now. I added the following line to use good old PDF interpreter:
Also defined resolution for produced image by following line:
Furthermore, i will share the structure of
StreamCollection
class, I used here as reference to implement this class. Hope it helps someone.