For loop goes through the first 2 items on the list then has a Tracback error on the Third

469 Views Asked by At

I'm using a for loop to go down a list and create variables for the pdfkit module and it works just fine for the first two items on the list then has an error on the Third. This is my code:

import pdfkit
import time


link1 = "https://www."
link2 = ".com"
pdf = ".pdf"
for line in open('links.txt'):
  print(line.strip("\n\r"))
  newlink = link1 + line.strip("\n\r") + link2
  print(newlink)
  newpdf = line.strip("\n\r") + pdf
  print(newpdf)
  pdfkit.from_url(newlink, newpdf)
print('Finished')

And its pulling from This list:

bing
yahoo
google

It Successfully completes the first 2 items and prints a pdf on them then I get an error that says,

Traceback (most recent call last): File new.py, line 14 in module pdfkit.from_url(newlink, newpdf)

File "/usr/local/lib/python2.7/dist-packages/pdfkit/api.py", line 26 in from_return r.to_pdf(output_path)

File "/usr/local/lib/python2.7/dist-packages/pdfkit/pdfkit.py," line 156, in traise IOError('wkhtmltopdf reported an error:\n' + stderr)

IOError:wkhtmltopdf reported an error:

Does anybody know why I'm getting this error and how to fix it?

2

There are 2 best solutions below

0
On BEST ANSWER

I wasn't able to find a fix for the network error on wkhtml, yet. But, I instead found an alternate plugin that works, called weasyprint.

Here is an alternate version of your code with weasyprint implemented.

from weasyprint import HTML

link1 = "https://www."
link2 = ".com"
pdf = ".pdf"
for line in open('links.txt'):
  print(line.strip("\n\r"))
  newlink = link1 + line.strip("\n\r") + link2
  print("newlink "+newlink)
  newpdf = line.strip("\n\r") + pdf
  print(newpdf)
  HTML(newlink).write_pdf(newpdf)
print('Finished')

Hopefully this helps.

4
On

When I ran the same code as you, it got stuck on "yahoo" whereas google and a few other websites I had tried, worked. It threw the following error for me.

raise IOError("wkhtmltopdf exited with non-zero code {0}. error:\n{1}".format(exit_code, stderr))
OSError: wkhtmltopdf exited with non-zero code 1. error:
Loading pages (1/6)
QFont::setPixelSize: Pixel size <= 0 (0)
QFont::setPixelSize: Pixel size <= 0 (0)
libpng warning: iCCP: known incorrect sRGB profile
Counting pages (2/6)                                               
QFont::setPixelSize: Pixel size <= 0 (0)
QFont::setPixelSize: Pixel size <= 0 (0)
Resolving links (4/6)                                                       
Loading headers and footers (5/6)                                           
Printing pages (6/6)
Done                                                                      
Exit with code 1 due to network error: ProtocolFailure

As you can see here, it seems to be an error due to protocol which implies that wkhtml could not load the page for some reason. I think the error you must have received must have been from a similar source. Therefore, if the choice of websites was just arbitrary, then choose websites that work.

If not, do tell and I'll delve into wkhtml documentation to try figure out the error source.