Multiple kernels in Spyder - Python3

911 Views Asked by At

I need to get source code of about 4 thousand of web pages, and extract a few numbers from it. I achieve this using urllib and .split(), store it in a dataframe and export to csv. After running cProfile:

ncalls tottime  percall cumtime  percall filename:
290    0.003    0.000  411.894    1.420 request.py:1281(http_open)
290    0.002    0.000  411.956    1.421 request.py:140(urlopen)

These take a long time. Is there a work around to getting source codes faster? If not, are there any disadvantages to splitting up the urls among 6 different Kernels, so that each only has to get some 650 source codes, and run it in parallel, instead of using Threading. I am new to Python3.

Also, is the above excerpt from cProfile => Python3 evidence that the source code fetching is the part which is a bottleneck to the code? What other factors could contribute to slow speeds in this regard? I have a decent 8mbps connection but I believe the TCP handshake is what takes too long.

0

There are 0 best solutions below