Pyspider console : phantomjs not found, continue running without it

758 Views Asked by At

I try to start a scraping project with Pyspider, I installed the required libraries:

  • Pyspider
  • PhantomJs
  • Tornado
  • Wsgidav (the required version 2.4)
  • Jsmin

OK, after installation I got this error

File "c:\users{:))}\appdata\local\programs\python\python37\lib\site-packages\pyspider\run.py", >line 231

async=True, get_object=False, no_input=False):

SyntaxError: invalid syntax

I solved this problem by changing all python keywords "async" variables name by another name "_async".(because I use python 3.7 and this version has set the async word as a keyword)

I started the project again with the command:

python -m pyspider.run

And got those errors :

C:\Users\yosser\AppData\Local\Programs\Python\Python37-32\lib\site-packages\pyspider \libs\utils.py:196: FutureWarning: timeout is not supported on your platform.

warnings.warn("timeout is not supported on your platform.", FutureWarning)

[W 200425 12:55:44 run:413] phantomjs not found, continue running without it.

[I 200425 12:55:46 result_worker:49] result_worker starting...

[I 200425 12:55:47 processor:211] processor starting...

[I 200425 12:55:47 scheduler:647] scheduler starting...

[I 200425 12:55:47 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0

[I 200425 12:55:47 result_worker:66] result_worker exiting...

[I 200425 12:55:47 scheduler:782] scheduler.xmlrpc listening on 127.0.0.1:23333

[I 200425 12:55:48 tornado_fetcher:638] fetcher starting...

[I 200425 12:56:47 scheduler:586] in 5m: new:0,success:0,retry:0,failed:0

The Pyspider server is down (localhost: 5000 not found) For this issue, I doubted the output line:

[W 200425 12:55:44 run:413] phantomjs not found, continue running without it.

and I changed the file "webui/webdav.py" according to this ansower. but no good news about it. Please I need to end this bad live story asp, Thank you.

1

There are 1 best solutions below

0
On

async in python 3.x is used as a keyword. Therefore, when using pyspider on python 3.x, you need to rename async to _async in all the necessary files. Start with the files and lines that cause an error in the console when starting pyspider

To use PhantomJS, you should have PhantomJS installed. If you are running pyspider with all mode, PhantomJS is enabled if excutable in the PATH.

Make sure phantomjs is working by running

$ pyspider phantomjs

PhantomJS is installed and placed somewhere in the PATH?