Scripted Browser Scapper

527 Views Asked by At

What can I use to achieve the following, script a browser or otherwise make a request to the server, login, browse the site, eg. find links and navigate to those links.

For now, since I am into NodeJS, I was looking at node.io. It allows you to scrape site quite easily but problem is when I try to post (to login) I get nothing back!

nodeio = require "node.io"

nodeio.scrape ->

    @post "http://localhost/auth/login", {
        username: "username"
        password: "password"
    }, ->

        console.log "=====After Login====="

But I just get

OK: Job complete

Even if the login fails, I should get to after login console.log?


Then I was thinking maybe its better to implement this by scripting a browser instead, it will simulate more closely a real request?

2

There are 2 best solutions below

4
On

node.io seems like it's a good tool for the job, but I'd also recommend zombie.js. It seems to be geared mostly towards testing, but the docs look like it'll be great for scraping, too.

If you want to go the scripted browser route, ignore my answer. :)

2
On

Selenium or Watir let you script a browser. They use the actual browser, so they will be slower than lower level tools, but they do everything a browser will (ie, JavaScript).