I am essentially trying to scrape a page on the fly. When you hit this url, it ouputs the result from the scrape job. Everything works wonderfully the first time. The second time I try it (with different parameters passed through job.options.args) it won't even execute the node.io job's run() function. scrape_result
returns empty the second time (I expect an object).
Any thoughts? How can I ensure the new results get returned the 2nd time? For my scrape job I'm almost exactly using example #3 from here: https://github.com/chriso/node.io/wiki/Scraping
excerpt from scraper.js (the rest is like example #3: https://github.com/chriso/node.io/wiki/Scraping)
run: function() {
var book = this.options.args[0].book;
var chapter = this.options.args[0].chapter;
this.getHtml('http://www.url.com' + book + '/' + chapter + '?lang=eng', function(err, $) {
Then my app.js
var scrip_scraper = require('./scraper.js');
app.get('/verses/:book/:chapter', function (req, res) {
var params = {
book: req.param('book'),
chapter: req.param('chapter')
}
scrip_scraper.job.options.args[0] = params;
//scrip_scraper.job.options.args.push(chapter);
console.log(scrip_scraper.job.options.args);
nodeio.start(scrip_scraper, function (err, scrape_result) {
console.log(scrape_result);
}, true);
}); //app.get('/verses/:book/:chapter')
You're probably running into scoping issues because
options.args
might change while a request is being made. Try passing the input to the job as a function argument so it cannot be changed by another request. Here's an example that you could adapt to your needsapp.js
scraper.js