How to get 'Next Page' link with Scrubyt

841 Views Asked by robintw At 03 October 2008 at 20:56

I'm trying to use Scrubyt to get the details from this page http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?section=events. I've managed to get the titles and detail URLs from the list, but I can't use next_page to get the scraper to go to the next page. I assume that's cause I'm not using the correct pattern for the next page link. I tried the string "Next Page", and I've also tried the XPath. Any other ideas?

The code is below:

require 'rubygems'
require 'scrubyt'

nuffield_data = Scrubyt::Extractor.define do
  fetch 'http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?section=events'

  event do
    title 'The Coast of Mayo'
    #url "href", :type => :attribute
    link_url
  end

  next_page "Next Page", :limit => 2


end

  nuffield_data.to_xml.write($stdout,1)

Original Q&A

There are 1 best solutions below

user6325 On 04 October 2008 at 10:34 BEST ANSWER

Try this with a slightly different URL:

fetch 'http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php'

scrubyt seems to be having issues with "?section=events" query on the end of the URL.

When it looks for the next page it is trying to return this URL:

http://www.nuffieldtheatre.co.uk/cn/events/?pageNum_rsSearch=1&totalRows_rsSearch=39&section=events

instead of:

http://www.nuffieldtheatre.co.uk/cn/events/event_listings.php?pageNum_rsSearch=1&totalRows_rsSearch=39&section=events

Removing the query string on the end of the URL seems to fix this - you might want to file this as a bug.

How to get 'Next Page' link with Scrubyt

There are 1 best solutions below

Related Questions in RUBY

Related Questions in SCRUBYT

Trending Questions

Popular # Hahtags

Popular Questions