I'm setting up a new web scraper using Apify to scrape a page with pagination. Usually, I'd use the use the request queue, Link Selector, Pseudo-URL method. However the page I'm trying to scrape has dynamic "next page" buttons and the link is triggered via a javascript function.
What would be the best way to tell Apify's web scraper to go to the next page?
Any way to simulate a manual click on the button?
Or to use the number sequence at the end of the URL (www.domain.com/discover/recent?page=2)?
Looking at this particular website - it looks like every next page (as you already mentioned) has
?page=<i>
in url, therefore you could just enqueue next page in the end of page function by usingcontext.enqueueRequest()
.Another option is to use cheerio crawler with xhr links, which would look like this one: https://webflow.com/api/discover/sites/recent?limit=12&offset=0&sort=&cloneable=false&tag=, where offset would be 0, 12, 24, etc. This way you'll get an array of structured jsons in response (which represent 12 items loaded on page), and also you would save some Compute Units as you don't really need the browser.
Hope this helps!