How to enable Js on colly

1.7k Views Asked by At

I've had a lot of experiences with Scrapy but or some reasons in this project I should use colly. I'm trying to scrape data from a website but it returns To regain access, please make sure that cookies and JavaScript are enabled before reloading the page.

the part of my codes as follow:

func crawl(search savedSearch) {
    c := colly.NewCollector()
    extensions.RandomUserAgent(c)
    /* for debugging to see what is the result
    c.OnHTML("*", func(e *colly.HTMLElement) {
        fmt.Println(e.Text)
        os.Exit(1)
    })*/
    c.OnHTML(".result-list__listing", func(e *colly.HTMLElement) {
        listingId, _ := strconv.Atoi(e.Attr("data-id"))
        if !listingExist(search.id, listingId) {
            fmt.Println("Listing found " + strconv.Itoa(listingId))
            saveListing(search.id, listingId)
            notifyUser(search.user, listingId)
        }else{
            fmt.Println("item is already crawled")
        }
    })

I see in the doc "Automatic cookie and session handling" so it might be the problem is js, how can I overcome this problem? first, try could be how can I enable js in colly?

1

There are 1 best solutions below

0
On

Colly is the best choice for HTML pages. If you need to scrape JS-driven pages, you will need to use a different strategy. Browsers have a mutual protocol to work on JS and they have different libraries for different language including Go.