I am using Colly for scrapping an ecommerce website. I will loop over many products.
Here is a snippet of my code getting a sub-title
c.OnXML("/html/body/div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1/1234", func(e *colly.XMLElement) {
fmt.Println(e.Text)
})
However, not all products have a sub-title so the above XML path does not work for all cases.
When I reach a product which does not have a sub-title my code got crashed and return an error of
panic: expression must evaluate to a node-set
Here is my so far code:
c := colly.NewCollector()
c.OnError(func(_ *colly.Response, err error) {
log.Println("Something went wrong:", err)
})
//Sub Title
c.OnXML("/html/body/div[4]/div/div[3]/div[2]/div/div[1]/div[3]/div/div/h1/1234", func(e *colly.XMLElement) {
fmt.Println(e.Text)
})
c.OnRequest(func(r *colly.Request) {
fmt.Println("Visiting", r.URL)
})
c.Visit("https://www.lazada.vn/-i1701980654-s7563711492.html")
Here is what I want
c.OnXML("/html/b.....v/h1/1234", func(e *colly.XMLElement) {
if no error {
fmt.Println("NO ERROR)
} else {
fmt.Println("GOT ERROR")
}
})
Maybe I figured out what went wrong in your code. Let me start with the final. As you can see, the error is originated from the
panicstatement at line 473 of theparse.gofile. The packagexpathhas a method calledparseNodeTestthat does the following check:The value of
p.r.typisitemNumber(28). This leads the switch to enter into the default branch and gives the error. The methods invoked before the above-mentioned one (you can see them in the call stack of your IDE) set thetypfor the literal1234to this value and this caused an invalid XPath query. To make it works, you've to get rid of the1234and put some valid value.Let me know if this solves your issue, thanks!