Colly not finding the body tag by xpath but finding it by selector name

1.8k Views Asked by At

I'm learning web scraping using gocolly. When I try to find the tag using selector name body, it successfully finds it. However, when I try to find the body tag by xpath /html/body, it fails to find it.

I have used OnHTML() with a simple callback function:

collector.OnHTML("/html/body", func(element *colly.HTMLElement) {

    fmt.Println("Found Body")

})

Any idea as to why is this happening?

Also, when looking at tutorials, I noticed that the selector passed into the function OnHTML() is sometimes wrapped by ""(double quotes) and sometimes by ``(back-ticks). Is there a difference between the two?

How do I search for a ID element because when I'm trying to search for the ID #layout-container under the body, Colly is not finding it:

collector.OnHTML("#layout-container", func(element *colly.HTMLElement) { 

    fmt.Println("Found Layout Container") 

})

Thanks in advance!

1

There are 1 best solutions below

2
On

From an HTML perspective, the /html part is already implied when using OnHTML.

You would use /html/body, as shown in colly_test.go, with OnXML() (Function will be executed on every XML element matched by the xpath Query parameter)

The test using OnHTML shows only "body".