How to add the start of a url to a colly link list

436 Views Asked by At

I'm somewhat new to go and am trying to scrape several webpages using colly. Two of the pages have incomplete links, the below is the code and output

func PaloNet() {

    c := colly.NewCollector(
        colly.AllowedDomains("security.paloaltonetworks.com"),
    )

    c.OnHTML(".list", func(e *colly.HTMLElement) {
        PaloNetlinks := e.ChildAttrs("a", "href")
        fmt.Println("\n\n PaloAlto Security: \n\n", PaloNetlinks)
    })

    c.Visit("https://security.paloaltonetworks.com/")

}

Output:

[/CVE-2022-0031 /CVE-2022-42889 /PAN-SA-2022-0006 /CVE-2022-0030 /CVE-2022-0029 /PAN-SA-2022-0005 /CVE-2022-28199 /PAN-SA-2022-0004 /CVE-2022-0028 /PAN-SA-2022-0003 /CVE-2022-0024 /CVE-2022-0026 /CVE-2022-0025 /CVE-2022-0027 /PAN-SA-2022-0001 /PAN-SA-2022-0002 /CVE-2022-0023 /CVE-2022-0778 /CVE-2022-22963 /CVE-2022-0022 /CVE-2021-44142 /CVE-2022-0016 /CVE-2022-0017 /CVE-2022-0020 /CVE-2022-0011 /csv?]

As you can see the links are missing the 'https://security.paloaltonetworks.com/' section. What would be the best way to add the start of the link

1

There are 1 best solutions below

1
Ahmet Buğra Okyay On BEST ANSWER

you can do it like this

func PaloNet() {
visitUrl := "https://security.paloaltonetworks.com"
urls := []string{}

c := colly.NewCollector(
    colly.AllowedDomains("security.paloaltonetworks.com"),
)

c.OnHTML(".list", func(e *colly.HTMLElement) {
    PaloNetlinks := e.ChildAttrs("a", "href")

    for i := 0; i < len(PaloNetlinks); i++ {
        urls = append(urls, visitUrl+PaloNetlinks[i])
    }

    fmt.Println("\n\n PaloAlto Security: \n\n", urls)
})

c.Visit(visitUrl)
}