I'm trying to webscrape latitude and longitude for Zillow houses using selector gadget tool for R, using rvest and dplyr packages.
Im trying find the latitude and longitude for each listing and store it into the data frame I created using the following code. This what I have now. Can anyone help?
link = "https://www.zillow.com/arlington-va/2_p/?searchQueryState=%7B%22pagination%22%3A%7B%22currentPage%22%3A2%7D%2C%22usersSearchTerm%22%3A%22arlington%2C%20virginia%22%2C%22mapBounds%22%3A%7B%22west%22%3A-77.46492611914063%2C%22east%22%3A-76.73708188085938%2C%22south%22%3A38.64364888623124%2C%22north%22%3A39.117234332841704%7D%2C%22regionSelection%22%3A%5B%7B%22regionId%22%3A30258%2C%22regionType%22%3A6%7D%5D%2C%22isMapVisible%22%3Atrue%2C%22filterState%22%3A%7B%22ah%22%3A%7B%22value%22%3Atrue%7D%2C%22sort%22%3A%7B%22value%22%3A%22globalrelevanceex%22%7D%7D%2C%22isListVisible%22%3Atrue%7D"
page = read_html(link)
bed = page %>% html_nodes(".list-card-details li:nth-child(1)") %>% html_text()
bed = page %>% html_nodes(".list-card-details li:nth-child(1)") %>% html_text()
bath = page %>% html_nodes(".list-card-details li:nth-child(2)") %>% html_text()
sqfoot = page %>% html_nodes(".list-card-details li:nth-child(3)") %>% html_text()
price = page %>% html_nodes(".list-card-price") %>% html_text()
marketime= page %>% html_nodes(".list-card-variable-text") %>% html_text()
houses = data.frame(address, bed, bath, sqfoot, price, marketime) %>%
mutate(bed = as.numeric(substring(bed, 1, 1)), bath = substring(bath, 1, 1), sqfoot =
gsub(",","",sqfoot), price = gsub(",", "", price))
houses <- mutate(houses, sqfoot = as.numeric(gsub(" sqft", "", houses$sqfoot)), price =
as.numeric(substring(price, 2, nchar(houses$price))))
You can extract all the listing info from script tags on the page (though I think zillow does an API which would be a better source)