I'm trying to use R to scrape the city-data.com data for a couple of hundred California cities and have it return a nice table (rows=cities, columns=city variables) from the website. I'd like to be able to input a list of URLs, one for each city. Right now I can come close to scraping a single city URL for a single record using:
library(XML)
city.url = c("http://www.city-data.com/city/Acalanes-Ridge-California.html"
city.df = readHTMLTable(city.url, header=T, which=2, stringsAsFactors=F)
head(city.df,1)
It returns: Males: 568 (50.0%) 1 Females: 569 (50.0%)
I'd really appreciate any advice. Dollar General is trying to build in our community and I'm trying to quickly put together an impact analysis to examine what happens to small towns after a Dollar General is built. Thanks!