How to extract data from weather underground (wunderground) into csv files

8.2k Views Asked by At

I am trying to download historical data from weather underground into csv format by running a for loop with iterations [i] for each date.

In order to speed up the run time I am running the code from the 1st of September 2019 to today's date (I will actually need to run for a period of a few years). I am running a for loop for the url:

myurl[i] <- paste("https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=",station,"&day=",my_dates$day[i],"&month=",my_dates$month[i],"&year=",my_dates$year[i],"&graphspan=day&format=1", sep = "")

Where i is a date between the start and end. eg. https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=IISLEOFW4&day=1&month=9&year=2019&graphspan=day&format=1

start_date <- as.POSIXct("2019-09-01",tz="gmt")
end_date <-  as.POSIXct(Sys.Date(),tz="gmt")

dates <- seq.POSIXt(start_date,end_date,by="day")

my_dates<-as.integer(unlist(strsplit(as.character(dates),"-")))

my_dates<-array(my_dates,dim=c(3,length(my_dates)/3))

my_dates<-as.data.frame(t(my_dates), )

colnames(my_dates)<-c("year","month","day")

station <- "IISLEOFW4"


for(i in 1:length(my_dates)) {

  myurl[i] <- paste("https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=",station,"&day=",my_dates$day[i],"&month=",my_dates$month[i],"&year=",my_dates$year[i],"&graphspan=day&format=1", sep = "")
  folder <- "D:\\WeatherData"
  myfile <- paste("NewportWeather",rownames(my_dates),".csv")
  download.file(url = myurl[i], file.path(folder, myfile, fsep = "\\" ))
}

When running this, only data for one day is downloaded in the correct file location, not all of them in the date range. Is anyone able to help me download all data for the entire date range in either separate files, or all appended onto the same file.

You will need to change the folder the file is downloaded into to run the code.

Thanks for helping!

1

There are 1 best solutions below

1
On

I have sorted it out!

As @AndrewGustar pointed out, i needed to incorporate i into the filename which I found can be done like this.

myfile[i] <- paste("NewportWeather",i,".csv")

I also had to chance length(my_dates) to nrow(my_dates) in the for loop.

The code finished as this:

start_date <- as.POSIXct("2019-09-01",tz="gmt")
end_date <-  as.POSIXct(Sys.Date(),tz="gmt")

dates <- seq.POSIXt(start_date,end_date,by="day")

my_dates<-as.integer(unlist(strsplit(as.character(dates),"-")))

my_dates<-array(my_dates,dim=c(3,length(my_dates)/3))

my_dates<-as.data.frame(t(my_dates), )

colnames(my_dates)<-c("year","month","day")

station <- "IISLEOFW4"


for(i in 1:nrow(my_dates)) {

  myurl[i] <- paste("https://www.wunderground.com/weatherstation/WXDailyHistory.asp?ID=",station,"&day=",my_dates$day[i],"&month=",my_dates$month[i],"&year=",my_dates$year[i],"&graphspan=day&format=1", sep = "")
  folder <- "D:\\WeatherData"
  myfile[i] <- paste("NewportWeather",i,".csv")
  download.file(url = myurl[i], file.path(folder, myfile[i], fsep = "\\" ))
}