I have the following problem:
For an analysis of weather effects on volunteers observing nature (animals, plants etc.) for a citizen science web page, I need to match the daily observations with the weather information of the nearest weather station. I'm using rdwd (for data of German weather service) and already managed to combine each observation location with the nearest weather station. So I now have a data frame (my_df_example) like this with 100 rows:
ID Date lat long Station_id Stationname
1317186439 2019-05-03 47.77411 9.540569 4094 Weingarten, Kr. Ravensburg
-2117439060 2019-05-19 48.87217 9.396229 10510 Winterbach/Remstal
-630183789 2019-04-30 48.86810 9.285427 4928 Stuttgart (Schnarrenberg)
-390672435 2019-05-10 50.71187 8.706279 1639 Giessen/Wettenberg
262182713 2019-05-01 50.82548 8.892961 3164 Coelbe, Kr. Marburg-Biedenkopf
-373270631 2019-05-24 51.61666 7.950153 5480 Werl
with dput(my_df_example):
structure(list(ID = c(1317186439L, -2117439060L, -630183789L, -390672435L, 262182713L, -373270631L,...
Datum = structure(c(1556841600, 1558224000, 1556582400, 1557446400, 1556668800, 1558656000, 1558224000, 1557532800,..., class = c("POSIXct", "POSIXt"), tzone = "UTC"),
lat = c(47.7741093721703, 48.8721672952686, 48.8681024146134, 50.7118683229165, 50.8254843786222, 51.6166575725419, 48.7357007677785,...
long = c(9.54056899481679, 9.3962287902832, 9.28542673587799, 8.70627880096436, 8.89296054840088, 7.95015335083008, 11.3105964660645,...
Stations_id = c(4094L, 10510L, 4928L, 1639L, 3164L, 5480L, 3484L,...
Stationsname = c("Weingarten, Kr. Ravensburg", "Winterbach/Remstal", "Stuttgart (Schnarrenberg)", "Giessen/Wettenberg", "Coelbe, Kr. Marburg-Biedenkopf", "Werl",...
row.names = c("58501", "89910", "69539", "24379", "45331", "77191", "50028",
class = "data.frame")
What I need to do now is get the weather information for each station on that specific date. I'm trying to use the rdwd package in R to do so. I tried two options so far, that both didn't work out.
Option 1:
urls <- selectDWD(name=my_df_final$Stationsname, res="daily", var="kl", per="historical", outvec=TRUE)
kl <- dataDWD(urls[1:100])
That gives me a list of 100 lists. Each list of the 100 includes the weather data for every recorded day of a certain station. So I would need to filter the data from those lists so that the date matches the dates in my_df_example. I don't know how to extract info from a list inside a list though.
Option 2:
stat <- my_df_example$Stationname
link <- selectDWD(c(stat), res="daily", var="kl", per="hist")
file <- dataDWD(link, read=FALSE)
clim <- readDWD(file, varnames=TRUE)
The problem here is, that dataDWD doesn't work for lists. And since "link" includes multiple Station names it is not just a vector.
I don't really know if one of these options is the right way at all or if an alternative would make more sense.
Thank you for any advice you can give.
According to your problem:
Then, once you have your list of lists (
kl
) then you can subset from this "meta"-list the information that you are looking for this way:x
represents the objectkl
passed to the function definition. The%in%
operator, as its letters indicate, will look for the elements in common between$MESS_DATUM
and$Date
variables and (&) also for the matches betweenSTATIONS_ID
andStation_id
.which()
ensures that no logical surprises occur while subsetting the data andas.Date()
returns a common date format for both data frames.After performing the extraction, you have to collapse the information into a single data frame. Since all the columns in all the lists inside the meta-list are the same, you can use
do.call()
+rbind()
directly. Like:To avoid messy rownames, call:
Then, to see the station names in the query data set, merge the query with my_final_df:
The final result looks like this:
This data set matches the dates and station's ids and names you first provided in the
my_df_example
.Provided more time, maybe someone will tell us how to solve this with
tidyverse
notation, because I suspect it is even more straightforward to do the subsetting-extraction algorithm with this package.