I have multiple dataframes of different row lengths. I would like to combine them into one dataframe, but before I do that I want to loop multiple functions on the dataframes. My main problem is that I don't know how to execute multiple functions to multiple dataframes.
I have a list of the filenames: for the example dataframes the list is:
stationlist <- c("StationAlpha","StationBeta")
This are two example dataframes:
StationAlpha<-
structure(list(YEAR = c(1979L, 1979L, 1979L, 1979L, 1979L), MONTH = 8:12,
X1 = c(0, 5.6, 7.6, 0, 11), X2 = c(0, 2.6, 4.7, 0.4, 0),
X3 = c(0, 0, 0, 0, 0)), .Names = c("YEAR", "MONTH", "X1",
"X2", "X3"), row.names = c(NA, 5L), class = "data.frame") ### Example file 1
StationBeta<-
structure(list(YEAR = c(2001L, 2001L, 2001L, 2001L, 2001L), MONTH = 4:8,
X1 = c(2.5, 3.3, 18, 13.5, 0), X2 = c(0.9, 10, 0, 1.5, 0),
X3 = c(9, 0, 0, 7.5, 3.7)), .Names = c("YEAR", "MONTH", "X1",
"X2", "X3"), row.names = c(NA, 5L), class = "data.frame") ### Example file 2
Looks like:
YEAR|MONTH | X1 |X2 |X3 |etc.
...
2001| 4 |2.5 |0.9 |9.0 |...
2001| 5 |3.3 |10.0 |0.0 |...
2001| 6 |18.0 |0.0 |0.0 |...
2001| 7 |13.5 |1.5 |7.5 |...
2001| 8 |0.0 |0.0 |3.7 |...
....etc.
These are the functions that I want to execute on all the dataframes:
test<-melt(StationAlpha, id=c("YEAR","MONTH"), variable.name = "day") #combine day columns to one big column
test$day<-gsub("X","",test$day);test$day<-as.numeric(test$day) # change days to numeric value
test$date <- as.Date(with(test, paste(YEAR, MONTH, day,sep="-")), "%Y-%m-%d") # make date from year,month,day
# make a data frame with dates and day of year
Ex.dates<-seq(as.Date("1940/01/01"), as.Date("2015/12/31"), "days") # vector with dates
df.append <-data.frame(Ex.dates); df.append$DayofYear<-df.append$Ex.dates; df.append$DayofYear<-strftime(df.append$DayofYear, format = "%j")
df.append$DayofYear<-as.numeric(df.append$DayofYear); names(df.append)<-c("date","DayOfYear")
# Combine both dataframes on column "date" and leaves out duplicates but keep empty rows # Method 2
df <- merge(df.append,test,by.x='date',by.y='date',all.x=T,all.y=T)
df$value<- ifelse(df$value < 0, -99, df$value) # replace negative values with -99
This is what I would like the dataframes combined into one to look like:
date |DayOfYear | YEAR |MONTH|day|StationAlpha|StationBeta|StationEtc
...
2001-05-01| 121 |2005 |5 |1 |5.6 |3.3 |5.1
2001-05-02| 122 |2005 |5 |2 |2.6 |10.0 |7.3
2001-05-03| 123 |2005 |5 |3 |0.0 |0.0 |0.0
2001-05-04| 124 |2005 |5 |4 |NA |NA |2.1
2001-05-05| 125 |2005 |5 |5 |NA |NA |0.0
....etc.
Could you show me or hint me how I could use a for loop on these example dataframes? Or lapply/sapply if that gives the same result?
how to apply the functions to every dataframe:
1) List the dataframes
2) Use a for loop, to execute each function
etc. apply for loop for other functions as well.
3) Combine all the dataframes in the list into one, while maintaining the columns with unique columnnames: