combine multiple arguments into loop for multiple dataframes

76 Views Asked by At

I have multiple dataframes of different row lengths. I would like to combine them into one dataframe, but before I do that I want to loop multiple functions on the dataframes. My main problem is that I don't know how to execute multiple functions to multiple dataframes.

I have a list of the filenames: for the example dataframes the list is:

stationlist <- c("StationAlpha","StationBeta")

This are two example dataframes:

StationAlpha<-
structure(list(YEAR = c(1979L, 1979L, 1979L, 1979L, 1979L), MONTH = 8:12, 
X1 = c(0, 5.6, 7.6, 0, 11), X2 = c(0, 2.6, 4.7, 0.4, 0), 
X3 = c(0, 0, 0, 0, 0)), .Names = c("YEAR", "MONTH", "X1", 
"X2", "X3"), row.names = c(NA, 5L), class = "data.frame") ### Example file 1

StationBeta<-
structure(list(YEAR = c(2001L, 2001L, 2001L, 2001L, 2001L), MONTH = 4:8, 
X1 = c(2.5, 3.3, 18, 13.5, 0), X2 = c(0.9, 10, 0, 1.5, 0), 
X3 = c(9, 0, 0, 7.5, 3.7)), .Names = c("YEAR", "MONTH", "X1", 
"X2", "X3"), row.names = c(NA, 5L), class = "data.frame") ### Example file 2

Looks like:

YEAR|MONTH | X1  |X2   |X3   |etc.
...
2001|   4  |2.5  |0.9  |9.0  |...
2001|   5  |3.3  |10.0 |0.0  |...
2001|   6  |18.0 |0.0  |0.0  |...
2001|   7  |13.5 |1.5  |7.5  |...
2001|   8  |0.0  |0.0  |3.7  |...
....etc.

These are the functions that I want to execute on all the dataframes:

test<-melt(StationAlpha, id=c("YEAR","MONTH"), variable.name = "day") #combine day columns to one big column
test$day<-gsub("X","",test$day);test$day<-as.numeric(test$day) # change days to numeric value
test$date <- as.Date(with(test, paste(YEAR, MONTH, day,sep="-")), "%Y-%m-%d") # make date from year,month,day

# make a data frame with dates and day of year
Ex.dates<-seq(as.Date("1940/01/01"), as.Date("2015/12/31"), "days") # vector with dates
df.append <-data.frame(Ex.dates); df.append$DayofYear<-df.append$Ex.dates; df.append$DayofYear<-strftime(df.append$DayofYear, format = "%j")
df.append$DayofYear<-as.numeric(df.append$DayofYear); names(df.append)<-c("date","DayOfYear")

# Combine both dataframes on column "date" and leaves out duplicates but keep empty rows # Method 2
df <- merge(df.append,test,by.x='date',by.y='date',all.x=T,all.y=T)
df$value<- ifelse(df$value < 0, -99, df$value) # replace negative values with -99

This is what I would like the dataframes combined into one to look like:

date      |DayOfYear | YEAR |MONTH|day|StationAlpha|StationBeta|StationEtc
...
2001-05-01|   121    |2005  |5    |1  |5.6         |3.3        |5.1
2001-05-02|   122    |2005  |5    |2  |2.6         |10.0       |7.3
2001-05-03|   123    |2005  |5    |3  |0.0         |0.0        |0.0
2001-05-04|   124    |2005  |5    |4  |NA          |NA         |2.1
2001-05-05|   125    |2005  |5    |5  |NA          |NA         |0.0
....etc.

Could you show me or hint me how I could use a for loop on these example dataframes? Or lapply/sapply if that gives the same result?

1

There are 1 best solutions below

0
On

how to apply the functions to every dataframe:

1) List the dataframes

stationlist<-mget(ls(pattern="X9")) 
# just made a list of data frames starting with "X9" as this is characteristic for my type dataframe names, but for the example the following is sufficient:
stationlist<-list(StationAlpha,StationBeta)

2) Use a for loop, to execute each function

my.melt <- function(x){
  x <- melt(x, id.vars=c("YEAR","MONTH"), 
            variable.name="day")
  x
} # melt function in order to be able to melt the days of the month

for (i in 1:length(stationlist)){
  stationlist[[i]] <- my.melt(stationlist[[i]]) 
} # melt the days of the month

for (i in 1:length(kenyastationlist)){
  stationlist[[i]]$day <- gsub("X","",stationlist[[i]]$day) 
} #remove the X from the stringvalue, before turning it into a number of day 

etc. apply for loop for other functions as well.

3) Combine all the dataframes in the list into one, while maintaining the columns with unique columnnames:

Combined_df<- Reduce(function(x, y) merge(x, y, all=TRUE),stationlist) # combine all into one dataframe