How can ddply (and similar functions) work with multiple data frames.
For example, I have one dataframe with information about cars in a family
car <- data.frame(name=c('aaa','aaa','bbb'), cars=c('honda','chevy','datsun'))
and a second dataframe with family members
people <- data.frame(name=c('aaa','bbb','bbb'), age=c(25,18,33))
I would like to apply a function
neatfun <- function( car_chunk, people_chunk){ analysis with age and type of cars}
to the corresponding chunks of car and people, something along the lines of
analysis <- ddply( list(car,people), "name", neatfun)
where ddply would split the list of dataframes by name and then pass the corresponding chunks of each dataframe to the neatfun function.
At the moment, I'm willing to assume that every "name" appears appears in all data frames so I don't have to worry about families with cars (but no people) or with people (but no cars).
Thanks
Without knowing exactly what you mean by 'some analysis', I can see a few ways to proceed. Start off by combining your data into a single dataframe.
Then use dplyr operations to do analysis.
If you have something a little more complex to do, and you want to have a function that say takes in a dataframe, does some stuff to it, and spits out a new dataframe, you can write the function that does it, split your data into a list, apply the function to each piece, and then either keep it in a list or if it makes sense bind them all together.
This is assuming we have the same df from above, a joined dataframe. (The
merge
option listed above should do the same thing as aleft_join
indplyr
.