I have this dataframe:
d=DataFrame(class=["A","A","A","B","C","D","D","D"],
num=[10,20,30,40,20,20,13,12],
last=[3,5,7,9,11,13,100,12])
and I want to do a groupby. In Python I would do:
d.groupby('class')[['num','last']].mean()
How can I do the same in Julia?
I am trying something to use combine
and groupby
but no success so far.
Update: I managed to do it this way:
gd = groupby(d, :class)
combine(gd, :num => mean, :last => mean)
Is there any better way to do it?
It depends what you mean by "a better way". You can apply the same function to multiple columns like this:
or if you had a lot of columns and e.g. wanted to apply
mean
to all columns exept a grouping column you could do:or (if you want to avoid having to remember which column was grouping)
These are basic schemas. Now the other issue is how to give a name to your target columns. By default they get a name in a form
"source_function"
like this:you can keep original column names like this (this is sometimes preferred):
or like this:
The last example shows you that you can pass any function as the last part that works on strings and generates you target column name, so you can do:
Edit
Doing the operation in a single line:
You can do just:
The benefit of storing
groupby(d, :class)
in a variable is that you perform grouping once and then can reuse the resulting object many times, which speeds up things.Also if you use DataFrmesMeta.jl you could write e.g.:
which is more typing, but this is style that people coming from R tend to like.