can DataFramesMeta handle multicolumn transformation or summarization at once?

59 Views Asked by At

I was wondering If I am able to replicate things in DataFramesMeta.

Let's check how you do in DataFrames with many columns and functions at once.

using DataFrames, Statistics
df=DataFrame(
           a=string.("cat",rand(1:3,100)),
           x=rand(1:100, 100),
           y=rand(2:10:100, 100),
           z=rand(0.5:0.1:2, 100)
         )
combine(groupby(df,:a), [:x,:y,:z].=>[mean sum])

Is there an equivalency for DataFramesMeta? I didn't see any example for more than one column. In the documentation I found only the use of the $ for replicating (symbol => f => newname) logic and the form :y=>f(:x) which is individual I guess.

Thanks.

1

There are 1 best solutions below

0
On

For multicolumn manipulation you can use DataFrameMacros. if you don't want to use DataFrameMacros and use DataFramesMeta you can do as follow:

using DataFramesMeta, Statistics
@by  df :a $([:x,:y,:z].=>[mean sum])

#for DataFrameMacros.

using DataFrameMacros, Statistics
@combine(@groupby(df,:a)          
        , "mean_{}" =  mean({Not(:a)})
        ,  "sum_{}" =  sum({Not(:a)}))
    
#alternative
gdf =  @groupby(df, :a)
@combine( gdf, "mean_{}" =  mean({valuecols(gdf)})
             ,  "sum_{}" =  sum({valuecols(gdf)})
         )

#or (if you don't have @chain avaliable)
gdf |> x->  @combine(x, "mean_{}" = Mean({valuecols(x)})
                         , "sum_{}" =  sum({valuecols(x)})
                    )

DataFrameMacros is wellsuited for complex multicolumn manipulation, meanwhile, you can use other DataFrames keywords like, All(),Not(), valuecols(gdf) for direct manipulation.