How to get distinct count in aggregate

82 Views Asked by janpeterka At 25 November 2019 at 14:29

I simply want to get distinct_count aggregation.

I have this code:

data_frame = data_frame.group_by(:job_id)
                       .aggregate(job_id: :max, bid_id: :count)

I want something like this:

data_frame = data_frame.group_by(:job_id)
                       .aggregate(job_id: :max, bid_id: :distinct_count)

I know there is no statistical method like that implemented yet, is there any other way?

Original Q&A

There are 1 best solutions below

janpeterka On 25 November 2019 at 14:29 BEST ANSWER

I found one way to do this:

data_frame = data_frame.group_by(:job_id)
                       .aggregate(job_id: :max,
                                  bid_id: lambda{ |x| x.uniq.size })

or maybe better yet:

data_frame = data_frame.group_by(:job_id)
                       .aggregate(job_id: :max,
                                  bid_id: ->(x) { x.uniq.size })

I am not sure if it is the right way, but it seems to work.

This pandas solution helped me.

How to get distinct count in aggregate

There are 1 best solutions below

Related Questions in RUBY

Related Questions in DATAFRAME

Related Questions in DARU

Trending Questions

Popular # Hahtags

Popular Questions