Stratified Sampling using R or Python

414 Views Asked by At

I have a dataset with 400K observations and 250 features. I would like to perform the stratified sampling.

I referred many links, but they are all after 1 or two variables examples including Target.

Can anybody please help me how should be performing stratified sampling using R / Python.

thanks in Adavance !

1

There are 1 best solutions below

0
On

If you first group your data.frame, you can sample each group using dplyr's sample_n()

library(dplyr)
sample.df <- df %>% group_by( ID ) %>% sample_n( 10 )