How to remove bias from stratification samples

17 Views Asked by At

I am trying to stratify my dataset to divide it into 50 cohorts using 10 metrics for stratification and then run experiments on them. The metrics have high variance.

I am using 10 odd metrics to stratify these samples into 50 cohorts, but there is significant difference in the cohorts in all the metrics, reason being metrics have high variance. What are the methods that I can leverage to improve the bias ?

1

There are 1 best solutions below

0
letdatado On

It highly depends on the nature of your data, your objectives, as well as domain expertise. Usually, balancing cohorts is done using Feature Engineering techniques such as Scaling and Dimension Reduction. Clustering, Stratification, Sensitivity Analysis, Regularization, Weighing metrics can be used as well. Other than that, one could also manually adjust cohort by initial stratification using metrics, and later doing manual cohort assignments to make them more balanced and to represent the population better.