How much data can sklearn handle with kernel density estimation

1.6k Views Asked by formath At 10 November 2014 at 11:51

I have a data set with 40 million line (about 8Mb) while each line is of float type. I want to use sklearn kernel density estimation to fit this data set with gaussian kernel. But it's too slow on my pc (4GB RAM, 256GB SSD). So, can sklearn kde handle data set with million or more samples?

Original Q&A

There are 1 best solutions below

Hugues Fontenelle On 10 November 2014 at 12:02 BEST ANSWER

Yes, sci-kit can handle a lot of data. But as you found out, it might be that your machine is not enough. Alternatively you may need to use the software better. Read Strategies to scale computationally: bigger data from the sci-kit documentation.

Edit: Density estimation for large dataset on Cross Validated is quite relevant.

How much data can sklearn handle with kernel density estimation

There are 1 best solutions below

Related Questions in KERNEL

Related Questions in SCIKIT-LEARN

Related Questions in HANDLE

Related Questions in KERNEL-DENSITY

Trending Questions

Popular # Hahtags

Popular Questions