Unable to run a selective pandas profiling for large dataset

986 Views Asked by At

I have a large dataset with 100 columns and 100000 rows and I'm trying to run a pandas profile report but it generates a very large file html file(300MB). Unable to open that file on any browser.

So I tried minimal=True but that just provides Interactions.

Can I run a selective pandas profiling report to view Interactions report only or missing _values report only

I tried this but ran into errors

ProfileReport(df,variables=False,Interactions=True, Correlations=False, Missing_values=False, Sample=False)

1

There are 1 best solutions below

0
On

There can be multiple possible improvements by tuning the configuration:

  • Plotting interactions with 100 columns generates 100 x 100 = 10.000 plots. You can narrow this down to the ones you're interested in by specifying a target (see the documentation)
  • The ProfileReport(df,variables=False,Interactions=True, Correlations=False, Missing_values=False, Sample=False) is not the right syntax (please don't use capitalization, use None instead of False, see this page) .