I have a dataframe, 11 columns 18k rows. The last column is either a 1 or 0, but when I use .describe() all I get is
count 19020
unique 2
top 1
freq 12332
Name: Class, dtype: int64
as opposed to an actual statistical analysis with mean, std, etc.
Is there a way to do this?
If your numeric (0, 1) column is not being picked up automatically by
.describe()
, it might be because it's not actually encoded as anint
dtype. You can see this in the documentation of the.describe()
method, which tells you that the defaultinclude
parameter is only for numeric types:My suggestion would be the following:
That is, first check the datatypes (I'm assuming the column is called
num
and the dataframedf
, but feel free to substitute with the right ones). If this indicator/(0,1) column is indeed not encoded asint
/integer type, then cast it as such by using.astype(int)
. Then, you can freely usedf.describe()
and perhaps even specify columns of which data types you want to include in the description output, for more fine-grained control.