convert data with LabelEncoder

478 Views Asked by At

I wrote this function to convert categorical features with LabelEncoder

#convert columns to dummies with LabelEncoder
cols = ['ToolType', 'TestType', 'BatteryType']
#apply ene hot encoder
le = LabelEncoder()
for col in cols:
    data[col] = data[col].astype('|S') #convert object to str type before apply label encoder
    le.fit(ravel(data[col]))
    data[col] = le.transform(ravel(data[col]))

There are null values in those columns, but there is an error like this

TypeError: ufunc 'isnan' not supported for the input types, and the inputs could not be safely coerced to any supported types according to the casting rule ''safe''

Does anyone know how to help me to fix this? Thank you

1

There are 1 best solutions below

0
On BEST ANSWER

This line is converting to numpy bytes_ which the encoder does not support:

data[col] = data[col].astype('|S')

If you want to convert to string, change '|S' to str:

data[col] = data[col].astype(str)

As a side note, you can cut the loop down to one line with apply() and fit_transform:

df[cols] = df[cols].astype(str).apply(le.fit_transform)