Changing Column Type in Pandas DataFrame to int64

6.4k Views Asked by At

I am trying to change a column's data type from type: object to type: int64 within a DataFrame using .map().

   df['one'] = df['one'].map(convert_to_int_with_error)

Here is my function:

def convert_to_int_with_error(x):
    if not x in ['', None, ' ']:
        try:
            return np.int64(x)
        except ValueError as e:
            print(e)
            return None
    else:
        return None

    if not type(x) == np.int64():
        print("Not int64")
        sys.exit()

This completes successfully. However, when I check the data type after completion, it reverts to type: float:

print("%s is a %s after converting" % (key, df['one'].dtype))
1

There are 1 best solutions below

2
On BEST ANSWER

I think problem is your problematic values are converted from None to NaN, so int is cast to float - see docs.

Instead map you can use to_numeric with parameter errors='coerce' for convert problematic values to NaN:

df['one'] = pd.to_numeric(df['one'], errors='coerce')