Pandas - Change values in output of dataframe.applymap(type)

294 Views Asked by At

I am writing a code to compare the datatype of each value in an excel spreadsheet to a database definition spreadsheet which lists the required field/datatypes.

I am using dataframe.applymap(type) to check all of the values in the excel sheet which holds my data.

data_types = location_df.applymap(type)

The output of the above block is this:

  Location_ID    Location_Name      AltName          X_UTM            Y_UTM           Type_Code            QA_QC         Comments
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>
  <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>

I want to change the values in the output dataframe so that the value matches an analogous string value found in my database definition file. For example, I want to change all instances of <class 'str'> to 'TEXT'.

I've been trying to use pandas.replace to do the job, but it's not giving the desired results:

for col in data_types:
     data_types[col] = data_types[col].replace("<class 'str'>", "TEXT", regex=True)

The print output does not have a changed value:

      Location_ID    Location_Name      AltName          X_UTM            Y_UTM           Type_Code            QA_QC         Comments
     <class 'int'>  <class 'str'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>  <class 'float'>    <class 'str'>

I've been able to use the above .replace method to change values in a dataframe when the values are not in the <class*> format. Is anyone able to explain why the .replace method does not work in this case, as well as the 'correct' method to manipulate the values in the data_types dataframe?

0

There are 0 best solutions below