appending new columns to python dataframe

84 Views Asked by At

New versions of anaconda python 3.11 are failing with traceback when I try to update an existing dataframe using df.loc with a new key (i.e., trying to append new columns to an existing row in my df) :

Traceback (most recent call last):
File "my.py", line 293, in <module>
    ...
File "utils.py", line 961, in update_dataframe_with_new_data
my_df.loc[my_df['Name'] == x1, 'newName']  = x1
                   ~~~~~~~~~~^^^^^^^^^^^^
File "/apps/anaconda/2024.02/lib/python3.11/site-packages/pandas/core/frame.py", line 3893, in __getitem__
indexer = self.columns.get_loc(key)
              ^^^^^^^^^^^^^^^^^^^^^^^^^
File "/apps/anaconda/2024.02/lib/python3.11/site-packages/pandas/core/indexes/range.py", line 418, in get_loc
raise KeyError(key)
KeyError: 'newName'

I am using python 3.11 from anaconda.

I do not manage this installation.

I found that the 2024.20 version of anaconda, python 3.11 exhibits this issue.

If I back up to the 2023.07 version of anaconda, then python 3.11 does not exhibit this issue.

So I expect that I am running into a unintentional feature that has been deprecated (but is not being caught and reported as such).

I would like to find a better solution to achieve the goal of updating a dataframe with new columns (keys). I do not necessarily know apriori what these new columns would be so I need to handle this dynamically vs. defining the additional columns when I first initialize the df.

Updates so far:

  • Initialization does not help: my_df['newName'] = np.nan
  • The later version of anaconda installed in my environment uses pandas version 2.1.4 whereas the older 2023.07 installation uses pandas version 1.5.3.
0

There are 0 best solutions below