Why does pandas.apply(id, axis=1) return the same id for all rows?

125 Views Asked by At

I'm actually facing some problem I cannot understand. Imagine this scenario:

df_mock = pd.DataFrame({'v': [[1,2,3],[4,5,6],[7,8,9]]})

class O:
  def __init__(self, row):
    self.row = row
    
  def calc(self):
    self.v = self.row.v

df_mock['obj'] = df_mock.apply(lambda row: O(row), axis=1)
df_mock['obj'].apply(lambda o: o.calc())
print(df_mock['obj'].apply(lambda o: o.v))

When I run this, I get:

0    [7, 8, 9]
1    [7, 8, 9]
2    [7, 8, 9]
Name: obj, dtype: object

But I expected that a reference to each row gets copied in the obj O.row. However, for some reason, after the apply, the last reference is kept in the objects of all rows.

Why does this happen? Does pandas.apply(axis=1) make some kind of unique reference for all rows and passes the current row as the same reference?

It can be seen much simpler if you just run:

df_mock.apply(id, axis=1)

It will output the same id for all cases

0    139938239801360
1    139938239801360
2    139938239801360
dtype: int64
0

There are 0 best solutions below