I have a DataFrame
where one column contains lists as cell contents, something like following:
import pandas as pd
df = pd.DataFrame({
'col_lists': [[1, 2, 3], [5]],
'col_normal': [8, 9]
})
>>> df
col_lists col_normal
0 [1, 2, 3] 8
1 [5] 9
I would like to apply some transformation to each element of col_lists
, for example:
df['col_lists'] = df.apply(
lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ],
axis=1
)
>>> df
col_lists col_normal
0 [1, None, 3] 8
1 [5] 9
With this dataframe this works as I expect, however, when I apply the same code to other dataframe I am getting a bizarre result -- for each row, the column contains only first element of the list:
df2 = pd.DataFrame({
'col_lists': [[1, 2], [5]], # length of first list is smaller here
'col_normal': [8, 9]
})
df2['col_lists'] = df2.apply(
lambda row: [ None if (element % 2 == 0) else element for element in row['col_lists'] ],
axis=1
)
>>> df2
col_lists col_normal
0 1.0 8
1 5.0 9
I have two questions:
(1) What is going on here? Why I am getting a correct result in case of df
, but not df2
?
(2) How can I correctly apply some transformations to lists within a DataFrame
?
First I think working with
list
s in pandas is not good idea.But if really need it, try upgrade pandas, because for me it working nice in
pandas 0.23.4
: