Very weird performance behavior with pandas isin combined with boolean indexing with line_profiler

44 Views Asked by At

As seen here i am doing a isin() call which takes only 11126 to complete. Then i do a boolean indexing on that isin() but suddenly the time needed to complete that task is ~18x higher at 187088.

 2      11126.0   5563.0      0.5      randomness = ~dataframe.certificate_status.isin(
61         1          4.0      4.0      0.0          [
62                                                       "tamagotchi",
63                                                       "nintendo",
64                                                       "megaman",
65                                                       "mic_check",
66                                                       "onetwothree",
67                                                       "test",
68                                                       "else",
69                                                       "something",
70                                                   ]
71                                               )
72                                           
73         1     187088.0 187088.0      8.9      dataframe = dataframe.loc[randomness]

I actually expected boolean indexing to be faster than isin(). Can someone explain why i am getting the results seen here?

0

There are 0 best solutions below