Apply scikit-learn murmurhash3_32 on a Pandas dataframe

510 Views Asked by At

I try to apply murmurhash on a pandas dataframe. I wanted to use scikit-learn murmurhash3_32 (any other easy proposition would be appreciated). I tried

import pandas as pd
from sklearn.utils.murmurhash import murmurhash3_32

df = pd.DataFrame({'a': [100, 1000], 'b': [200, 2000]}, dtype='int32')
df.apply(murmurhash3_32)

But I get

TypeError: ("key 0 100\n1 1000\nName: a, dtype: int32 with type class 'pandas.core.series.Series' is not supported. Explicit conversion to bytes is required", 'occurred at index a')

But Scikit is supposed to handle int32: https://scikit-learn.org/dev/modules/generated/sklearn.utils.murmurhash3_32.html#sklearn.utils.murmurhash3_32

Any idea or recommendation on it?

1

There are 1 best solutions below

0
On

Stupid mistake, not sure if I should delete my question:

Apply will pass a series to the function.

Using applymap works as expected as it pass every element to the function.