I'm working to perform a rolling time-aware count within a grouped subset of a data frame, which would dynamically create new columns for each ActivityType an and ContactId as of a certain date.
I believe I'm using the correct syntax when calling the groupby and rolling methods, but I'm still receiving an TypeError: incompatible index of inserted column with frame index - which I'm not sure why.
Your time and help is greatly appreciated.
Expected output:
ContactId ActivityType Date Call_90D Meeting_90D Conference_90D Webinar_90D
David Call 01/01/2015 1 0 0 0
David Meeting 01/17/2015 1 1 0 0
David Call 02/12/2015 2 1 0 0
Ryan Conference 01/01/2015 0 0 1 0
Ryan Meeting 01/17/2015 0 1 1 0
Ryan Webinar 02/12/2015 0 1 1 1
Louis Call 03/29/2015 1 0 0 0
Louis Call 03/30/2015 2 0 0 0
Louis Webinar 09/01/2015 0 0 0 1
Bryan Call 01/12/2015 1 0 0 0
Bryan Call 01/17/2015 2 0 0 0
Bryan Call 02/01/2015 3 0 0 0
code
import pandas as pd
data = [['David', 'Call', '1/1/2015'], ['David', 'Meeting', '1/17/2015'], ['David', 'Call', '2/12/2015'],
['Ryan', 'Conference', '1/1/2015'], ['Ryan', 'Meeting', '1/17/2015'], ['Ryan', 'Webinar', '2/12/2015'],
['Louis', 'Call', '3/29/2015'], ['Louis', 'Call', '3/30/2015'], ['Louis', 'Webinar', '9/1/2015'],
['Bryan', 'Call', '1/12/2015'], ['Bryan', 'Call', '1/17/2015'], ['Bryan', 'Call', '2/1/2015']]
df = pd.DataFrame(data, columns=['ContactId', 'ActivityType', 'Date'])
df.Date = pd.to_datetime(df.Date)
activities = df['ActivityType'].values.tolist()
for a in activities:
df[a] = df['ActivityType'].apply(lambda x: 1 if x == a else 0)
new_col_name = '%s_90D' % a
df[new_col_name] = df.groupby(by='ContactId', as_index=False, group_keys=False).rolling(window='90D', on='Date')[a].sum()
del df[a]
print(df)
Error:
line 20, in <module>
df[new_col_name] = df.groupby(by='ContactId', as_index=False, group_keys=False).rolling(window='90D', on='Date')[a].sum()
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2331, in __setitem__
self._set_item(key, value)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2397, in _set_item
value = self._sanitize_column(key, value)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2547, in _sanitize_column
value = reindexer(value)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2542, in reindexer
raise TypeError('incompatible index of inserted column '
TypeError: incompatible index of inserted column with frame index