None of the solution on KeyError posts addressed my problem hence this question:
I have the following column in a Pandas DataFrame:
df['EventDate']
0 26-12-2016
1 23-12-2016
2 16-12-2016
3 15-12-2016
4 11-12-2016
5 10-12-2016
6 07-12-2016
Now I am trying to split the Date and extract the last four values of the year into another Series by using the below command:
trial=df["EventDate"].str.split("-",2,expand=True)
Now using the 3rd index value I am able to get the entire values:
df.year=trial[2]
Checking the data type of the year column now:
type(df.year)
Out[80]: pandas.core.series.Series
Yes it is Pandas Series transferred through trial[2] code to df.year
print(trial[2])
0 2016
1 2016
2 2016
3 2016
4 2016
Now I am trying to Groupby the Year column and that is where I get the error:
yearwise=df.groupby('year')
Traceback (most recent call last):
File "<ipython-input-81-cf39b80933c4>", line 1, in <module>
yearwise=df.groupby('year')
File "C:\WINPYTH\python-3.5.4.amd64\lib\site-
packages\pandas\core\generic.py", line 4416, in groupby
**kwargs)
File "C:\WINPYTH\python-3.5.4.amd64\lib\site-
packages\pandas\core\groupby.py", line 1699, in groupby
return klass(obj, by, **kwds)
File "C:\WINPYTH\python-3.5.4.amd64\lib\site-
packages\pandas\core\groupby.py", line 392, in __init__
mutated=self.mutated)
File "C:\WINPYTH\python-3.5.4.amd64\lib\site-
packages\pandas\core\groupby.py", line 2690, in _get_grouper
raise KeyError(gpr)
KeyError: 'year'
Can you please help to resolve this KeyError and get the Groupby value for Year column?
A THOUSAND thanks in advance for your answers.
The fundamental misunderstanding here is that you think doing
Creates a column called
year
indf
, but this is not true! Observe:So, what is
df.year
? It is an attribute ofdf
, which is not the same as a column. In python, you can assign attributes using thedot
notation, so this works without throwing errors. You can confirm by printing outdf.__dict__
:If you want to actually assign to a column, you'll need to use the
[...]
indexing syntax, like this: