Plot line chart by grouping columns in dataframe

2.1k Views Asked by At

I have a csv file with data that I grouped the information on months and then used cumsum to calculate the running total for the month into a dataframe.

Using this code:

df = df.sort_index(sort_remaining=True).sort_values('months')
df['value'] = df.groupby('months')['value'].cumsum()

OUTPUT example in EXCEL, but my DF will look the same with 1000's of rows:

output example

I would now like to plot a chart that groups the month and plot each value so basically I will have 12 plotted lines showing how the value moved either higher or lower over time.

The output plot will look like the following chart showing cumsum of each month: Chart showing cumsum of each month

#

Thanks to @jezrael it is now working. Below is the plot Working Output

1

There are 1 best solutions below

6
On BEST ANSWER

I believe need pivot with rename for months names instead numeric and for new index values use cumcount:

d = {1: 'Jan', 2: 'Feb', 3: 'Mar', 4: 'Apr', 5: 'May',
     6 : 'Jun',7: 'Jul', 8: 'Aug', 9: 'Sep', 10: 'Oct', 11: 'Nov', 12: 'Dec'}

g = df.groupby('months').cumcount()
pd.pivot(index=g, columns=df['months'], values=df['value']).rename(columns=d).plot()

Detail:

print(pd.pivot(index=g, columns=df['months'], values=df['value']).rename(columns=d))
months    Jan   Feb   Mar   Apr
0        50.0   2.0  10.0   5.0
1        80.0   3.0  16.0  20.0
2       120.0   8.0  31.0  40.0
3       140.0  11.0  34.0  50.0
4         NaN  15.0  43.0  75.0

EDIT:

For define only some months for plot use subset:

months = ['Mar','Apr']
g = df.groupby('months').cumcount()
pd.pivot(index=g, columns=df['months'], values=df['value']).rename(columns=d)[months].plot()

Or filter months in input DataFrame by boolean indexing and isin:

df = df[df['months'].isin([3,4])]
g = df.groupby('months').cumcount()
pd.pivot(index=g, columns=df['months'], values=df['value']).rename(columns=d)[months].plot()