I have the following chart and I want to annotate the bar with the highest value. The problem is the coordinate of x-axis has no values, it has text.
How to annotated the top of a stacked bar with the greatest height
144 Views Asked by Ayman M At
2
There are 2 best solutions below
0
On
- Bars tick locations are usually 0 indexed, especially if the the tick labels are categorical.
- The easiest option is to use
.pivot_tableto aggregate themeanfor each group, and create a separate variable,totfor the maximum total bar height relative to theindex.- The
pivot_tableindex will be the x-axis and the column headers will be the bar groups.
- The
pandas.DataFrame.plotwithkind='bar'andstacked=Trueoffers the easiest option for plotting stacked bars.pandasusesmatplotlibas the default plotting backend.- Use
.bar_labelas explained in this answer and this answer, to annotate the bars.- The
fmtparameter accepts alambdaexpression, which is used to filter the labels to matchtot. This works frommatplotlib v3.7, otherwise a customlabelparameter must be used, as shown in the linked answers. - The segments for each color group are in
ax.containers, whereax.containers[0]is the bottom segments andax.containers[1]is the top segments. label_type='edge'is the default, which results in the annotation being the sum of the bar heights.
- The
- If the months are not ordered on the x-axis, then the
'month'column can be set withpd.Categoricalandordered.from calendar import month_abbrto get an ordered list of abbreviated month names.df.month = pd.Categorical(values=df.month, categories=month_abbr[1:], ordered=True)
- Tested in
python 3.12.0,pandas 2.1.2,matplotlib 3.8.1,seaborn 0.13.0
import seaborn as sns # seaborn is only used for the sample data, but pandas and matplotlib are imported as dependencies
import numpy # for sample data
# sample data: this is a pandas.DataFrame
df = sns.load_dataset('flights')[['month', 'passengers']]
np.random.seed(2023)
df['Gender'] = np.random.choice(['Male', 'Female'], size=len(df))
# pivot and aggregate the mean
pt = df.pivot_table(index='month', columns='Gender', values='passengers', aggfunc='mean')
# calculate the max value by the index
tot = pt.sum(axis=1).max()
# plot the stacked bars
ax = pt.plot(kind='bar', stacked=True, rot=0, figsize=(7, 5), xlabel='Month',
ylabel='Mean Number of Passengers', title='Annotation Demonstration')
# annotate the top group of bars
ax.bar_label(ax.containers[1], fmt=lambda x: f'{x:0.0f}' if x == tot else '')
# move the legend: cosmetics
ax.legend(title='Gender', bbox_to_anchor=(1, 0.5), loc='center left', frameon=False)
# remove the top and right spines: cosmetics
ax.spines[['top', 'right']].set_visible(False)
df.head()
month passengers Gender
0 Jan 112 Female
1 Feb 118 Female
2 Mar 132 Male
3 Apr 129 Female
4 May 121 Female
pt
Gender Female Male
month
Jan 233.000000 259.250000
Feb 209.428571 270.800000
Mar 282.375000 245.750000
Apr 289.000000 245.166667
May 238.571429 318.400000
Jun 264.000000 378.400000
Jul 336.166667 366.500000
Aug 343.500000 358.666667
Sep 274.400000 322.428571
Oct 340.333333 192.833333
Nov 191.333333 274.333333
Dec 252.833333 270.833333
pt.sum(axis=1)
month
Jan 492.250000
Feb 480.228571
Mar 528.125000
Apr 534.166667
May 556.971429
Jun 642.400000
Jul 702.666667
Aug 702.166667
Sep 596.828571
Oct 533.166667
Nov 465.666667
Dec 523.666667
dtype: float64
tot
702.6666666666667


The data + example below demonstrates how to label the tallest bar. However, it assumes that the bars were drawn directly using
matplotliband that the data is anumpyarray. If you produced your plot usingpandasor some other plotting library, then the approach below would need to be modified.