The color is not matching between two subplots & legend order

688 Views Asked by At

I'm struggling with two small formatting issues with my Seaborn boxplot and histogram (plotted as subplots).

  1. The colors between the two subplots are slightly different even though the coded colors are exactly the same.
  2. I'm trying to rearrange the order of the legend so 'Group A' appears above 'Group B'
groupA = [94, 74, 65, 36, 32, 65, 56, 59, 24, 133, 16, 8, 18]
groupB = [1, 1, 1, 1, 2, 7, 7, 10, 15, 16, 17, 17, 19, 29, 31, 32, 43, 43, 44, 47, 56, 64, 64, 80, 81, 87, 103, 121, 121, 121, 187, 197, 236, 292, 319, 8, 12, 12, 14, 14, 15, 16, 16, 20, 20, 33, 36, 37, 37, 44, 46, 48, 51, 51, 54, 57, 72, 74, 95, 103, 103, 107, 134, 199, 216, 228, 254]
f, (ax_boxplot, ax_histogram) = plt.subplots(2, sharex=True, gridspec_kw={'height_ratios': (0.3,0.7)}, figsize=(10,10))
sns.boxplot(data=[groupA, groupB], ax=ax_boxplot, orient='h', palette=['green', 'silver'])
ax_boxplot.tick_params(axis='y', left=False, labelleft=False)
sns.histplot(data=[groupA, groupB], bins=34, binrange=(0,340), palette=['green', 'silver'], alpha=1, edgecolor='black')
ax_histogram.tick_params(axis='both', labelsize=18)
ax_histogram.legend(labels=['groupB', 'groupA'], fontsize=16, frameon=False)
plt.xlabel("Days", fontsize=24, labelpad=20)
plt.ylabel("Count", fontsize=24, labelpad=20)
sns.despine()

What I have tried so far:

  • For the colors: I tried setting the alpha to 1 in the histogram, but there still seems to be a slight difference.
  • For the legend: Tried playing around with hue_order and handles, but didn't have any luck getting that to work.

Image of histogram and boxplot subplots

3

There are 3 best solutions below

0
On BEST ANSWER

Try using saturation=1 in your call to boxplot. Unless specified, saturation is equal to 0.75.

The documentation says:

saturation float, optional

Proportion of the original saturation to draw colors at. Large patches often look better with slightly desaturated colors, but set this to 1 if you want the plot colors to perfectly match the input color.

0
On

Your coloring issue appears commonly on StackOverflow. E.g. Avoid Seaborn barplot desaturation of colors, Seaborn chart colors are different from those specified by palette or Inconsistent colours from custom seaborn palette. Seaborn's author likes desaturated color for rectangles, so this is enabled by default.

Seaborn creates its own legends, which often differ from what you get by calling matplotlib's ax.legend(...). To change the parameters of the legend, Seaborn has a sns.move_legend() function. move_legend is primarily meant to change the position, but you can also change other parameters (except the item labels). As the "new" position is a required parameter, you can use loc='best', which is the default.

For the labels in the legend, Seaborn's usual way is a "long form" dataframe where one column is used as hue=. But Seaborn also support a dictionary as data. Then, the labels of the dictionary serve as legend labels.

Note that unless you add sns.histplot(..., multiple='stack') (or multiple='dodge'), the bars of the last drawn histogram will hide (partially or totally) the bars of the other histogram. That can be very confusing (that's why by default some transparency is set).

import matplotlib.pyplot as plt
import seaborn as sns

groupA = [94, 74, 65, 36, 32, 65, 56, 59, 24, 133, 16, 8, 18]
groupB = [1, 1, 1, 1, 2, 7, 7, 10, 15, 16, 17, 17, 19, 29, 31, 32, 43, 43, 44, 47, 56, 64, 64, 80, 81, 87, 103, 121, 121, 121, 187, 197, 236, 292, 319, 8, 12, 12, 14, 14, 15, 16, 16, 20, 20, 33, 36, 37, 37, 44, 46, 48, 51, 51, 54, 57, 72, 74, 95, 103, 103, 107, 134, 199, 216, 228, 254]
f, (ax_boxplot, ax_histogram) = plt.subplots(2, sharex=True,
                                             gridspec_kw={'height_ratios': (0.3, 0.7)}, figsize=(10, 10))

sns.boxplot(data=[groupA, groupB], ax=ax_boxplot, orient='h',
            palette=['green', 'silver'], saturation=1)
ax_boxplot.tick_params(axis='y', left=False, labelleft=False)

sns.histplot(data={'Group A': groupA, 'Group B': groupB},
             bins=34, binrange=(0, 340),
             palette=['green', 'silver'], alpha=1, edgecolor='black')
ax_histogram.tick_params(axis='both', labelsize=18)
ax_histogram.set_xlabel("Days", fontsize=24, labelpad=20)
ax_histogram.set_ylabel("Count", fontsize=24, labelpad=20)
sns.move_legend(ax_histogram, loc='best', fontsize=24, frameon=False)
sns.despine()

plt.show()

sns.histplot with changed legend

3
On
  • As mentioned by @JohanC, putting the data into a long-form dataframe has the benefit of allowing seaborn to automatically add the labels, and deal with the order.
  • pd.DataFrame(data=v, columns=['Days']).assign(Group=group) is used to create a dataframe for each list, where .assign creates a column called 'Group' for the name of the data. The two dataframes are combined with pd.concat.
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

# set the matplotlib rc parameters (global settting)
params = {'axes.labelsize': 24,
          'axes.titlesize': 24,
          'axes.labelpad': 20,
          'axes.spines.top': False,
          'axes.spines.right': False,
          'ytick.labelsize': 18,
          'xtick.labelsize': 18,
          'legend.fontsize': 20,
          'legend.frameon': False,
          'legend.title_fontsize': 16}

plt.rcParams.update(params)

# create the dataframe with a column defining the groups
df = pd.concat([pd.DataFrame(data=v, columns=['Days']).assign(Group=group) for v, group in zip([groupA, groupB], ['A', 'B'])], ignore_index=True)

# create the figure and axes
fig, (ax_boxplot, ax_histogram) = plt.subplots(2, sharex=True, gridspec_kw={'height_ratios': (0.3,0.7)}, figsize=(10,10))

# plot the histplot from df
sns.histplot(data=df, x='Days', hue='Group', bins=34, binrange=(0,340), palette=['green', 'silver'], alpha=1, edgecolor='black', ax=ax_histogram)

# plot the boxplot from df
sns.boxplot(data=df, x='Days', y='Group', ax=ax_boxplot, palette=['green', 'silver'])
ax_boxplot.tick_params(axis='y', left=False, labelleft=False, bottom=False)
_ = ax_boxplot.set(xlabel='', ylabel='')

enter image description here

df.head(15)

    Days Group
0     94     A
1     74     A
2     65     A
3     36     A
4     32     A
5     65     A
6     56     A
7     59     A
8     24     A
9    133     A
10    16     A
11     8     A
12    18     A
13     1     B
14     1     B