I am trying to plot a stripplot
with 3 categories (column assigned to x
) and have the marker sizes vary based on a column in a dataframe.
However, the sizes don't line up even when I am setting the sizes
attribute to the same as y
(I am using sizes = df["col"]
as if I use sizes = "col"
I get the error TypeError: len() of unsized object
). In doing this, I'd expect to see smaller markers at the bottom and larger markers at the top as the values should be the same for both sizes
and y
. Instead there doesn't appear to be any correlation between the size of the marker and its position on the y-axis.
After some investigation by pulling out the PathCollections
and comparing the actual values (.get_offsets()
with the size value (.get_sizes
), it is clear that the same array of sizes is being used for each category.
Is this feature not properly implemented yet? I tried assigning the categories as hue
instead of x
but I get a StopIteration
error. The only solution I've found is to iterate through each category and plot it on a separate axis in a row of axes. This is clunky and surely there's a better way.
Here is a very simplified version of my code:
sns.stripplot(data = df,
x = 'category_col',
y = 'value_col',
sizes = df['value_col'])
The
sns.stripplot
documentation doesn't mentionsizes=
as a possible parameter. The function shares some code withsns.swarmplot
which relies on all points having the same size. Buthue
should work without problems, at least in the latest version. (In seaborn, hue comes in two flavors: either a limited set of values interpreted as categories with individual colors, or a numerical range which works with color mapping).Here is how hue could be used, starting from seaborn's 'tips' dataset:
A
scatterplot
doe havesize=
(column to indicate the size of the dots) andsizes=
(to indicate the range of sizes) parameters. Converting the categoricalx
column to numbers and manually add some jitter, you can create a strip plot.