I'm trying to add annotations to lmplots in a FacetGrid (r and p values for each regression) but the plots have two regression lines because I'm using "hue", and therefore I get two annotations that are stacked on top of each other. I'd like to either specify that they are displayed in different locations or ideally to use the complete dataset, not separated by the argument passed to hue I assume for that I need to modify "data" in the annotate function but I cannot figure out how.
I did manage to do it by creating a dataframe that contains all r and p values and looping through g.axes_dict.items()
, but I would like a more elegant solution where the values can be calculated and displayed directly
import pandas as pd
import seaborn as sns
import scipy as sp
dict = {
'ID': ['A','B','C','D','A','B','C','D','A','B','C','D','A','B','C','D'],
'SCORE': [18,20,37,40,34,21,24,12,34,54,23,43,23,31,65,78],
'AGE': [34,54,46,65,43,23,54,23,43,54,23,32,56,42,12,43],
'GENDER': [1,1,1,1,2,2,2,2,1,1,1,1,2,2,2,2]
}
df = pd.DataFrame(dict)
g = sns.lmplot(x='SCORE', y='AGE', data=df,hue='GENDER',
col='ID', height=3, aspect=1)
def annotate(data, **kws):
r, p = sp.stats.pearsonr(data['SCORE'], data['AGE'])
ax = plt.gca()
ax.text(.05, .8, 'r={:.2f}, p={:.2g}'.format(r, p),
transform=ax.transAxes)
g.map_dataframe(annotate)
tips
dataset is being used because the sample data in the OP causesscipy
to generateConstantInputWarning: An input array is constant; the correlation coefficient is not defined.
dict
to define the y-position for eachhue
category.map_dataframe
, for each facet, each hue group is plotted separately, which can be seen by displayingdata
indef annotate
.hue
, then separate statistics should be plotted.Iterate through
g.axes.flat
.map_dataframe
.axes
, which easily allows for calculations and annotations to be made with all the data for each facet.Iterate through
g.axes_dict.items()
col=
groups as keys, but then hard coding'time'
is required for creatingdata
.Plot Result