I want to only label the highest value on my heatmap, but only the first digit is showing. I don't know why. Shrinking the font doesn't seem to work. While writing this I guess ignoring the annotation variable and adding a text might work but I can't wrap my head around this for subplot :cryingface:
You can see what I'm getting here:
Toy data generation
np.random.seed(42)
n_rows = 10**6
n_ids = 1000
n_groups = 3
times = np.random.normal(12, 2.5, n_rows).round().astype(int) + np.random.choice([0,24,48,72,96,120,144], size=n_rows, p=[0.2,0.2,0.2,0.2,0.15,0.04,0.01])
timeslots= np.arange(168)
id_list = np.random.randint(low=1000, high=5000, size=1000)
ID_probabilities = np.random.normal(10, 1, n_ids-1)
ID_probabilities = ID_probabilities/ID_probabilities.sum()
final = 1 - ID_probabilities.sum()
ID_probabilities = np.append(ID_probabilities,final)
id_col = np.random.choice(id_list, size=n_rows, p=ID_probabilities)
data = pd.DataFrame(times[:,None]==timeslots, index=id_col)
n_ids = data.index.nunique()
data = data.groupby(id_col).sum()
data['grp'] = np.random.choice(range(n_groups), n_ids)
data
Copy pasta sample of the toy data:
0 1 2 3 4 5 6 7 8 9 ... 159 160 161 162 163 164 165 166 167 grp
1011 0 0 0 0 0 0 2 3 15 21 ... 1 1 0 0 0 0 0 0 0 1
1016 0 0 0 0 0 0 4 3 18 41 ... 2 0 0 0 0 0 0 0 0 2
1020 0 0 0 0 0 1 1 2 6 16 ... 1 1 0 0 0 0 0 0 0 0
1024 0 0 0 0 0 0 2 3 7 13 ... 0 1 1 0 0 0 0 0 0 0
1029 0 0 0 0 0 0 1 5 3 14 ... 1 0 1 0 0 0 0 0 0 1
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
4965 0 0 0 0 0 2 4 2 10 9 ... 0 1 0 0 0 0 0 0 0 1
4984 0 0 0 0 0 1 0 6 10 12 ... 0 0 0 0 0 0 0 0 0 2
4989 0 0 0 0 0 1 3 4 7 16 ... 1 1 0 0 0 0 0 0 0 0
4995 0 0 0 0 2 0 2 2 2 23 ... 0 1 0 0 0 0 0 0 0 0
4999 0 0 0 0 0 1 1 7 9 11 ... 0 0 0 0 0 0 0 0 0 2
My code for generating the graphs
import seaborn as sns
import matplotlib.pyplot as plt
rows = 1
cols = n_groups
# profiles['grp'] = results
grpr = data.groupby('grp')
actual_values = []
fig, axs = plt.subplots(rows, cols, figsize=(cols*3, rows*3), sharey=True, sharex=True)
for grp, df in grpr:
plt.subplot(rows,cols,grp+1)
annot_labels = np.empty_like(df[range(168)].sum(), dtype=str)
annot_mask = df[range(168)].sum() == df[range(168)].sum().max()
actual_values.append(df[range(168)].max().max())
annot_labels[annot_mask] = str(df[range(168)].max().max())
sns.heatmap(df[range(168)].sum().values.reshape(7,-1), cbar=False, annot=annot_labels.reshape(7,-1), annot_kws={'rotation':90, 'fontsize':'x-small'}, fmt='')
ppl = df.shape[0]
journs = int(df.sum().sum()/1000)
plt.title(f'{grp}: {ppl:,} people, {journs:,}k trips')
for ax in axs.flat:
ax.set(xlabel='Hour', ylabel='Day')
ax.set_yticklabels(['M','T','W','T','F','S','S'], rotation=90)
# Hide x labels and tick labels for top plots and y ticks for right plots.
for ax in axs.flat:
ax.label_outer()
score_ch = ordered_scores['calinski_harbasz'][p]
score_si = ordered_scores['silhouette'][p]
plt.suptitle(f"Why don't these labels work? Actual values = {actual_values}")
plt.tight_layout()
plt.show()

Thanks to the comment from @TrentonMcKinney and this post on numpy array fixed length strings I have a simple solution. Creating the empty structure like this results in an array of strings of length 1 character:
annot_labels = np.empty_like(df[range(168)].sum(), dtype=str)Changing dtype fixes the problem.
np.empty_like(a, dtype='U5')creates an array with 5 unicode charcters