I've tried and tried, all day to try and make this work and it's starting to make me angry! All I want to do is create a necessary pandas series for input into upsetplot as detailed here:
https://pypi.org/project/upsetplot/
I don't understand how the generate_data function is manipulating its sets to make a series. I would have assumed that there was a simple way to do this by calling set(), but I can't seem to find it.
So I instead began manipulating my dataframes directly but suspected the attempts were misguided.
Thus I resort to providing a simple dataframe below and pray that some kind soul can enlighten me.
import pandas as pd
from matplotlib import pyplot as plt
from upsetplot import generate_data, plot
df = pd.DataFrame({'john':[1,2,3,5,7,8],
'jerry':[1,2,5,7,9,2],
'josie':[2,2,3,2,5,6],
'jean':[6,5,7,6,2,4]})
df = pd.DataFrame({'john':[True,False,True,False,True,False],
'jerry':[True,True,False,True,False,True],
'josie':[True,False,False,True,False,False],
'jean':[True,False,False,True,False,False],
'food':['apple','carrot','choc','bread','ham','nut']})
the example from the package home
from upsetplot import generate_data
example = generate_data(aggregated=True)
example # doctest: +NORMALIZE_WHITESPACE
set0 set1 set2
False False False 56
True 283
True False 1279
True 5882
True False False 24
True 90
True False 429
True 1957
Name: value, dtype: int64
Aggregate count by
GroupBy.size
with all columns withoutfood
: