The upset plot tutorials on the documentation have this example with movies: https://upsetplot.readthedocs.io/en/stable/formats.html#When-category-membership-is-indicated-in-DataFrame-columns
I wanted to know, after creating data from memberships "Genre" and plotting how do I list the names of the movies as well?
In the plot, I want to print the list of movies at each intersection. So at intersection 48, I want to list the 48 movies.
Upset plot python list row names
705 Views Asked by Uqhah At
1
In the example on the documentation page, this information is contained in the dataframe
movies_by_genre
, which is defined as:movies_by_genre = from_indicators(genre_indicators, data=movies)
. Now, we can extract the required information from this data frame. We just need to make sure that the order of the boolean tuple of length 20, (True, False, ....., True) in the pandas Series objectintersection
and the pandas Series objectmovies_by_genre.Genres
. I used a dict to map the order of columns. For reproducibility, the end-to-end python script is given below:Output:
EDIT:
Upon clarification from OP, the list of names should be printed on the plot. So, we can follow the same method and put the text on the plots manually. I did the following:
_plot_bars()
function insideupsetplot.plotting.py
such that it allows us to add text from a parameterlist calledlol_of_intersection_names
;lol
stands for list of list. Additionally, I added analpha
parameter to reduce the transparency of the bars whenax.bar
is called; otherwise the text will not be visible. (alpha = 0.5) in the example below.u
of classUpset
so that it can be accessed inside the function_plot_bars()
as shown below:Finally, the output looks as shown below:
However, given the long list of names, I am unsure of the practical importance of plotting like this. Only when I save the image in 600DPI, can I zoom in and see the names of movies.