Using Python to draw a mosaic | marimekko chart with custom colors and labels

644 Views Asked by At

I would like to use Python to draw a mosaic | marimekko chart with custom colors and labels.

The following code works fine

import plotly.graph_objects as go

year = ['2019', '2020', '2021', '2022']

fig1 = go. Figure() 

fig1.add_trace(go.Bar(x=year, y=[20, 18, 14, 10], text=['20', '18', '14', '10'], name='brand 1'))
fig1.add_trace(go.Bar(x=year, y=[10, 15, 20, 22], text=['10', '15', '20', '22'], name='brand 2'))
fig1.add_trace(go.Bar(x=year, y=[6,   8, 10, 12], text=[ '6',  '8', '10', '12'], name='brand 3'))

fig1.update_layout(barmode='stack')

fig1.write_image('test_1.png')    

However, I want to sort the data for each year by the data passed via y. That means the code would look like (I'll leave out the sorting, that's not the question here).

fig2.add_trace(go.Bar(x=year, y=[20, 18, 20, 22], text=['20: brand 1', '18: brand 1', '20: brand 2', '22: brand 2']))
fig2.add_trace(go.Bar(x=year, y=[10, 15, 14, 12], text=['10: brand 2', '15: brand 2', '14: brand 1', '12: brand 3']))
fig2.add_trace(go.Bar(x=year, y=[ 6,  8, 10, 10], text=[ '6: brand 3',  '8: brand 3', '10: brand 3', '10: brand 1']))

Of course, I still want to use the same colors per brand (not per position), so in addition to the appropriately sorted data, I need to pass two more arrays for custom label texts (works fine) and for the corresponding custom colors (I don't see how to do that).

Question 1: How can I pass an array of custom colors to each trace so that each brand always gets the same color? Is there anyling like

fig1.add_trace(go.Bar(x=year, y=[20, 18, 14, 10], colors=...))

Question 2: Is there another option to create a mosaic | marimekko chart with varying x-widths which is not based on plotly?

The expected code is something like

# the color map 
the_brand_cmap = plt.get_cmap('seismic_r') 
the_brand_norm = co.TwoSlopeNorm(vmin=-max_abs, vcenter=0, vmax=max_abs)

...

for i in years: # the loop is over the years, not over the brands!

    # some more code to sort df per year and to extract the brand names and colors per year

    fig1.add_trace(go.Bar( # this adds a trace for the i-th year
        x=np.cumsum(xwidths) - xwidths,
        y=ysizes_norm, 
        width=xwidths,
        marker_color=the_brand_cmap(the_brand_norm(colors)), # the colors for each year
        text=brand_name)

The expected result is

enter image description here

1

There are 1 best solutions below

6
On

I have created a Marimekko graph using your data based on the examples in the reference. Add a new column for the composition of the year. Similarly, create a column width with the total of the years. For specifying the color for each brand, create a dictionary of brands and colors and specify when creating a stacked graph with data extracted by brand.

import plotly.graph_objects as go
import numpy as np
import pandas as pd

year = ['2019', '2020', '2021', '2022']
data = {'brand 1': [20, 18, 14, 10],
       'brand 2': [10, 15, 20, 22],
       'brand 3': [6,   8, 10, 12]
       }

df = pd.DataFrame.from_dict(data)

df = df.T
df.columns = year
for c in df.columns:
    df[c+'_%'] = df[c].apply(lambda x: (x / df.loc[:,c].sum()) * 100)

widths = np.array([sum(df['2019']), sum(df['2020']), sum(df['2021']), sum(df['2022'])])
marker_colors = {'brand 1': 'darkblue', 'brand 2': 'darkgreen', 'brand 3': 'crimson'}

fig1 = go.Figure()

for idx in df.index:
    dff = df.filter(items=[idx], axis=0)
    fig1.add_trace(go.Bar(
        x=np.cumsum(widths) - widths,
        y=dff[dff.columns[4:]].values[0],
        width=widths,
        marker_color=marker_colors[idx],
        text=['{:.2f}%'.format(x) for x in dff[dff.columns[4:]].values[0]],
        name=idx
    )
)

fig1.update_xaxes(
    tickvals=np.cumsum(widths)-widths,
    ticktext= ["%s<br>%d" % (l, w) for l, w in zip(year, widths)]
)

fig1.update_xaxes(range=[0, widths])
fig1.update_yaxes(range=[0, 100])

fig1.update_layout(barmode='stack')

#fig1.write_image('test_1.png')
fig1.show()

enter image description here

Since the objective is to draw in order of increasing numerical value by year, the outer loop should loop through the years, and the inner loop should loop through the years in ascending numerical order, with the largest value coming at the top.

widths = np.array([sum(df['2019']), sum(df['2020']), sum(df['2021']), sum(df['2022'])])
marker_colors = {'brand 1': 'darkblue', 'brand 2': 'darkgreen', 'brand 3': 'crimson'}

new_widths = (np.cumsum(widths) - widths).tolist()
new_widths.append(np.cumsum(widths)[-1])

fig = go.Figure()

for i,c in enumerate(df.columns[4:]):
    dff = df[c].to_frame()
    dff.sort_values(c, ascending=True, inplace=True)
    base = [0]
    for k,br in enumerate(dff.index):
        df_br = dff.iloc[k].to_frame(br)
        # print(df_br)
        # print(widths[i])
        # print(df_br[br])
        # print(offset)
        fig.add_trace(go.Bar(
            x=[new_widths[i], new_widths[i+1]],
            y=[df_br[br][0]],
            width=widths[i],
            base=base,
            marker_color=marker_colors[br],
            text='{:.2f}%'.format(df_br[br][0]),
            name=br
        ))
        base += df_br[br][0]

        names = set()
fig.for_each_trace(
    lambda trace:
        trace.update(showlegend=False)
        if (trace.name in names) else names.add(trace.name))

fig.update_xaxes(
    tickvals=np.cumsum(widths)-widths,
    ticktext= ["%s<br>%d" % (l, w) for l, w in zip(year, widths)]
)

fig.update_xaxes(range=[0, widths])
fig.update_yaxes(range=[0, 100])

fig.update_layout(barmode='stack')
fig.show()

enter image description here