plotly don't show zeros in area plot

2.2k Views Asked by At

For context: I'd like to make a plot in plotly showing the evolution of an investment portfolio where the value of each asset is plotted on top of each other. Since assets are bought and sold, not every asset should be shown for the entire range of the curve. The below example can clarify this. Leading or trailing zeros indicate that the asset was not in the portfolio at that moment.

import pandas as pd
import plotly.express as px
import numpy as np
data = {"Asset 1": [0, 1, 2, 3, 4, 5], "Asset 2": [0, 0, 2, 3, 2, 2], "Asset 3": [1, 1, 3, 0, 0, 0]}
df = pd.DataFrame(data)
fig = px.area(df)
fig.show()

This results in the following figure: enter image description here

The problem is now that at the indicated time (index=4), Asset 3 is not in the portfolio anymore, hence its value 0. However it is still shown, and the bigger problem is that it makes it impossible to see the value of Asset 2 which is in the portfolio.

I tried changing the zeros to NaN values to indicate that they don't exist but that gives the exact same figure.

data2 = {"a": [np.nan, 1, 2, 3, 4, 5], "b": [np.nan, np.nan, 2, 3, 2, 2], "c": [1, 1, 3, np.nan, np.nan, np.nan]}
df2 = pd.DataFrame(data2)
fig2 = px.area(df2)
fig2.show()

enter image description here

3

There are 3 best solutions below

2
On

I am afraid I cannot construct an elegant solution. However this will work for most requirements you stated. How it works:

  • Instead of using the auto stack function, draw the line one by one by yourself.
  • That means you will have to pre-process the dataframe a little bit - by calculating the values of column A+B and column A+B+C.
  • plotly.express offers limited custom control. Instead of using plotly.express, use plotly.graph_objects. They have similar syntax.
  • The order of placing the "traces" (aka. lines) is important. The last line rendered get placed on the top. In your problem statement, the lines get drawn from left-most to right-most column, and that's why overlapping would favor the right-er column.
  • The NaN values has to be zero-filled manually before the plotting. Otherwise the filled areas create weird shapes, considering your sample data contains a certain amount of NaNs.
import pandas as pd
import numpy as np

import plotly.graph_objects as go

data = {"a": [np.nan, 1, 2, 3, 4, 5], "b": [np.nan, np.nan, 2, 3, 2, 2], "c": [1, 1, 3, np.nan, np.nan, np.nan]}
df = pd.DataFrame(data)

# fill NAs with zeros before doing anything
df = df.fillna(0)

fig = go.Figure()

# add lines one by one. The order matters - last one lays on top along with its hoverinfo
fig.add_trace(go.Scatter(
    x=df.index, 
    y=df['a'], 
    mode='lines',
    fill='tonexty',  # fill the area under line to next y
))

fig.add_trace(go.Scatter(
    x=df.index, 
    y=df['a']+df['b'], # sum of 'a' and 'b'
    mode='lines', 
    fill='tonexty', # fill the area under line to next y
))

fig.add_trace(go.Scatter(
    x=df.index, 
    y=df['a']+df['b']+df['c'], # sum of 'a' and 'b' and 'c'
    mode='lines', 
    fill='tonexty', # fill the area under line to next y
))

# minor bug where an area below zero is shown
fig.update_layout(yaxis=dict(range=[0, max(df.sum(axis=1) * 1.05)]))
fig.show()

The resulting plot would look like: stackedplot_with_line

The green line, representing values of df['a']+df['b']+df['c'] still sits on the top. However, the hover label is now showing the value of df['a']+df['b']+df['c'] instead of either of the assets.

In fact, I found these asset-allocation-y plot prettier without the edge lines:

stackedplot_no_line

and this can be done by setting mode='none' for each of the 3 plot objects.

Remarks:

  • Another way I have tried for anyone who is reading: consider each filled area and line as two separate traces. By doing so, you will need to define custom pairs of colors (solid and its half-transparent color). There were some buggy results for this. Also, the struggle of traces with stackgroup set in argument cannot contain NaN values and NaN values will either be zero-filled or interpolated. This creates bad plots in the context of this problem.
0
On

This thread gives a method not to show the lines and should give the desired effect: Remove series border lines from plotly express area chart

0
On

If you set hovermode = 'x', you probably will get what you want. If you have a lot of stacked features, it's probably better to set it to "x unified" to avoid -as much as possible- overcrowding your graph.

You can check it out here: https://plotly.com/python/hover-text-and-formatting/

fig.update_layout(hovermode = 'x')
fig.update_layout(hovermode = 'x unified')