Automating a plotting routine in python

614 Views Asked by At

Objective: I would like to create a stacked plot function that plots all the columns in a given data frame. Such a data frame can have N-columns.

A generic code for plotting a stacked plot in Plotly is the following:

from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot
import plotly.graph_objs as go
from plotly import tools

trace1 = go.Scatter(
    x=[0, 1, 2],
    y=[10, 11, 12]
)
trace2 = go.Scatter(
    x=[0, 1, 2],
    y=[100, 110, 120],
)
trace3 = go.Scatter(
    x=[0, 1, 2],
    y=[1000, 1100, 1200],
)
fig = tools.make_subplots(rows=3, cols=1, specs=[[{}], [{}], [{}]],
                          shared_xaxes=True, shared_yaxes=False,
                          vertical_spacing=0.1)
fig.append_trace(trace1, 1, 1)
fig.append_trace(trace2, 2, 1)
fig.append_trace(trace3, 3, 1)
plot(fig)

The How: How do I create a loop that will create the code for Plotly to plot?

My Attempt

import pandas as pd
import numpy as np

df = pd.DataFrame()
df['x'] = np.array([0, 1, 2])
df['y1'] = np.array([10, 11, 12])
df['y2'] = np.array([100, 110, 120])
df['y3'] = np.array([1000, 1100, 1200])

d = {}
for i in np.arange(df.shape[0]):
    d["trace{0}".format(i)] = "go.Scatter(x=[{0}],y=[{1}])".format(df.iloc[:,0], df.iloc[:, i])

fig = tools.make_subplots(rows=3, cols=1, specs=[[{}], [{}], [{}]],
                          shared_xaxes=True, shared_yaxes=False,
                          vertical_spacing=0.1)

for index, key in enumerate(d):
    fig.append(d[key], index+1, 1)
plot(fig)

Running this, I get a following error:

     23 for index, key in enumerate(d):
---> 24         fig.append(d[key], index+1, 1)
     25 plot(fig)
     26

TypeError: 'NoneType' object is not callable

How do I make it work?

1

There are 1 best solutions below

0
On BEST ANSWER

You need to recreate the structure of fig just the way Plotly is creating it. By running your code I find the structure of the fig as follows:

In [4]: fig
Out[4]:
{'data': [{'type': 'scatter',
   'x': [0, 1, 2],
   'xaxis': 'x1',
   'y': [10, 11, 12],
   'yaxis': 'y1'},
  {'type': 'scatter',
   'x': [0, 1, 2],
   'xaxis': 'x1',
   'y': [100, 110, 120],
   'yaxis': 'y2'},
  {'type': 'scatter',
   'x': [0, 1, 2],
   'xaxis': 'x1',
   'y': [1000, 1100, 1200],
   'yaxis': 'y3'}],
 'layout': {'xaxis1': {'anchor': 'y3', 'domain': [0.0, 1.0]},
  'yaxis1': {'anchor': 'free',
   'domain': [0.7333333333333334, 1.0],
   'position': 0.0},
  'yaxis2': {'anchor': 'free',
   'domain': [0.3666666666666667, 0.6333333333333333],
   'position': 0.0},
  'yaxis3': {'anchor': 'x1', 'domain': [0.0, 0.26666666666666666]}}}

You are missing the fig.append_trace. By including this, making few changes, I have created a function that takes in data frame and plots all of the columns as a stack plot:

def plot_plotly(dataframe):
    """
    Plots all of the columns in a given dataframe as a stacked plot.

    Note: Plotly is extremely slow when it comes to plotting data points
    greater than 100,000. So, this program will quit if the size is larger.

    Example:
    ---------
    df = pd.DataFrame()
    df['x'] = np.array([0, 1, 2])
    df['y1'] = np.array([10, 11, 12])
    df['y2'] = np.array([100, 110, 120])
    df['y3'] = np.array([1000, 1100, 1200])
    df['y4'] = np.array([2000, 3000, 1000])

    # Selecting first four columns
    df1 = df.iloc[:, :4]

    plot_plotly(df1)
    """

    if dataframe.shape[0] >= 100000:
        print "Data Frame too Large to plot"
    return None

    d = {}
    spec_list = []
    for i in np.arange(dataframe.shape[1] - 1):
        d["trace{0}".format(i)] = go.Scatter(x=list(dataframe.iloc[:, 0].values), y=list(dataframe.iloc[:, i + 1].values))
        spec_list.append([{}])

    fig = tools.make_subplots(rows=dataframe.shape[1] - 1, cols=1, specs=spec_list,
                              shared_xaxes=True, shared_yaxes=False,
                              vertical_spacing=0.1)

    for index, key in enumerate(d):
        fig.append_trace(d[key], index + 1, 1)
    return plot(fig)