Plotting rows of a df based on their label (colour coded)

43 Views Asked by At

I am trying to plot a dataset where rows have the information. I want to plot the rows as a line plot (so there will be 6 lines), in the colour that their labels are encoded (for ex. label 0 is blue, label 1 is red) The example df is:

Label,col1,col2,col3
0,43.55,98.86,2.34
1,21.42,51.42,71.05
0,49.17,13.55,101.00
0,5.00,17.88,28.00
1,44.00,2.42,34.69
1,41.88,144.00,9.75

According to it, the lines for 0th and 2nd and 3rd lines should be blue and 1st & 4th and 5th should be red.

What I tried to convert rows to columns and plot:

df = pd.read_csv(dfname,header=0)
df.index = df['Label'] # index the label 
df_T = df.iloc[:,1:].T # remove the label transpose the df

df_T looks like this:

Label       0       1       0      0        1       1
col1    43.55   21.42   49.17   5.00    44.00   41.88
col2    98.86   51.42   13.55   17.88   2.42    144.00
col3    2.34    71.05   101.00  28.00   34.69   9.75

However when I plot the columns, each column has their own independent colour: df_T.plot.line()

enter image description here

I would be happy to have your help, thanks!

2

There are 2 best solutions below

4
jezrael On BEST ANSWER

You can simplify your code by DataFrame.set_index with transpose, then use DataFrame.plot with color parameter by mapping 0,1 values by dictionary:

df_T = df.set_index('Label').T

d = {0:'red', 1:'blue'}
df_T.plot(color=df_T.columns.map(d))

pic

1
A.A. On

In your approach, you set the DataFrame's index to Label, which was a correct step. However, when transposing the DataFrame with df_T = df.iloc[:,1:].T, the connection between the rows and their corresponding labels was lost to the plotter.

The transposed DataFrame df_T had the labels as its column headers, but these labels were not used to define the color of the lines in the plot.

When you ran df_T.plot.line(), it resulted in each row being plotted with a default, distinct color, as the method did not have any instruction to differentiate the rows based on their original Label values.

You can try this:

data = {
    "Label": [0, 1, 0, 0, 1, 1],
    "col1": [43.55, 21.42, 49.17, 5.00, 44.00, 41.88],
    "col2": [98.86, 51.42, 13.55, 17.88, 2.42, 144.00],
    "col3": [2.34, 71.05, 101.00, 28.00, 34.69, 9.75]
}
df = pd.DataFrame(data)

# Transposing the DataFrame while keeping the label
df_T = df.set_index('Label').T

colors = {0: 'blue', 1: 'red'}
for label in df['Label'].unique():
    df_label = df_T[label]
    df_label.plot.line(color=colors[label], label=f'Label {label}')

plt.legend()
plt.show()