How can I display overlapping categories?

96 Views Asked by At

I have a Python script to plot a dataset with the following metrics: "99th percentile," "50th percentile," "Mean," "2xx," "4xx/5xx." The 'IsError' column is the label indicating 'Service_Status,' indicating the normal or anomalous state of the service.

I create a pair plot to provide a better visual understanding of the dataset. However, I have a problem where, in many cases, the anomalous and normal behaviors overlap in the graph, giving the visual impression that only one case exists. For example, in the plot, when '3_True' and '3_False' have the same behavior, only '3_False' is visible, giving the false idea that there are no anomalous occurrences. I've tried adjusting the alpha to the maximum but without success. Any suggestions on how to address this issue?

Script:

    import pandas as pd
    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    
    data = pd.read_csv('../datasets/application anomalies/data.csv')
    
    data = data.drop('Time', axis=1)
    
    data['Service_Status'] = data['Service'].astype(str) + '_' + data['IsError'].astype(str)
    
    markers = {'0_False': 'D', '1_False': 'D', '2_False': 'D', '3_True': 'X', '4_False': 'D', '5_False': 'D', '6_False': 'D', '3_False': 'D', '0_True': 'X', '4_True': 'X', '1_True': 'X','5_True':'X','6_True':'X'}
    
    custom_palette = sns.color_palette("Set1", len(data['Service_Status'].unique()))
    
    scatter_kws = {'s': 100, 'alpha': 0.2, 'style': data['Service_Status'],  'markers': markers}
    
    sns.pairplot(data, hue='Service_Status', vars=["99th percentile", "50th percentile", "Mean", "2xx", "4xx/5xx"], diag_kind="kde", plot_kws=scatter_kws, corner=False)
    
    plt.show()

enter image description here

Dataset sample:

https://github.com/jnobre/data-sample/blob/main/data-sample.csv

0

There are 0 best solutions below