Python - x-ticks placed wrong in a bar plot

30 Views Asked by At

I created a bar plot/hist of categorical data and I iterated over the number of records. The problem is the x-ticks are not centered. And somehow there's a zero there that comes from nowhere! Could someone please explain to me how I can center the x-ticks and how did I get a zero in there when it's nowhere in my dataset??!!

I'd appreciate your help. Thanks!

Here's my code:

def distribution(data, colName, title, plot_name):

    fig = plt.figure(figsize = (50,10));

    for i, feature in enumerate(colName):
        ax = fig.add_subplot(1, 2, i+1)
        ax.hist(data[feature], bins = 25, color = 'maroon')
        ax.tick_params(axis='x', rotation=90, labelsize=20)
        ax.set_ylim((0, 2000))
        ax.set_yticks([0, 500, 1000, 1500, 2000])
        ax.set_yticklabels([0, 500, 1000, 1500, ">2000"], fontsize=20)
        ax.bar_label(ax.containers[0], size=20)

    ax.set_title(title, fontsize = 30)
    plt.savefig(plot_name, format='png')
    plt.show()
        
distribution(df, ['Subtype 1'], "Distribution of Subtype 1", "Distribution of Subtype 1.png" )

enter image description here

1

There are 1 best solutions below

0
On

I removed the for loop entirely, because Pandas with groupby function is much easier to work with. Also my mistake was that I used the histogram function not the barplot. In the new code I used the barplot from seaborn library. The result is perfectly representative of the dataset.

Here's my new code:

import seaborn as sns

df = df.groupby("Subtype 1").count().reset_index() 

def distribution(data, colName_x, colName_y, title, plot_name):
    x_axis = colName_x
    y_axis = colName_y
    fig = plt.figure(figsize = (50,10));
    ax = sns.barplot(x=colName_x, y=colName_y, data=data, color = 'maroon')
    ax.tick_params(axis='x', rotation=90, labelsize=20)
    ax.set_ylim((0, 2000))
    ax.set_yticks([0, 500, 1000, 1500, 2000])
    ax.set_yticklabels([0, 500, 1000, 1500, ">2000"], fontsize=20)
    ax.bar_label(ax.containers[0], size=20)

    ax.set_title(title, fontsize = 30)
    plt.savefig(plot_name, format='png', bbox_inches="tight")
    plt.show()

distribution(df, df['Subtype 1'], df['Count'], "Distribution of Subtype 1", "Distribution of Subtype 1.png" )

enter image description here