l have four bars in my histogram which represent the frequency of letter, digit, special characters and alphnumeric in my file.
l did the following to get the frequency of each family :
def group_data():
def find_group(val):
val = str(val)
val = val.lower()
if val.isalpha():
return 'alphabet'
elif val.isdigit():
return 'digit'
elif val.isalnum and any(c.isalpha() for c in val) and any(c.isdigit() for c in val):
return 'alphanum'
else:
return 'spec-char'
df = pd.read_csv('file1.csv', sep=',')
df = df.astype(str)
df.manual_raw_value = df.manual_raw_value.str.lower()
df.manual_raw_value.apply(find_group)
df.manual_raw_value.apply(find_group).value_counts().plot(kind='bar')
and l got the following for
the first file and second file
Now l would like to get a stacked bar so l did the following
normal = df.manual_raw_value.apply(find_group).value_counts()
aug=df2.manual_raw_value.apply(find_group).value_counts()
sub_df = pd.concat([normal, aug], keys=["alphabet", "digit","spec-char","alpha-num"]).unstack()
sub_df.plot(kind='bar', stacked=True, rot=1)
plt.show()
However, l got a wrong plot
EDIT-1
I'm supposed to get plot like the following:
red bars for
aug=df2.manual_raw_value.apply(find_group).value_counts()
and blue bars for
normal = df.manual_raw_value.apply(find_group).value_counts()