How do I count non alphanumerics in a text document with Python?

121 Views Asked by At

Here is my code. I'm having trouble figuring out how to make my program count non alphanumerics. I am a new coding student so go easy on me.

infile = open("Gettysburg.txt", "r")
data = infile.readlines()
non_alpha_num = 0
uppercase_count = 0
lowercase_count = 0
whitespace_count = 0
digit_count = 0
for character in data:
    if character.isupper():
        uppercase_count += 1
    elif character.islower():
        lowercase_count += 1
    elif character.isspace():
        whitespace_count +=1
    elif character.isdigit():
        digit_count +=1
    if not character.isalnum() and not character.isspace():
        non_alpha_num += 1
    print("Jake's text document counter")
    print('The uppercase count is ', uppercase_count)
    print('The lowercase count is ', lowercase_count)
    print('The digit count is ', digit_count)
    print('The whitespace count is ', whitespace_count)
    print('The non alphanumeric count is ', non_alpha_num)
1

There are 1 best solutions below

8
Sruthi On BEST ANSWER

Try

if not character.isalnum():
    non_alpha_num += 1

To exclude whitespaces :

if not character.isalnum() and not character.isspace():
    non_alpha_num += 1

EDIT : Following @ShadowRanger comment : You are not reading characters, you are reading lines. Please modify your code.

infile = open("Gettysburg.txt", "r")
data = infile.readlines()

uppercase_count=0
lowercase_count=0
whitespace_count=0
digit_count=0
non_alpha_num=0

for line in data:
    for character in line :
        if character.isupper():
            uppercase_count += 1
        elif character.islower():
            lowercase_count += 1
        elif character.isspace():
            whitespace_count +=1
        elif character.isdigit():
            digit_count +=1
        elif not character.isalnum() and not character.isspace():
            non_alpha_num += 1


print("Jake's text document counter")
print('The uppercase count is ', uppercase_count)
print('The lowercase count is ', lowercase_count)
print('The digit count is ', digit_count)
print('The whitespace count is ', whitespace_count)
print('The non alphanumeric count is ', non_alpha_num)