Character count in Python

1.4k Views Asked by At

The task is given: need to get a word from user, then total characters in the word must be counted and displayed in sorted order (count must be descending and characters must be ascending - i.e., if the user gives as "management" then the output should be

**a 2
e 2
m 2
n 2
g 1
t 1**

this is the code i written for the task:

string=input().strip()
set1=set(string)
lis=[]
for i in set1:
 lis.append(i)
lis.sort()
while len(lis)>0:
 maxi=0
 for i in lis:
  if string.count(i)>maxi:
   maxi=string.count(i)
 for j in lis:
  if string.count(j)==maxi:
   print(j,maxi)
   lis.remove(j)

this code gives me following output for string "management"

a 2
m 2
e 2
n 2
g 1
t 1

m & e are not sorted. What is wrong with my code?

6

There are 6 best solutions below

1
On BEST ANSWER

The problem with your code is the assignment of the variable maxi and the two for loops. "e" wont come second because you are assigning maxi as "2" and string.count(i) will be less than maxi.

 for i in lis:
  if string.count(i)>maxi:
   maxi=string.count(i)

 for j in lis:
  if string.count(j)==maxi:
   print(j,maxi)

There are several ways of achieving what you are looking for. You can try the solutions as others have explained.

0
On

you can use a simple Counter for that

from collections import Counter

Counter("management")
Counter({'a': 2, 'e': 2, 'm': 2, 'n': 2, 'g': 1, 't': 1})
1
On

I'm not really sure what you are trying to achieve by adding a while loop and then two nested for loops inside it. But the same thing can be achieved by a single for loop.

for i in lis:
    print(i, string.count(i))

With this the output will be:

a 2
e 2
g 1
m 2
n 2
t 1
0
On

As answered before, you can use a Counter to get the counts of characters, no need to make a set or list.

For sorting, you'd be well off using the inbuilt sorted function which accepts a function in the key parameter. Read more about sorting and lambda functions.

>>> from collections import Counter
>>> c = Counter('management')
>>> sorted(c.items())
[('a', 2), ('e', 2), ('g', 1), ('m', 2), ('n', 2), ('t', 1)]
>>> alpha_sorted = sorted(c.items())
>>> sorted(alpha_sorted, key=lambda x: x[1])
[('g', 1), ('t', 1), ('a', 2), ('e', 2), ('m', 2), ('n', 2)]
>>> sorted(alpha_sorted, key=lambda x: x[1], reverse=True) # Reverse ensures you get descending sort
[('a', 2), ('e', 2), ('m', 2), ('n', 2), ('g', 1), ('t', 1)]
0
On

The issue with your code lies in that you're trying to remove an element from the list while you're still iterating over it. This can cause problems. Presently, you remove "a", whereupon "e" takes its spot - and the list advances to the next letter, "m". Thus, "e" is skipped 'till the next iteration.

Try separating your printing and your removal, and don't remove elements from a list you're currently iterating over - instead, try adding all other elements to a new list.

string=input().strip()
set1=set(string)
lis=[]
for i in set1:
 lis.append(i)
lis.sort()
while len(lis)>0:
 maxi=0
 for i in lis:
  if string.count(i)>maxi:
   maxi=string.count(i)

 for j in lis:
  if string.count(j)==maxi:
   print(j,maxi)


 dupelis = lis
 lis = [] 

 for k in dupelis:
   if string.count(k)!=maxi:
    lis.append(k)

management
a 2
e 2
m 2
n 2
g 1
t 1

Demo

0
On

The easiest way to count the characters is to use Counter, as suggested by some previous answers. After that, the trick is to come up with a measure that takes both the count and the character into account to achieve the sorting. I have the following:

from collections import Counter

c = Counter('management')

sc = sorted(c.items(),
            key=lambda x: -1000 * x[1] + ord(x[0]))

for char, count in sc:
    print(char, count)

c.items() gives a list of tuples (character, count). We can use sorted() to sort them.

The parameter key is the key. sorted() puts items with lower keys (i.e. keys with smaller values) first, so I have to make a big count have a small value.

I basically give a lot of negative weight (-1000) to the count (x[1]), then augment that with the ascii value of character (ord(x[0])). The result is a sorting order that takes into account the count first, the character second.

An underlying assumption is that ord(x[0]) never exceeds 1000, which should be true of English characters.