Finding the most occurring letter in every position of a string in a list of strings

313 Views Asked by At

I have a list of strings called words such that

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']

I have to find the most occurring letter in every position the strings, example, let's find the most occurring first letter, so I'll check every first letter of my strings and get 'h' because is the letter that most repeat it self. (If I get two letters that repeat themselves the same amount of times I'll consider the alphabetic order), so the second letter is 'a' because is the letter that repeat itself most time at the second position of all letters, then 'r' because of every third letter in every string is the one that is repeated mostly and so on, at the end I want the string maxOccurs = "hareennt" that is a string that contains all the most frequent letter. This is what I coded so far:

maxOccurs = ""
listOfChars = []

for i in range(len(words)):
    for item in words:
        listOfChars.append(item[i])

    maxOccurs += max(set(listOfChars), key=listOfChars.count)
    listOfChars.clear()

It raises me and index error out of bound when i == 4, obviously because not every letter has the same length, but I cannot get done with it, I will appreciate any help. P.S. I can't use any import.

2

There are 2 best solutions below

2
MangoNrFive On BEST ANSWER

This works:

words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']


maxOccurs = ""
listOfChars = []

for i in range(len(max(words, key=len))):
    for item in words:
        try:
            listOfChars.append(item[i])
        except IndexError:
            pass

    maxOccurs += max(sorted(set(listOfChars)), key=listOfChars.count)
    listOfChars.clear()

I made 3 changes to your code:

  1. Iterate by the length of the longest word in the outer for-loop
  2. Access the characters of the string in a try-block, to deal with different-length words
  3. Sorting the set of most used characters to consider alphabetic order in the case of same number of appearance

If imports where allowed, I would do this:

from statistics import mode
from itertools import zip_longest


words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
maxOccurs = "".join(mode("".join(chars)) for chars in zip_longest(*words, fillvalue=""))
0
Iguananaut On

The standard library is full of nice utilities for counting. Here's a one-liner that does it:

>>> from collections import Counter
>>> from itertools import zip_longest
>>> words = ['house', 'garden', 'kitchen', 'balloon', 'home', 'park', 'affair', 'kite', 'hello', 'portrait', 'angel', 'surfing']
>>> ''.join(Counter(filter(None, chars)).most_common(1)[0][0] for chars in zip_longest(*words))
'horeennt'

The only difference is it returns 'horeennt' instead of 'hareennt' because o and a apply equally frequently in the second place, and Counter.most_common(1) will return the first item encountered if there's a tie.