How to remove a string from a list of strings if its length is lower than the length of the string with max length in Python 2.7?

Basically, if I have a list such as:

test = ['cat', 'dog', 'house', 'a', 'range', 'abc']
max_only(test)

The output should be:

['house', 'range']

'cat''s length is 3, 'dog' is 3, 'house' is 5, 'a' is 1, 'range' is 5, 'abc' is 3. The string with the highest length are 'house' and 'range', so they're returned.

I tried with something like this but, of course, it doesn't work :)

def max_only(lst):
    ans_lst = []
    for i in lst:
        ans_lst.append(len(i))   
        for k in range(len(lst)):
            if len(i) < max(ans_lst):
                lst.remove(lst[ans_lst.index(max(ans_lst))])
    return lst

Could you help me?

Thank you.

EDIT: What about the same thing for the min length element?

5

There are 5 best solutions below

2
On

Use a list comprehension and max:

>>> test = ['cat', 'dog', 'house', 'a', 'range', 'abc']
>>> max_ = max(len(x) for x in test)    #Find the length of longest string.
>>> [x for x in test if len(x) == max_] #Filter out all strings that are not equal to max_
['house', 'range']
2
On

You can use max() which returns the largest item in the list.

>>> len_max = len(max(test, key=len))
>>> [x for x in test if len(x) == len_max]
['house', 'range']

If you then take all the strings that have the same length as the element you get the desired result.

9
On

A solution that loops just once:

def max_only(lst):
    result, maxlen = [], -1
    for item in lst:
        itemlen = len(item)
        if itemlen == maxlen:
            result.append(item)
        elif itemlen > maxlen:
            result[:], maxlen = [item], itemlen
    return result

max(iterable) has to loop through the whole list once, and a list comprehension picking out items of matching length has to loop through the list again. The above version loops through the input list just once.

If your input list is not a sequence but an iterator, this algorithm will still work while anything that has to use max() won't; it'd have exhausted the iterator just to find the maximum length.

Timing comparison on 100 random words between length 1 and 9, repeated 1 million times:

>>> import timeit
>>> import random
>>> import string
>>> words = [''.join([random.choice(string.ascii_lowercase) for _ in range(1, random.randrange(11))]) for _ in range(100)]
>>> def max_only(lst):
...     result, maxlen = [], -1
...     for item in lst:
...         itemlen = len(item)
...         if itemlen == maxlen:
...             result.append(item)
...         elif itemlen > maxlen:
...             result[:], maxlen = [item], itemlen
...     return result
... 
>>> timeit.timeit('f(words)', 'from __main__ import max_only as f, words')
23.173006057739258
>>> def max_listcomp(lst):
...     max_ = max(len(x) for x in lst)
...     return [x for x in lst if len(x) == max_]
>>> timeit.timeit('f(words)', 'from __main__ import max_listcomp as f, words')
36.34060215950012

Replacing result.append() with a cached r_append = result.append outside the for loop shaves off another 2 seconds:

>>> def max_only(lst):
...     result, maxlen = [], -1
...     r_append = result.append
...     for item in lst:
...         itemlen = len(item)
...         if itemlen == maxlen:
...             r_append(item)
...         elif itemlen > maxlen:
...             result[:], maxlen = [item], itemlen
...     return result
... 
>>> timeit.timeit('f(words)', 'from __main__ import max_only as f, words')
21.21125817298889

And by popular request, a min_only() version:

def min_only(lst):
    result, minlen = [], float('inf')
    r_append = result.append
    for item in lst:
        itemlen = len(item)
        if itemlen == minlen:
            r_append(item)
        elif itemlen < minlen:
            result[:], minlen = [item], itemlen
    return result

More fun still, a completely different tack: sorting on length:

from itertools import groupby

def max_only(lst):
    return list(next(groupby(sorted(lst, key=len, reverse=True), key=len))[1])[::-1] 

def min_only(lst):
    return list(next(groupby(sorted(lst, key=len), key=len))[1]) 

These work by sorting by length, then picking out the first group of words with equal length. For max_only() we need to sort in reverse, then re-reverse the result. Sorting has a O(NlogN) cost, making this less efficient than the O(2N) solutions in other answers here or my O(N) solution above:

>>> timeit.timeit('f(words)', 'from __main__ import max_only_sorted as f, words')
52.725801944732666

Still, the sorting approach gives you a fun one-liner.

2
On

This works:

max_len = len(max(test, key=len))

result = [word for word  in test if len(word) == max_len]
5
On
>>> test = ['cat', 'dog', 'house', 'a', 'range', 'abc']
>>> filter(lambda x,m=max(map(len, test)):len(x)==m, test)
['house', 'range']

For Python3.x you would need to use list(filter(...))