Given a list of integers, determine if 70% of the values are within 20% of one of the values

695 Views Asked by spulec At 30 May 2012 at 18:01

I want to check if the list values have some level of "closeness". Is there a good algorithm to do this? Bonus points for the most pythonic way.

Valid

[1,7,8,9]
[3,4,100,101,102,103,104,105]

Not Valid

[1,8,9]
[1,10]
[100,200,300,400,500]

Original Q&A

There are 4 best solutions below

Maria Zverina On 30 May 2012 at 18:21 BEST ANSWER

For small lists this O(n^2) algorithm will suffice:

def is_close(l):
    for n in l:
        c = sum([1 for x in l if x >= 0.8 *n and x <= 1.2 * n])
        if c >= 0.7 * len(l):
            return True
    return False

print is_close([1,7,8,9])
print is_close([3,4,100,101,102,103,104,105])
print is_close([1,8,9])
print is_close([1,10])
print is_close([100,200,300,400,500])

Output is:

True
True
False
False
False

ControlAltDel On 30 May 2012 at 18:05

Look up variance: http://en.wikipedia.org/wiki/Variance

Murat Derya Özen On 30 May 2012 at 21:25

Here is an algorithm that takes n logn time.

sort the array
for i in range(len(array))
    begin = binary search an index such that array[begin] >= array[i]*0.2
    end = binary search an index such that array[end]*0.2 <= array[i]
    if (end - begin) <= len(array) * 0.7
        70% of the values are within %20 of array[i]
        i.e all elements between begin and end are within 20% of array[i]

Several optimizations, including changing the iteration order, are possible.

bennos On 31 May 2012 at 12:15

Here is a simple linear-time algorithm for an array a that is already sorted (as in the examples, otherwise it needs to be sorted beforehand in O(n log n) time). The idea is to construct and test each maximal subsequence that starts at a given position low.

low = middle = high = 1
while (low <= length (a))
   advance middle to the largest i such that a[i]*0.8<=a[low]
   advance high to the largest i such that a[i]<=a[middle]*1.2
   if ((high-low+1)/length(a)>=0.7) output(true)
   low = low + 1
return (false);

Since low, middle, and high are always increased from 1 through length(a), the running time is always linear in length(a).

If the matching subsequence of a is desired, one can output a[low]...a[high] instead of true.

Given a list of integers, determine if 70% of the values are within 20% of one of the values

There are 4 best solutions below

Related Questions in PYTHON

Related Questions in ARRAYS

Related Questions in ALGORITHM

Related Questions in LIST

Related Questions in ARRAY-ALGORITHMS

Trending Questions

Popular # Hahtags

Popular Questions