What is the space complexity of a prime sieve with data in proportion to number of primes?

1.1k Views Asked by At

I'm practicing writing algorithms optimized for space or time complexity. With prime sieves, at minimum you must store a list of all primes found. It seems data in proportion to the number of primes found is the least amount of space such an algorithm may use.

  • Is this rationale valid?
  • How would the space complexity for this algorithm be evaluated?

From Wikipedia about the sieve of Atkin - What I'm unsure about is how a sieve can use O(n^1/2) space when the number of primes exceeds this. This is why it seems at minimum the space must be proportional to the number of primes. Am I confusing countable numbers with space complexity?

In this paper on the sieve of Atkin, their algorithm prints "the prime numbers up to N...Here “memory” does not include the paper used by the printer." This seems like an unfair calculation of space.

  • I would appreciate clarification on how this should be / is actually measured objectively.
def prime_sieve(limit):
    factors = dict()
    primes = []
    factors[4] = (2)
    primes.append(2)
    for n in range(3, limit + 1):
        if n not in factors:
            factors[n * 2] = (n)
            primes.append(n)
        else:
            prime = factors.get(n)
            m = n + prime
            while m in factors:
                m += prime
            factors[m] = (prime)
            del factors[n]
    return primes
3

There are 3 best solutions below

5
On BEST ANSWER

The space complexity for this algorithm is len(numbers) + len(primes); the size of the list plus the size of the dictionary.

In this case, the algorithm is worse than you'd expect for a naive prime sieve (limit). len(numbers) + len(primes) > limit because e.g. for prime_sieve(100) the following irrelevant numbers are stored in numbers:

{129: 43, 134: 67, 141: 47, 142: 71, 146: 73, 158: 79, 166: 83, 178: 89, 194: 97, 102: 17, 104: 2, 105: 3, 106: 53, 110: 11, 111: 37, 112: 7, 114: 19, 115: 23, 116: 29, 117: 13, 118: 59, 120: 5, 122: 61, 123: 41, 124: 31}

There are several prime number sieves, with varying time and space complexity; see e.g. Wikipedia and questions like How do i reduce the space complexity in Sieve of Eratosthenes to generate prime between a and b?

Also, note that there's no need for prime = numbers.get(n) - you've already checked if n not in numbers, so you can just use prime = numbers[n].

0
On

" With prime sieves, at minimum you must store a list of all primes found. "

Incorrect. You only need primes below (and including) the square root of your upper limit, to sieve for the primes in that range.

And if your sieve is incremental, unbounded, then you only need to have primes below (and including) the square root of the current production point.

How is that possible? By using a separate primes supply for the "core" primes (those below the sqrt), which is perfectly OK to be calculated with the same function — recursively. For an example, see this answer.

And it's perfectly legit not to count the produced primes - you can indeed send them to the printer, or an external file, etc. So, the space complexity of such a sieve will be O( sqrt(N)/log(N) ) ~ O( sqrt( n/log(n) )) for n primes up to N ~= n * log n.

Also, don't go near the sieve of Atkin. Word on the street is, it's impossible to beat the properly wheel-erized sieve of Eratosthenes with it (search for answers by GordonBGood on the subject, like this one).

0
On

The space complexity measurement is perfectly fair. If you replace primes.append(n) by yield n, and you process the primes one by one in a consumer routine without storing them all, for example to find a prime with a particular property, then the storage space required for the primes themselves is O(1), measured in the number of primes.

(yield is the Python way of constructing a generator, a type of co-routine that emits values to the caller and saves the state of the function so it can be re-entered.)