I was wondering if anyone had any suggestions on what I can do to improve the running speed of my code. I wrote a Sieve of Atkin
function that returns a vector including all primes from [2, max]
inclusive.
Here is my code.
void atkin_sieve (unsigned int max, std::vector <unsigned int> & primes) {
// Sieve array up to max defaulted to false
// Index's [0, max] correspond to numbers
// Dynamic memory so all values default false
bool* sieve = new bool [max + 1];
// Square root of max number
unsigned int sqrt_max = int (sqrt (max));
// Unsigned integers declared to save time
unsigned int n, x, y;
// TODO - Explain this
for (x = 1; x < sqrt_max; x++) {
for (y = 1; y < sqrt_max; y++) {
n = (4 * x * x) + (y * y);
if (n <= max && (n % 12 == 1 || n % 12 == 5))
sieve [n] = !sieve [n];
n = (3 * x * x) + (y * y);
if (n <= max && (n % 12 == 7))
sieve [n] = !sieve [n];
n = (3 * x * x) - (y * y);
if (x > y && n <= max && (n % 12 == 11))
sieve [n] = !sieve [n];
}
}
// TODO - Explain this
for (x = 5; x < sqrt_max; x++) {
if (sieve [x]) {
n = x * x;
for (y = n; y <= max; y += n)
sieve [y] = false;
}
}
// Push primes 2, 3, and 5
primes.push_back(2);
primes.push_back(3);
primes.push_back(5);
// Start from prime 7, skip even numbers
for (x = 7; x <= max; x += 2) {
// If the number is prime
if (sieve [x])
// Push it into the vector
primes.push_back(x);
}
// Delete the sieve array
delete [] sieve;
}
I have a few questions about what I might be able to do better in order to optimize this function.
I initialized the sieve array as a dynamic array of booleans so that they would all default to false, is it faster to have the sieve be dynamic like this or should I keep it as a normal array?
I am storing the primes in a vector using a for loop after the algorithm has processed, is there a faster way that i can find all the primes in the sieve to store them in the vector?
Any other tips, tricks, hints, or code is welcome and very much appreciated, thank you.
As the the other answer has suggested, profiling is the best way to work out where the time is going.
Some suggestions and comments, some of which are probably obvious
I don't believe you actually have a guarantee of zero initialisation on your allocation of sieve; to do so you'd need to use
bool *sieve = new bool[max+1]();
. If you find things are working correctly then you're either being lucky here or the running under a compiler that zero-initialises debug builds. If that's the case, try a release build with optimisations enabled and you'll see some speed up.Your
bool[]
is most likely not going to be using up 1 bit per element, but probably a byte. You might find using astd::vector<bool>
more efficient as it's typically specialised to store the booleans densely - 1 bit per bool. You'd be trading off a reduced memory footprint vs. increased complexity of reading/writing an individual boolean.Try pre-sizing the primes array to
max / log(max)
, with appropriate input validation, as an approximation to the number of primes less than max.If the function is being called repeatedly in your application, try reusing previous sieve arrays to answer subsequent calls.