I was wondering if anyone had any suggestions on what I can do to improve the running speed of my code. I wrote a Sieve of Atkin function that returns a vector including all primes from [2, max] inclusive.
Here is my code.
void atkin_sieve (unsigned int max, std::vector <unsigned int> & primes) {
// Sieve array up to max defaulted to false
// Index's [0, max] correspond to numbers
// Dynamic memory so all values default false
bool* sieve = new bool [max + 1];
// Square root of max number
unsigned int sqrt_max = int (sqrt (max));
// Unsigned integers declared to save time
unsigned int n, x, y;
// TODO - Explain this
for (x = 1; x < sqrt_max; x++) {
for (y = 1; y < sqrt_max; y++) {
n = (4 * x * x) + (y * y);
if (n <= max && (n % 12 == 1 || n % 12 == 5))
sieve [n] = !sieve [n];
n = (3 * x * x) + (y * y);
if (n <= max && (n % 12 == 7))
sieve [n] = !sieve [n];
n = (3 * x * x) - (y * y);
if (x > y && n <= max && (n % 12 == 11))
sieve [n] = !sieve [n];
}
}
// TODO - Explain this
for (x = 5; x < sqrt_max; x++) {
if (sieve [x]) {
n = x * x;
for (y = n; y <= max; y += n)
sieve [y] = false;
}
}
// Push primes 2, 3, and 5
primes.push_back(2);
primes.push_back(3);
primes.push_back(5);
// Start from prime 7, skip even numbers
for (x = 7; x <= max; x += 2) {
// If the number is prime
if (sieve [x])
// Push it into the vector
primes.push_back(x);
}
// Delete the sieve array
delete [] sieve;
}
I have a few questions about what I might be able to do better in order to optimize this function.
I initialized the sieve array as a dynamic array of booleans so that they would all default to false, is it faster to have the sieve be dynamic like this or should I keep it as a normal array?
I am storing the primes in a vector using a for loop after the algorithm has processed, is there a faster way that i can find all the primes in the sieve to store them in the vector?
Any other tips, tricks, hints, or code is welcome and very much appreciated, thank you.
Depending on how many primes you are returning, using
std::vectormay be causing a problem, each time a vector has to grow to handle more objects than its capacity it will probably need to create a new array and copy all the values from the old array to the new. Consider using astd::listor astd::dequeto avoid this issue. If you really need avectorit may be faster to loop through and count the number of primes, thenreservethat much size in the vector, so that the vector never needs to grow.You should probably do some profiling on the code - how you do this depends on your development environment. This should tell you where most of your code's time is spent. Without that info, you could spend ages optimizing one part of the code when it won't have a huge effect on the result.
At the very least add some timing code, so you can see whether any changes you make help, and by how much.
The best optimizations are usually changing the algorithm, usually doing things like loop unrolling etc are best left to the compiler, unless the code is particularly time critical (even then maybe not).
Also, make sure you are compiling with optimizations on - this can make a huge difference.