Problem with quicksort function on arrays of 64 bits integers in C

98 Views Asked by At

I am currently working in C with large sets of 64-bit integers, and I naturally need a quicksort algorithm to simplify any incoming function on those sets. I tried a classic implementation already found in several places on this website (such as here) and ended up with the following code:

#include <stdio.h>
#include <stdlib.h>         
#include <time.h>
#include <stdint.h>

static void swap_int(uint64_t *a, uint64_t *b)
{
    uint64_t tmp = *a;
    *a  = *b;
    *b = tmp;
}

uint64_t QuickSortPartition(uint64_t *array, uint64_t begin, uint64_t end) {

    uint64_t i = begin, j;

    for (j = begin; j <= end; j++)
    {
        if (array[j] < array[end])
            swap_int(array + j, array + i++);
    }
    swap_int(array + i, array + end);

    return i;
}

void QuickSortFunction(uint64_t *array, uint64_t begin, uint64_t end) {
    if (begin < end) {
        uint64_t pivot = QuickSortPartition(array, begin, end);
        QuickSortFunction(array, begin, pivot - 1);
        QuickSortFunction(array, pivot + 1, end);
    }
}

It works perfectly fine for small sets as far as I have tested. I then went on to pseudorandomly generate 64 bits integers using the following function, and test the quicksort:

uint64_t rnd64(uint64_t n)
{
    const uint64_t z = 0x9FB21C651E98DF25;

    n ^= ((n << 49) | (n >> 15)) ^ ((n << 24) | (n >> 40));
    n *= z;
    n ^= n >> 35;
    n *= z;
    n ^= n >> 28;

    return n;
}

int main(int argc, char const *argv[]) {
    int n = 64;
    uint64_t Size = strtoull(argv[1], NULL, 10);

    uint64_t *S = malloc(Size * sizeof(uint64_t));

    uint64_t state = 1;
    for (uint64_t i = 0; i < Size; i++)
    {
        const uint64_t n = rnd64(state++);
        S[i] = n;
    }

    QuickSortFunction(S, 0, Size - 1);

    printf("Sorted S:\n");
    Display_set(S, Size, n);
}

Where Size is arbitrarily entered as an argv parameter. Moreover, Display_set is a standard function printing each element of S in binary and in order. This works perfectly fine for Size <= 11, but once Size = 11, this produces a segmentation fault. What am I doing wrong here?

I tried implementing other standard quicksort methods such as the pivot in first or last position, to no avail. As this function should (unless I am also wrong here) work for int type, I am guessing the transition to uint64_t induces an error.

4

There are 4 best solutions below

5
ZWORX52 On BEST ANSWER

You don't have to reinvent the wheel; check out the C standard library function qsort. It lets you specify the size of the elements, which you can set to sizeof(your integer type).

If quicksort isn't the algorithm you want, its sister functions mergesort and heapsort may be more useful.

The comparison function -- the last argument -- can be simply implemented as

int compare_int_type(const void *a, const void *b) {
    return *(const int_type *)a - *(const int_type *)b;
}

qsort is then called as

// for the last argument, you're passing the function,
// so it's important not to write compare().
qsort(array, size, sizeof(*array), compare);
3
chux - Reinstate Monica On

On review, OP says, Size was 11 or so and therefore the overflow issue below is not a concern for OP now.


Potential overflow

The type of the numbers is uint64_t --> good.

The array size and indexing should be size_t. If Size is a value more than SIZE_MAX/sizeof(uint64_t), then Size*sizeof(uint64_t) is too big for malloc(size_t) and you have problems.

Limit array.

0
rcgldr On

The code in the question just needs one line fixed, to change begin and end from uint64_t to int64_t. This is done to handle cases l like begin = 0 and end = 0xffffffffffffffff, which should be begin = 0 and end = -1 so that if (begin < end) { is false in those cases.

void QuickSortFunction(uint64_t *array, int64_t begin, int64_t end){

QuickSortPartition() is doing one more loop than needed, but it's not a problem.

    for (j = begin; j < end; j++)       /* no need for j <= end */
0
chqrlie On

There is a subtle problem in your implementation: if QuickSortPartition returns 0, which happens when the pivot element is greater or equal to all entries at the beginning of the array, passing pivot - 1 in a recursive call passes a very large (UINT64_MAX) value as the end index, causing QuickSortPartition to iterate far beyond the end of the array, invoking undefined behavior such as a segmentation fault as you observe.

Here is a modified version:

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>         
#include <time.h>

static void swap_uint64(uint64_t *a, uint64_t *b) {
    uint64_t tmp = *a;
    *a  = *b;
    *b = tmp;
}

size_t QuickSortPartition(uint64_t *array, size_t begin, size_t end) {
    size_t i = begin;
    for (size_t j = begin; j < end; j++) {
        if (array[j] < array[end])
            swap_uint64(array + j, array + i++);
    }
    swap_uint64(array + i, array + end);
    return i;
}

void QuickSortFunction(uint64_t *array, size_t begin, size_t end) {
    if (begin < end) {
        size_t pivot = QuickSortPartition(array, begin, end);
        if (pivot > begin)
            QuickSortFunction(array, begin, pivot - 1);
        QuickSortFunction(array, pivot + 1, end);
    }
}

Note however that this simplistic implementation has a pathological case if the array is already sorted: the complexity is quadratic and the recursion depth can reach the length of the array, causing a stack overflow.

The recursion problem can be handled with this modified version that only recurses on the smaller slice:

void QuickSortFunction(uint64_t *array, size_t begin, size_t end) {
    while (begin < end) {
        size_t pivot = QuickSortPartition(array, begin, end);
        if (pivot - begin < end - pivot) {
            if (pivot > begin)
                QuickSortFunction(array, begin, pivot - 1);
            begin = pivot + 1;
        } else {
            QuickSortFunction(array, pivot + 1, end);
            end = pivot - 1;
        }
    }
}

This does not solve the pathologic cases where substantial portions of the array are already sorted. To sort the array more reliably, with a method that may not always be a variation of QuickSort, you should use the standard library function qsort:

#include <stdlib.h>

int compare_uint64(const void *aa, const void *bb) {
    const uint64_t *a = aa;
    const uint64_t *b = bb;
    return (*a > *b) - (*a < *b);
}

void QuickSortFunction(uint64_t *array, size_t begin, size_t end) {
    if (begin < end)
        qsort(array, end - begin + 1, sizeof(*array), compare_uint64);
}

For very large sets, you could get better performance by implementing Radix Sort, which is well suited for arrays of unsigned integers.