Make RadixSort with linked lists on C++

403 Views Asked by At

I'm learning C++ and learning linked lists. I'm currently trying to make a radix sort for this type of lists, but my approach is not working so I was wondering if anyone could advice me on how to do it. Here's my code:

void simpleList::radixSort() {
    for (int i = 0; i < numberOfDigits(); i++){
        Node * tmp = firstNode;
        for (int j = 0; j < counter(); j++){
            int pow = 10;
            for (int k = 0; k < 10; k++){
                if (tmp -> data % pow == k){
                    insertFirst(tmp->data);
                    tmp = tmp -> next;
                }
                pow= pow * 10;
            }
        }
    }

}

The function numberOfDigits() gets you the amount of digits on the max number of the list, and counter() the amount of elements in the list. Also insertFirst() puts the number at the beggining of the list.

3

There are 3 best solutions below

2
Dúthomhas On BEST ANSWER

A few notes to help you on your way:

Radix Sort

A radix sort is any kind of (stable n-ary bucket sort) repeated until you run out of digits in your sort keys. Make your life simple: use binary as your radix.

Keys

You are using int values in your list, but that need not be the case. In general, you need a value→key function that returns an integer key for each element of your list, where “equivalent” elements return the same key. For a list of integers the value→key function is the identity function, so easy enough.
For convenience, I will use lambdas for the value→key functions below.

Reducing Passes

You can reduce the number of times you bucket sort to only those bits that are not the same for all keys. Before the first pass through your list, you do not know anything about the keys, so we can gather information at the same time as the first bucket sort:

key_type key_mask       =  0; // bits to sort
key_type common_mask    = ~0; // (bits set in all keys)
auto value_to_bucket_fn = [&]( const T & element )
{
  // here we gather information about the keys
  key_type key = user_supplied_value_to_key_fn( element );
  key_mask    |= key;
  common_mask &= key;
  // here we return our first bucket key == bit 0
  return key & 1;
};

binary_bucket_sort( value_to_bucket_fn );

Once we have made that first pass, we can get a bitmask indicating which bits need sorting in our keys:

key_mask ^= common_mask;  // key_mask ← set bits == bits to sort

All remaining passes can now be managed with a simple loop, ending when we run out of bits needing sorting:

for (int shift = 1;  key_mask >>= 1;  shift++)
  if (key_mask & 1)
    binary_bucket_sort( [&]( const T & element )
    {
      return (user_supplied_value_to_key_fn( element ) >> shift) & 1;
    } );

Bucket Sort

A linked list is perfect for bucket sorts of large, unwieldly objects. Remember: the bucket sort must be stable, meaning it must not mix up the order of “equivalent” elements. This is imperative for a radix sort to work properly.

For a binary bucket sort over a linked list, life is pretty simple — only two buckets are necessary, and you only need keep track of the last node in each bucket to append.

If you are using a doubly-linked list that bookkeeping is done for you. If you are using a singly-linked list you will have to do it manually.

Node   heads[2] = { Node{},    Node{}    };
Node * tails[2] = { &heads[0], &heads[1] };

while (head)
{
  int bucket = value_to_bucket_fn( head->data );
  tails[bucket]->next = head;  // append current head to correct bucket
  tails[bucket]       = head;  // bucket’s new tail is the current head
  head = head->next;  // pop current head; get new current head
}

Hopefully you can see how easy this would be to expand to any number of buckets. Still, we will stick with two.

Once you have split all the nodes into the two buckets, just join them back together into your new complete list. Don’t forget to clean up the tail’s next pointer!

head = heads[0]->next;
tails[0]->next = heads[1]->next;
tails[1]->next = nullptr;

Be sure to check out trincot’s answer to see how he defined his singly-linked list with a lastNode pointer and useful functions to make all this splitting into buckets (“partitions”) and joining the final list into invocations of very inexpensive member functions.

Generics

This answer spends some time going on about keys and non-integer values. I have defined my list type’s nodes as:

struct node_type
{
  T           data;
  node_type * next;
};

And I have defined the sorting functions as:

template <typename ValueToBucket>
void binary_bucket_sort( ValueToBucket value_to_bucket );
  
template <typename ValueToKey>
void radix_sort( ValueToKey value_to_key );

Then, when I sorted my test lists, I used variations of:

list <int> xs;
...
xs.radix_sort( []( int x ) { return x; } );

You can do things like observe the stability in the sort by playing with the value→key function (lambda). For example, I could define a list of integers where the one’s digit didn’t matter:

xs.radix_sort( []( int x ) { return x / 10; } );

Or a list of floats where the fractional part only mattered to two decimal places:

xs.radix_sort( []( double x ) { return static_cast <long long> ( x * 100 ); } );

I could have a list of Student where I am only interested in sorting by the student’s ID:

xs.radix_sort( []( const Student & student ) { return student.ID; } );

As long as the value→key function returns a sufficiently unique integer value we can use radix sort.

0
trincot On

The main problem in your approach is that the only thing that can happen with a node is that it eventually gets moved to the start of the list. But there is nothing else that can happen with a node. For instance, there is no logic that leaves a node where it is and then moves on to the next. Instead the code keeps looking at the same node until it can be moved. It should be clear that this cannot result in a sorted order.

Secondly, if you are using radix 10, you will need 10 different possible destinations for a node ("buckets"), depending on the digit that is inspected. These would be linked lists as well, but you need to somehow manage them. Then when all nodes have been distributed over this buckets, the buckets should be joined again into a single list.

I would suggest using radix 2. Then you only need two "buckets". Also, I would suggest to keep track of the last node in a list (if you haven't already done so), with a lastNode member in your class.

Here is an implementation with radix 2 and the use of two bucket linked lists in each pass:

#include <iostream>
#include <vector>

class Node {
public:
    Node *next = nullptr;
    int data;

    Node (int data): data(data) {};
    Node (int data, Node *next): data(data), next(next) {};
};

class simpleList {
public:
    Node *firstNode = nullptr;
    Node *lastNode = nullptr;

    simpleList() {}

    simpleList(std::vector<int> data) {
        for (auto value: data) {
            append(value);
        }
    }

    void clear() { // Does not free nodes
        firstNode = lastNode = nullptr;
    }

    // Three variants of append. To append:
    // * A node
    // * A value, for which a Node will be created
    // * Another linked list, which will be emptied
    void append(Node *node) {
        if (!firstNode) {
            firstNode = node;
        } else {
            lastNode->next = node;
        }
        lastNode = node;
        node->next = nullptr;
    }

    void append(int data) {
        append(new Node(data));
    }

    void append(simpleList *other) {
        if (firstNode) {
            lastNode->next = other->firstNode;
        } else {
            firstNode = other->firstNode;
        }
        if (other->firstNode) {
            lastNode = other->lastNode;
            other->clear();
        }
    }

    Node *popFirstNode() {
        auto node = firstNode;
        if (firstNode) {
            firstNode = firstNode->next;
            if (!firstNode) {
                lastNode = nullptr;
            }
            node->next = nullptr;
        }
        return node;
    }

    void print() {
        auto node = firstNode;
        while (node) {
            std::cout << node->data << " ";
            node = node->next;
        }
        std::cout << "\n";
    }

    void radixSort() {
        bool hasMoreBits = true;
        simpleList *partitions[] = {new simpleList(), new simpleList()};
        for (int bit = 0; hasMoreBits; bit++) {
            hasMoreBits = false;
            while (firstNode) {
                hasMoreBits |= ((firstNode->data >> bit) >> 1) != 0;
                int digit = (firstNode->data >> bit) & 1;
                partitions[digit]->append(popFirstNode());
            }
            append(partitions[0]);
            append(partitions[1]);
        }
    }

};

// Demo
int main() {
    auto list = new simpleList({4, 9, 1, 2, 6, 8, 3, 7, 5});
    list->print();
    list->radixSort();
    list->print();
}
0
rcgldr On

Example of a base 256 (8 bit) radix sort for linked list using 64 bit unsigned integers for data. The list structure used in the code uses a pointer to pointer for the tail to simplify the code. A base 256 (8 bit, 8 sort passes) is about 8 times as fast as a base 2 (single bit, 64 sort passes) radix sort.

typedef struct NODE_{                   // node struct
    struct NODE_ * next;
    uint64_t       data;
}NODE;

typedef struct LIST_{                   // list struct
    NODE *  phead;
    NODE ** pptail;
}LIST;

NODE * RadixSort(NODE * plist)
{
LIST alist[256];                        // array of lists
NODE *pnode = plist;
uint32_t i, j, k;
size_t x;
    for(k = 0; k < 64; k += 8){         // radix sort
        for (i = 0; i < 256; i++) {     //  reset alist
            alist[i].phead = 0;
            alist[i].pptail = &alist[i].phead;
        }
        pnode = plist;                  //  split into lists
        while(pnode){
            x = ((pnode->data) >> k) & 0xff;
            *(alist[x].pptail) = pnode;
            alist[x].pptail = &(pnode->next);
            pnode = pnode->next;
        }
        //                              // concatenate lists
        for(i = 0; alist[i].phead == 0; i++);
        plist = alist[i].phead;
        j = i;
        for(++i; i < 256; ++i){
            if(alist[i].phead == 0)
                continue;
            *(alist[j].pptail) = alist[i].phead;
            j = i;
        }
        *(alist[j].pptail) = 0;
    }
    return plist;
};