What are bit vectors and how do I use them to convert chars to ints?

364 Views Asked by At

Here's the explanation for our task when implementing a set data structure in C "The set is constructed as a Bit vector, which in turn is implemented as an array of the data type char."

My confusion arises from the fact that almost all the functions we're given take in a set and an int as shown in the function below yet our array is made up of chars. How would I call functions if they can only take in ints when I have an array of chars? Here's my attempt att calling the function in my main function as well as the structs and example of function used.

int main(){

  set *setA = set_empty();
  set_insert("green",setA );

}

struct set {
    int capacity;
    int size;
    char *array;
};

void set_insert(const int value, set *s)
{
    if (!set_member_of(value, s)) {
        int bit_in_array = value; // To make the code easier to read

        // Increase the capacity if necessary
        if (bit_in_array >= s->capacity) {
            int no_of_bytes = bit_in_array / 8 + 1;
            s->array = realloc(s->array, no_of_bytes);
            for (int i = s->capacity / 8 ; i < no_of_bytes ; i++) {
                s->array[i] = 0;
            }
            s->capacity = no_of_bytes * 8;
        }

        // Set the bit
        int byte_no = bit_in_array / 8;
        int bit = 7 - bit_in_array % 8;
        s->array[byte_no] = s->array[byte_no] | 1 << bit;
        s->size++;
    }
}
1

There are 1 best solutions below

0
the busybee On

TL;DR: The types of the index (value in your case) and the indexed element of an array (array in your case) are independent from each other. There is no conversion.


Most digital systems these days store their value in bits, each of them can hold only 0 or 1.

An integer value can therefore be viewed as a binary number, a value to the base of 2. It is a sequence of bits, each of which assigned a power of two. See the Wikipedia page on two's complement for details. But this aspect is not relevant for your issue.

Relevant is the view that an integer value is a sequence of bits. The simplest integer type of C is the char. It holds commonly 8 bits. We can assign indexes to these bits, and therefore think of them as a "vector", mathematically. Some people start to count "from the left", others start to count "from the right". Other common terms in this area are "MSB" and "LSB", see this Wikipedia page for more.

To access an element of a vector, you use its index. On a common char this is commonly a value between 0 and 7, inclusively. Remember, in CS we start to count from zero. The type of this index can be any integer wide enough to hold the value, for example an int. This is why you use a int in your case. This data type is independent from the type of elements in the vector.

How to solve the problem, if you need more than 8 bits? Well, then you can use more chars. This is the reason why your structure holds (a pointer to) an array of chars. All n chars of the array represent a vector of n * 8 bits, and you call this amount the "capacity".

Another option is to use a wider type, like a long or even a long long. And you can build an array of elements of these types, too. However, the widths of such types are commonly not equal in all systems.

BTW, the mathematical "vector" is the same thing as an "array" in CS. Different science areas, different terms.

Now, what is a "set"? I hope your script explains that a bit better than I can... It is a collection that contains an element only once or not at all. All elements are distinct. In your case the elements are represented by (small) integers.

Given a vector of bits of arbitrary capacity, we can "map" an element of a set on a bit of this vector by its index. This is done by storing a 1 in the mapped bit, if the element is present in the set, or 0, if it is not.

To access the correct bit, we need the index of the single char in the array, and the index of the bit in this char. You calculate these values in the lines:

        int byte_no = bit_in_array / 8;
        int bit = 7 - bit_in_array % 8;

All variables are of type int, most probably because this is the common type. It can be any other integer type, like a size_t for example, as long as it can hold the necessary values, even different types for the different variables.

With these two values at hand, you can "insert" the element into the set. For this action, you set the respective bit to 1:

        s->array[byte_no] = s->array[byte_no] | 1 << bit;

Please note that the shift operator << has a higher precedence than the bit-wise OR operator |. Some coding style rules request to use parentheses to make this clear, but you can also use this even clearer assignment:

        s->array[byte_no] |= 1 << bit;