Binary search algorithm using `size_t` instead of using `int`

308 Views Asked by At

This is the binary search algorithm from Wikipedia. It uses int type indexing. The use of the int type here is necessary because the algorithm sometimes uses the negative index values while searching.

int binary_search(unsigned array[], int length, unsigned needle) {
  int l = 0, r = length - 1, m;
  while (l <= r) {
    m = (l + r) / 2;
    if (array[m] < needle)
      l = m + 1;
    else if (array[m] > needle)
      r = m - 1;
    else
      return m;
  }
  return -1;
}

How to modify this algorithm to use size_t (unsigned int) instead of int?

What is an optimal solution for this?

size_t binary_search(unsigned array[], size_t length, unsigned needle) {
  size_t l = 0, r = length - 1, m;

  ???

  return (size_t) -1;
}
3

There are 3 best solutions below

10
Vlad from Moscow On BEST ANSWER

We, beginners, should help each other.:)

Here you are.

As you specified the language tags C and C++ then the demonstration program below can be run in C and C++. That is there are no special features of C++ except using names bool, false and true that in C can be used if to include header <stdbool.h> Also in C you need to include header <stdio.h> instead of <iostream>

#include <iostream>

size_t binary_search( const int a[], size_t n, int value )
{
    size_t left = 0, right = n, middle = 0;
    bool found = false;

    while ( !found && left != right )
    {
        middle = left + ( right - left ) / 2;

        if ( value < a[middle] )
        {
            right = middle;
        }
        else if ( a[middle] < value )
        {
            left = middle + 1;
        }
        else
        {
            found = true;
        }
    }

    return found ? middle : -1;
}

int main( void )
{
    int a[] = { 1, 3, 5, 7, 9 };
    const size_t N = sizeof( a ) / sizeof( *a );

    for ( int value = 0; value <= a[N - 1] + 1; value++ )
    {
        size_t pos = binary_search( a, N, value );

        if ( pos != -1 )
        {
            printf( "%d is found at %zu.\n", value, pos );
        }
        else
        {
            printf( "%d is not found.\n", value );
        }
    }
}

The program output is

0 is not found.
1 is found at 0.
2 is not found.
3 is found at 1.
4 is not found.
5 is found at 2.
6 is not found.
7 is found at 3.
8 is not found.
9 is found at 4.
10 is not found.

Pay attention to that in general in C++ you will need to write a template function while in C you will write the function similarly to the implementation of bsearch.

Without using the variable found the function can look as

size_t binary_search( const int a[], size_t n, int value )
{
    size_t left = 0, right = n;

    while ( left != right )
    {
        size_t middle = left + ( right - left ) / 2;

        if ( value < a[middle] )
        {
            right = middle;
        }
        else if ( a[middle] < value )
        {
            left = middle + 1;
        }
        else
        {
            return middle;
        }
    }

    return -1;
}
0
Zwy On

Using int is to make sure `r = length - 1' is negative if the array is empty. So if you want to use size_t, just add a check in case of empty array.

0
chqrlie On

There is a classic problem in the posted code:

  • m = (l + r) / 2; may cause an arithmetic overflow for large arrays, invoking undefined behavior. On many architectures, m may get a negative value, causing further undefined behavior when dereferencing array[m]. m should be computed as

        m = l + (r - l) / 2;
    
  • Note that the expression m = (l + r) / 2; would still cause problems for unsigned index types if the array is even larger.

To use index type size_t and thus handle larger arrays, you must also change the return type to size_t and use as a convention a return value of SIZE_MAX to mean not found.

Here is an alternative using size_t index types:

size_t binary_search(const unsigned array[], size_t length, unsigned needle) {
    size_t lo = 0, hi = length;
    while (lo < hi) {
        size_t mid = lo + (hi - lo) / 2;
        if (array[mid] < needle) {
            lo = mid + 1;
        } else {
            hi = mid;
        }
    }
    if (lo < length && array[lo] == needle) {
        return lo;
    } else {
        return (size_t)-1;  // SIZE_MAX
    }
}