Why isn't the code reflecting the actual mathematical behaviour when doing a binary search guess?

Question

Why isn't the code reflecting the actual mathematical behaviour when doing a binary search guess?

79 Views Asked by Perseus_Lynx At 10 March 2024 at 10:06

I decided to create a small program that would just find a number in a given range using what I think is binary search. The code works well, as far as I can tell. The issue appeared when I wanted to make a mathematical function that would predict the average amount of guesses needed to guess the given number, and rather intuitively, the function was:

log₂(x) + 1

However, when testing the code experimentally, the result on large numbers like 1e12 resembled more the formula:

log₂(x) - 1

Also, it could be that the formula is not correct as I have my doubts about it. Anyway here is the code, in case you find any issues with the calculations:

use rand::Rng;
use std::io;
use std::cmp::Ordering;
use std::time::Instant;

fn read_input() -> i64 {
    println!("Insert the range of the number [1 - <input>]: ");

    let mut max_range_str = String::new();

    io::stdin()
        .read_line(&mut max_range_str)
        .expect("Couldn't read the line");

    let max_range = max_range_str
        .trim()
        .parse()
        .expect("Couldn't convert to int");

    println!("");
    
    max_range
}

fn create_secret_number(max_range: i64) -> i64 {
    rand::thread_rng().gen_range(1..=max_range)
}

fn main() {
    let max_range = read_input();
    let mut loop_count = 0;
    let mut sum_of_guesses = 0;
    loop {
        let start_time = Instant::now();
        let secret_number = create_secret_number(max_range);

        let mut iterations = 0;
        let mut ceiling = max_range;
        let mut floor = 0;
        let mut guess = max_range / 2;

        loop {
            iterations += 1;
            match guess.cmp(&secret_number) {
                Ordering::Greater => ceiling = guess,
                Ordering::Less => floor = guess,
                Ordering::Equal => {
                    let elapsed_time = start_time.elapsed();
                    println!("The correct number was {}", secret_number);
                    println!("You win after {} iterations in {:?} seconds \n", iterations, elapsed_time);
                    sum_of_guesses += iterations;
                    break;
                }
            }
            guess = (ceiling + floor) / 2;
        }

        loop_count += 1;
        let average_guesses = sum_of_guesses / loop_count;
        println!("Average guess is now: {}", average_guesses);
    }
}

Original Q&A

There are 1 best solutions below

**Markus von Broady** · Answer 1 · 2024-03-10T13:28:53.470000

I'm learning Rust myself [a disclaimer you shouldn't be inspired by my coding conventions].

Let's start with a simple algorithm. Start with a range from 1 to some power of 2. I'm assuming all ranges are a power of 2. Under this assumption, your code positions the guess on the maximum of 1st half. So if the guess is lower than the secret number, you guessed the right half. Then you can divide this half into its own two halves and repeat the process:

            1..8
    1..4←?→         5..8
1..2    3..4    5..6    7..8

I modified your code fixing some errors, and removing console printing, which is so slow, you're probably better off iterating over a million tries at once instead of continuously iterating and waiting until you reach the million eventually:

use rand::Rng;
use std::io;

fn read_input() -> i64 {
    println!("Insert the range of the number [1 - <input>]: ");

    let mut max_range_str = String::new();

    io::stdin()
        .read_line(&mut max_range_str)
        .expect("Couldn't read the line");

    let max_range = max_range_str
        .trim()
        .parse()
        .expect("Couldn't convert to int");

    println!("");
    
    max_range
}

fn create_secret_number(max_range: i64) -> i64 {
    rand::thread_rng().gen_range(1..=max_range)
}

fn main() {
    let max_range = read_input();
    let mut sum_of_guesses = 0;
    let tries = 100000;
    for _ in 1..=tries { // was infinite
        let secret_number = create_secret_number(max_range);

        let mut ceiling = max_range;
        let mut floor = 1;  // was 0, but range starts at 1!
        
        for iterations in 0..max_range { // ensure it doesn't hang but break early
            if floor == ceiling {
                sum_of_guesses += iterations;
                break;
            }
            // position guess at upper bound of first pair
            let guess = (floor + ceiling) / 2;
            
            if guess > secret_number {
                // the pair guessed correctly
                ceiling = guess;
            } else {
                // the pair guessed wrongly - so the other is the right one
                // +1 to move from upper bound of 1st pair to lower bound of 2nd
                floor = guess + 1;
            }
        }
    }
    let average_guesses = sum_of_guesses as f64 / tries as f64; // was int div!
    println!("Average guess is: {}", average_guesses);
}

Keep in mind I could just replace the max_range in for iterations in 0..max_range with a log₂ of the maximum, but then I would be assuming what I'm proving… Perhaps obviously, the result is an average of log₂ guesses (e.g. 7 for a range of 128), because that's the number of subdivisions (128, 64, 32, 16, 8, 4, 2; the last step of the iteration is not counted because there's nothing to guess).

Your code, however, has a small chance, to break early:

1..128 - 1/128 chance to break early
1..64 (or 65..128) - 1/64 chance to break early
1..32 (or ...) - 1/32 chance
1..16 - 1/16
1/8
1/4
1/2

So as you can see, the bigger the range, the more it approaches 100% chance to break early. Except, as I already said, it's unfair to say you can break early from the innermost pair (e.g. 1..2), because that would be the last guess anyway. Therefore, given the assumption in bold, you're approaching 50% chance to break early.

If for some erroneous reason you would assume that breaking early means decreasing the sum_of_guesses by just 1, then the bigger the range, the closer the actual average would be to the log₂max - 0.5. However, perhaps again obviously, the earlier you break, the more guesses you avoid:

1/4 chance to save 1 guess = 1/4 guesses saved by average
1/8 chance to save 2 guesses = 2/8 = 1/4 guesses saved by average
1/16 * 3 = 3/16
1/32 * 4 = 4/32 = 1/8

and so on, approaching one guess saved.

Why isn't the code reflecting the actual mathematical behaviour when doing a binary search guess?

There are 1 best solutions below

Related Questions in MATH

Related Questions in RUST

Related Questions in BINARY-SEARCH

Trending Questions

Popular # Hahtags

Popular Questions