I wrote a function meant to act like the <stdlib.h> function atoi:

int _atoi(char *s)
{
    int i, neg = 1, n = 0;

    for (i = 0; s[i] != '\0'; i++)
    {
        if ((s[i] >= '0') && (s[i] <= '9'))
            n = n * 10 + (s[i] - '0');
        else if (s[i] == '-')
            neg *= -1;
    }
    n *= neg;
    return (n);
}

When I run it with something like

nb = _atoi("         +      +    -    -98 Battery Street; San Francisco, CA 94111 - USA             ");
printf("%d\n", nb);

The output is -9894111

But with a similar code like:

int _atoi(char *s)
{
    int sign = 1, i = 0, res = 0;
    while (!(s[i] <= '9' && s[i] >= '0') && s[i] != '\0')
    {
        if (s[i] == '-')
            sign *= -1;
        i++;
    }
    while (s[i] <= '9' && s[i] >= '0' && s[i] != '\0')
    {
        res = (res * 10) + (s[i] - '0');
        i++;
    }
    res *= sign;
    return (res);
}

The output is 98. Which is what the real atoi function returns.

What's the difference between the two codes that would make the latter ignore everything after 8 (ie the - and the 94111)?

2

There are 2 best solutions below

0
MikeCAT On

The loop condition of first code is s[i] != '\0'. This means the loop will run until the end of string, regardless of wheather unconverted character exists before that.

On the other hand, the loop condition of the last loop in the second code is s[i] <= '9' && s[i] >= '0' && s[i] != '\0'. This will make the loop stop at the first character that is not a digit.

Therefore, the first code will see 94111 - after non-digit characters following 98 while the second code won't.

2
chqrlie On

The first function iterates on the whole string, combining all digits present into a single number even as they are separated by other characters. It also interprets any occurring - as a negative sign to apply to the resulting number. This is definitely not the behavior of atoi().

The second function first skips any non digits, only testing for - signs interpreted as changing the sign of the result. Then it iterates on the digits until it finds a non digit or the end of the string. Thus is only interprets the first number present in the string, negating it possibly multiple times. This produces a different value for your input string, but is still not the behavior of atoi().

The Standard function atoi first skips any white space characters (as defined by isspace(), then accepts an optional sign (a single + or - character), then it parses any immediately following digits and stops at the first non digit. For the test string, it returns 0.

It is recommended to use strtol() over atoi() to avoid undefined behavior on strings that contain representations of integers beyond the range of type int.

Here is a simple implementation of atoi() with defined behavior:

#include <limits.h>
#include <stdlib.h>

int atoi(const char *s) {
    long n = strtol(s, NULL, 10);
    return n < INT_MIN ? INT_MIN :
           n > INT_MAX : INT_MAX : n;
}

If you want to implement your own function without <stdlib.h>, you can use:

#include <ctype.h>
#include <limits.h>

int my_atoi(const char *s) {
    int res = 0;

    while (isspace((unsigned char)*s)) {
        s++;
    }
    if (*s == '-') {
        s++;
        // isdigit((unsigned char)*s) is guaranteed to be
        //   equivalent to `(*s >= '0' && *s <= '9')`
        // isdigit(*s) would have undefined behavior if `char`
        //   is signed and `*s` is negative.
        // I am using the explicit test for clarity
        while (*s >= '0' && *s <= '9') {
            int digit = *s++ - '0';
            if (res < INT_MIN / 10
            ||  res == INT_MIN / 10 && -digit < INT_MIN % 10) {
                res = INT_MIN;
            } else {
                res = res * 10 - digit;
            }
        }
    } else {
        if (*s == '+') {
            s++;
        }
        while (*s >= '0' && *s <= '9') {
            int digit = *s++ - '0';
            if (res > INT_MAX / 10
            ||  res == INT_MAX / 10 && digit > INT_MAX % 10) {
                res = INT_MAX;
            } else {
                res = res * 10 + digit;
            }
        }
    }
    return res;
}