Can fscanf buffer overflow when %d is used?

188 Views Asked by At

I ran the Fortify Static Code Analyzer on the ossec-hids repo and it reported the following Buffer Overflow: Format String finding for src/analysisd/stats.c:415:

The format string argument to fscanf() at stats.c line 415 does not properly limit the amount of data the function can write, which allows the program to write outside the bounds of allocated memory. This behavior could corrupt data, crash the program, or lead to the execution of malicious code

The line of code in question is like this:

if (fscanf(fp, "%d", &_RWHour[i][j]) <= 0) {

_RWHour is declared as

static int _RWHour[7][25];

on line 33 of the same file. I believe there is no shadowing of _RWHour between this declaration on line 33 and its use on line 415, as when I select the declaration on line 33, my IDE (Visual Studio 2019) highlights _RWHour in 415.

When I look at the cppreference documentation for fscanf, it says the following:

d matches a decimal integer. The format of the number is the same as expected by strtol with the value 10 for the base argument

The table I took the quote above from also shows that when no length modifier is used for %d (as is the case for the fscanf call in question), the argument type should be signed int* or unsigned int*.

My question is this:

Is it possible for the Fortify finding to be a false positive in this instance? Or is it possible to write to memory outside an int when you pass the address of an int to fscanf?

If it is possible to write outside the memory of an int when using %d with fscanf, how can this be safely avoided?

3

There are 3 best solutions below

12
On

Prevent conversion overflow

fscanf(fp, "%d", &_RWHour[i][j]) is undefined behavior (UB)*1 if the numeric text attempts to convert to a value outside the int range.

A quick fix to prevent UB is to limit the number of characters read with a width:

//fscanf(fp, "%d", &_RWHour[i][j])
fscanf(fp, "%4d", &_RWHour[i][j])  // Limit [-999 ... 9999].

A more robust solution reads in the text and uses strtol() to convert.

I recommend creating a helper function to handle reading in the int.

This level of checking kinda stinks, doesn't it?

i, j in range

Review of OP's referenced code looks OK, yet analysis tool may be whining about that.


*1

... or if the result of the conversion cannot be represented in the object, the behavior is undefined. C23dr § 7.23.6.2 10

With UB, it is "possible to write to memory outside an int".

0
On

Is it possible for the Fortify finding to be a false positive?

Of course it's possible, in general, that Fortify reports a false positive. And @chux's answer about possible undefined behavior notwithstanding, I think this is indeed best characterized as a false positive. At minimum, it is an inaccurate diagnosis. Formally, a program that exercises undefined behavior can do anything, but in practice, overrunning the bounds of the specified int object is a highly unlikely outcome in this case, and diagnosing that discounts all the other possible results.

Supposing that the other arguments are correctly type matched with respect to the format, the main risk for the scanf family functions to overflow a target object is with a %s or %[ conversion specification without a width, or a %s, %[, or %c conversion specification with too large a width. I think Fortify is inappropriately generalizing that.

Or is it possible to write to memory outside an int when you pass the address of an int to fscanf?

Not without the involvement of undefined behavior, whether from the input being overlong or from some other source outside the fscanf() call.

If it is possible to write outside the memory of an int when using %d with fscanf, how can this be safely avoided?

Realistically, I don't think this is something to be concerned about.

Nevertheless, you can probably satisfy Fortify by adding an appropriate field width to the conversion specifier, though determining what "appropriate" means in this case could be tricky. This will also rule out any possibility of UB arising from the call, which is a worthy objective.

Alternatively, you can parse the number by some means other than fscanf().

0
On

The code does not seem to warrant the problem reported by the tool.

Furthermore, there is another call to fscanf in the same file that should report the same problem except for the fact that the destination int is an entry in a 1D array instead of a 2D array:

381                if (fscanf(fp, "%d", &_RHour[i]) <= 0) {

But in both cases, it is rather obvious and should be inferred by the tool that the index variable stay within the proper boundaries.

You can try and change the loop in lines 404..426 with this code to investigate if the tool is to blame:

       for (j = 0; j <= 24; j++) {
            _CWHour[i][j] = 0;
            _RWHour[i][j] = 0;
            snprintf(_weekly, 128, "%s/%d/%d", STATWQUEUE, i, j);
            if (File_DateofChange(_weekly) >= 0) {
                FILE *fp = fopen(_weekly, "r");
                if (fp != NULL) {
                    int hour;
                    if (fscanf(fp, "%d", &hour) == 1 && hour >= 0) {
                        _RWHour[i][j] = hour;
                    }
                    fclose(fp);
                }
            }
        }