How safe is using !feof in searching a file?

161 Views Asked by At

I read here that feof or more precisely using !feof in searching for a info in a file is a bad habit.

What I understood is that it's bad because it reads information from the FILE pointer before called function or process or something like that.

Wouldn't it be fine to have a do/while loop with fscanf inside and !feof as the exit condition?

This is a search function that I did:

typedef struct
{
    char lname[20] , fname[20];
    int nchildren;
}employee;
void searchemployee(char *filename , char *str)
{
    employee e;
    FILE *f;
    int c;
    f = fopen(filename, "r");
    if (f == NULL)
        printf("file couldn't be loaded\n");
    else {
        c = 0;
        do {
            fscanf(f, "%s %s %d\n", e.fname, e.lname, &e.nchildren);
            if (strcmp(e.fname, str) == 0)
                c = 1;
        } while (c == 0 && !feof(f));
        if (c != 1)
            printf("employee not found\n");
        else
            printf("employee : %s %s| children : %d\n", e.fname, e.lname, e.nchildren);
    }
    fclose(f);
}
2

There are 2 best solutions below

10
Andreas Wenzel On BEST ANSWER

The return value of the function feof specifies whether a previous input operation has already encountered the end of the file. This function does not specify whether the next input will encounter the end of the file.

The problem with

do{
    fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren);
    if (strcmp(e.fname,str)==0)
        c=1;
}while(c==0 && !feof(f));

is that if fscanf fails and returns EOF due to encountering the end of the file, then it will write nothing to e.fname.

If this happens in the first iteration of the loop, then the content of e.fname will be indeterminate and the subsequent function call strcmp(e.fname,str) will invoke undefined behavior (i.e. your program may crash), unless e.fname happens to contain a terminating null character.

If this does not happen in the first iteration, but rather in a subsequent iteration of the loop, then the content of e.fname will contain the content of the previous loop iteration, so you will effectively be processing the last successful call of fscanf twice.

In this specific case, processing the last successful call of fscanf twice is harmless, except for being a slight waste of CPU and memory resources. However, in most other cases, processing the last input twice will result in the program not working as intended.

See the following question for further information:

Why is “while( !feof(file) )” always wrong?

If you change the loop to

for (;;) {
    fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren);
    if ( c != 0 || feof(f) )
        break;
    if (strcmp(e.fname,str)==0)
        c=1;
}

so that the loop condition is checked in the middle of the loop, then the problem mentioned above will be gone.

However, it is generally better to check the return value of fscanf instead of calling feof, for example like this:

c = 0;

while ( c == 0 && fscanf(f,"%s %s %d\n",e.fname,e.lname,&e.nchildren) == 3 ) {
    if (strcmp(e.fname,str)==0)
        c=1;
}

Also, you don't need the flag variable c. I suggest that you incorporate the lines

if (c!=1)
    printf("emplyee not found\n");
else
    printf("employee : %s %s| children : %d\n",e.fname,e.lname,e.nchildren);

partially into the loop, like this:

void searchemployee( char *filename, char *str )
{
    employee e;
    FILE *f = NULL;

    //attempt to open file
    f = fopen( filename, "r" );
    if ( f == NULL )
    {
        printf( "file couldn't be loaded\n" );
        goto cleanup;
    }

    //process one employee record per loop iteration
    while ( fscanf( f, "%s %s %d\n", e.fname, e.lname, &e.nchildren ) == 3 )
    {
        //check whether we found the target record
        if ( strcmp(e.fname,str) == 0 )
        {
            printf(
                "employee : %s %s| children : %d\n",
                e.fname, e.lname, e.nchildren
            );
            goto cleanup;
        }
    }

    printf( "employee not found.\n");

cleanup:
    if ( f != NULL )
        fclose(f);
}

Another issue is that when using %s with scanf or fscanf, you should generally also add a width limit, to prevent a possible buffer overflow. For example, if e.fname has a size of 100 characters, you should use %99s to limit the number of bytes written to 99 plus the terminating null character.

0
Eric Postpischil On

Calling feof asks the question “Was end-of-file or an error encountered in a previous operation on this stream?”

If you use feof to answer that question, that is fine. But, you use feof to expect that your next operation will read data from the file, that is wrong. The previous operation might have ended just before the end of the file, so feof says “no,” but there is nothing left in the file to read.

The file/stream functions in the standard C library are designed to tell you when they failed because end-of-file was reached. You should use the return value (or other indication) provided by each function to test for a problem:

if (3 != fscanf(f, "%s %s %d\n", e.fname, e.lname, &e.nchildren))
{
    // Handle fact that fscanf did not read and convert 3 values.
}

int x = getchar();
if (x == EOF)
{
    // Handle fact that fscanf did not read and convert 3 values.
}

Note that calling fscanf and then feof will tell if fscanf encountered end-of-file or an input error, but it will not tell you whether fscanf read some input and assigned some values but then encountered end-of-file and did not finish. If you are reading only one thing, you might get away with fscanf followed by feof, but a more sophisticated program may need to distinguish partial inputs.