Why is the size of my string changing? (C)

166 Views Asked by At
    fseek(fp, 0L, SEEK_SET);
        int i; char c;
        i = 0;
        for (c = getc(fp); c != EOF; c = getc(fp)) {
            c = tolower(c);
            file_string[i] = c;
            i++;
        }

In this code, I read through each character of a file, convert it to lower case, and put it in a string. Now, say I allocate 21 bytes * sizeof(char) to file_string. Occasionally, after this piece of code shown here, strlen(file_string) will return 30, rather than the expected 20. Perhaps there is something wrong with my pointer arithmetic? Some things I have gathered:

1 - This only occurs some of the time.

2 - I have ensured that I am allocating the right amount of bytes to file_string (which happens the line directly before this code). The code is as follows:

fseek(fp, 0L, SEEK_END);
file_len = ftell(fp);
file_string = malloc( sizeof(char) * (file_len+1) );

printing file_len outputs the expected length.

3 - I printed out the value of i to make sure it was iterating the exact number of times as the length of file_string, and it is.

4 - Now, DIRECTLY after this code (after closing the file), when I print out the length of file_string, sometimes it will suddenly increase to some larger size. This has been causing me problems elsewhere in my code.

Now, I figure I can just stick the null terminal character in there and fix the problem (maybe that will cause errors further down the line), but I would prefer to know what's going on under the hood here.

Here is an example of my debugging, showing the change in size before and after. Bear in mind, the before length corresponds to the file_len variable.

>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> file_len: 24
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 0
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 1
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 2
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 3
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 4
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 5
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 6
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 7
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 8
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 9
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 10
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 11
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 12
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 13
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 14
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 15
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 16
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 17
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 18
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 19
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 20
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 21
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 22
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>i: 23
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> strlen(file_string): 30
1

There are 1 best solutions below

5
On BEST ANSWER

Now, I figure I can just stick the null terminal character in there and fix the problem (maybe that will cause errors further down the line), but I would prefer to know what's going on under the hood here.

That's literally the solution. strlen is searching for a NULL terminator in the buffer, but not finding one because you have never added one. After your read code, you should explicitly add the NULL terminator (i.e. file_string[i] = '\0').

Remember that the storage returned by malloc is not zeroed, it's basically random data (well...the contents of the storage returned by malloc is undefined). What's happening is that you are running off the end of the buffer and into random memory, it just then happens to run into a zero byte further down the line and assumes that is the end of the string.