I am studying from the K&R C Programming Edition and one of the exercises (1-13) is to print a word length histogram which I mostly succeeded in doing. However, there is one issue.
if (nc < MAXLEN)
++wlen[nc];
else if (nc > MAXLEN)
++overflow;
According to the above, this should increment the value of the array value N so long as nc is less than the MAXLEN substitute, but if it is more than that, then it should increment the overflow count, right? I have a print statement at the end of the code that is supposed to print the overflow, but on inputs that have words greater than MAXLEN, the overflow count still prints a 0. Note: I previously had a simple else in the latter clause.
Here is full code of the program. I need to understand the logic behind why this bug is happening.
#include <stdio.h>
/* Prints a histogram of the lengths of words. */
#define MAXHIST 20
#define MAXLEN 100
#define MINLEN 0
#define IN 1
#define OUT 0
int main()
{
int c, i, j, state;
int wlen[MAXHIST];
int nc, overflow;
state = OUT;
nc = overflow = 0;
for (i = 0; i < MAXHIST; ++i)
wlen[i] = 0;
while ((c = getchar()) != EOF) {
if (c == '\t' || c == '\n' || c == ' ') {
state = OUT;
if (nc < MAXLEN)
++wlen[nc];
else if (nc > MAXLEN)
++overflow;
}
else if (state == OUT) {
state = IN;
nc = 1;
}
else {
++nc;
}
}
for (i = 1; i < MAXHIST+1; ++i) {
printf("%3d - %3d ", i, wlen[i]);
for (j = wlen[i]; j > MINLEN && j < MAXLEN; --j) {
printf("*");
}
printf("\n");
}
printf("\nOverflow: %d\n", overflow);
return 0;
}
There are multiple problems with your code.
The most evident one is, that if
MAXHIST < MAXLENGTHit won't count words with a length betweenMAXHISTandMAXLENGTH. In the worst case, you could get a segmentation fault, because you try to access memory you didn't allocate. Get rid of one of those, and only use a single value to determine the size of the histogram-array. The size of this array will also determine the maximum length of words you can count (or vice versa)Furthermore, because of you condition while counting, it won't count words with a length of exactly
MAXLENGTH, because thenncis neither< MAXLENGTHnor> MAXLENGTH.And finally, your way of resetting the length is needlessly complicated. You don't need the
state, just reset thenc = 0on a whitespace instead of switching state.And as a matter of style: I personally prefer to make the scope of variable as narrow as possible. Especially for loopcounters (like your
iandj) I prefer to define them within the loop's scope, ie in the loop-declaration.See the following code which fixes those issues.