I'm getting a segmentation fault when running the code below.
It should basically read a .csv
file with over 3M lines and do other stuff afterwards (not relevant to the problem), but after 207746 iterations it returns a segmentation fault. If I remove the p = strsep(&line,"|");
and just print the whole line
it will print the >3M lines.
int ReadCSV (int argc, char *argv[]){
char *line = NULL, *p;
unsigned long count = 0;
FILE *data;
if (argc < 2) return 1;
if((data = fopen(argv[1], "r")) == NULL){
printf("the CSV file cannot be open");
exit(0);
}
while (getline(&line, &len, data)>0) {
p = strsep(&line,"|");
printf("Line number: %lu \t p: %s\n", count, p);
count++;
}
free(line);
fclose(data);
return 0;
}
I guess it'd have to do with the memory allocation, but can't figure out how to fix it.
A combination of
getline
andstrsep
often causes confusion, because both functions change the pointer that you pass them by pointer as the initial argument. If you pass the pointer that has been throughstrsep
togetline
again, you run the risk of undefined behavior on the second iteration.Consider an example:
getline
allocates 101 bytes toline
, and reads a 100-character string into it. Note thatlen
is now set to 101. You callstrsep
, which finds'|'
in the middle of the string, so it pointsline
to what used to beline+50
. Now you callgetline
again. It sees another 100-character line, and concludes that it is OK to copy it into the buffer, becauselen
is still 101. However, sinceline
points to the middle of the buffer now, writing 100 characters becomes undefined behavior.Make a copy of
line
before callingstrsep
:Now
line
that you pass togetline
is preserved between loop iterations.