Best way to read a file line-by-line in C using mmap?

1.2k Views Asked by At

The following code shows how to read part of a file using the mmap command:

       addr = mmap(NULL, length + offset - pa_offset, PROT_READ,
                   MAP_PRIVATE, fd, pa_offset);
       if (addr == MAP_FAILED)
           handle_error("mmap");

       s = write(STDOUT_FILENO, addr + offset - pa_offset, length);
       if (s != length) {
           if (s == -1)
               handle_error("write");

If addr is a char*, how would I split the result into lines? Or is there a better way to read lines from a text file using mmap?

1

There are 1 best solutions below

0
On

It's unclear why you want to mmap the file in the first place. I suppose it is for performance, but unless you have determined through performance testing that your program does not run fast enough and that I/O on the file in question is a significant bottleneck for it, then such a step would be jumping the gun.

Nevertheless, if you are determined to mmap the file, and you must also perform some form of line-by-line processing on it, then your alternatives for identifying line breaks are:

  1. examine the bytes to see which ones are line terminators.

Details depend on exactly what you want to do. You can be more efficient if you can test for newlines as you scan the data, but if necessary, then you can scan ahead of the current processing position to find the next line terminator, so as to know ahead of time where it is. You can write that as a simple loop, or you might find it convenient to use the memchr() function.

Do bear in mind, too, that you probably don't want to modify the data (and can't if you map it with PROT_READ, as you do), so you cannot expect to replace line terminators with string terminators unless you copy the data to a separate buffer. Also, the last line may or may not have a terminator. You will therefore need to exercise caution with the standard string functions.