How many minor faults is my process *really* taking?

599 Views Asked by At

I have the following simple program, which basically just mmaps a file and sums every byte in it:

#include <errno.h>
#include <stdio.h>
#include <stdlib.h>
#include <stdint.h>
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <string.h>

volatile uint64_t sink;

int main(int argc, char** argv) {

  if (argc < 3) {
    puts("Usage: mmap_test FILE populate|nopopulate");
    return EXIT_FAILURE;
  }

  const char *filename = argv[1];
  int populate = !strcmp(argv[2], "populate");
  uint8_t *memblock;
  int fd;
  struct stat sb;

  fd = open(filename, O_RDONLY);
  fstat(fd, &sb);
  uint64_t size = sb.st_size;

  memblock = mmap(NULL, size, PROT_READ, MAP_SHARED | (populate ? MAP_POPULATE : 0), fd, 0);

  if (memblock == MAP_FAILED) {
    perror("mmap failed");
    return EXIT_FAILURE;
  }

  //printf("Opened %s of size %lu bytes\n", filename, size);  

  uint64_t i;
  uint8_t result = 0;
  for (i = 0; i < size; i++) {
    result += memblock[i];
  }

  sink = result;

  puts("Press enter to exit...");
  getchar();

  return EXIT_SUCCESS;
}

I make it like this:

gcc -O2 -std=gnu99     mmap_test.c   -o mmap_test

You pass it a file name and either populate or nopopulate1, which controls whether MAP_POPULATE is passed to mmap or not. It waits for you to type enter before exiting (giving you time to check out stuff in /proc/<pid> or whatever).

I use a 1GB test file of random data, but you can really use anything:

dd bs=1MB count=1000 if=/dev/urandom of=/dev/shm/rand1g

When MAP_POPULATE is used, I expect zero major faults and a small number of page faults for a file in the page cache. With perf stat I get the expected result:

perf stat -e major-faults,minor-faults ./mmap_test /dev/shm/rand1g populate
Press enter to exit...

 Performance counter stats for './mmap_test /dev/shm/rand1g populate':

                 0      major-faults                                                
                45      minor-faults                                                

       1.323418217 seconds time elapsed

The 45 faults just come from the runtime and process overhead (and don't depend on the size of the file mapped).

However, /usr/bin/time reports ~15,300 minor faults:

 /usr/bin/time ./mmap_test /dev/shm/rand1g populate
Press enter to exit...

0.05user 0.05system 0:00.54elapsed 20%CPU (0avgtext+0avgdata 977744maxresident)k
0inputs+0outputs (0major+15318minor)pagefaults 0swaps

The same ~15,300 minor faults is reported by top and by examining /proc/<pid>/stat.

Now if you don't use MAP_POPULATE, all the methods, including perf stat agree there are ~15,300 page faults. For what it's worth, this number comes from 1,000,000,000 / 4096 / 16 = ~15,250 - that is, 1GB divided in 4K pages, with an additional factor of 16 reduction coming from a kernel feature ("faultaround") which faults in up to 15 nearby pages that are already present in the page cache when a fault is taken.

Who is right here? Based on the documented behavior of MAP_POPULATE, the figure returned by perf stat is the correct one - the single mmap call has already populated the page tables for the entire mapping, so there should be no more minor faults when touching it.


1Actually, any string other than populate works as nopopulate.

0

There are 0 best solutions below