I open huge (11Gb) file, mmap it to memmory, and fail to search the string in the file
my code is
if ( (fd = open("l", O_RDONLY)) < 0 ) err_sys("Cant open file");
if ( fstat(fd, &statbuf) < 0 ) err_sys("Cant get file size");
printf("size is %ld\n", statbuf.st_size);
if ( (src = mmap(0, statbuf.st_size, PROT_READ, MAP_SHARED, fd, 0)) == MAP_FAILED ) err_sys("Cant mmap");
printf("src pointer is at %ld\n", src);
char * index = strstr(src, "bin/bash");
printf("needle is at %ld\n", index);
It works on small files, but on huge sources returns 0. What function should I use to search in huge mmapped files?
The output is:
size is 11111745740
src pointer is at 140357526544384
needle is at 0
You should not use
strstr()
to search for text in a memory mapped file:strstr
will keep scanning beyond the end of the file, invoking undefined behavior by attempting to read unmapped memory.You could instead use a function with equivalent semantics but applied to raw memory instead of C strings,
memmem()
, available on Linux and BSD systems:Note that you also use the wrong
printf
formats: it should be%p
forsrc
andindex
and you might prefer to print the offset as aptrdiff_t
or anunsigned long long
:If
memmem
is not available on your platform, here is a simple implementation: