I have a generic tool that is meant to dump various information about a process at an arbitrary point in time. I have hit a bug, where a process being dumped had mmap'd an area of memory to a file, and then the file itself shrunk. So when I mmap to the same file, my mmap region is bigger than the file. When I attempt to dump the mmap'd region to file (via gzwrite), I get a SIGBUS.
I've considered several solutions, each of which seem to have flaws that I can't get past:
write a sigbus handler:
The sigbus occurs deep within the gzwrite operation, and doing some sort of jump out of the routine would cause the file being written to to become corrupted. I also do not see any async-safe functions to generate the missing page(s) and simply return.
use fstat to get file size, and only read the populated part of the memory region:
This seems to suffer from concurrency issues -- namely, what happens if the file size shrinks right after I do the fstat call? I could potentially lock the file from being resized, but I'm worried this might cause other insurmountable issues with other running processes currently using the file.
I'm wondering if there is any safe strategy to dump the contents of the file?
----- EDIT -----
So some simplified code of what the program is doing, or attempting to do:
// grab relevant segment from pid:
segment = get_prog_seg(pid);
if (segment->filename) {
ECHO( "segment: size:%zu, filename:%s",
segment->size, segment->filename);
int mmap_fd = open(segment->filename, O_RDWR);
if (mmap_fd == -1) { ... };
// at this point, segment is 500Mb, but the
// filesize has already shrunk to 100Mb
mmap_addr = mmap(NULL,
segment->size,
PROT_READ | PROT_WRITE,
MAP_PRIVATE,
mmap_fd,
segment->file_offset);
if (mmap_addr == MAP_FAILED) { ... };
int elems_wrote = gzwrite(gzfile, mmap_addr, segment->size);
// I get a SIGBUS here, as soon as zlib attempts to access
// mmap[100Mb]
...
}