Will malloc round up to the nearest page size?

1.2k Views Asked by At

I'm not sure if I'm asking a noob question here, but here I go. I also searched a lot for a similar question, but I got nothing.

So, I know how mmap and brk work and that, regardless of the length you enter, it will round it up to the nearest page boundary. I also know malloc uses brk/sbrk or mmap (At least on Linux/Unix systems) but this raises the question: does malloc also round up to the nearest page size? For me, a page size is 4096 bytes, so if I want to allocate 16 bytes with malloc, 4096 bytes is... a lot more than I asked for.

2

There are 2 best solutions below

2
On BEST ANSWER
#include <stdio.h>
#include <stdlib.h>
#include <inttypes.h>
#include <unistd.h>


int main(void) {
    void *a = malloc(1);
    void *b = malloc(1);
    uintptr_t ua = (uintptr_t)a;
    uintptr_t ub = (uintptr_t)b;
    size_t page_size = getpagesize();

    printf("page size: %zu\n", page_size);
    printf("difference: %zd\n", (ssize_t)(ub - ua));
    printf("offsets from start of page: %zu, %zu\n",
        (size_t)ua % page_size, (size_t)ub % page_size);
}

prints

page_size: 4096
difference: 32
offsets from start of page: 672, 704

So clearly it is not rounded to page size in this case, which proves that it is not always rounded to page size.


It will hit mmap if you change allocation to some arbitrary large size. For example:

void *a = malloc(10000001);
void *b = malloc(10000003);

and I get:

page size: 4096
difference: -10002432
offsets from start of page: 16, 16

And clearly the starting address is still not page aligned; the bookkeeping must be stored below the pointer and the pointer needs to be sufficiently aligned for the largest alignment generally needed - you can reason this with free - if free is just given a pointer but it needs to figure out the size of the allocation, where could it look for it, and only two choices are feasible: in a separate data structure that lists all base pointers and their allocation sizes, or at some offset below the current pointer. And only one of them is sane.

1
On

The basic job of malloc and friends is to manage the fact that the OS can generally only (efficiently) deal with large allocations (whole pages and extents of pages), while programs often need smaller chunks and finer-grained management.

So what malloc (generally) does, is that the first time it is called, it allocates a larger amount of memory from the system (via mmap or sbrk -- maybe one page or maybe many pages), and uses a small amount of that for some data structures to track the heap use (where the heap is, what parts are in use and what parts are free) and then marks the rest of that space as free. It then allocates the memory you requested from that free space and keeps the rest available for subsequent malloc calls.

So the first time you call malloc for eg 16 bytes, it will uses mmap or sbrk to allocate a large chunk (maybe 4K or maybe 64K or maybe 16MB or even more) and initialize that as mostly free and return you a pointer to 16 bytes somewhere. A second call to malloc for another 16 bytes will just return you another 16 bytes from that pool -- no need to go back to the OS for more.

As your program goes ahead mallocing more memory it will just come from this pool, and free calls will return memory to the free pool. If it generally allocates more than it frees, eventually that free pool will run out, and at that point, malloc will call the system (mmap or sbrk) to get more memory to add to the free pool.

This is why if you monitor a process that is allocating and freeing memory with malloc/free with some sort of process monitor, you will generally just see the memory use go up (as the free pool runs out and more memory is requested from the system), and generally will not see it go down -- even though memory is being freed, it generally just goes back to the free pool and is not unmapped or returned to the system. There are some exceptions -- particularly if very large blocks are involved -- but generally you can't rely on any memory being returned to the system until the process exits.