What does "Private Data" define in VMMAP?

6.2k Views Asked by At

I am using VMMap to analyse Virtual/Process Address Space utilisation in my mixed mode (managed and unmanaged) application. I understand how the Windows VMM and the Virtual Memory API works, I understand how the Heap Memory API Works too. I have looked at the CRT implementation I am using (not in great detail) and (I think I - this could be my downfalling) understand how this uses the aforementioned Win32 APIs.

I'm looking to understand what this "Private Data" stat is showing me. My application makes no direct calls to the any of the Win32 Memory API functions, it only ever uses "malloc/new" in native C++ and "new" in C# (which deep down will be using the Win32 Memory Management API).

The definition of "Private Data" given by VMMap is:

Private memory is memory allocated by VirtualAlloc and not suballocated either by the Heap Manager or the .NET run time. It cannot be shared with other processes, is charged against the system commit limit, and typically contains application data.

So I guess this definition makes me ask, ok, so who is making the calls to VirtualAlloc? Is it the Heap Manager or .Net run time?

I could get an address of some of the committed private data and use WinDbg to find out.... Well... it turns out Microsoft in their wisdom have nobbled the ntdll public symbols, so WinDbg doesn't work so nicely - I can provide more details on this if requested, but basically commands like !address -summary no longer work due to missing symbols.

Another way to word this question might: What C++ or C# code can I write that will cause this private data statistic to increase or decrease? Or is this all managed by the OS, the C++ runtime or the .Net runtime and therefore at the mercy of it's whims?

I can deduce from the nature of VMMap (other memory types are EXCLUSIVE OF EACH OTHER) that this "private data", therefore cannot be any of the following types of address space:

  • Heap (note that this includes committed AND reserved heap space - reserved through a call to VirtualAlloc, as described in the description of Private Data above).
  • Managed Heap
  • Stack
  • Shareable
  • Mapped File
  • Image
  • Page Table
  • Unusable
  • Free

(I couldn't find an online help file that defines what VMMap thinks all the above types are, but here's a link to download the help file: https://technet.microsoft.com/en-us/library/dd535533.aspx)

I have noticed that in my application, the TOTAL (reserved and committed) size of private data remains fairly constant throughout my applications lifetime, despite the Heap/Managed Heap/Stack sizes changing as expected. I have also noticed that of the ~250Mb total used by private data, only ~33Mb is actually committed. Note my method of measuring this is rather rudimentary, so the value may be changing between each of my measurements and I'm just not seeing it (if i knew what this was measuring, I could use DebugDiag to grab a dump of the process when the related counter hit a certain threshold, chicken and egg).

My current speculative theory is that this is space that is being reserved to grow the native (or managed i suppose?) heaps into as they reach their capacity, but I got nothing to prove this. So it remains firmly in the speculative pile.

Searching the internet for details on this can be painful, there are many many posts/articles/blogs out there that confuse things, use self referential definitions (the first sentence of Performance Monitor's definition for Working Set is a great example of this), are incomplete or are just plain wrong. Many places blur definitions or use an inconsistent terminology (note that VMMaps definition of the field private data, goes on to refer to it as private memory, maybe a bit of an anal complaint, but ambiguity).

Now that i've criticised the rest of the internet for getting things muddled and incorrect... if there is anything in the above that doesn't make sense or you can show me documentation to the contrary or you need a more explicit definition of, let me know and i'll put myself on the offenders list as well! I think the first half of trying to explain a memory issue to someone, online, is making sure we're all talking about the same thing.

Finally this question: How does VMMap know a given memory region is Thread Stack, specifically? suggests I may never find out an answer :/

UPDATE/EDIT: I have discovered that by turning on gflags user stack tracing (gflags -i myapp.exe +ust), you can increase the size of the Private Data, I would assume this is the backtrace database, but there even without gflags, there is still Private Data which I am struggling to account for.

2

There are 2 best solutions below

0
On

The collapsed VMMap view for a process shows all the VAD entries. The VAD can be viewed using kd / windbg / livekd.

Let's take a look at calc.exe:

enter image description here

lkd> !process 0n12876
PROCESS fffffa802e058600
    SessionId: 1  Cid: 324c    Peb: 7fffffdf000  ParentCid: 0bec
    DirBase: 22211000  ObjectTable: fffff8a00e9b1310  HandleCount:  85.
    Image: calc.exe
    VadRoot fffffa8039b76500 Vads 176 Clone 0 Private 1852. Modified 1. Locked 0.
.
.

lkd> !vad fffffa8039b76500
VAD             level      start      end    commit
.
.
fffffa803c9da680 ( 6)      ff7b0    ff892         6 Mapped  Exe  EXECUTE_WRITECOPY  \Windows\System32\calc.exe
.
.

The VAD only has one range, and shows a bogus protection, EXECUTE_WRITECOPY, although I think this does generically describe all of the sections such that the sections within it are permitted to be CoW or read only and executable. VMMap attempts to be more informative and shows not only the image subsection objects, but the different protection ranges within those subsections. For instance, it shows, for 1 .data subsection, all protection ranges within that. It was originally 5 copy on write pages, and now 3 have been replaced with read/write PTEs and 2 are still untouched.

There is a private page in .rdata, and also both the read/write and CoW pages of .data are private. The private part of .rdata is going to be the page that contains the import address table (IAT), because it is modified by the loader and is different for each process. Therefore, this was made a copy on write page as well, it was then written to, and now it's been made read only by the loader. It does not need to be executable, because it is accessed using a rip-relative memory indirect jump or call.

All other sections are file-backed and are shared between processes. The shareable working set is the working set that can be mapped in by other processes, and the shared working set is the working set that is actually mapped in by at least one other process, and is a subset of the shareable WS.

Private means that the data is modified by the process and it is only relevant for that process to see and not others, and is therefore not written back to the shared mapped file for other processes to see. It therefore has to create a page which will be stored in the pagefile rather than written back to the file when it needs to be paged out. One example of this is the IAT, where each image will have different import addresses depending on where the loaded modules are in the process address space, which varies from process to process and means nothing in the context of another process.

Another example of a private might be parts of the code section that require fix ups in the case that the image is not loaded at its preferred base and there are absolute addresses within the code. These pages will have to be allocated copy on write and then made Execute/Read – in this example there are none.

It's also important to note that size means the virtual memory that has been reserved by the process. For instance HeapCreate reserves some memory for the heap, which will have a VAD entry and the full size of the reservation marked by the VAD entry is added to size (and the size of the blocks in the VAD is calculated through other means, like analysing the PTEs themselves). This memory is then committed when you call HeapAlloc i.e. PTEs are actually allocated, meaning PDEs are also allocated a physical page such that the PTEs can actually be changed. The PTEs are made demand-zero for the specific range. These are now 'committed'. When you actually write to the address, the PTE will be allocated a zeroed physical page and the PTE will be made into a valid hardware PTE that points to that physical page and now the page is part of the process working set until it's trimmed from the working set.

private is the virtual commit that is private (in page i.e. 4KiB granularity), and the private WS then shows the amount of private virtual commit pages represented by the PTEs that actually have a physical page allocated for them as part of the process working set. It's a logical categorisation of the physical pages in the working set. The total WS is private WS + shareable WS, i.e. it's the working set of the process.

Mapped images always have the same size as their commit, but mapped files and sections don't always – they can have reserved blocks like a heap. Reserved in the context of mapped files means that there are no PTEs for that region yet. You typically map a view of the whole image section, but with a data file, I have seen cases where the full file was mapped in, but the VAD entry had a reserved space much larger than the file -- I'm not sure how to do this. The size of the mapping is reserved in the VAD and then they are allocated physical pages containing PTEs so that their PTEs can be at the same time filled in to point to the correct prototype PTEs (PPTEs), which were created (but not allocated physical pages) when the section was created in the case of image files, and when the view was mapped in the case of data files. When the file is actually accessed, a physical page is given to the PTE to point to, and it becomes a valid hardware PTE. This will be simply copying pointer to the physical page that the PPTE points to from the PPTE (and if the PPTE doesn't point to one i.e. it's a MMPTE_SUBSECTION, then it is allocated and filled in with an IO read to the file, or it's paged in if the prototype PTE is a pagefile PTE).

Reservation is just a reservation in the VAD, but a commit means there are now PTEs for it that are software forms (i.e. they are invalid), which means that there is some extra commit charge because physical pages needed to be allocated in the PDE / PDPT / PML4 i.e. PTE pages such that the PTEs could actually be written to. This particular commit charge does not show up in the working set of the process (the working set commit charge), neither does the paged / non paged pool commit charge or the modified / standby list commit charge or the page file commit charge.

A read only section is by default a shared section (apart from the pages that need to be modified by the linker as they momentarily become writeable), but a write enabled section in an image is by default not shared, so it is copy on write, and each process has a different copy of the data section, which replaces it when the page fault occurs (the page fault occurs because CoW pages are invalid), and this is paged out to the page file when it needs to be paged out. If you specify it as a shared section in the section header characteristics in the image then it is not allocated as a copy on write page, just read/write, and all writes to it will write directly to the mapped image and will be seen by all processes that have mapped it in. This I believe ends up writing back to the image i.e. it is file backed.

The fact that the CoW page virtual commit is listed as private is interesting. This page itself I would have thought is shared and file-backed, only the replacement page is private. The CoW page is also not on the working set in this instance, but in my chrome.exe it is, and it's all and only on the shareable working set open for reads, as you'd expect, although the virtual commit is still listed as private, but at least it is on the shareable working set and not on the private working set, which would have been an optimisation failure, because it should be and is able to be shared. Here is another example:

enter image description here

This begs the question of how the privateness of a virtual commit is determined. In this case it appears to be classifying the commit PTEs that point to CoW PPTEs as private, because it will eventually be private when the page is written to and swapped for the substitute page copy. This is misleading although not a tangible issue (showing it to be part of a private working set would be a tangible issue). As for the .rdata, it knows the first page is private when it isn't private in the original commit (the original commit determines privateness from the protection of the PPTEs that the commit PTEs point to, which were filled out using the image sections), but what's strange is that it doesn't include that 4K page in the private total for the image (it shows 20K and not 24K), but does include the CoW private pages in the total. You'd think it would read the PPTEs/PTEs to determine what is private commit and what isn't --- the current commit is different to the original commit because the loader has changed the read only (shared) PTE to a CoW PTE and PPTE, which it then writes to and makes the PTE read only (but PPTE remains CoW and retains the protection even when the physical page it points to is discarded for being file backed and read only (and it now may as the reference count has decreased), i.e. it doesn't reacquire a protection from the image file). When the page is written, the new page will no longer involve or point to the PPTE again, and it's now pagefile backed, and the physical page allocated's PFN entry does not point to the PPTE. For this reason, a shared section is file backed because it's associated with a PPTE, as is a CoW page, and as is a read only page within the image, but a page allocated that isn't related to a PPTE (the page PFN doesn't point to a PPTE) is pagefile backed, regardless of whether that is set to read/write or read (read in the case of the IAT).

In task manager, cached = standby + modified, and available = free + standby. The commit charge is the amount of physical memory (RAM + pagefile) that is used (either for a commit itself (on the working set, pagefile, modified, standby), or for structures supporting the actual commit but not part of the commit range itself, or the paged / nonpaged pool etc.).

1
On

This question was mentioned by Sasha Goldstein in his talk about WinDbg at DotNext conference - https://www.youtube.com/watch?v=8t1aTbnZ2CE. The point was it can be easily answered with help of WinDbg.

To answer whether CLR uses VirtualAlloc for its heap we'll set a breakpoint at this function with a script which prints current stack (native and managed).

bp kernelbase!VirtualAlloc ".printf \"allocating %d bytes of virtual memory\", dwo(@esp+8);.echo;  k 5; !clrstack; gc"

Here: k 5 print last 5 frames of native callstack and !clrstack (from SOS) prints managed stack. gc continue execution.

Please note that this script will work only for x86 processes. For x64 you'll need some other (registries and calling convention differ).

Then I created a simple program which allocate an object and adds it to a List.

    static void Main(string[] args)
    {
        var list = new List<string[]>();
        while (true) {
            var a = Console.ReadLine();
            if (a == "q" || a == "Q") break;
            var arr = new string[100];
            list.Add(arr);
        }
    }

Run it under WinDbg and started pressing Enter. At some point the breakpoint hit - on List expanding and allocating additional memory in the heap:

enter image description here

So obviously CLR uses VirtualAlloc for allocating memory for its heap.