In a virtual machine monitor such as VMware's ESXi Server, how are shadow page tables implemented?

Question

In a virtual machine monitor such as VMware's ESXi Server, how are shadow page tables implemented?

1.8k Views Asked by ali01 At 27 July 2025 at 19:22

My understanding is that VMMs such as VMware's ESXi Server maintain shadow page tables to map virtual page addresses of guest operating systems directly to machine (hardware) addresses. I've been told that shadow page tables are then used directly by the processor's paging hardware to allow memory accesses in the VM to execute without translation overhead.

I would like to understand a bit more about how the shadow page table mechanism works in a VMM. Is my high level understanding above correct? If so,

What kind of data structures are used in the implementation of shadow page tables?
What is the flow of control from the guest operating system to the hardware?
Short of straight up reading the source code of an open source VMM, what resources can I look into to learn more about hardware virtualization?

Original Q&A

There are 3 best solutions below

**Ratana Ty** · Answer 1

Here is what I can tell. Please correct me if I am wrong. Shadow page table is created and maintained by Hypervisor/VMM. It is the table which contains guest virtual addresses and machine physical addresses. Imagine without shadow page table, to get into machine physical address we have to first get virtual address then walk through the OS(guest) page table to get guest physical address, then we need to convert guest physical address into machine physical address. So here is what happening, see how one guest virtual address get translated into machine physical address under the senario of shadow page table:

First physical processor will see the virutal address, and its destination is to get machine physical address. The first thing it do is trying to look at TLB(Translation look aside buffer) if the entry is in TLB we are now get the machine address. This is the most simple case which we called a TLB hit case. There is no performance issue at all. It will run in what ever call a native speed.

What happen if there is no entry in TLB(TLB miss)?
If there is no entry in TLB, the processor will do a page table walk in shadow page table. Assuming that there is a corresponding mapping(Guest VA to Machine Physical address), the processor will insert the value in TLB then restart the execution and we are good to go this case. This is one other good case. It may take around 10 cycle to do a look up in shadow page table, so performance wise we dont have to worry much.

What happen if there is no entry in shadow page table?
Processor is doing a look up in shadow page table and it could not find the entry. Well in this case as the look up is privilege there will be a fault. The VMM(Virtual Machine Monitor) will look up into the guest page table to resolve the issue. This case is a little complicate. Any way when the VMM walk through the guest page table there will be two possibilities.
1. In the case of the look up found the entry: When the look up found the entry, we can only walk in the guest page table to finally get guest physical address. Hey our target is to get the physical machine address. How do we get there. The monitor will take the guest physical address and will do the look up into their PMap table(or structure). If it found the entry, it will insert the value (basically guest virtal address, machine physical address) in to the shadow page table. Now we have the entry in shadow page table, we are good to go as when the processor restart the instruction it can get the mapping from the shadow page table. . Ah! forget to mention this case the monitor is doing a hidden page fault to resolve the issue by using PMap or PhysMap to get the corresponding machine physical address.
2. In the case of the look up not found the entry the monitor(VMM) will inject a virtual guest page fault. Now inside the guest it see that there is a page fault. OS will come and resolve the issue. This can take thousand to hundred thousand cycle or more in case of the page was swap out to the disk by the guest. Now assuming that the OS(guest OS) resolve the issue. We can restart the 3.1 steps.

Well the whole flow is a little complicate. I hope you will understand the process. . Note: Shadow page table is implemented in a software like: VMware, Microsof. It is only used in Binary Translation Mode(BT). With Nested Page Table we dont need a shadow page table at all.

There are some issue with shadow page table.

We are rely on the guest to invalidate the TLB. The thing is we want to keep the consistence between the guest page table and the shadow page table. Imagine what happen if the guest is update the page table, what happen if the guest is switching the process. It has to switch the page table. In this case it has to inform the hardware hey I update entry in page table and I invalidate it.
Aggressive shadow page table caching is necessary: We need to cach the shadow page table. See what happen if guest doing context switch and we have a lot of guest processes. It has to inform the hardware that it has to change it shadow page table pointer. Every switch will flash the TLB. Traditionally we have a shadow page table for every running process but we dont have as many as shadow page table compare to the processes have it table.
Write protect to guest page table (another word is tracing) to see what happen incase of example the page got lock by operating system for some reason, we have to get inform.

**ali01** · Answer 2

over 50 views and nothing?

I spoke to Vidar Holen over IRC on ##linux on freenode.net. He suggested that I take a look at this AMD technical report. It has proven to be a great resource. Anyone else have other suggestions?

**istudy0** · Answer 3

Basically, guest OS will try to translate virtual address to physical address but this seemingly physical address is not actually real physical address in that these are coming from VMM/hypervisor and hence these addresses are not contiguous addresses as that is the case with regular OS without VM. So there requires one more translation to map these guest physical address to real machine address. In order to accomplish this, VMM/hypervisor keeps shadow page tables to map these guest physical address to machine physical addresses.

In addition, hardware provides a mechanism to avoid page table walk by providing TLB but if you can imagine, these TLB inside guest must not be the real hardware TLB and VMM/hypervisor has to somehow emulate these as well. At the same time, shadow page tables can be used as a TLB for guest.

So that is a basic idea of shadow page table but it is probably most complicated piece of technology in hardware virtualization technology. I have left out a lot of details and catches that I also do not completely understand.

Following is a link that talks about some of those issues with simplified shadow page tables and how kvm tries to avoid them.

http://lwn.net/Articles/216794/

One more thing is that there is also a hardware support for this mechanism and they are called EPT and NPT supported by both intel and amd.

HTH.

In a virtual machine monitor such as VMware's ESXi Server, how are shadow page tables implemented?

There are 3 best solutions below

Related Questions in VMWARE

Related Questions in VIRTUALIZATION

Related Questions in VIRTUAL-MACHINE

Related Questions in VMWARE-SERVER

Trending Questions

Popular # Hahtags

Popular Questions