To implement efficient spinlocks in the VM enviroment, KVM documentation says that a vcpu waiting for spinlock can execute halt instruction and let the spinlock holder vcpu get chance for execution, this spinlock holder vcpu can then execute KVM_HC_KICK_CPU hypercall and awake the waiting vcpu.
Now here is my question:
Imagine below sequence of instructions
CHECK_SPIN_LOCK_FLAG
// <------------ waiting vCPU get scheduled out at exactly before executing hlt
hlt
now, when the spinlock holder vcpu wakes up, releases the spinlock and then tries to wake the cpu, there is nothing to do as cpu is already running. However, when the spinlock waiting cpu get scheduled, it will execute hlt instruction and remain there.
is this a race condition in this hypercall design?
The following is excerpt from hypercall.rst in the Documentation/virt/kvm/x86/hypercalls.rst
5. KVM_HC_KICK_CPU
------------------
:Architecture: x86
:Status: active
:Purpose: Hypercall used to wakeup a vcpu from HLT state
:Usage example:
A vcpu of a paravirtualized guest that is busywaiting in guest
kernel mode for an event to occur (ex: a spinlock to become available) can
execute HLT instruction once it has busy-waited for more than a threshold
time-interval. Execution of HLT instruction would cause the hypervisor to put
the vcpu to sleep until occurrence of an appropriate event. Another vcpu of the
same guest can wakeup the sleeping vcpu by issuing KVM_HC_KICK_CPU hypercall,
specifying APIC ID (a1) of the vcpu to be woken up. An additional argument (a0)
is used in the hypercall for future use.
okay, I am not sure whether I solved the problem or not but it seems there is one more hypercall
We can use above hypercall before waking up the cpu, to make sure the target vcpu was indeed in halted state. If it was preempted, then yield and resume when target is halted rather than preempted.
Solution as of now: