If I set the presentation interval in Direct3D9 to D3DPRESENT_INTERVAL_ONE, when I call Present it waits until the monitor updates. It always waits the correct amount and (presumably) doesn't use a spinlock.
I'd like to be able to do the same "waiting" that Present does in Direct3D9, however I don't want to use Direct3D. How exactly does it wait for vsync perfectly without using a spinlock? Can just the waiting be programmed without Direct3D?
Synchronization with the vertical retrace is handled by driver in a device dependent manner. It's not inconceivable that there exists some implementation just busy waits, polling some device register until it detects the beginning of the retrace interval. The alternative would to sleep waiting on a device interrupt, which frees up the CPU for other tasks, but increases the latency because of the necessary kernel-mode/user-mode transitions. It's also possible for a driver to implement a hybrid approach by estimating the time to the retrace, sleeping for a bit less than that and then busy waiting.
I don't know which of these three possible implementations is typical, but it doesn't really matter. Windows doesn't provide any device independent means for a Windows application to synchronize with the virtual retrace outside of DirectX (and I guess OpenGL). Unlike a video card driver, applications don't have direct access to the hardware, so can't read the device registers nor request or handle interrupts.