Is it possible to make parallel rendering with Vulkan?

181 Views Asked by At

My question is fairly simple and may look naive, but I don't see a lot of talks about it, because mainly articles and posts I saw treats of real-time rendering.

But when I see a GPU can have up to 8 or 16 graphics queues. I was wondering if we can launch as much renders as the GPU have queues ? I mean, I'm deeply interested by this in a fully offscreen rendering software where renders were totally unrelated, excepted by geometry and shaders.

2

There are 2 best solutions below

4
vandench On

The GPU is free to do whatever it wants, however it wants, as long as it follows the constraints set by the Vulkan spec. The only real constraints in Vulkan queues are synchronization primitives. As long as everything ends up in the right order according to the semaphores, everything in between semaphores can happen in any order. This can happen within a command buffer, within a queue, within a queue family, within a device, or across devices (a device being the virtual context represented as a VkDevice, not a physical device).

Taking from NVidia's explanation of the rendering process on their GPUs, within a single Graphics Processing Cluster there is a single rasterizer, and a lot of cores and dispatch units to handle the shaders, most of their GPU's have multiple GPCs, so each one can presumably be working on rendering a different triangle. In practice things are wildly more complex than what I've described.

So can you render things in parallel: sure, why not; will you notice: assuming you setup your synchronization primitives correctly, probably not.

Pragmatically speaking, this would be something you would ask your support engineer for the various GPU manufacturers you work with, and they would be able to go over how to best optimize your renderer.

3
Nicol Bolas On

You can shove as much stuff down however many queues the implementation allows you to have.

Will you get any meaningful performance improvement out of doing so? That is highly unlikely.

On a CPU, if you don't use a thread, that thread goes unused. That's just how threads work.

GPUs aren't like that. Queues are not like CPU threads. They are interfaces for dispatching work to the various execution units of the GPU. As such, one of the 8 queues does not limit itself to merely 1/8th of the available hardware. It will fill up however many execution units and hardware components are available to be filled up with the work it has.

Submitting two pieces of work at the same time will therefore cause them both to contend for the same resources. The results will still take the same total amount of time to be completed because they're ultimately using the same resources.

This article (and its companions) are from 2018, but they should still be broadly applicable.