DirectX 11 DrawInstanced render order

273 Views Asked by At

I'm just experimenting with rendering 2D sprites with DirectX11 using instancing. It seems that the primitive order matters when using "DrawInstanced".

On the first try I tested with a couple of sprites (each with 4 vertices + texture data with alpha).

The input layout looks like:

D3D11_INPUT_ELEMENT_DESC ied[] =
    {
        // vertex buffer
        {"POSITION", 0, DXGI_FORMAT_R32G32B32_FLOAT, 0, D3D11_APPEND_ALIGNED_ELEMENT, D3D11_INPUT_PER_VERTEX_DATA, 0},
        {"TEXCOORD", 0, DXGI_FORMAT_R32G32_FLOAT, 0, D3D11_APPED3D11_INPUT_ELEMENT_DESC ied[] =   

        // instance buffer
        { "INSTANCEPOS", 0, DXGI_FORMAT_R32G32B32_FLOAT, 1, D3D11_APPEND_ALIGNED_ELEMENT, D3D11_INPUT_PER_INSTANCE_DATA, 1},        
        { "TEXTUREID", 0, DXGI_FORMAT_R32_FLOAT, 1, D3D11_APPEND_ALIGNED_ELEMENT, D3D11_INPUT_PER_INSTANCE_DATA, 1}
    };

In the vertex shader the position and texture is adjusted for each instance.

cbuffer CB_Matrix : register(b0) {
    matrix g_matrix;
};

cbuffer CB_Position : register(b1){
    float2 cb_position;
};

struct VOut {
    float4 position  : SV_POSITION;
    float2 uv        : TEXCOORD0;
    float  textureID : TEXTUREID;
};

VOut VShader(float4 position : POSITION, float2 uv : TEXCOORD0, float3 instancePos : INSTANCEPOS, float textureID : TEXTUREID) {
    VOut output;

    float4x4 translate = { 1, 0, 0, cb_position.x,
                           0, 1, 0, cb_position.y,
                           0, 0, 1, 0,
                           0, 0, 0, 1 };

    position += float4(instancePos, 0.0f);

    output.position = mul(translate, position);
    output.position = mul(g_matrix, output.position);
    output.uv = uv;
    output.textureID = textureID;

    return output;
}

The initialization looks like:

for (uint32_t i = 0; i < NUM_INSTANCES; i++) {  
    instances[i].Position.x = spriteData[i].Position.x;
    instances[i].Position.y = spriteData[i].Position.y;
    instances[i].Position.z = 0.0f;    
    instances[i].TextureID  = spriteData[i].TextureID;
}

The sprites were rendered (using DrawInstanced) but when they overlap the alpha values weren't correct: Sprites with wrong alpha values

Then I changed the initialization to sort the instances back-to-front by starting with the maximum z value and decreasing it for each instance:

float z = 1.0f;
for (uint32_t i = 0; i < NUM_INSTANCES; i++) {
    z -= 0.0001f;
    instances[i].Position.x = spriteData[i].Position.x;
    instances[i].Position.y = spriteData[i].Position.y;
    instances[i].Position.z = z;    
    instances[i].TextureID  = spriteData[i].TextureID;
}

Then the sprites were rendered with the correct alpha values: Sprites with correct alpha

It's nice that this works but this raises a few questions:

  • It is guaranteed that the back to front ordering fixes the alpha problem on each hardware? I couldn't get any DirectX information about instancing that mentioned that the instance order matters
  • If the order matters, can it be that "DrawInstanced" is a sequential task, so each instance is rendered after the other? I can imagine it like that the gpu triggers a "Draw" call for each instance.
1

There are 1 best solutions below

0
On BEST ANSWER

It is guaranteed that the back to front ordering fixes the alpha problem on each hardware? I couldn't get any DirectX information about instancing that mentioned that the instance order matters?

Yes it guaranteed that primitives will be rendered in order. This would make any blending operation nearly impossible otherwise (there are some advanced techniques but they can be really heavy and don't fit those use cases)

Some manufacturers (like AMD), allow some extension to disable this feature (on a per draw basis), which is useful when you have depth buffer and you don't need to provide this. But this is an "opt out" feature.

If the order matters, can it be that "DrawInstanced" is a sequential task, so each instance is rendered after the other? I can imagine it like that the gpu triggers a "Draw" call for each instance.

Architecture depends, but DrawInstanced is not "serialized" into a list of individual draws. Gpu tends to process workloads with batches/streams (plus you can of course overlap Vertex/Pixel stage across the same draw), there are also different ways to eventually process primitives in a different order but still to guarantee pixel ordering, up to output merger level (post pixel shader/pre blending).