I need help with organizing shaders in my game.
The game uses vertex and pixel shaders for texturing and lighting. Some objects are textured, other just colored, then there are several lighting algorithms - lightmaps, diffuse and specular lighting, some shadow generation which only some objects get, etc.
My first idea was to have one big, nice shader technique which can handle it all based on parameters. This was great to maintain (pseudo code to give idea):
float4 PixelShaderFunction(input)
{
if (UseTexture)
objectColor = tex2D(...)
else
objectColor = ObjectColor;
if (UseAmbient)
...
if (UseDiffuseLight)
...
// etc...
}
However the performance was poor.
Then I created multiple techniques to avoid if
branches, so every technique only uses what a particular object needs. So if the object doesn't accept shadows, there's no code for it there. If it doesn't make specular light, no code. Plus I grouped common functionality into functions. Like this (real code now):
float4 Textured_PixelShader(Textured_VsOut input) : COLOR0
{
float4 lightLevel = 0;
float4 objectColor = GetObjColorTex(input.TexUV);
// calculate main table lighting
PsAddTableLight(input.WorldPosition, input.Normal, lightLevel);
// calculate cloth lighting
PsAddClothLight(input.WorldPosition, input.Normal, lightLevel);
// calculate fill light - do we need it?
PsAddFillLight(input.Normal, lightLevel);
// add ambient light
PsAddAmbientLight(lightLevel);
// apply light
PsApplyDiffuseLight(lightLevel, objectColor);
// add speculars
PsAddSpecularBlurred(input.WorldPosition, input.Normal, objectColor);
// final work
return PsFinalize(objectColor);
}
The improvement in performance was massive.
However I'm getting lost in maintaining this shader. Every second day I need to add a new technique because there isn't yet a combination which does texture + lightmap + this_kind_of_shadow + specular
or whatever I need. Some of them get names like this, other gets named by the object they are used for because there is only one object with such combination. And it's becoming a mess.
I have then two questions:
- Why cannot I have those
if
statements? I read a lot about how conditional execution hurts GPUs, but my ifs only depend on shader parameters which have same value for all rendered pixels (or verts). Why cannot they be fast? I really miss them. - What is the best way to divide this code into different shaders / techniques / files. Are there any good standards or rules?
Thoughts:
The main problem is here, that the compiler only sees a boolean expression and not the semantical information that your variable is a shader constant. It is only a special case and can become complex, if you for example use a more sophisticated boolean expression with functions, which only uses shader constants too. I think to prevent a lot of headaches of the shader compiler developers they choose to let the if be as general as it is.
Facts:
As stated in the HLSL-Documentation for
if
, there are two modes of the if, eitherflatten
orbranch
. Withflatten
the compiler rolls out both sides of the if, so they are computed first, and afterwards the result is taken from the right side. Withbranch
only the right side is executed, as the boolean is evaluated first, but this mode can only be used if you don't use any gradient functions like tex2D, because they are dependent of neighbour fragments, and so need to be executed on each fragment and must not be skipped. In your case I'm pretty sure that you are using such functions, so the compiler choosesflatten
for your ifs, resulting in a fully executed and slow shader.As far as I know there are no good standards, but I think engines like Unity or Unreal are a good spot to have a deeper look, how they are managing their shaders. The last time I looked into Unreal, they dynamically generated each shader once it is needed, so the shader code is generated from their shader builder, which then is compiled.
In my own little engine, I used a similar approach as you, but instead of dynamic branching, I'm using the preprocessor directives
#if, #elif, #else, and #endif
. If the engine encounters a needeed shaders, it sets the right defines and then compiles them on the fly withD3DCompile
. To prevent stuttering, I save the compiled shaders to disc and before compiling, I'm looking for the shader, if it has been compiled before.