im trying to orient my code to use the cache as efficiently as possible using data oriented design, its my first time thinking about such things as it goes. ive worked out a way to loop over the same instruction that draw a sprite on screen, the vectors sent to the function include positions and sprites for all game entities.
my question is does the conditional statement get rid of the draw function from the instruction cache and therefore ruin my plan? or is what im doing just generally insane?
struct position
{
position(int x_, int y_):x(x_), y(Y_)
int x,y;
};
vector<position> thePositions;
vector<sprite> theSprites;
vector<int> theNoOfEntities; //eg 3 things, 4 thingies, 36 dodahs
int noOfEntitesTotal;
//invoking the draw function
draw(&thePositions[0], &theSprites[0], &theNoOfEntities[0], noOfEntitesTotal)
void draw(position* thepos, sprite* thesp, int* theints, int totalsize)
{
for(int j=0;int i=0;i<totalsize;i++)
{
j+=i%size[j]?1:0;
thesp[j].draw(thepos[i]);
}
}
Did you verify that the conditional stays as a conditional in assembly? generally with simple conditionals such as the one presented above, the expression can be optimized to a branchless sequence (either at machine level using machine specific instructions, or at IR level using some fancy bit math).
In your case, you conditional gets folded down very nicely on x86 to a flat sequence (and AFAIK, this will occur on most non-x86 platforms too, as its a mathematical optimization, not a machine specific one):
So this means the aren't any branches to predict, other than your outer loop, which follows a pattern, meaning it won't cause any mis-prediction (it might mis-predict on exit, depending on the generated assembly, but its exited, so it doesn't matter).
This brings up a second point, never assume, always profile and test (one of the cases where assembly knowledge helps a lot), that way you can spend time optimizing where it realy matters (and you can understand the inter and inner workings of your code on your target platform better too).
If you really are concerned about branch mis-prediction and the penalties incured, use the resources provided by your target architectures manufacturer (different architectures behave very differently for branch mis-prediction), such as this and this from Intel. AMD's CodeAnalyst is a great tool for checking branch mis-prediction and the penalties it may be causing.