From docs: seq
set-if-equal destination = source1 == source2 ? 1 : 0, component-wise
I haven't yet tested it thoroughly, but so far my fragment shader worked on both machines (desktop pcs), where context3D initialization succeeded as DirectX, but doesn't work on machines where flash falls back to software rendering.
seq ft2.x, ft0.x, fc0.x
ft.x
is set to 1
on hardware, when current pixel red value, stored in ft0.x
is equal to constant fc0.x
, which stores 50/255
. So what I want to happen, does happen on #32????
(50 == 0x32) colored pixel on hardware, but doesn't on software.
I already tested for a workaround, and I can replace seq
opcodes with a more complex algorithm involving slt
(set if less than) or sge
(set if greater or equal).
So it seems the problem lies in a comparison of a constant I supply to the GPU (50/255) and the actual red value (which is 50 in the texture). If it was anything else (e.g. RGBA values had a different order), slt
and sge
would fail as well.
Am I doing something wrong here? Should I somehow round compared values (e.g. multiply by 255 then remove the fractional) in order to be sure it will work in all devices and modes?
Update: One of the machines with software rendering fallback was set to 16 bit graphics, however changing it to 32 bit didn't fix the issue. I also did a blind try to divide the color value by 256, 128 and 127 instead of 255, hoping it could maybe fix the issue if the float had a different precision (and higher and lower numbers would work as long as they would equal to one of pixels inside a 256px long gradient), but my hopes didn't pay off.
Then I tried the workaround of storing the constant as an integer, and inside shader multiplying the value by 255 and removing the fractional, and to my surprise, while it worked on GPU, it failed on software rendering:
mul ft0.x, ft0.x, fc0.y
convert ft0.x (red channel) to integer by multiplying it by the constant 255
frc ft4.x, ft0.x
get a fractional
sub ft0.x, ft0.x, ft4.x
remove fractional, to truncate the integer
Now do the comparisons, e.g. seq ft2.x, ft0.x, fc0.x
add ft0.x, ft0.x, ft4.x
add fractional back, this step is probably not necessary
div ft0.x, ft0.x, fc0.y
divide the integer value by 255 to convert it back to float (by this I mean a number in 0..1 range)
The next thing I'm going to try as a workaround is to simply make a series of less-than comparisons that set a temp register to 1, which is added to another temp register (a counter), so that by checking the counter I can see inside which range is the value.
Here's the workaround that finally did the trick for me.
I had 4 colors on the red alpha channel, that were informing the shader what to do. If the red value was 50, the shader would take left pixel as a source, if it was 100, it would take top pixel and so on. So all I had to do was 4
seq
commands to set 0 or 1 offsets to 4 components of a register that I can later add or remove from the the register with the position for the sampler.Because
seq
failed to compare the red value of the pixel from the first sampling with the constant supplied, I made a 'ladder' of set-if-greater-or-equal opcodes:Now I had a register ft2 that stored:
0
for red below49
(actually all of these values of red color are divided by 255 as in comments in the code above)1
for red between49
and98
2
for red between98
and147
3
for red between147
and196
4
for red above196
Then instead of comparing a pixel color with a constant, I would compare the
ft2.x
counter state with a constant (and the constants would be 1,2,3,4 instead of 50,100,150,200).Unfortunately it means whole code above is an additional overhead that I can spare the GPU, but can't avoid on CPU unless I can find out the solution to the
seq
opcode always returning 0 on CPU when comparing a pixel color and a constant.