GLSL Atomic Image Access

2.1k Views Asked by At

My other post intends to collect general information on the kinds of GLSL spinlocks, but unfortunately nothing has come of it, nor has it solved my problem. Therefore, a specific question. I reduced my problem down to a minimal example, presented below:

The trivial problem makes a screen-sized texture of locks and texture of color. In pass one, the colors are all set to zero (shader 1). In pass two, two triangles are drawn, which the geometry shader quadruples and slightly offsets (shader 2). The fragment shader atomically increments the texture's color. In pass three, the color is visualized (shader 3).

Shader 1:

//Vertex
#version 440
uniform mat4 mat_P;
in vec4 _vec_vert_a;
void main(void) {
    gl_Position = mat_P*_vec_vert_a;
}

//Fragment
#version 440
layout(rgba32f) coherent uniform image2D img0;
void main(void) {
    imageStore(img0,ivec2(gl_FragCoord.xy),vec4(0.0,0.0,0.0,1.0));
    discard;
}

Shader 2:

//Vertex
#version 440
in vec4 _vec_vert_a;
out vec4 vert_vg;
void main(void) {
    vert_vg = _vec_vert_a;
}

//Geometry
#version 440
#define REPS 4
layout(triangles) in;
layout(triangle_strip,max_vertices=3*REPS) out;
uniform mat4 mat_P;
in vec4 vert_vg[3];
void main(void) {
    for (int rep=0;rep<REPS;++rep) {
        for (int i=0;i<3;++i) {
            vec4 vert = vert_vg[i];
            vert.xy += vec2(5.0*rep);
            gl_Position = mat_P*vert; EmitVertex();
        }
        EndPrimitive();
    }
}

//Fragment
#version 440
layout(rgba32f) coherent uniform image2D img0;
layout(r32ui) coherent uniform uimage2D img1;
void main(void) {
    ivec2 coord = ivec2(gl_FragCoord.xy);
    bool have_written = false;
    do {
        bool can_write = (imageAtomicExchange(img1,coord,1u)!=1u);
        if (can_write) {
            vec4 data = imageLoad(img0,coord);
            data.xyz += vec3(1.0,0.0,0.0);
            imageStore(img0,coord,data);
            memoryBarrier();
            imageAtomicExchange(img1,coord,0);
            have_written = true;
        }
    } while (!have_written);
    discard;
}

Shader 3:

//Vertex
#version 440
uniform mat4 mat_P;
in vec4 _vec_vert_a;
void main(void) {
    gl_Position = mat_P*_vec_vert_a;
}

#version 440
layout(rgba32f) coherent uniform image2D img0;
void main(void) {
    vec4 data = imageLoad(img0,ivec2(gl_FragCoord.xy));
    gl_FragData[0] = vec4(data.rgb/4.0, 1.0); //tonemap
}

Main Loop:

  1. Enable Shader 1
  2. render fullscreen quad
  3. glMemoryBarrier(GL_ALL_BARRIER_BITS);

  4. Enable Shader 2

  5. Render two small triangles
  6. glMemoryBarrier(GL_ALL_BARRIER_BITS);

  7. Enable Shader 3

  8. render fullscreen quad

Note that in steps 3 and 6 I [think I ]could have used GL_SHADER_IMAGE_ACCESS_BARRIER_BIT. Just in case, I'm being conservative.

The visualized colors jitter with time, and are mostly fairly small. This shows that atomicity is not happening. Can someone sanity check this procedure? Am I missing anything?

EDIT: From this page, I found that using discard can make image load/store undefined in the fragment. I removed discards, but the problem still occurs. I also found layout(early_fragment_tests) in;, which forces early fragment tests (it didn't help either).

1

There are 1 best solutions below

1
On

Another related link:
https://www.opengl.org/discussion_boards/showthread.php/182715-Image-load-store-mutex-problem?p=1255935#post1255935

Some spin lock code that worked last time I tested it (granted a few years ago):
http://blog.icare3d.org/2010/07/opengl-40-abuffer-v20-linked-lists-of.html

Another implementation of the same application:
https://github.com/OpenGLInsights/OpenGLInsightsCode/blob/master/Chapter%2020%20Efficient%20Layered%20Fragment%20Buffer%20Techniques/lfbPages.glsl

In the above links, a canary is used which definitely was important and possibly still is. coherent is important but you have that. A few years ago memoryBarrier() simply wasn't implemented and did nothing. I hope this isn't the case, however it may be the spin lock works just fine and the writes to img0 don't happen in order with following reads.

GLSL compilers can be a bit buggy some times. Here's a few examples to show just how strange GLSL coding can get. Point is, trying lots of different ways of writing the same code can help. I've seen issues with functions as conditionals inside while loops simply failing. For all I know do..while compiles vastly differently to a typical while. Combining as many conditions into the loop condition can sometimes help too. Sometimes else break doesn't behave as expected or allow the compiler to unroll certain loops and you have to use the following:

for (...)
{
    if (...)
    {
        ...
        continue;
    }
    break;
}