Write to mutiple 3Dtextures in fragment shader OpenGL

470 Views Asked by At

I have a 3D texture where I write data and use it as voxels in the fragment shader in this way:

#extension GL_ARB_shader_image_size : enable
...
layout (binding = 0, rgba8) coherent uniform image3D volumeTexture;
...
void main(){
vec4 fragmentColor = ...
vec3 coords = ...
imageStore(volumeTexture, ivec3(coords), fragmentColor);
}

and the texture is defined in this way

glGenTextures(1, &volumeTexture);
glBindTexture(GL_TEXTURE_3D, volumeTexture);
glTexImage3D(GL_TEXTURE_3D, 0, GL_RGBA8, volumeDimensions, volumeDimensions, volumeDimensions, 0, GL_RGBA, GL_UNSIGNED_BYTE, 0);

and then this when I have to use it

glActiveTexture(GL_TEXTURE0);
glBindTexture(GL_TEXTURE_3D, volumeTexture);

now my issue is that I would like to have a mipmapped version of this and without using the opengl function because I noticed that it is extremely slow. So I was thinking of writing in the 3D texture at all levels at the same time so, for instance, the max resolution is 512^3 and as I write 1 voxel VALUE in that 3dtex I also write 0.125*VALUE for the 256^3 voxel and 0.015625*VALUE for the 126^3 etc. Since I am using imageStore, which uses atomicity all values will be written and using these weights I would automatically get the average value (not exactly like an interpolation but i might get a pleasing result anyway). So my question is, what is the best way to have multiple 3dtextures and writing in all of them at the same time?

1

There are 1 best solutions below

7
On

I believe hardware mipmapping is about as fast as you'll get. I've always assumed attempting custom mipmapping would be slower given you have to bind and rasterize to each layer manually in turn. Atomics will give huge contention and it'll be amazingly slow. Even without atomics you'd be negating the nice O(log n) construction of mipmaps.

You have to be really careful with imageStore with regard to access order and cache. I'd start here and try some different indexing (eg row/column vs column/row).

You could try drawing to the texture the older way, by binding it to an FBO and drawing a full screen triangle (big triangle that covers the viewport) with glDrawElementsInstanced. In the geometry shader, set gl_Layer to the instance ID. The rasterizer creates fragments for x/y and the layer gives z.

Lastly, 512^3 is simply a huge texture even by todays standards. Maybe find out your theoretical max gpu bandwidth to get an idea of how far away you are. E.G. lets say your GPU can do 200GB/s. You'll probably only get 100 in a good case anyway. Your 512^3 texture is 512MB so you might be able to write to it in ~5ms (imo this seems awfully fast, maybe I made a mistake). Expect some overhead and latency from the rest of the pipeline, spawning and executing threads etc. If you're writing complex stuff then memory bandwidth isn't the bottleneck and my estimation goes out the window. So try just writing zeroes first. Then try changing the coords xyz order.


Update: Instead of using the fragment shader to create your threads, the vertex shader can be used instead, and in theory avoids rasterizer overhead though I've seen cases where it doesn't perform as well. You glEnable(GL_RASTERIZER_DISCARD), glDrawArrays(GL_POINTS, 0, numThreads) and use gl_VertexID as your thread index.