What's the best way to share data among arbitrary numbers of instances in OpenGL (ES)?

58 Views Asked by At

In my app, I have a transform hierarchy of objects. All the objects are instances of a single object, and there might be a large number of them. Any group in the hierarchy can have any number of instances or sub-groups.

Aside from the transforms, the instances can potentially be batched into a single draw call, and per-instance attributes can live in an attribute array with an instance stride.

I would like to specify the transforms for the objects in an efficient way, but I'm not sure the best approach. Here's what I've considered:

  • One matrix per instance, in an attribute array with a divisor of 1.
    • This seems wasteful. If I have (say) 1000 objects in a single group, then I'll specify the same matrix 1000 times. When the parent transform changes, I'll also have to go update 1000 attributes instead of 1.
  • Break into multiple draw calls, one for each leaf group, and specify the transform as a uniform at the start of each call.
    • This is less wasteful, but seems like it might be slower due to a larger number of draw calls. Groups with small numbers of instances might be inefficient due to poor batching.
    • This isn't possible in OpenGL ES anyway (WebGL specifically), since I'd need to either (a) offset the starting instance for each call, which AFAICT requires a call to glDrawElementsInstancedBaseInstance(), or (b) build entirely separate attribute arrays for each group in the hierarchy, which also has more overhead + complexity.
  • Use a uniform array of transforms, and specify a per-instance index into it
    • This seems untenable because it looks like the maximum size of uniform arrays is very small.
  • Use a uniform buffer object to hold an array of transforms, and specify a per-instance index into the array
    • This seems better, though it still looks like the maximum array size isn't very large
  • Use an SSBO to do the above
    • Looks like this isn't available in OpenGL ES 3.0.
  • Something else?

What would be considered best practice for this task? Is there perhaps a method I haven't considered?

1

There are 1 best solutions below

3
solidpixel On

If I have (say) 1000 objects in a single group, then I'll specify the same matrix 1000 times.

If you are repeating the same matrix you already have problems. Ultimately you need a unique matrix per instance because you need each instance to be transformed to a unique location on screen.

When the parent transform changes, I'll also have to go update 1000 attributes instead of 1.

Yes, so? Matrices are small and very cheap to update.

It's a lot more efficient to update 1 matrix per instance on the CPU than perform a redundant tree of matrix fetches and calculations per vertex on the GPU.

One matrix per instance, in an attribute array with a divisor of 1

If possible I'd try to avoid attributes and instance divisors for per instance data - this forces per-vertex refetches on a lot of GPUs because the fact that it's a per instance divisor isn't known at shader compile time.

If possible store what you need in an instance-count sized array in a uniform buffer or storage buffer, and index into that array using gl_InstanceID to fetch the matrix you need. UBO limits can be a problem, so if you need a lot of objects, then you'll need to partition the draws in chunks. (That said, benchmark this, it's going to be vendor dependent).