Search Unity

Indirect Instances Help

Discussion in 'Shaders' started by VictorKs, Jan 5, 2022.

  1. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    So I implemented InstancingIndirect with a surface shader to render tens of thousand meshes. But I have some questions.
    1) What is the best way to pass the transform matrix?
    2) I pass several data to the buffers (Matrix4x4 for transform and some Vector4) should I pass them in one buffer as a struct or should I use several buffers?
    3) Can material Props be combined with this or should I just use the buffers for passing data?
    4) How should I approach an efficient GPU frustrum culling?
     
  2. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,352
    1) With as little data as possible. If you just need position, just pass the float3 position. If you need position and rotation, pass the float3 position and float3 euler rotation (or maybe float4 quaternion). If you need position, rotation, and uniform scale, pass a float4 with position and scale and a float3 euler rotation (or float4 quaternion). If you need position, rotation, and non-uniform scale, pass just the 4x3 matrix, as the last float4 will always be float4(0,0,0,1).

    2) Either is fine. One buffer might be easier for management. Multiple buffers might be needed if you're doing a lot of instances are hitting buffer size limits.

    3) If they're constant for all instances, use material properties. No need to waste space on redundant data.

    4)
     
    VictorKs likes this.
  3. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,352
    Some additional thoughts:

    You can get better performance on some GPU if you ensure your buffer is 128 bit aligned. Meaning if you have a struct with a float3 and a float4, using two float4 values or adding a dummy additional float can be faster even though you're sending more data.

    Ideally frustum culling should happen before the vertex shader runs, though you can still get some benefit doing it in the vertex shader if you're rendering most of your instances off screen. Usually better to do it in a compute shader, or even on the CPU, beforehand. You can either construct a new buffer that only has the data for the visible instances, or create a buffer that is a list of the visible indices (which is used to access the matrix buffer rather than the instance ID itself), or some extra bit in the buffer data that you check when running the vertex shader to quickly exit and output all an zero vertex position. There isn't really any special trick to the actual frustum culling itself thought: test bounding spheres or boxes against 6 frustum planes, which in itself is tricky. There are a ton of tutorials out there on the topic, try one and see if it's faster than not doing it at all.
     
    VictorKs likes this.
  4. JSmithIR

    JSmithIR

    Joined:
    Apr 13, 2023
    Posts:
    113
    Do you know how to use Unity's built-in shader variables such as unity_WorldToObject() with indirect instanced objects? I am using the same approach as OP, where I am passing in transformation matrices via a ComputeBuffer. There seems to be no data corresponding to these built-in shader float4x4 though....Using them does absolutely nothing with instanced meshes
     
  5. bgolus

    bgolus

    Joined:
    Dec 7, 2012
    Posts:
    12,352
    For indirect instanced objects, the built in matrices are effectively blank. This is because when using indirect instancing it's your job to handle the transforms. If you want the world to object transforms, you need to provide them yourself, either as another buffer, or ideally calculate the inverse in the shader. If the transforms are uniform (x, y & z scale is the same) you can use a transposed matrix for the rotation and scale. If they're non-uniform, Unity's Particle Instancing shader has some example code.

    https://github.com/TwoTailsGames/Un...des/UnityStandardParticleInstancing.cginc#L27
     
    JSmithIR likes this.