Search Unity

Question Buffers and instancing

Discussion in 'Shaders' started by VictorKs, Dec 13, 2022.

  1. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    I've been reading the Scripting API to get a better understanding of Drawing and Buffers but some things are a little confusing.

    1)What is the difference between a GraphicsBuffer and a ComputeBuffer?

    2)Can we use Indirect rendering without compute Shader Support? I mean cant we pass the data with a Constant Buffer?

    3)Why are we limited to 1023 instances with DrawMeshInstanced()? I guess it must be the ConstantBuffer 64KB limit for the obj2World matrix for each instance. But is it possible to pass transform data in a custom CBuffer? In my case I want uniform scale and rotation only on Z axis so I could simplify the data sent, thus getting more instances in a batch.
    EDIT: Microsoft Docs says we should use a second vertex buffer for instanced data. But I guess 1023 is good enough and Unity devs must have their reasons.

    4)What is MaterialPropBlock under the hood? I guess it is a class which sends Buffers per instance, am I wrong? Also does a matPropBlock itself have memory limits?

    5)I noticed that we can send textures to the shader via the matProps. So that way we can customize textures per instance. But I can't understand how these textures get mapped per instance. I mean if a send colors through matProps they will be packed in a float4 array inside the shader and I can get them through instanceId.

    6)What is the instancing support on mobile platforms(opengles3+)? I guess compute buffers are out of the question.
     
    Last edited: Dec 13, 2022
  2. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    All ComputeBuffers are GraphicsBuffers, but not all GraphicsBuffers are ComputeBuffers. GraphicsBuffer can also be a vertex or index buffer.

    AFAIK, all platforms that support indirect also support compute buffers.

    It's not a Unity limitation; 1023 is the maximum supported by DX10. You'd have to ask Microsoft/NVIDIA back in 2004 or whenever they were coming up with that spec. Back then, the hardware probably packed instance IDs into 10 bits somewhere. You can draw a lot more than that with indirect, though.

    I think it's basically just a bunch of key-value pairs. It's a Unity abstraction, not anything that's part of any graphics APIs. You can set variables from multiple different CBuffers with it (eg something in global and something else in UnityPerMaterial or whatever).

    You'll need to use a texture array. Give each instance an index of which texture slice to use.

    GLES 3.1+ requires support both of those, AFAIK. And Vulkan/Metal certainly do. I don't know about 3.0. You can check at runtime...

    https://docs.unity3d.com/ScriptReference/SystemInfo-supportsInstancing.html
    https://docs.unity3d.com/ScriptReference/SystemInfo-supportsComputeShaders.html
     
    VictorKs likes this.
  3. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    1) Seems like ComputeBuffers class is going to be deprecated then if GraphicsBuffer class can use computeBuffers.
    2) Yes I read the Instancing.cginc and now have a better understanding. I now realize that the point of indirect drawing is to use compute buffers to be drawn directly from the GPU. I was confused with DrawInstancedProcedural and Indirect.
    3) Turns out it is actually the ConstantBuffer that imposes the limit. But by using assume uniformScale we can get 1023 , without it only 512 is possible on DX11.

    Yes I've been using that for years actually. I was just intrigued we could set Textures through MatProps but I guess they cant be instanced without a texArray.

    So after reading the Instancing.cginc I realized that there are two instancing type:
    *DrawMeshInstanced just sends all the world matrices of instances in a cBuffer and sets-up an instanceID.
    *DrawMeshInstancedProcedural allows us to send a custom buffer ourselves and also setup the data ourselves in our own process inside the vertex shader.
    So actually Instanced is a preconfigured version of InstancedProcedural. So I guess I can send structuredBuffers with InstancedProcedural and bypass the 1023 limit.

    And then there is IndirectInstancing which uses computeBuffers to control positions without CPU->GPU data flow on each draw.

    But can anyone explain why we need InstancedIndirect and InstancedProcedural? I mean can't we achieve the same effect with both, I could just send a bufferwithArgs with Procedural? And how are they implemented behind the scene?
     
  4. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,751
    The *procedural functions is for rendering without a vertex buffer (and optionally also without an index buffer at all). It's for when you can generate vertex positions, UVs, etc prodecurally directly in the shader or by using custom logic for reading from buffers/textures from the shader itself. For example, it's trivial to build a quad solely based on the vertex index without the need to read from an index or vertex buffer.

    The *indirect reads the draw arguments (number of instances, number of indices, etc) from a compute/graphics buffer. This allow these parameters to be filled in in the GPU by a compute shader. For example, you can have a compute shader that determine visible instances and writes their transforms to a buffer, then writes the number of visible instances into the indirect arguments.
     
    VictorKs likes this.
  5. VictorKs

    VictorKs

    Joined:
    Jun 2, 2013
    Posts:
    242
    I thought so at first but after looking at the instancing.cginc and API I realized that DrawMeshInstancedProcedural uses vertex buffer from the Mesh parameter. It is different to DrawProcedural and DrawProceduralIndirect.

    The naming is confusing and the implementation is confusing DrawMeshInstancedProcedural looks more as the generic version of instancing. Mesh means vertex/index buffers, Instanced means instanced drawCalls and Procedural means that we set-up instancing data inside a procedure in Shaders. Can someone which understands their implementation explain it a little more.
     
  6. opaqe

    opaqe

    Joined:
    Nov 22, 2014
    Posts:
    8
    Bumping this. I have similar confusion now going into GPU instancing options.

    What I would like to know specifically is what approach do I need if I want to avoid GPU readback?

    For example: I want to draw a vector field and I have a mesh asset to draw an arrow representing the vector. I initialize a structured buffer of Matrix4x4 transforms for each vector in my field. I set this buffer for 1) a compute shader to perform physics based updates -- say, magnitude decays to 0 over time -- and 2) for GPU instanced drawing, referenced in my shader code.

    I'd like for my updates in the compute shader to write directly into the GPU buffer and for that data to be read by the GPU instance drawing method without readback to the CPU.
     
  7. burningmime

    burningmime

    Joined:
    Jan 25, 2014
    Posts:
    845
    I'm not sure what you mean. What happens on the GPU stays on the GPU unless you manually read it back. If you write stuff to a buffer and bind that same buffer to a different shader, it won't ever be copied to the CPU.
     
  8. opaqe

    opaqe

    Joined:
    Nov 22, 2014
    Posts:
    8
    Thanks that was helpful as a sanity check. I figured out my bug was related to something else.