Search Unity

Discussion SRP Batcher not using GPU Instancing

Discussion in 'General Graphics' started by VirtualYeti, Jul 14, 2022.

  1. VirtualYeti

    VirtualYeti

    Joined:
    Jul 14, 2022
    Posts:
    3
    So my story so far is that I've created a very very basic terrain with Gaia and the result is a whopping 25k+ draw calls and an API count of 250k, which is obviously way too high.

    Analyzing with Renderdoc has exposed the source of the draw calls being SRP Batcher which always calls DrawIndexedInstanced for an Instance Count of 1 from inside DrawLoop#DrawSRPBatch.

    Research now told me:
    From https://docs.unity3d.com/Manual/GPUInstancing.html: The SRP Batcher takes priority over GPU instancing.
    From: https://docs.unity3d.com/Manual/SRPBatcher.html#intentionally-removing-compatibility : If you want to render many identical meshes with the exact same material, GPU instancing can be more efficient than the SRP Batcher.

    Why doesn't the batcher use Instancing automatically?
    I get that the overhead of rebinding uniforms/whatever SetPass does can be avoided, which is what the batcher avoids. But why isn't the batcher itself rendering all the batched meshes using instancing? Is it because you typically have only a few meshes per batch and general instanced rendering has a bigger overhead than a regular drawcall? (but then, why does it use instanced API at all).

    Asked differently: Why is using Instancing worse than the SRP Batcher? Shouldn't the whole instance call have the very same constants anyway? So is it only that the SRP Batcher applies to more than regular instancing could? Like in this particular usecase we could instance the batches contents, but could there be cases where the batcher can batch more geometry than instancing could somehow?


    Besides that, how would I approach replacing the Terrains Tree (though they use MaterialPropertyBlocks and should bypass the SRP Batcher) and especially detail in a way that I remain most HDRP compatibility?
    Basically removing the "draw" checkbox from the Terrain Component and then have my own that uses the Graphics.DrawMeshInstanced and InstancedIndirect API (the latter if I have computer shaders culling and maybe even LODing and mesh emitting?)?

    Is that a big gain over using Unitys built in GPU Instancing though in the mid-term? I wonder what else would be possible to reduce CPU overhead on the terrain specifically (i.e. going down to bare textures and some constant buffers containg tree positions)
     
  2. VirtualYeti

    VirtualYeti

    Joined:
    Jul 14, 2022
    Posts:
    3
    Actually, when using the DX11 Backend instead of DX12, the high draw count still performs well enough to use SRP Batching.
     
    DevDunk likes this.
  3. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    989
    DrawCalls is a bit of an aging concept these days in general and especially when in the context of the SRP Batcher. SetPass calls is something you should be more concerned with. The best measure though is going to be actual frametime for targeted hardware, though obviously that's not always possible so probably SetPass calls and a general comparison of relative performance changes on available hardware is your next best bet.