Search Unity

  1. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Question When use which render/draw technique?

Discussion in 'Graphics Dev Blitz Day 2023 - Q&A' started by KamilVDono, May 24, 2023.

  1. KamilVDono

    KamilVDono

    Joined:
    Jun 2, 2016
    Posts:
    12
    Hey,

    Currently in Unity there is a few ways of rendering not-skinned meshes, I mean:
    1. SRP batched mesh renderer
    2. Cutom BatchRendererGroup
    3. ECS BatchRendererGroup
    4. DrawMeshInstancedIndirect
    5. DrawMeshInstanced
    So could you tell, from your experience, when to use which method? Provide some heuristic when, which technique is most performant (if possible please split for CPU performance, GPU performance, RAM memory and VRAM memory).

    Thanks in advance
     
  2. arnaud-carre

    arnaud-carre

    Unity Technologies

    Joined:
    Jun 23, 2016
    Posts:
    97
    Hi,

    This is quite wide question but let's try to answer.

    1. SRP batched mesh renderer

    SRP Batcher is the most generic / versatile option. Not the fastest one, but doesn't require any effort. Make any scene run at a reasonable speed. It's enabled by default in URP and HDRP

    2. Custom BatchRendererGroup

    Low level API to submit DrawInstanced commands. This is probably the fastest way to render tons of meshes. You have to handle GPU memory allocation / upload by yourself. You can also provide your own custom "culling" function. It requires a lot of work on your side.

    3. ECS BatchRendererGroup

    Or "entities.graphics" package. This is the easiest way to get BatchRendererGroup speed benefit without effort. Entities.graphics package will handle GPU memory allocation & upload for you. It will also manage culling & draw commands generation for you. Just create a sub-scene, put any mesh into it, and you get automatic rendering through BRG. Please note it's also the only way to render ECS entities.

    4. DrawMeshInstancedIndirect
    5. DrawMeshInstanced

    Historical way of doing explicit GPU instanced rendering in Unity. You have to provide a matrix array by yourself and also any custom "per instance" properties though a MaterialPropertyBlock. These data have to be uploaded to the GPU memory at each Draw call.

    Hope it helps
     
    Onigiri and tmonestudio like this.
  3. sqallpl

    sqallpl

    Joined:
    Oct 22, 2013
    Posts:
    383
    @arnaud-carre ,

    What about RenderMesh functions like RenderMeshInstanced or RenderMeshIndirect?

    Thanks.
     
    KamilVDono likes this.
  4. KamilVDono

    KamilVDono

    Joined:
    Jun 2, 2016
    Posts:
    12
    Hey @arnaud-carre

    So 2/3 should be always used as newer replacements of 4/5, yes? Even if not true, let that be assumption for below.

    Also, I was hoping for some heuristic when it's more performant to use 2/3 instead of 1.
    I'm not sure if, for example, mesh with 10 instances in a scene should be "promoted" to batch rendering?
    Or maybe it's not efficient to draw less than X instances with batch rendering because memory overhead will be greater than CPU benefit of single draw call. And so on.

    And last about 2 and 3, do you think custom BRG would be a few % faster or a few hundreds %?

    Thanks
     
  5. arnaud-carre

    arnaud-carre

    Unity Technologies

    Joined:
    Jun 23, 2016
    Posts:
    97
    2/3 are always faster than 4/5 because you provide the persistent data in GPU memory ( no no upload required ). But 2 needs more work (ou have to deal with low level stuff yourself such as GPU memory alloc & upload). (3) requires ECS project.
    So there is still use case where you'd prefer simplicity of 4 or 5

    Regarding speed about SRP Batcher vs BRG it really depends on the scene content. As a rule of thumb, BRG is always faster, or run at the same speed in worst case scenario. ( worst case scenario is 1 single instance per BRG DrawInstanced ). If your scene content is highly instanceable , BRG is way faster than SRP Bacher.

    It's hard to say a number of instance where it's better to "promote" a mesh to BRG. I would say 2 or more instances of a mesh in a view is already a win (CPU performance wise). And even 1 single instance will probably run at the same speed as SRP Batcher
     
    Onigiri likes this.
  6. KamilVDono

    KamilVDono

    Joined:
    Jun 2, 2016
    Posts:
    12
    Thank, it is exactly what I wanted to know. Great stuff with BRG then, keep doing such awesome techs and one again thanks for response.