Search Unity

Confused about performance of SRP batching vs. GPU instancing.

Discussion in 'Universal Render Pipeline' started by KiddUniverse, Aug 10, 2020.

  1. KiddUniverse

    KiddUniverse

    Joined:
    Oct 13, 2016
    Posts:
    115
    Lets say I have an emissive material that is shared among hundreds of objects, but that the emissive value is also scaled dynamically and individually among each of them at various times. Is using property blocks with GPU instancing and forcing the shader not to be compatible with SRP batching so that it actually does GPU instancing, more or less performant than just using the SRP batching and having hundreds of materials that are all practically the same, save for a varying emissive value?

    My understanding about SRP is that it batches by shader, not by material, so you'll still have the same amount of Set Pass calls regardless, so this is where my confusion arises.
     
    Gametyme likes this.
  2. Steven_Cannavan

    Steven_Cannavan

    Unity Technologies

    Joined:
    Jan 13, 2020
    Posts:
    4
    Hello,

    In general the answer is that it depends on the situation.

    GPU Instancing requires the batch to be the same mesh, each instanced batch can be thought of do this mesh x times with this material using a instanced batcher, when we execute this batch we gather all the information needed (transforms, material property blocks, etc.) and tell the GPU to draw them all in one draw call. This is cheaper on the CPU as we don't have to issue a draw call for each instance of the mesh.

    SRP Batches are based around shaders so we don't need to have the same material or even mesh, each mesh will be a separate draw call but we reduce the cost on the CPU from the reduced number of set pass calls between each material which are quite expensive.

    So when you have a large number of the same mesh and material (100's, 1000's, etc) on screen then GPU Instancing makes sense and should perform better.

    When you have a large number of different materials but few shader variants then SRP Batching is the best for performance overall.

    For more details on SRP Batching and GPU Instancing you may want to look at the SRP Batching blog post and the GPU Instancing documentation in the manual.

    So in your particular example it is more complicated as you want to dynamically update the emissive values, so using MaterialPropertyBlocks could be a more convenient way to update these values across multiple renderers, without having to create 100's of dynamic materials. Depending on whats actually needed you maybe able to implement only using a few dynamic materials or animate the whole effect using texture data in the material, so you can avoid updating these properties on the CPU.

    The difference in performance will be based on how many of the same mesh and material is on screen, if you have hundreds of the same mesh & material I would expect GPU Instancing to perform favorably, if you only have a few on screen or lots of different material/mesh variants that keeps the batches small then I would expect no or a small performance decrease compared to SRP just from the extra overhead to batch them into a instance.

    Hopefully this answers your question?
     
    Last edited: Aug 11, 2020
  3. elJoel

    elJoel

    Joined:
    Sep 7, 2016
    Posts:
    125
    So what happens when I have SRP Batcher enabled and set GPU instancing on the materials does this lead to problems. Can the two coexist? What does the Dynamic Batching checkbox do in the URP Settings?
     
  4. Steven_Cannavan

    Steven_Cannavan

    Unity Technologies

    Joined:
    Jan 13, 2020
    Posts:
    4
    So during rendering we have 2 paths it can take, the SRP Batcher route which shows up as RenderLoopNewBatcher.Draw in the frame debugger or the orignal route which shows up as RenderLoop.Draw. The SRP Batcher doesn't currently support instancing* so GPU Instanced materials will just be added to a normal SRP batch when used on MeshRenderers. Things like particle systems will be rendered in the original path and if you do a explicit call such as DrawInstancedMesh from a command buffer for example this will be instanced.

    If you disable SRP Batching then everything goes through the original path, so instancing will work again assuming you have enabled it.

    If you have a shader that doesn't support SRP Batching then it will go through the original path no matter if SRP Batching is enabled and therefore supports instancing.

    Depending on the project it may makes sense to disable the SRP Batcher so you can use Instancing extensively, but in general its better to just use SRP Batcher as it can batch objects with similar materials. A thing to remember is that other things will break a instanced batch such as lightmaps, reflections, light probes so depending on your scene setup and if the material needs any of these it may make a lot of small batches which would of been more efficient being done via the SRP Batcher.

    You can always enable instancing on the materials and switch between enabling/disabling the batcher and profile to see what works best for your project setup. If you need instancing for a a couple of cases and the SRP Batcher you can use explicit instancing (DrawInstancedMesh) or a material that doesn't support it, though the material approach has the caveat that this may not be a valid approach down the line and may cause unforeseen issues.

    As I said previously instancing only really makes sense if you have 100's and 1000's of the same mesh and material, small batches would be more efficient via SRP Batching in general as other meshes with the same shader variant can be included. It always depends on the specific use case, so if you decide to try it out profile, profile, profile.

    The previous example by Anthony0506 is using it in one place which means where if it gets in a situation where its suboptimal it probably will be a relatively low blip compared to 100's of materials approach, also more convenient to use and update the MaterialPropertyBlocks via a manger.

    As for the Dynamic Batching checkbox it enables or disables the use of the dynamic batcher for that pipeline.

    * There is a instanced path for DOTS/Hybrid Renderer and for XR SinglePassInstanced
     
    marcospgp, KwahuNashoba and Oxeren like this.
  5. elJoel

    elJoel

    Joined:
    Sep 7, 2016
    Posts:
    125
    Thanks for the detailed reply!

    I am using Hybrid Renderer V2, in the documentation I only see instancing mentioned for V1. I am using simple URP/Lit Materials, and have a lot of the same ones in the scene. Will setting "Enable GPU Instancing" do anything in this case, or what do I have to do to get the instancing in this scenario? What if I use URP/Simple Lit that is not compatible with SRP batcher? Never mind played around with SRP Versions now its compatible.

    Do you have any more info or could point me to documentation on what that means and how it interacts with the SRP Batcher?
     
  6. Steven_Cannavan

    Steven_Cannavan

    Unity Technologies

    Joined:
    Jan 13, 2020
    Posts:
    4
    Hybrid Renderer both V1 & V2 are built around instancing. Hybrid Renderer V2 has a lot of improvements and has its own shader variant so if your using a default URP shader or one created by Shader Graph it will be instanced correctly. So the requirement to enable GPU Instancing on the material is no longer required. Using the SRP Batcher is a requirement for the Hybrid Renderer.

    If your interested you can confirm the instancing is working form a capture if your using a windows machine to develop and have RenderDoc installed, documentation on using RenderDoc with Unity can be found here. Though fair warning in my test project that has over 0.25 million boxes it can sometimes cause my project to crash using it :( so just in case make sure you have saved before loading or making a capture. But if you do a capture you should see a number of DrawIndexedInstanced draw calls in the RenderLoopNewBatcher.Draw labels.

    Having dynamic batching enabled in the pipeline is the same as enabling it in the project settings in a built in project.

    When you have both enabled, SRP Batching takes precedence over dynamic batching. Dynamic batching historically has only been a saving for older mobiles as it will almost always fail to actually produce decent batches, unless the content is very carefully authored to comply with the dynamic batching constraints. So you will only see dynamic batching occurring in non SRP batches and this shouldn't be a performance issue.

    When using the Hybrid Renderer however the old batching systems can interfere with it, so you should have them disabled. As everything is instanced in the Hybrid Renderer there is no need to statically batch meshes together nor would dynamic make any sense. Static batching doesn't really gel well with the conversion of the GameObject to Entities, if its enabled on objects that are converted into Entities and is rendered with Hybrid they will render incorrectly.
     
  7. imaewyn

    imaewyn

    Joined:
    Apr 23, 2016
    Posts:
    211
    @Steven_Cannavan can you advice the best way for batch big chunks for custom landscape. Vertices in chunks has difference only on y axis. Also I have 2 variants of shaders: with the same materials with using gloabal buffer and differents materials. I don't see any difference for perfomance for this. And Ialways see "saved by batching - 0". Chunks placing dinamicaly and I've tryed StaticBatchingUtility.Combine but 0 result also.
     
  8. cassius

    cassius

    Joined:
    Aug 5, 2012
    Posts:
    125
    Am I understanding correctly that when doing GPU instancing (Graphics.DrawMeshInstanced) it is not possible to use reflection probes?

    Interesting thread!
     
  9. ishangill

    ishangill

    Joined:
    Nov 28, 2017
    Posts:
    36
    somebody please help me.
    I am confused with SRP batcher
    I am using URP (pipeline) for my android project.
    I want to know about SRP batcher. If i enabled SRP batcher then in Stats Verts is 5.6k and tris is 2.5k.
    And if i disable SRP batcher then verts is 85k and tris is 115k.
    Is SRP batcher is good for android performance or not.
     
  10. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Use SRP batching on all platforms if you can, it is intended to be the replacement, after all.
     
  11. ishangill

    ishangill

    Joined:
    Nov 28, 2017
    Posts:
    36
    Thanks a lot.
    Now I will use SRP batching.
    And tris and verts is reduced from 85k to 2.5k
    with SRP batching.
    Again thanks .