Search Unity

Question Static batching support for static entities using Entities Graphics?

Discussion in 'Graphics for ECS' started by zenbin3d, Sep 20, 2023.

  1. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    Is there any way to have static batching support for static mesh entities in a subscene?

    I cannot efficiently use GPU instancing or rely on SRP Batcher in my situation because of using up to 4 LODs per entity with different meshes per entity and the number of draw calls goes up exponentially.
    With static batching I can keep the draw calls count very low considering all different meshes share the same material.

    Manually combining the meshes is not a solution also because I won't benefit from frustum culling anymore.
     
  2. JussiKnuuttila

    JussiKnuuttila

    Unity Technologies

    Joined:
    Jun 7, 2019
    Posts:
    351
    Static batching is not supported for Entities Graphics, and we have no plans to support it at the moment.

    If you are looking to improve your per draw call cost, you could investigate combining the meshes as separate submeshes in a single mesh. This user reported a speedup from doing so: https://forum.unity.com/threads/how-to-toggle-rendering-of-worlds-systems.1483371/

    Under the hood, this should be relatively similar to using static batching in terms of performance, since in both cases Unity does not have to switch vertex buffers between the draw calls (static batching packs objects inside a vertex buffer similarly to how submeshes are packed inside a mesh). This way, you still get full visibility culling. In cases where a lot of the objects are visible, you might have more draw calls compared to static batching, but the extra draw calls will be the cheapest kind of draw where very little state has to change between the draws.
     
  3. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    First I have to admit I'm a bit surprised. I thought Entities Graphics is trying achieve parity with built-in rendering features and/or improve in various areas like it has done for highly efficient GPU instancing.

    Regarding your suggestion, I apologize, if I reiterate below just to make sure I understood correctly.

    Creating a mesh with dozens of submeshes each one with the same material and LODGroup would benefit from frustum culling per submesh and at the same time would be as cheap as having them statically batched even if the number of draw calls would be up to the number of dozens?!?!
    When having that dozen of meshes statically batched, I would only get maybe up to 2 draw calls if I'm correct.

    How can these 2 approaches be equivalent? What am I missing?!
     
  4. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    Oh, maybe because with static batching, updating the index buffer to partially render a static batch could take up more CPU than extra cheaper draw calls you mentioned above?
     
  5. JussiKnuuttila

    JussiKnuuttila

    Unity Technologies

    Joined:
    Jun 7, 2019
    Posts:
    351
    Static batching is a feature where we feel like the technical problems it causes internally are worse for Entities Graphics than the benefit it brings. We prefer to try to optimize the BatchRendererGroup path instead, so you can get good performance without static batching.

    Packing meshes into submeshes will not be exactly the same as static batching. Let me try to elaborate on this a bit.

    Static batching works by essentially packing the objects into submeshes like I suggested, such that the transforms of the objects will be baked into the vertex buffer. This uses a lot of memory, because every object needs their own vertices, even if many objects would be sharing a mesh otherwise. At runtime, Unity will cull objects normally, and if after culling and sorting it discovers that it's rendering consecutive submeshes (let's say submeshes 3,4,5,6), that have exactly the same properties, then it will render those with a single draw call. If the submeshes are not consecutive (let's say submeshes 3,5,7,9), then you will get 4 draw calls, but they will be cheap draw calls. As far as I know, there is no index buffer updating happening, so the draw call count depends on what is visible, and which visible objects happened to be next to each other.

    If you have the same situation with manually packed submeshes using Entities Graphics, you will always get 4 cheap draw calls, and it will be similar to the non consecutive submeshes situation. It is possible that this is still slower than static batching, but it should be faster than having 4 completely separate meshes, so it should hopefully give you a performance increase.

    Regarding LODs, those should only hurt instancing if the actual meshes are unique. Entities Graphics is capable of batching together instances across many LOD levels in cases where the LOD meshes are the same, provided that the instances also have the same Material and belong to the same Entities Graphics batch. For example, if you have 8 objects with the same mesh and 4 LODs each, and you have two visible LOD0, one visible LOD1, no visible LOD2 and five visible LOD3, you should get three instanced draw calls with instance counts 2, 1, and 5.
     
    UniqueCode likes this.
  6. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    Thank you so much for the extra details Jussi! Understood!

    It always bugged me that Unity's static batching system is incredibly opaque / black-boxed. No one outside the engine could wisely pick the right tool for the job without guess work or just by reading some high-level documentation tips without the low-level details that can hide a lot of cons.

    Sounds to me that Entities Graphics is becoming a specialized GPU instancing beast of a framework that could also use a secondary feature that might help for situations where different meshes (with the same materials) would be used multiple times: bindless vertex data?
    I'm guessing of course that this would open up some powerful combinations between GPU instancing and bindless vertex data usage when rendering to allow reduced draw calls even for meshes with LODs that might not generate different draw calls. Not to mention reduced rendering state changes also. Right?
     
  7. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    Hey @JussiKnuuttila ! Did some tests with GameObjects workflow mesh renderers with submeshes and apparently there is no frustum culling done per submesh. It is only done per mesh renderer. Which kind of makes sense.
    This means that combining multiple meshes into a bigger mesh object with sub-meshes is not feasible because frustum culling is thrown out of the way.
    Though I noticed it is a tiny bit cheaper to draw multiple sub-meshes than separate mesh renderers probably also because there's not per-submesh frustum culling checking and also because, like you said, if all submeshes have the same material it does seem to have less state changes when they're getting rendered than if they're separate renderers.

    It seems that using submeshes as an alternative to static batching in a ECS project is not a viable solution because there is no frustum culling per submesh.
    I really think that Entities Graphics badly needs an extra rendering system that would also support multiple different meshes with the same materials / shaders besides the GPU instancing based one.
    Am I missing something here?

    Is there maybe a decent way (as in not requiring a month of work) to implement a custom static batching system with frustum culling support using ECS?
     
  8. JussiKnuuttila

    JussiKnuuttila

    Unity Technologies

    Joined:
    Jun 7, 2019
    Posts:
    351
    Entities Graphics represents submeshes as separate entities in 1.0, and will do it in 1.1 as well unless the ENABLE_MESH_RENDERER_SUBMESH_DATA_SHARING scripting define is set (the define does not affect 1.0). In that situation, it should cull those entities separately.

    However, it is probably using the bounds of the entire Mesh as the default bounding box to cull with, so in order to get submesh-specific culling you have to manually set the right mesh bounds to the RenderBounds component, as Unity does not automatically compute submesh specific bounds (they are not usually used like this).
     
  9. zenbin3d

    zenbin3d

    Joined:
    Jul 7, 2023
    Posts:
    25
    Ooooh! Ok! Got it! So it's like using a hack / workaround to push a bit more cheaper draw calls using Entities Graphics.
    A bit tedious workaround...but...I guess...

    Thank you very much for the extra insight @JussiKnuuttila.