Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Resolved Reduce the number of batches in RenderMesh without GPU instancing

Discussion in 'Graphics for ECS' started by hojjat-reyhane, Apr 24, 2021.

  1. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Hi,
    I want to know if there is a way to reduce the number of batches using RenderMesh without enabling GPU instancing in the shader?
    I published a 2d RTS game and used Graphics.DrawMeshInstanced() but about 20% of players android devices doesn't support instancing.
    I think the only option for me is RenderMesh, but the high number of batches is the barrier.
    Also some shaders has no option for GPU instancing.
     
    Last edited: Apr 24, 2021
  2. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    You can enable dynamic batching. Unity will then automatically combine meshes up to 300 vertices in size into one mesh reducing drawcalls.

    You can also manually combine meshes through your code. This will be paid in cpu cost though. However if you do this manually, you can multithread it.
     
  3. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    As I tested enabling dynamic batching doesn't work with ECS RenderMesh.
    Are you talking about combining meshes using RenderMesh?
     
  4. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Every frame, you can manually combine some meshes into larger ones, then use render mesh on the new meshes you create. This will reduce batches and you can use it with render mesh. Doing this manually is faster than dynamic batching anyway, because you can combine different meshes on different threads.

    If you are using render mesh then automatic dynamic batching won't work, but you can do it manually.

    Also, if the device does not support instancing, likely the hardware is not that powerful anyway. So on those devices, you can reduce the quality of your meshes to reduce the CPU load of manually combining meshes.
     
  5. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Thanks, I should to test it.
    Is there any limit to combine the meshes this way?
    I have thousands of 2d units in the scene, and they are animated using 30 materials each with a 1024 sprite sheet texture. Can I combine the meshes with different materials or same material with different UV?
     
  6. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    I am not an expert, but if they have different materials it is very difficult to combine them into same mesh. But if two meshes have multiple materials, but they are the same, I think you can combine them. Basically you would create a new mesh, and fill with vertices of mesh 1, then fill with vertices of mesh 2. Then pass this new mesh to render mesh.

    You can achieve this two ways. Remove render mesh component from each entity, and have separate entities for combined meshed, with render mesh component containing combined mesh.

    Or, do not use ECS render mesh and use Graphics.DrawMesh instead. Manually calling the draw yourself.

    I believe they can have different UV, vertices, but must share materials. There is no limit to how many you can combine, but be aware that combining too many into one mesh may lead to poor CPU performance. So you must test this to see how it can perform.

    I have done this with combining dozens of 3D meshes into one on mobile devices and worked well.
     
    Last edited: Apr 24, 2021
    hojjat-reyhane likes this.
  7. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Another solution is to give meshes LOD. Meshes far away you can use all the same mesh and material and will make batching easy. Depends on your game though.
     
  8. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Thanks again.
    I tested Graphics.DrawMesh without combining and it had some problems.
    My game is a 2d RTS in one view, so I can't use LOD.
    So I have to test combining the meshes. But I'm concerned about layering objects correctly using this method.
     
  9. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    You can see the history of my problem Here and Here.
     
  10. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    The game has an isometric view. So the lower unit or object should be in front.
     
    Last edited: May 9, 2021
  11. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    1. Sort all your meshes into ones which are overlapping and ones which aren't. You should use spatial partitioning like a grid,hashmap,or tree. Any meshes not overlapping you can simply combine and batch.

    2. For all meshes which do overlap, you will need to test. I believe vertices are drawn in the order they appear in the array, I am not 100% sure. If you test and find this is true, when combining meshes which overlap, add the bottom meshes first, up through to the top. I am not too familiar with 2D, but this may be the solution.

    Edit: By bottom meshes I mean the ones in the back.
     
    hojjat-reyhane likes this.
  12. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Ok. Thanks a lot, I will give it a try and post the result.
     
    MintTree117 likes this.
  13. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Awesome! Do let me know, as I have other ideas which may help.
     
  14. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Well I tested a simple mesh combining with 2 materials.
    I managed the layering with setting z position of the meshes.
    But I have one problem. I don't know how to set different UVs for the combined meshes!
    Right now all the combined meshes are showing same tile of the texture.
    Each texture of soldiers contains 128 frames which I use to animate them.
     
  15. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    You can set the UV the same way as vertices: Mesh.uv. Are you having trouble with this, or are you doing this but they are still showing the same tile? Forgive me if you already know this, I just want to make sure.

    https://docs.unity3d.com/ScriptReference/Mesh-uv.html
     
    hojjat-reyhane likes this.
  16. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Thank you very much.
    Yes I knew about Mesh.uv but I didn't know I could use it with mesh combining. Also I was not curious enough to test it.
    I managed to combine meshes with different UVs.
    But right now it has serious performance issue with 1000 units because I'm testing with Monobehaviour Update.
    I should to take it to the ECS and do lots of optimizations to see if it works or not.
     
    MintTree117 likes this.
  17. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Good to hear you got it working! Multi-threading and using burst should fix any performance issues. Try to keep combined meshes below 300 vertices because around that point the cpu cost of combining outweighs the performance from batching. For example, if you are combing all 1000 meshes into 1 it is much better to create more batches (number of batches you choose depends on how many vertices your meshes have), with smaller combined meshes.
     
  18. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    About keeping the combined meshes below 300 vertices, as I'm combining quads so the limit would be 75 meshes. Right?
    But as I tested the process of splitting the similar meshes into 75 groups is much heavier than combining all of them into one mesh.
    Test result with 2000 units and 12 materials:
    If I combine all of the same units into one mesh => 12 batches, 12 setPassCalls, 70 fps, each mesh up to 600 vertices
    If I limit the combining to 75 meshes => ~22 batches, ~22 setPassCalls, 60 fps, each mesh limited to 300 vetices
     
  19. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Interesting. If it works for you than keep doing it. I was just quoting the Unity documentation. It could be that on your mobile drawcalls and set/pass calls are more significant than the cpu overhead of combining meshes.

    Also because you are using 12 materials. I suspect if you only had 1 or 2 materials smaller meshes would be more performance. But since each material is a drawcall and set/pass, its more performant for you to combine into large meshes.
     
  20. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Thanks to you @martygrof3708 I succeeded to make it.
    Although the bottleneck is the mesh combining which I have to do it on the main thread. And also it has too much allocation each frame (about 120kb).
    If I could do that inside a job, it would be great. Any suggestion?
     
    MintTree117 likes this.
  21. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    You can sort entities into "thread-groups", then you can just combine entities meshes in those groups in different threads. Sort them into separate lists then run IJob on each list on different thread.
     
  22. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    The problem is I can't put mesh in my entities component.
    Let me explain the way I'm doing it.
    Each entity should has it's own mesh. So I have a mesh pool (list of meshes in monobehaivour) and for each entity I store it's mesh index.
    Then I'm using mesh index to combine them.
    If I want to put mesh in my component I have to change it to ISharedComponentData.
    As the meshes are not shared, I think this is not ideal.
    I read this somewhere: "every entity with a different mesh will get its own chunk, using up a lot of space unnecessarily."
    Another problem is that I can't put components with mesh into native collections:

    InvalidOperationException: Unity.Rendering.RenderMesh used in NativeArray<Unity.Rendering.RenderMesh> must be unmanaged (contain no managed types) and cannot itself be a native container type.


    So how I can send meshes to the job?
     
  23. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Sort entities into native list by mesh index and do that on different threads. Don't use ECS for this.

    But why do you have to use sharedcomponent?

    You can put mesh.vertices and mesh.uv into job then copy them back to a new mesh on main thread.

    If you want to use ECS, store dynamic buffer of vertices and UV of each entity, then combine them into native list in jobs, then create combined meshes from list of verts and UV on main thread. You can also use c# threading to put the main thread part on another thread and then do audio or something on the main thread.
     
    Last edited: May 9, 2021
    hojjat-reyhane likes this.
  24. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    If I can combine different lists of verts and UVs into one list of verts and UVs, I think that's going to solve my problem.
    But due to lack of knowledge about meshes, verts and UVs, I'm not sure how to do that.
    I think first I have to apply localToWorldMatrix on the verts and UVs of each entity then append them into one list. Then create new mesh with this verts and UVs on main thread.
    Only appending works or I've missed something?
     
  25. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Ok, Thanks again.
    Now I only have UVs in my components.
    As my game is 2d and the mesh is a quad, there is no need to store vertices and triangles.
    So in job I calculate and apply localToWorldMatrix on quad verts, then calculate triangles array and append entity uvs array.
    Finally I assign the arrays to the mesh, and that's it. 50% more efficiency than combining meshes on main thread. :)
     
    MintTree117 likes this.
  26. Guedez

    Guedez

    Joined:
    Jun 1, 2012
    Posts:
    831
    Does android support
    ComputeBuffer
    ?
    I would have a huge mesh that is just a bunch of quads centered around 0,0,0 with the UV.x being the vertice index. Then, in a shader, read from a
    ComputeBuffer
    which sprite it needs to draw from a atlas and where, and let the vertex shader handle it all, the position you read from the ComputeBuffer and the UV are infered by the vertex index in UV.x. Just decimate the any leftover quads (if spriteID==-1 then pos.xyz = 0,0,0)
    It will be a whole lot of work and I can't guarantee it will be faster, but it's an idea.
    You can even use burst and parallelism to generate the data array that will be fed to the
    ComputeBuffer
    .
    Just to be sure to use
    ComputeBuffer.SetData
    rather than
    ComputeBuffer.BeginWrite/EndWrite
    , I am not entirely sure when/why/how the second pair of functions are meant to be used, but they sure as heck are not as consistent as
    SetData
    , even though that
    NativeArray
    they return seem so inviting to feed into a bursted job.
     
  27. MintTree117

    MintTree117

    Joined:
    Dec 2, 2018
    Posts:
    340
    Awesome! I'm so gad you got it working :) I am curious, what are the before and after benchmarks with this method?
     
  28. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    Some android devices are not supporting ComputeBuffer. So maybe it's not the best option for android.
    I tested the method below on one Xiaomi device, and it didn't work.
    https://forum.unity.com/threads/1-million-animated-sprites-at-60-fps.811116/
    Also this method was not efficient enough for multiple materials or huge sprite sheets.
     
    Guedez likes this.
  29. hojjat-reyhane

    hojjat-reyhane

    Joined:
    Feb 4, 2014
    Posts:
    49
    I tested my game with 1200 units (moving and have animation).
    Combining meshes on the main thread ~ 80 fps
    Combining vertices and ... with job ~ 115 fps