Search Unity

Even poorer performance for ECS Hybrid Renderer comparing to traditional GameObjects?

Discussion in 'Graphics for ECS' started by Seto, Apr 16, 2019.

  1. Seto

    Seto

    Joined:
    Oct 10, 2010
    Posts:
    243
    https://github.com/SetoKaiba/TestECS
    Here's the github to test with.
    I make a simplified project for benchmark.
    For traditional GameObjects approach, it's with better performance.
    For ECS Hybrid Renderer approach, it's with poorer performance.

    Both instantiate 1000 different meshes(clones, so they're different) for rendering.
    So the rendering is not instanced.
    I'm just to compare the basic performance without instanced.

    The current ECS approach is with poorer performance with non instanced rendering.
    Is that intended?
    Or do I miss something?
    The RenderMeshSystemV2 costs too much CPU.
    1.png
    2.png
    3.png
     
    Last edited: Apr 17, 2019
  2. runner78

    runner78

    Joined:
    Mar 14, 2015
    Posts:
    792
    Try activate GPU Instancing in the Material.
    I recently tested render 10000 cubes with ECS and (HDRP on 2019.1). With a old Radeon HD 7850 i have 80-100 fps in Editor.
     
  3. Seto

    Seto

    Joined:
    Oct 10, 2010
    Posts:
    243
    @runner78 What I'm discussing in this thread is to compare the basic rendering performance.
    I mean non instanced rendering.
    Is the poorer performance intended?
    Or do I miss something?
    The RenderMeshSystemV2 costs too much CPU.
     
    Last edited: Apr 17, 2019
  4. runner78

    runner78

    Joined:
    Mar 14, 2015
    Posts:
    792
    Can you try spawn the cubes not on the same place?
    Material GPU Instancing reducing drawcalls and save CPU time. In one my tests i also simulated and displayed 200000 simple meshes a the same time with 30 fps, with the old GameObject not possible.
     
  5. james_unity988

    james_unity988

    Joined:
    Aug 10, 2018
    Posts:
    71
    ECS was designed for parallelism. When you try to parallelize a task which cannot run in parallel, you inherit all the overhead of the architecture without any of the benefits. You are showing us a contrived example of having 1000 objects managed by the same system which cannot be batched together. This isn't really what ECS was designed for. Also, the CPU overhead for this worse-case scenario could be improved in the future as the API matures.
     
  6. Zuntatos

    Zuntatos

    Joined:
    Nov 18, 2012
    Posts:
    612
    1000 meshes shouldn't take that long despite worst-case for it right? Did you look at a builds' performance instead of editor performance?
     
  7. That's a misconception. Although it greatly benefits from the parallelism, ECS is not inherently about parallel processing. It's about linear processing without a lot of extra data in-between. When they talk about tight loops, they talk about this.
    What designed for parallel processing is the Job system.
     
    Deleted User and Ryiah like this.
  8. james_unity988

    james_unity988

    Joined:
    Aug 10, 2018
    Posts:
    71
    There would really be no point in taking all the time and effort in migrating over to ECS if you're not taking advantage of, at least to some extent, the Job system and the Burst compiler. I realize that these are separate components and can be used independently. But to suggest they weren't intended and designed to work together is a misconception. Even the term "Pure ECS" implies use of the job system.
     
  9. Chris-Herold

    Chris-Herold

    Joined:
    Nov 14, 2011
    Posts:
    116
    RenderMeshSystemV2 is built around the concept of instancing.
    Testing it against the legacy system with no instancing means you're looking at it with wrong expectations to begin with.
    Of course it will perform worse - it is simply not built for that purpose because it's going to great length to batch draw calls.

    IMHO when you do have a scenario where you have to have 1000 unique meshes generating 1000 draw calls - then arguably you're simply doing things wrong.
    And i'm not even going into the memory implications of keeping 1000 unique meshes in memory.. or the terrible memory fragmentation due to 1000 unique SharedComponents...and many more bad things.

    So in my opinion your test 1. isn't a fair comparison in the first place and 2. it articially supposes a scenario that should never exist in a real application if things have been put together properly.
     
    Last edited: Apr 18, 2019
  10. Straw-man says hello.
     
    Ryiah likes this.
  11. Zuntatos

    Zuntatos

    Joined:
    Nov 18, 2012
    Posts:
    612
    My use case is a voxel terrain system, which generates unique meshes per chunk. It handles about 10k unique mesh draw calls at 60 fps using the standard meshfilter/meshrenderer combo. That's 10x the performance of what the OP has posted while rendering actual meshes in an actual project instead of this test.

    On topic; is there a chance this is an edge case with the data structures used by the renderer? Like some grouping by world position for batching/culling running into issues due to all those meshes being at the same place?
     
    Lahcene, Seto and james_unity988 like this.
  12. runner78

    runner78

    Joined:
    Mar 14, 2015
    Posts:
    792
    I have tested test with 20000 cubes without instanciating, with 2 monitor, one the game, the oder the scene view.

    GameObject: 25-30 FPS
    Entites: 15 FPS
    Entities with deactivated Job leak detection and Bust safty checks off: 55-60 FPS
     
    Seto and Zuntatos like this.
  13. Seto

    Seto

    Joined:
    Oct 10, 2010
    Posts:
    243
    @james_unity988 @Chris-Herold No. You're not correct. Like @Zuntatos saying, I'm trying to migrate my voxel terrain system to ECS. I can take advantage of Job to generating the terrain asynchronously. The official declared ECS as a replacement of traditional GameObject system instead of a complement. So I think at least for non instanced rendering. It should be with equal performance. At least, it should not equal.

    @Zuntatos Yes. That's what I'm trying to express. I'm migrating my voxel terrain system to ECS as well.

    @runner78 Thank you. I found out that as well. I try to test with Standalone players instead. The performance is better. But the performance of ECS is still unstable. ECS is with 14-22ms while GameObjects with 16ms. Can you please share your benchmark on GitHub? Thank you.
     
  14. iamarugin

    iamarugin

    Joined:
    Dec 17, 2014
    Posts:
    883
    Did you find any solutions? I am facing the same issue with my planetary terrain system. Due to its nature it consists of only unique meshes.
     
  15. B1QQ

    B1QQ

    Joined:
    Aug 16, 2018
    Posts:
    10
    still looking for a solution.
     
  16. Radu392

    Radu392

    Joined:
    Jan 6, 2016
    Posts:
    210
    Avoid RenderMesh altogether and build your own rendering system. There are literally dozens of threads in this subforum on this topic. Use the search bar if you want ‘inspiration’ on that.
     
  17. B1QQ

    B1QQ

    Joined:
    Aug 16, 2018
    Posts:
    10
    Exactly what i did