Search Unity

Trying to optimize - Why do i get CGcalls, and an entity i cant get rid of

Discussion in 'Entity Component System' started by avvie, May 17, 2018.

  1. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    So i made a small project to explore the ecs.
    I was looking at the profiler and saw some cg calls under the MeshInstanceReneder (i am not familiar with profilers, so i might be drawing wrong conclusions).
    I put the project onto github here if someone wants to clarify. The bottleneck in the project is the cpu (i am on a ryzen 1700).
    On an boid example with fish swimming around i could get 250K boids on the screen at 24fps, again the bottleneck was the cpu.
    The project on github has a settings gameobject to set values. At 200K i am at 17fps with just cubes.
    I feel like i am definitely doing something wrong. It might be the Lighweight Rendering pipeline ? that i used to play around the shadergraph? no idea still.

    I was wondering if someone can help figure out where i lose the juice.

    Thanks in advance
     
  2. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    I just want add the profiler pic for convinience
     
  3. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    @avvie I'm not sure if I'm understanding this correctly, but you seem to be concerned that it's the CG.Alloc on the MeshInstanceRenderSystem causing the CPU to bottleneck? Looking at the screenshot I'm not seeing large amounts of time being spent in there. It's also tad difficult to see what else could be causing the issue from the Hierarchy view. If you look just under the graph view there's a drop down option (click on Hierarchy) where you can set the view to timeline. That should give you a better understanding of what systems are eating up time on the processor and why.

    I've also looked at your project on Github. It doesn't quite look like what's been done would be considered the best approach. Firstly you're creating the Cube Archetype on every frame. This isn't necessary. It's not something that changes at all throughout the lifecycle of the system. You can rather do this setup by overriding the OnCreateManager method in your system and doing the Archetype setup there. You could even go as far as just doing Archetype setup in your Bootstrap methods which would only be called once per scene and exposing those as static variables. Also take note that you have a EntityManager available from inside the System (it's contained in the base classes, but it is exposed using protected so that inheritors can access it.).

    Also note you're doing a lookup for the MeshInstanceRendere on a GameObject in every frame which would also be eating some chunk of your performance. Alongside this GameObject.Find is likely also going to provide quite a performance hit. I might be completely incorrect on this, but it should show quite quickly if the CubeSpawnSystem is causing a slowdown if you view the profiler in it's Timeline view.
     
  4. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    I was just about to mention most of that stuff, after 2 days of digging around.
    I realize i dont need the spawner to run every frame. as it consumes 4-5ms at 200k cubes.
    Instead of doing all those optimizations that you describe i wanted to see what the outcome will be if it just runs once.

    Code (CSharp):
    1. public struct CubeSp{
    2.         public int Length;
    3.         [ReadOnly] public ComponentDataArray<Position> Position;
    4.         [ReadOnly] public ComponentDataArray<Radius> RadiusComponent;
    5.          public EntityArray Entities;
    6.     }
    7.  
    8. protected override void OnUpdate(){
    9.         Debug.Log(_Cubesp.Length);
    10.         if(localCount >= Bootstrap.Settings.nbOfCubes - 1){
    11.             return;
    12.         }
    13. .
    14. .
    15. .
    16. //played with both independantly now i just left both lines in the code
    17. PostUpdateCommands.RemoveComponent<Radius>(_Cubesp.Entities[0]);
    18.         PostUpdateCommands.DestroyEntity(_Cubesp.Entities[0]);
    19. }
    I checked and it does return on the local count at every frame. but i dont recreate the entity anywhere. I even moved the line that creates the CubeSpawner from afterSceneLoad to before just in case.

    Shouldn't that delete the entity, so there would be no injection thus the system shouldnt run at all?

    PS: i also commented out the injection for the actual cubes since i never used it, but that changed nothing as well in the execution time in the inspector..
    Edit2: i get InvalidOperationException: The NativeArray has been deallocated, it is not allowed to access it, on either the removecomponent or destroy entity. So thats the reason it doesnt get destroyed.. am I missing the way to destroy it?

    EDIT: is there an attribute like [AlwaysUpdateSystem] that forces it to run once? or to set the SystemRunning to false

    EDIT3:
    That i did, but the actual logic of the thing only runs in the first frame. because of the check i do at the start. But Invoking the OnUpdate of the system and anything under it, even if it does an if statement and returns takes 3-4ms

    EDIT4: I changed the settings in the lightweight rendering pipeline. to 1 light, no cascades, no shadows, no MSAA. got 10fps out of that as well. Although i dont like the solution. That doesnt resolve my other problem but i managed to get extra juice out of it
     
    Last edited: May 18, 2018
  5. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    So i got 33-36ms per frame at an i7 with 90% utilization of CPU.
    I managed to stop the system with
    Code (CSharp):
    1. World.Active.GetOrCreateManager<CubeSpawnSystem>().Enabled = false;
    No idea if its good practice or not. If anyone has any ideas, if its even possible. I am trying to see how much more i can squeeze out of this example

    So now its just the rendering and the CPU. On the CPU i use about 17-18ms so the bottleneck now is on the the GTX1060


    if someone can see how to get even better i am all ears.
     
  6. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    ~facepalm~ Sorry I missed that you're exiting the system early. Have you tried building this and running it as a standalone EXE? I've looked at your profiler screenshot again and realized that it seems EditorOverhead alone is already impacting your performance quite badly. What graphics card are you running?

    Edit

    Sorry only noticed your follow up post after I posted my response
     
  7. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    I'm honestly not sure why this is causing so much overhead. Regarding whether it's best practice or not you don't have to go to the active world to turn off the system. You can just use `this.Enabled = false`. Since the system is currently responsible for turning itself off I feel this would be a more correct way to do so.
     
  8. siggigg

    siggigg

    Joined:
    Apr 11, 2018
    Posts:
    247
    Did you turn off the debug, leak detection etc? If not open the Jobs menu in Unity and turn off: JobsDebugger, Leak Detection, and Burst Safety Checks.
     
  9. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    @siggigg Turning off checks and stuff did not change anything, as far as i can see.

    @Cyberwiz15 this.Enabled = false; that worked as well and I prefer it. accessing the world was a solution but I like this one better.
    Making a standalone build yielded an extra 5 fps. I put a text on UI to print deltatime and standalone is an average of 0.036s

    The bottleneck now is at the GPU which i can see from task manager to have 25% utilization while the CPU is at 90% over all cores. Although the question of WHY did the system that returned so early each frame took 3ms is still a mystery to me.
    Why would the gpu bottleneck though and be at such low utilization ?

    BTW i updated the github example in case anyone want to check more indepth. maybe different hardware sheds some light on the situation. Although we are talking about 200k cubes
     
  10. Shinao

    Shinao

    Joined:
    Mar 7, 2014
    Posts:
    36
    @avvie Your bottleneck is still at the CPU. The current version of the MeshInstanceRendererSystem is quite slow and is waiting for some changes in the Graphics.DrawMesh methods to allow NativeArrays as input. If I remember correctly from my tests when the changes in Graphics are made we should have an 80% reduction in cpu usage (approximately) on this system.
     
    Last edited: May 18, 2018
    starikcetin likes this.
  11. avvie

    avvie

    Joined:
    Jan 26, 2014
    Posts:
    74
    At the start it didnt make sense. Then i saw the profile on a Ryzen 1700 with a 1080 ti and it was the same output exactly, fps wise.