Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question Stress Testing of game - Optimisation

Discussion in 'Entity Component System' started by mk1987, May 8, 2022.

  1. mk1987

    mk1987

    Joined:
    Mar 18, 2019
    Posts:
    53
    Hi all,

    I am trying to stress test my game to see where i have bottlenecks currently. I've effectively put around 60 space ships in a scene close together so lots is going on with thousands of missiles and missiles going off.

    Obviously thats set the frame rate to the floor but i think with some optimisation it should be fine One of them is the thermal voxel damage model i have. For a single fixed simulation frame its taking up around 6ms of time, this is over multiple lambdas of which 2 are of interest and act on all elements of the damage system. There are roughly 100,000 cube entitys.

    The first takes up around 2ms and essentially updates the current frames temperature based on the adjacent cubes, essentially its a variable called HealthCubeVar_Next thats written and is a function of its own HealthCubeVar and adjacent HealthCubeVar.

    The second takes up 2.7ms and essentially is just HealthCubeVar = HealthCubeVar_Next so that its updating properly ready for the next frame, there are some extra processes in there that check when it goes over certain thresholds which i can optimise using tags and doing elsewhere on much fewer elements.

    Whats interesting is I reduced that second systems as much as possible to just the Var=Var_next and its running in 1.6ms which for such a simple process is indicating that im close to the limit of whats possible and just a function of the number of entitys.

    First System
    Code (CSharp):
    1.    [ReadOnly] public ComponentDataFromEntity<HealthCube> ColCube;
    2.     [ReadOnly] public ComponentDataFromEntity<HealthCubeVar> ColCubeVar;
    3.     [ReadOnly] public ComponentDataFromEntity<CubeLinkedComponent> CoLink;
    4.     [ReadOnly] public BufferFromEntity<HitTargetBuffer> HTB;
    5.    // public ComponentDataFromEntity<IsAlive> LaserMissileAlive;
    6.     public float deltaTime;
    7.     public float OverheatT;
    8.     public float CrticialT;
    9.     ///
    10.     [BurstCompile(OptimizeFor = OptimizeFor.Performance)]
    11.     public void Execute(Entity entity, ref HealthCubeVar_Next HCV1, ref DynamicBuffer<KillBuffer> KB, in DynamicBuffer<HealthCubeLink> HCL)
    12.  
    13. ///some process
    14.  

    Second System..
    Code (CSharp):
    1.         //
    2.         Entities.WithBurst().ForEach((ref HealthCubeVar HCV, in HealthCubeVar_Next HCV1) =>
    3.         {
    4.             HCV.Damage = HCV1.Damage;
    5.             HCV.Temperature = HCV1.nextTemperature;
    6.         }).ScheduleParallel();
    7.         //
    Does anyone have a suggestion how to effectively eliminate the time taken up by the process above?

    My thoughts are..
    A) In my first system could i efficiently make a system that updates HealthCubeVar without changing the values in ComponentDataFromEntity<HealthCubeVar>(), as id be updating something thats in our componentdatafromentity array to my knowledge it wont work.
    B)Is there a efficient means to copy data from one component to another identical one with limited performance cost?

    Not expecting anything positive as for that number of elements a system that does barely anything takes 1.6ms, but if i can eliminate that and implement further ideas i have for optimisation i think i can get the whole system down to around 2-2.5ms of which once i sort out audio my frame rate should stabalise and be fine :).

    Thanks for any help or suggestions anyone has!
     
  2. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,983
    Unfortunately you didn't give us the right information to help you. Here's what I need to start with if I were to suggest anything useful:
    1) Profiler timeline view screenshot (or saved session I can load into Unity myself) showing the operation you don't like taking so long
    2) The full system code of the system that is taking too long
    3) The definition of all custom types used
    4) Assurance that the dominant architypes have a chunk capacity of 16 or larger
    5) Some explanation for why you are performing the data transform, as well as frequency of change on the inputs.
     
  3. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,011
    You could do a test removing the Entities.ForEach with regular job code. Entities manual has some examples. That may give you more options. And check the Burst compiler window for any hints regarding vecorization etc.
     
  4. mk1987

    mk1987

    Joined:
    Mar 18, 2019
    Posts:
    53
    Thanks for replys.

    Think i managed to fix it as it looks like its working and now runs 2.7ms for the entire system, 0.7ms of which I know i can eliminate from another lamda.

    In essence rather than having HealthCubeVar and HealthCubeVarNext where the former is updated by the latter in a seperate system. I now have a bool condition 'tick' in HealthCubeVar and instead of damage and thermal damage variables, i have a float4 where xy and zw represent Var and VarNext and the tick is switched on/off each frame which swaps which one is var and varnext doing away with the need for the second 'update' system. This uses the NativeParallelForRestirction parameter on my HealthCubeVar componentdatafromentity parameter. Could be a useful technique for anyone running a live interconnected system.

    I think for further improvement i'll check those out thanks @SteffenItterheim , for now i think its playable with a obscene amount of bullets/missiles going on as well, i'll post a video tomorrow on my showcase for anyone who might be interested :)

    EDIT: To simplify how i improved things the concept went from..

    • Input - HealthCubeVar Component with Temperature and Damage Floats>
    • ForEach to compare adjacent Cubes Temperatures to derive its own via thermal transfer (2ms)
    • Output - HealthCubeVar_Next Component (clone of HealthCubeVa)r
    • Foreach where HealthCubeVar = HealthCubeVar_Next ready for next iteration.. (1.6ms+)

    The new approach gets rid of the second system and eliminates need for the Var_Next component. This still reads from and writes to componentdatafromentity array, each entity will be written to once per frame.

    • Input - HealthCubeVar Component with Float4 used for Current/Future Temp/Damage and bool to inform which input is current xy or zw (eg. for this pass bool is true and xy is inputs)
    • ForEach to compare adjacent Cubes Temperatures to derive its own via thermal transfer , bool condition flipped per frame (1.6ms)
    • Output - Same HealthCubeVar (In the current example output to zw, bool condition is now false, ready for next frame)
    Since ive reduced number of components i got a double whammy of eliminating an unnecessary system and improved runtime of the main component. This is a full thermal simulation for 100,000 adjacently connected entities (probably 3-4 adjacent cubes per cube), and the whole simulation needed to run before i could update the thermal state. My only concern is if im writing the new data variable at the same time its being read if that causes a problem.
     
    Last edited: May 9, 2022