Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Help me with a performance bottleneck; lots of calls to a single function

Discussion in 'Scripting' started by Nanako, Dec 3, 2014.

  1. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    Hi all, i'm currently hitting a bottleneck. Some context;

    I'm presently writing a culling manager. It's purpose is to track hundreds (maybe thousands) of small physics objects, and take actions against them to ensure the game's framerate remains consistent. Stuff like altering their collision filter, collider shape, or just plain destroying them if needed. I'm not that far in yet though.

    Right now i'm at the registration stage, all these little objects register themselves with the culling manager whenever they become dynamic, and i'm hitting a bottleneck here when a few hundred objects are simultaneously made dynamic, the game freezes for almost 3 seconds, which isn't acceptable.

    the registration function exists as a public function within the culling manager, it is as follows;

    Code (CSharp):
    1.     public void Register(FracturedChunk chunk)
    2.     {
    3.         timer.Reset();
    4.         timer.Start();
    5.  
    6.         ChunkRecord temp = new ChunkRecord(0, chunk);
    7.         temp.priority = CalculatePriority(temp);
    8.         debris.Add(temp);
    9.  
    10.         timer.Stop();
    11.         totalTimer.Stop();
    12.         //print("Registration done, time taken: " + timer.Elapsed);
    13.  
    14.     }
    debris is a generic list, nothing special there. ChunkRecord is a struct, not a class. I chose a struct on the belief it would help performance. It holds an integer and a reference to a script (of FracturedChunk class)

    CalculatePriority is currently a stub. Some math is done here to calculate an initial priority value, right now it's very simple but i'll be adding some more number crunching to it in the future, which will only make this problem worse, not better;

    Code (CSharp):
    1.     int CalculatePriority(ChunkRecord input)
    2.     {
    3.         float temp;
    4.         temp = input.chunk.Volume * multiplierVolume;
    5.         int newPriority = Mathf.FloorToInt(temp);
    6.         return newPriority;
    7.     }
    to be honest i'm not entirely sure why i made it an int. another assumption of speed that may or may not be true, it's internally calculated as a float and floored to int before being returned, i'll be adding a fair bit more math before that return step in the future

    Maybe these functions can be farther optimised, but i'm thinking some broader solution is needed here. The problem isn't that it's slow, it's that i'm calling it 500 times almost simultaneously. I need to redesign my code to fix that, but i DO still need to call it a whole lot in a short space of time.


    I think the ideal solution, if possible, would be to allow all the Register calls to pile up, like an inbox, and let the cullingManager work through them at its own pace. That plus maybe limiting it to some maximum amount of time spent processing them per frame, before yielding and waiting a frame?

    Performance is the primary concern here. Whatever solution i implement, it's first and foremost goal has to be to not-make the game freeze up, this is happening at runtime during an action game. Processing all the registrations quickly is a high priority too, but it takes second place to maintaining a stable framerate.
     
    Last edited: Dec 3, 2014
  2. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    i'm presently looking into threads, this is a bit new to me. I'm not sure if it will help though, since there still seems tobe a single point of load in the culling manager. i need some guidance here
     
  3. Niiro

    Niiro

    Joined:
    Dec 3, 2014
    Posts:
    1
    I think part of the performance problem is the redundancy and breaks by using a method call.

    I'd also recommend you change your struct to a class, since I strongly do not think there will be a performance gain, and generally object references inside structs will end up doing weird things, and is never recommended.

    Code (CSharp):
    1. public static void RegisterOpt(FracturedChunk chunk)
    2.         {
    3.             //timer.Reset ();
    4.             //timer.Start ();
    5.            
    6.             debris.Add (new ChunkRecord(Mathf.FloorToInt(chunk.Mass * multiplierVolume), chunk));
    7.            
    8.             //timer.Stop ();
    9.  
    10.         }
    I haven't been able to recreate any bottleneck issues on pure primitive objects. (I don't know what a Fractured Chunk contains, so I just filled it with primitives with randomly generated data) When running the code, I don't hit a second in the timer until I hit 25,000 Fractured Chunks with 400+ primitives inside each chunk.

    I don't think anything is wrong with your code, because even shortening it to one line only optimizes it to maybe ~25-33% faster processing times.

    I think the bottleneck is more related to when objects are made dynamic and not the registration process. Unless using structs is causing an issue somewhere.

    I don't know Unity very well, but from a C# standpoint there's nothing super crazy wrong with the code you're using currently.
     
  4. MakeCodeNow

    MakeCodeNow

    Joined:
    Feb 14, 2014
    Posts:
    1,246
    Just wanted to suggest that while you can probably make your code go faster, I'm really skeptical that this path will be a net performance improvement. Much of what you are doing is already done within the core of the PhysX engine. Even once you get the main registration faster, you'll find that calls to change layers and activate and deactivate objects cost quite a lot of CPU.

    Threads probably won't save you either, for a few reasons. The first is that there are scheduling issues in Unity where threads may be suspended for ~100ms without warning. That's ok for background workers but really not good for systems that need to complete within a frame. Even without that, you can't call any Unity API from threaded code, which makes things even more difficult to write in a way that's performant.
     
  5. Stoven

    Stoven

    Joined:
    Jul 28, 2014
    Posts:
    171
    What type of collider(s) are attached to the Chunks? Maybe the Physics Engine is recalculating each Rigidbody's Colliders separately when it notices each one has changed. That may result in a 500 * CollidersInScene recalculations since the objects in the scene need to know if they're colliding with the objects now that they're dynamic and the dynamic colliders also need to know if they're colliding into an object, but the Physics Engine is not aware of this until a request is made for it to watch and control each Rigidbody and the Colliders that the Rigidbodys contain.

    I don't know how many Colliders are in your scene, but I'm assuming this is the potential problem and that you may have to split up setting the isKinematic value of fragments across different Updates.

    I don't think that using a different thread will be a valid solution if setting a ton of Rigidbody's isKinematic during runtime is causing the slowdown. Using a Coroutine or simply using a timer in Update and splitting up the job to a small number of isKinematic = false assignments per Cycle might be better.
     
  6. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    I swear the registration is the problem here. Things run pretty flawlessly if i just comment out the call to Register (well as smooth as they can with lots of objects, which is to say slow, but not frozen). There is No Freezing without the registration call.

    not more than the cpu cost of thousands of intercollisions, or the cost of leaving an object for several minutes, i'd wager. Besides that, this is a tool, i will tweak it to suit the purpose. It's for rendering load and player mobility around the level, as much as for reducing physics load. I don;t need design advice in that regard. I am doing this.
    just help me make it work :p
     
  7. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    none of this is the problem.

    With 100 objects and no registration for example, there's a noticeable dip in the framerate, that frame takes 0.1 seconds to process, which is kind of slow, but mostly acceptable

    With the registration code turned back on, and those same 100 objects, the game freezes for just over 1 second. commenting and uncommenting the call to Register, and nothing else.
    This is completely reproducible, i just tried it with 3 seperate sets of objects in the same run session.
     
  8. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    is there anything in my code that might trigger aggressive garbage collection? that's one possibility i could think of.
     
  9. Stoven

    Stoven

    Joined:
    Jul 28, 2014
    Posts:
    171
    Have you tried increasing the Capacity of the List before adding elements to it?

    I have my doubts that the resizing is the problem, but that's one optimization you can perform (setting the List Capacity so it doesn't need to resize very often initially).
     
  10. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    this is interesting. did you use a struct in this example?

    i'll try changing it to a class and see if that does anything
     
  11. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    so far i've tried turning chunkrecord into a class, and setting the list capacity to 500 in Awake, this doesn't seem to have affected anything.
     
  12. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    Some more findings; i'm running some benchmarks with the stopwatch class, and they don't seem to show the freezing at all

    Code (CSharp):
    1.     public void Register(FracturedChunk chunk)
    2.     {
    3.         timer.Reset();
    4.         timer.Start();
    5.         totalTimer.Start();
    6.  
    7.         ChunkRecord temp = new ChunkRecord(0, chunk);
    8.         temp.priority = CalculatePriority(temp);
    9.         debris.Add(temp);
    10.  
    11.         timer.Stop();
    12.         totalTimer.Stop();
    13.         print("Registration done, time taken: " + timer.Elapsed);
    14.  
    15.     }
    I don't reset the totalTimer between registrations, and i have it readout once per second in a coroutine.




    The very first registration takes some 50 times as long as all the preceding ones, what's going on there?

    However in any case, the total time doesn't even approach a thousandth of a second. And yet the thing is clearly freezing for over a second just from the registration process. What gives ?


    i've also tried outputting deltatime in every frame in Update() and that didn't show much apparent slowdown at all. Does that mean the entire engine is freezing up, or otherwise just not running, for that period? if it's apparently not recording times.
     
    Last edited: Dec 3, 2014
  13. Stoven

    Stoven

    Joined:
    Jul 28, 2014
    Posts:
    171
    Where in code is Register being used? How is the List being used? Are you doing something to the List each Register call, for example (besides what's in the function*)?
     
  14. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    Oh my god. You guys are useless :p

    and so am i. :(

    I've finally figured it out. It was the PRINT statement. Seriously, the outputting of debug messages is what's slowing it down. I remove that and it works perfectly. How did nobody catch that?
     
    Niiro likes this.
  15. Stoven

    Stoven

    Joined:
    Jul 28, 2014
    Posts:
    171
    I assumed you were commenting it out each test. Your original post does show the print statement commented out.
     
    Nanako likes this.
  16. Nanako

    Nanako

    Joined:
    Sep 24, 2014
    Posts:
    1,047
    you're right, it does, i edited what i posted, from what i had in code, for neatness >.<

    for some reason i assumed that debug messages were an irrelevant issue in terms of speed
     
    Niiro and Stoven like this.