Search Unity

LINQ vs for(...) loop performance

Discussion in 'Scripting' started by mainhuochoa, Jul 19, 2019.

  1. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    I have a terrain with 30k tree. I want to copy from one to another. I've used for(...) loop and it took ~21s. When i use LINQ, it only took ~7ms.
    My code:
    Code (csharp):
    1.  
    2. void LINQFind()
    3.     {
    4.         Debug.Log("linq find start.");
    5.         sw.Stop();
    6.         sw.Start();
    7.         var trees = terrain.terrainData.treeInstances.Where(v => v.position.x > x0 && v.position.z > z0);
    8.         newTerrain.terrainData.treeInstances = trees.ToArray();
    9.         sw.Stop();
    10.         Debug.Log("linq find time: " + sw.ElapsedMilliseconds);
    11.     }
    12.  
    13.     void ForFind()
    14.     {
    15.         Debug.Log("for find start");
    16.         sw.Stop();
    17.         sw.Start();
    18.         for (int i = 0; i < terrain.terrainData.treeInstanceCount; i++)
    19.         {
    20.             TreeInstance tree = terrain.terrainData.treeInstances[i];
    21.             if (tree.position.x > x0 && tree.position.z > z0)
    22.             {
    23.  
    24.             }
    25.         }
    26.         sw.Stop();
    27.         Debug.Log("for find time: " + sw.ElapsedMilliseconds);
    28.     }
    29.  
    And the result:
    upload_2019-7-19_9-47-59.png
    I'm surprised that LINQ is much faster than for(...) loop. Am i wrong?
    Sorry for my bad English.
     
    Last edited: Jul 19, 2019
  2. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    38,751
    Is that code really equivalent? I don't Linq but I do for(), and as far as I can tell you're not even indexing anything with that i value, and all you are doing is repeatedly asking for the entire set of tree instances, which is probably where your time is being spent.

    What kind of timing do you get when you move this line out of the for loop, since it seems invariant:

    Code (csharp):
    1. TreeInstance tree = terrain.terrainData.treeInstances;
    and do it before the for loop?

    From my reading of the for() code, you are 30k times asking for the terrain subsystem to make you a huge list of 30k trees.

    Edit: Actually, how does the above code even compile? That .treeInstances property returns an array of TreeInstances, not one TreeInstance... if you mean to dereference it, then GET the list once into an array, then dereference the array, don't bang on the engine asking it to make the same 30k list again and again. That's just basic caching smarts.

    Also, please format your code according to the first top post in this forum.
     
    mainhuochoa likes this.
  3. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    Code (csharp):
    1. TreeInstance tree = terrain.terrainData.treeInstances;
    It's my copy/paste error.
    When i use
    Code (csharp):
    1. TreeInstance tree = terrain.terrainData.treeInstances[i];
    it still took ~20s, but when i change to:
    Code (csharp):
    1.  
    2. var trees = terrain.terrainData.treeInstances;
    3.         for (int i = 0; i < terrain.terrainData.treeInstanceCount; i++)
    4.         {
    5.             TreeInstance tree = trees[i];
    6.             if (tree.position.x > x0 && tree.position.z > z0)
    7.             {
    8.  
    9.             }
    10.         }
    11.  
    it only took 9ms.

    Thank you very much. Now it's equal.
     
    Last edited: Jul 19, 2019
  4. mainhuochoa likes this.
  5. Nope.
    Code (CSharp):
    1. TreeInstance tree = terrain.terrainData.treeInstances;
    ->
    Code (CSharp):
    1. TreeInstance tree = terrain.terrainData.treeInstances[i];
     
  6. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    Please see my edited.
     
  7. And you still get the 7ms vs 21 ms and you get the same results in both cases?
     
    mainhuochoa likes this.
  8. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    Yes, it's only change to ~9ms when i declare an array outside for(...) loop and use it instead of direct access to treeInstances.
     
  9. bobisgod234

    bobisgod234

    Joined:
    Nov 15, 2016
    Posts:
    1,042
    I would assume then that treeInstances is a getter that is creating a new array each time you access it.
     
    Art-Leaping and Lurking-Ninja like this.
  10. What happens if you do this?
    Code (CSharp):
    1. var trees = terrain.terrainData.treeInstances;
    2. int length = terrain.terrainData.treeInstanceCount;
    3. TreeInstance tree;
    4.         for (int i = 0; i < length; i++)
    5.         {
    6.             tree = trees[i];
    7.             if (tree.position.x > x0 && tree.position.z > z0)
    8.             {
    9.             }
    10.         }
     
  11. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    Seems like that. I'm quite surprised to this.
     
  12. mainhuochoa

    mainhuochoa

    Joined:
    Jun 29, 2015
    Posts:
    7
    7ms. Great!
    But if i use:
    Code (csharp):
    1.  
    2. var trees = terrain.terrainData.treeInstances;
    3.         int length = terrain.terrainData.treeInstanceCount;
    4.         TreeInstance tree;
    5.         for (int i = 0; i < terrain.terrainData.treeInstances.Length; i++)
    6.         {
    7.             tree = trees[i];
    8.             if (tree.position.x > x0 && tree.position.z > z0)
    9.             {
    10.             }
    11.         }
    12.  
    It took ~20s to complete.
     
  13. bobisgod234

    bobisgod234

    Joined:
    Nov 15, 2016
    Posts:
    1,042
    That's because each loop iteration, the length of treeInstances is reevaluated, and to do that, it needs to call the getter.
     
    mainhuochoa likes this.
  14. It's a good habit to not to use arbitrary length functions in a tight for loop.
    In this case I suspect (I do not know for sure, I'm not an expert in terrain at all), that either the terrain or the terrainData either copies over data every access or reaching to the C++ side for the data.
    This means that the Linq solution copies your array once and work on that, but your for cycle is reaching into the C++ side on every iteration.

    Or something like that. As I said, what I wrote here take with a grain of salt. But something like this happens. And honestly right now I don't feel inclined to dig into the nitty-gritty of the terrain system and data-structures. :)
     
    MadeFromPolygons and mainhuochoa like this.
  15. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    38,751
    This is all just a case of "lift as much stuff all out of the inner loop as possible," ie. Optimization 101.

    This is especially important when you are calling an API whose contract does not specify how performant it is. Basic engineering tells you if you don't know how expensive something is, then you call/use it as little as possible, because while it might be "performant enough" in your testing, if the API specification doesn't tell you "it will always be this fast," (and it doesn't), then you don't know if in the future it will suddenly become even less performant, or perform poorly on one platform or another, for instance.
     
    MadeFromPolygons and Ryiah like this.
  16. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,338
    Yeah, no, Unity's API is full of traps like this. If you're asking Unity for any array, you're pretty much guaranteed that it's creating a new array, and filling it with all of the data, every time.

    There's usually a workaround available. In this case, you're meant to use treeInstanceCount and GetTreeInstance.
     
    MadeFromPolygons and halley like this.
  17. halley

    halley

    Joined:
    Aug 26, 2013
    Posts:
    2,445
    The entire whole POINT of C# "properties" is to hide the dead bodies. It's a getter which you didn't even know was a getter. It's a setter which can spend O(n^3+log n) to set many other things in parallel. I shudder at the work that's hidden whenever you change a
    Transform.localPosition
    value; a whole transform stack can get updated or flagged for lazy updates.
     
  18. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,983
    ECS can alieviate some of that, at the cost of prototyping speed and general usability though
     
  19. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,620
    Unity Technologies and Microsoft, among others, recommend to avoid LINQ where performance is a concern.

    https://learn.unity.com/tutorial/fixing-performance-problems

    https://docs.microsoft.com/en-us/windows/mixed-reality/performance-recommendations-for-unity
     
    marcospgp, AnaLuyza, Novack and 4 others like this.
  20. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,983
    Building on what @Peter77 has said, I tend to only use LINQ for stuff that isnt constantly running, such as setting up stuff at the startup of program or initial setup of a scene. I would never use it in the place of an iterating structure in anything that will be happening once the scene is properly running and updating.

    Generally though you never really need LINQ, it mostly just makes writing the code easier at cost of perf.
     
    Kurt-Dekker likes this.
  21. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    38,751
    The other massive and sometimes hidden cost of Linq is readability/maintainability.

    Now you might be clever enough to read your own Linq constructs, that is if you use it a lot, but when I receive some other engineer's "Look! I'm clever and I put everything in one giant Linq statement chain!" code nightmare and there is a bug filed against it, I can't put a breakpoint anywhere in there and see WTF is going on.

    I have wasted so much time rewriting and unrolling Linq code just to debug someone else's mis-implementation or incorrect underlying data assumption, Linq is by FAR a net negative in my workspace. It has cost our company a fortune in lost time and productivity due to allowing truly abysmal code discipline.

    On the other hand, if the code was just a series of loops with intermediate containers, I could instantly breakpoint at each stage and identify where the problem lies.

    Okay, rant mode is OFF now! Good luck with your Linq-ing!
     
  22. Vryken

    Vryken

    Joined:
    Jan 23, 2018
    Posts:
    2,106
    Totally agree. If i'm going to use LINQ, I typically aim for using it in cases where I only need to run the statement once or infrequently, and I can get the result I need in only one statement.
    If I must use multiple LINQ statements to get the result I need, I separate each statement onto a new line.