Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Vector3 Operations performance

Discussion in 'Editor & General Support' started by Stephan-B, Sep 7, 2011.

  1. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    I did some testing on Vector3 operations (Subtractions Additions) and found something interesting that I don't understand. (btw: I have a bad toothache and didn't feel like doing any serious work so I was just poking around)

    Code (csharp):
    1. IEnumerator VectorOperations()
    2.     {
    3.         Vector3 position = new Vector3(1, 1, 1);
    4.         Vector3 origin = new Vector3(0, 0, 0);
    5.  
    6.         while (true)
    7.         {
    8.             for (int i = 0; i < 25000; i++)
    9.             {
    10.                 Vector3 addition = origin + position;
    11.                 Vector3 subtraction = origin - position;
    12.  
    13.                 Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
    14.                 Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
    15.             }
    16.  
    17.             yield return null;
    18.         }
    19.     }
    The addition subtraction operations each took about 8.5ms to complete in the loop while the (longer form) version only took about 1.85ms each. Why are simple vector operations almost 5 times slower in the first form?

    I realize that in order to see these numbers, I am looping 25000 times which is unrealistic however, it does show a huge performance difference between these two ways of doing simple vector operations.
     
    Last edited: Sep 7, 2011
  2. npsf3000

    npsf3000

    Joined:
    Sep 19, 2010
    Posts:
    3,830
    @ 25k iterations there was no measurable difference, but at 250k there is a measurable difference.

    I'm thinking that there are checks for invalid operations? Or maybe it's something to do with the way structs work?
     
    Last edited: Sep 7, 2011
  3. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    On my PC, at 25K iterations, I am looking at 4.5 times slower. For subtractions it is 8.5MS vs. 1.85ms ... that is pretty big.

    Q6600 running at 2.8 GHZ
     
  4. hippocoder

    hippocoder

    Digital Ape

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    mobile needs to consider these numbers in larger collections too.
     
  5. MrBurns

    MrBurns

    Joined:
    Aug 16, 2011
    Posts:
    378
    Unity 3.4, Intel Core i5 Mobile (slower than Q6600):

    The script below attached to empty scene main camera yields 120 ms for the first two additions and 60 ms for the second two additions.
    That makes 12-24 microseconds for adding two vectors... (quite a bit slow)
    I am not sure what the reason is frankly, also I don't have time to investigate this. But Usually this should be up to 3 orders of magnitudes faster.

    using UnityEngine;
    using System.Collections;
    using System.Diagnostics;

    public class NewBehaviourScript : MonoBehaviour {

    // Use this for initialization
    void Start () {
    Stopwatch watch = new Stopwatch();
    watch.Start();
    Vector3 position = new Vector3(1, 1, 1);
    Vector3 origin = new Vector3(0, 0, 0);

    for (int i = 0; i < 2500000; i++)
    {
    //Vector3 addition = origin + position;
    //Vector3 subtraction = origin - position;

    Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
    Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
    }

    watch.Stop();
    UnityEngine.Debug.Log(watch.Elapsed.TotalMilliseconds);

    }

    // Update is called once per frame
    void Update () {

    }
    }
     
    Last edited: Sep 7, 2011
  6. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    I just tested this with doing a manual Dot Product vs Vector3.Dot in the same Coroutine and got 0.18ms vs. 5.0ms ...

    Code (csharp):
    1. dotProduct = (forward.x * direction.x) + (forward.y * direction.y) + (forward.z * direction.z);        
    2. //dotProduct = Vector3.Dot(forward, direction);
    btw: All my numbers come from Deep Profiling which obviously makes all these numbers look worst overall. However, their relative differences should remain constant (correct?)
     
    Last edited: Sep 7, 2011
  7. MrBurns

    MrBurns

    Joined:
    Aug 16, 2011
    Posts:
    378
    Use the code I wrote ^^. It will measure with around nanosecond precision (at least on a recent computer and hopefully on MONO too)...

    And never use a profiler for such measurements... You won't get any useful information. A profiler is to detect hotspots and has preparations to neutralize its impact on performance measurements. If you do your own measurements in profiling sessions they will most likely be useless... And for this I suspect that the profiler "hooks" into the struct constructor for memory allocation and this is why it is taking much longer for you as for me the way through constructors is twice as fast as the way without them...
     
    Last edited: Sep 7, 2011
  8. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    I am using that now :)

    I now get 52.5094ms using (long form) addition subtraction vs. 114.5767 using the (short form) so the gap is not as bad as it was using deep profiling but it is still half the speed. Granted, that's a lot of iterations to get that which puts things back in perspective.

    I knew the numbers from Deep Profilling would be worst but I never expected the relationship between those to be different. I always thought if the profiler showed a function being half the speed of another, that this would be consistant.
     
  9. MrBurns

    MrBurns

    Joined:
    Aug 16, 2011
    Posts:
    378
    Profilers are usually bad at measuring speed for fast functions, this is why the more advanced ones will sieve them out at runtime and suggest you to add them on an ignore list, since profiling fast function might, as you have discovered now, not only lead to wrong results but it will heavily reduce performance during profiling, as you can notice in Unity, and also potentially invalidate the results of slower function using these profiled fast functions...

    So the bottom line is that profiling is intended for larger functions where the actual profiling overhead is far outweighted by execution time of the function itself.
     
  10. npsf3000

    npsf3000

    Joined:
    Sep 19, 2010
    Posts:
    3,830
    Q6600 @ 2.4 Ghz and both ran at ~1ms.
     
  11. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    Using Mr.Burns Stopwatch method, I still see a big difference although not as significant...

    For instance:

    Code (csharp):
    1.  for (int i = 0; i < 2500000; i++)
    2.         {
    3.             // These two get done in 116.8739ms
    4.             //Vector3 addition = origin + position;
    5.             //Vector3 subtraction = origin - position;    
    6.  
    7.            
    8. Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
    9.             Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
    10.         }
     
  12. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    Using Mr.Burns Stopwatch method, I still see a big difference although not as significant...

    For instance:

    Code (csharp):
    1.  for (int i = 0; i < 2500000; i++)
    2.         {
    3.             // These two get done in 116.8739ms
    4.             //Vector3 addition = origin + position;
    5.             //Vector3 subtraction = origin - position;    
    6.  
    7.             // These two get done in 53.2564ms
    8.             Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
    9.             Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
    10.         }
    That is still twice as fast for manually adding or subtracting vectors. This is true for dot products or any other vector operations I have tested.
     
    Ali-Nagori likes this.
  13. adaman

    adaman

    Joined:
    Jul 5, 2011
    Posts:
    97
    Calling a function or using overloaded operators is going to be slower than using inline code.
     
  14. MrBurns

    MrBurns

    Joined:
    Aug 16, 2011
    Posts:
    378
    This is a managed environment ^^. Its a very vague assumption without knowing how mono does it. It is possible to optimize this, that one great thing about managed environments, called "Runtime Optimization", but I am not quite up to date how much of them they already use.
     
  15. npsf3000

    npsf3000

    Joined:
    Sep 19, 2010
    Posts:
    3,830
    Read my early post. I also noted a differance - and FYI my code was better than Mr. Burns :p
     
  16. Stephan-B

    Stephan-B

    Joined:
    Feb 23, 2011
    Posts:
    2,269
    My apologies but I only referenced Mr.Burns because I am employed at his nuclear plant in Springfield an didn't want to suffer his wrath and yes, you did noted those differences as well.
     
  17. npsf3000

    npsf3000

    Joined:
    Sep 19, 2010
    Posts:
    3,830
    Really? I've heard a lot about your safety inspector.
     
  18. MrBurns

    MrBurns

    Joined:
    Aug 16, 2011
    Posts:
    378
    @NPSF3000: Not that it bothers me much but which improvements do you have for the stopwatch ;)?
     
  19. Aka_ToolBuddy

    Aka_ToolBuddy

    Joined:
    Feb 25, 2014
    Posts:
    536
  20. Aka_ToolBuddy

    Aka_ToolBuddy

    Joined:
    Feb 25, 2014
    Posts:
    536
    Someone who was looking for such optimizations send me a list of all the threads he looked at before finding mine.