# Vector3 Operations performance

Discussion in 'Editor & General Support' started by Stephan-B, Sep 7, 2011.

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
I did some testing on Vector3 operations (Subtractions Additions) and found something interesting that I don't understand. (btw: I have a bad toothache and didn't feel like doing any serious work so I was just poking around)

Code (csharp):
1. IEnumerator VectorOperations()
2.     {
3.         Vector3 position = new Vector3(1, 1, 1);
4.         Vector3 origin = new Vector3(0, 0, 0);
5.
6.         while (true)
7.         {
8.             for (int i = 0; i < 25000; i++)
9.             {
10.                 Vector3 addition = origin + position;
11.                 Vector3 subtraction = origin - position;
12.
13.                 Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
14.                 Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
15.             }
16.
17.             yield return null;
18.         }
19.     }
The addition subtraction operations each took about 8.5ms to complete in the loop while the (longer form) version only took about 1.85ms each. Why are simple vector operations almost 5 times slower in the first form?

I realize that in order to see these numbers, I am looping 25000 times which is unrealistic however, it does show a huge performance difference between these two ways of doing simple vector operations.

Last edited: Sep 7, 2011
2. ### npsf3000

Joined:
Sep 19, 2010
Posts:
3,830
@ 25k iterations there was no measurable difference, but at 250k there is a measurable difference.

I'm thinking that there are checks for invalid operations? Or maybe it's something to do with the way structs work?

Last edited: Sep 7, 2011

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
On my PC, at 25K iterations, I am looking at 4.5 times slower. For subtractions it is 8.5MS vs. 1.85ms ... that is pretty big.

Q6600 running at 2.8 GHZ

### Digital ApeModerator

Joined:
Apr 11, 2010
Posts:
29,723
mobile needs to consider these numbers in larger collections too.

5. ### MrBurns

Joined:
Aug 16, 2011
Posts:
378
Unity 3.4, Intel Core i5 Mobile (slower than Q6600):

The script below attached to empty scene main camera yields 120 ms for the first two additions and 60 ms for the second two additions.
That makes 12-24 microseconds for adding two vectors... (quite a bit slow)
I am not sure what the reason is frankly, also I don't have time to investigate this. But Usually this should be up to 3 orders of magnitudes faster.

using UnityEngine;
using System.Collections;
using System.Diagnostics;

public class NewBehaviourScript : MonoBehaviour {

// Use this for initialization
void Start () {
Stopwatch watch = new Stopwatch();
watch.Start();
Vector3 position = new Vector3(1, 1, 1);
Vector3 origin = new Vector3(0, 0, 0);

for (int i = 0; i < 2500000; i++)
{
//Vector3 addition = origin + position;
//Vector3 subtraction = origin - position;

Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
}

watch.Stop();
UnityEngine.Debug.Log(watch.Elapsed.TotalMilliseconds);

}

// Update is called once per frame
void Update () {

}
}

Last edited: Sep 7, 2011

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
I just tested this with doing a manual Dot Product vs Vector3.Dot in the same Coroutine and got 0.18ms vs. 5.0ms ...

Code (csharp):
1. dotProduct = (forward.x * direction.x) + (forward.y * direction.y) + (forward.z * direction.z);
2. //dotProduct = Vector3.Dot(forward, direction);
btw: All my numbers come from Deep Profiling which obviously makes all these numbers look worst overall. However, their relative differences should remain constant (correct?)

Last edited: Sep 7, 2011
7. ### MrBurns

Joined:
Aug 16, 2011
Posts:
378
Use the code I wrote ^^. It will measure with around nanosecond precision (at least on a recent computer and hopefully on MONO too)...

And never use a profiler for such measurements... You won't get any useful information. A profiler is to detect hotspots and has preparations to neutralize its impact on performance measurements. If you do your own measurements in profiling sessions they will most likely be useless... And for this I suspect that the profiler "hooks" into the struct constructor for memory allocation and this is why it is taking much longer for you as for me the way through constructors is twice as fast as the way without them...

Last edited: Sep 7, 2011

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
I am using that now

I now get 52.5094ms using (long form) addition subtraction vs. 114.5767 using the (short form) so the gap is not as bad as it was using deep profiling but it is still half the speed. Granted, that's a lot of iterations to get that which puts things back in perspective.

I knew the numbers from Deep Profilling would be worst but I never expected the relationship between those to be different. I always thought if the profiler showed a function being half the speed of another, that this would be consistant.

9. ### MrBurns

Joined:
Aug 16, 2011
Posts:
378
Profilers are usually bad at measuring speed for fast functions, this is why the more advanced ones will sieve them out at runtime and suggest you to add them on an ignore list, since profiling fast function might, as you have discovered now, not only lead to wrong results but it will heavily reduce performance during profiling, as you can notice in Unity, and also potentially invalidate the results of slower function using these profiled fast functions...

So the bottom line is that profiling is intended for larger functions where the actual profiling overhead is far outweighted by execution time of the function itself.

10. ### npsf3000

Joined:
Sep 19, 2010
Posts:
3,830
Q6600 @ 2.4 Ghz and both ran at ~1ms.

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
Using Mr.Burns Stopwatch method, I still see a big difference although not as significant...

For instance:

Code (csharp):
1.  for (int i = 0; i < 2500000; i++)
2.         {
3.             // These two get done in 116.8739ms
4.             //Vector3 addition = origin + position;
5.             //Vector3 subtraction = origin - position;
6.
7.
8. Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
9.             Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
10.         }

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
Using Mr.Burns Stopwatch method, I still see a big difference although not as significant...

For instance:

Code (csharp):
1.  for (int i = 0; i < 2500000; i++)
2.         {
3.             // These two get done in 116.8739ms
4.             //Vector3 addition = origin + position;
5.             //Vector3 subtraction = origin - position;
6.
7.             // These two get done in 53.2564ms
8.             Vector3 addition2 = new Vector3(origin.x + position.x, origin.y + position.y, origin.z + position.z);
9.             Vector3 subtraction2 = new Vector3(origin.x - position.x, origin.y - position.y, origin.z - position.z);
10.         }
That is still twice as fast for manually adding or subtracting vectors. This is true for dot products or any other vector operations I have tested.

Ali-Nagori likes this.

Joined:
Jul 5, 2011
Posts:
97
Calling a function or using overloaded operators is going to be slower than using inline code.

14. ### MrBurns

Joined:
Aug 16, 2011
Posts:
378
This is a managed environment ^^. Its a very vague assumption without knowing how mono does it. It is possible to optimize this, that one great thing about managed environments, called "Runtime Optimization", but I am not quite up to date how much of them they already use.

15. ### npsf3000

Joined:
Sep 19, 2010
Posts:
3,830
Read my early post. I also noted a differance - and FYI my code was better than Mr. Burns

### Unity Technologies

Joined:
Feb 23, 2011
Posts:
2,269
My apologies but I only referenced Mr.Burns because I am employed at his nuclear plant in Springfield an didn't want to suffer his wrath and yes, you did noted those differences as well.

Joined:
Sep 19, 2010
Posts:
3,830

18. ### MrBurns

Joined:
Aug 16, 2011
Posts:
378
@NPSF3000: Not that it bothers me much but which improvements do you have for the stopwatch ?

Joined:
Feb 25, 2014
Posts:
480

Joined:
Jan 20, 2015
Posts:
8,792
21. ### Aka_ToolBuddy

Joined:
Feb 25, 2014
Posts:
480
Someone who was looking for such optimizations send me a list of all the threads he looked at before finding mine.