Hello everyone, Decided to start this thread to write some scripting optimization tips. Hopefully someone will find this useful. Results: Spoiler: Calculating Distance Code (CSharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public static class ExtensionsManager { public static float Distance (Vector3 a, Vector3 b) { Vector3 vector; float distanceSquared; vector.x = a.x - b.x; vector.y = a.y - b.y; vector.z = a.z - b.z; distanceSquared = vector.x * vector.x + vector.y * vector.y + vector.z * vector.z; return (float)System.Math.Sqrt(distanceSquared); } } public class DistancePerformanceTest : MonoBehaviour { Vector3 v1 = new Vector3 ( 15, 20, 52 ); Vector3 v2 = new Vector3 (-100, 10, 80); // Update is called once per frame void Update () { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 3"); Test3(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1 () { for (int i = 0; i < 100000; i++) { float distance = Vector3.Distance(v1, v2); } } void Test2 () { for (int i = 0; i < 100000; i++) { float distance = ExtensionsManager.Distance(v1, v2); } } void Test3 () { for (int i = 0; i < 100000; i++) { float distance = Vector3.SqrMagnitude(v2 - v1); } } } Spoiler: Result (Times ms) Test1: 10.75 Test2: 5.51 Test3: 8.20 Spoiler: Mathf abs Code (CSharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public class MathfPerformance : MonoBehaviour { // Update is called once per frame void Update() { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1 () { for (int i = 0; i < 10000; i++) { float value = Mathf.Abs(-10.5f); } } void Test2 () { for (int i = 0; i < 10000; i++) { float value = System.Math.Abs(-10.5f); } } } Spoiler: Result (Times ms) Test1: 0.76 Test2: 0.40 Spoiler: Quaternion Euler Code (CSharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public class QuaternionPerformance : MonoBehaviour { // Update is called once per frame void Update() { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1 () { for (int i = 0; i < 10000; i++) { Quaternion rot = Quaternion.Euler(new Vector3(1, 6, 3)); } } void Test2 () { for (int i = 0; i < 10000; i++) { Vector3 v; v.x = 1; v.y = 6; v.z = 3; Quaternion rot = Quaternion.Euler(v); } } } Spoiler: Result (Times ms) Test1: 2.93 Test2: 1.89 Spoiler: Comparing Tags Code (CSharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public class TagsPerformance : MonoBehaviour { bool yes = false; // Update is called once per frame void Update() { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1 () { for (int i = 0; i < 10000; i++) { if (gameObject.CompareTag("Player")) { yes = true; } } } void Test2 () { for (int i = 0; i < 10000; i++) { if (gameObject.tag == "Player") { yes = true; } } } } Spoiler: Result (Times ms) Test1: 2.44 Test2: 3.54 (GC Alloc) Test1: 0 Test2: 410.2 KB Spoiler: Change Position Code (CSharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public class ChangePosition : MonoBehaviour { // Update is called once per frame void Update() { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1 () { for (int i = 0; i < 10000; i++) { Vector3 b = new Vector3(1, 2, 3); } } void Test2 () { for (int i = 0; i < 10000; i++) { Vector3 a; a.x = 1; a.y = 2; a.z = 3; } } } Spoiler: Result (Times ms) Test1: 0.24 Test2: 0.13
Of these, the only one I found useful/interesting was the first one - I'd be interested to see what the source code of Vector3.sqrMagnitude looks like that it could actually be slower than an operation that includes a square root. Or, I suspect, maybe that additional time has to do with the fact that it's bridging into native code? In either case, it will make me think twice about using the sqrMagnitude optimization that is taken as common wisdom. I guess the gcalloc of the .tag comparison is good to know, for those not aware of it. As for the other examples, though, were I to review this code in a project I was working on, I would reject the code outright. 1) When you're dealing with amounts of time this tiny, the profiler is not a particularly reliable measure of performance. CPU's and compilers do so many optimizations to code under the hood that any minuscule change to the way a method is called may hugely affect the results, which means these results cannot be relied on in other situations. 2) The amount of time you're saving on these is inconsequential in the context of C# code. If you're doing enough of these operations to matter (millions of times per frame), you need to rethink your entire approach to whatever you're working on, not optimize down a picosecond on each operation. Either use a different technique entirely, or offload processing to the video card or something. 3) Making your code less readable and maintainable to save 0.000000104 seconds on a Euler operation (yes, that's how it actually maths out on that one) is not good coding practice. Code maintainability is almost always preferable to micro-optimizations.
Most of it basically boils down to something that has been reported multiple times already. Using the overloaded constructors for the struct classes appear to have a higher performance impact than having a default vector + settings its values one by one. Many of the built-in operations for vectors use the overloaded constructors and therefore inherently suffer from this impact. This appears to be the reason why sqrMagnitude does not make the race here. Internally, it's simple math, but the argument you provide (v1-v2) calls the -operator of Vector3, which again calls an overloaded constructor...
That is an odd performance hit for a 3D real time engine to have. Seems like you can vote to get this looked at, but it currently only has 152 votes. [Edit: 153 now I've just added mine ]
for Math.abs, it could be even faster doing it manually (at least with Mathf.Min/Max it was) Code (CSharp): float a = -10.5f; float b = 5f; float r = a<b?a:b; // Mathf.Min(a,b) For replacing bounds.IntersectRay(), i think this was faster with lots of bounds to check (dont have the results here now..) https://gist.github.com/unitycoder/8d1c2905f2e9be693c78db7d9d03a102
I'm with StarManta and Suddoha on this one. CompareTag is something newbs should definitely be introduced to! And the 'distance' one is a bit unintuitive, but if you think about it (like suddoha points out), it's realted the the fact that you used the "-" operator which uses a struct constructor overload, which can be sort of slow. note, the unity Vector3 does not call internally and just does the calculation in C#. It's the struct constructor definitely slowing it down: Code (csharp): public static float Distance(Vector3 a, Vector3 b) { Vector3 vector3 = new Vector3(a.x - b.x, a.y - b.y, a.z - b.z); return Mathf.Sqrt((float) ((double) vector3.x * (double) vector3.x + (double) vector3.y * (double) vector3.y + (double) vector3.z * (double) vector3.z)); } public static float SqrMagnitude(Vector3 vector) { return (float) ((double) vector.x * (double) vector.x + (double) vector.y * (double) vector.y + (double) vector.z * (double) vector.z); } But by demonstrating that your Distance implementation is technically faster, it's not really pointing out why. And a custom implementation of SqrMagnitude can really show why: Code (csharp): using System.Collections; using System.Collections.Generic; using UnityEngine; public static class ExtensionsManager { public static float Distance(Vector3 a, Vector3 b) { Vector3 vector; float distanceSquared; vector.x = a.x - b.x; vector.y = a.y - b.y; vector.z = a.z - b.z; distanceSquared = vector.x * vector.x + vector.y * vector.y + vector.z * vector.z; return (float)System.Math.Sqrt(distanceSquared); } public static float SqrDistance(Vector3 a, Vector3 b) { float x = a.x - b.x; float y = a.y - b.y; float z = a.z - b.z; return x * x + y * y + z * z; } } public class DistancePerformanceTest : MonoBehaviour { Vector3 v1 = new Vector3(15, 20, 52); Vector3 v2 = new Vector3(-100, 10, 80); // Update is called once per frame void Update() { UnityEngine.Profiling.Profiler.BeginSample("Test 1"); Test1(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 2"); Test2(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 3"); Test3(); UnityEngine.Profiling.Profiler.EndSample(); UnityEngine.Profiling.Profiler.BeginSample("Test 4"); Test4(); UnityEngine.Profiling.Profiler.EndSample(); } void Test1() { for (int i = 0; i < 100000; i++) { float distance = Vector3.Distance(v1, v2); } } void Test2() { for (int i = 0; i < 100000; i++) { float distance = ExtensionsManager.Distance(v1, v2); } } void Test3() { for (int i = 0; i < 100000; i++) { float distance = Vector3.SqrMagnitude(v2 - v1); } } void Test4() { for (int i = 0; i < 100000; i++) { float distance = ExtensionsManager.SqrDistance(v1, v2); } } } Results: Code (csharp): Test1 - 5.68 Test2 - 3.83 Test3 - 4.04 Test4 - 1.95 <-- there's the SqrDistance pulling out ahead This was on my machine, a Core i7 2700K (a bit older... but mid to higher end for when it came out). You might notice though that the speeds were actually comparable for the Distance vs SqrMagnitude. And that the custom SqrDistance really pulled ahead here. Thing is, the reason why on my machine SqrDistance and the custom Distance were closer matched is because I ran it in the newer .Net 4.6 compatability. When I bring it back to .Net 2.0 it starts to bounce around a lot getting pretty extreme at times, but averaging about: Code (csharp): Test1 - 6.25 Test2 - 4.20 Test3 - 5.35 Test4 - 3.00 This is probably mostly because the newer .Net 4.6 is sort of faster. It's also telling though that my platform still does pretty well compared to OP's machine. The speed differences aren't as pronounced... and this is on a 7 year old computer that I currently have like 4 virtual machines humming away on. It goes to show different rigs don't scale linearly in complexity. ... And lets not forget this also means that other platforms, especially platforms that use il2cpp, a lot of these mico-optimizations sort of just go away. Which is again why I point out the 'SqrDistance' thing. In the end, avoiding a square root that you don't need is always very adventageous, no matter the platform you're on.
I implemented those optimizations and others in this free asset https://www.assetstore.unity3d.com/#!/content/120660?aid=1101l3N9P It is easy to use, just import it and rebuild your game. To be completely aware of its limitations, please read the asset description.
Slightly off-topic, am I the only one put off by extension methods being put in a class named "ExtensionsManager"? This class does not "manage" extensions, it contains them. I'm particularly sensitive when any class gets named "Manager" because it doesn't add anything to the name. Could as well be "ExtensionsSomething". Plainly naming the class "Extensions" would be a much better fit and isn't misleading. Btw, my rule for extension classes is: <ClassNameExtensionIsFor>Extensions in the same namespace. For example: namespace UnityEngine { static class TransformExtensions{..} } Extensions could be shortened to Ext if there's verbosity concern.
I definitely agree with the over use of the 'Manager'. Along with many other coding sludge I've seen over the years I've been doing this. Seldom have the energy to chat about it though since it often gets into such a subjective realm.
I assume these tips were gathered AFTER you've profiled your game to find where the real bottlenecks were. Right?