Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Question About advanced programming on Unity and game architecture

Discussion in 'Scripting' started by mwss1996, Nov 26, 2022.

  1. mwss1996

    mwss1996

    Joined:
    May 27, 2018
    Posts:
    5
    Do you have recommendations for getting Unity advanced coding information? Like assets or open source projects to study the code, youtube channels, articles, books, courses and etc.

    I am a professional software developer and work with databases, frontend and backend for 6 years and recently decided to focus on game development. But after trying to go deep on Unity i found really difficult to find advanced content on game architecture and how to properly manage things for scallable games.
     
    Last edited: Nov 27, 2022
    TzuriTeshuba likes this.
  2. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    31,343
    With interactive software it's all just iteration and refinement.

    There's almost always ten billion different ways to do things.

    Usually the simplest approach is the best one to start with.

    In other words, just do-do-do and pay attention to what you are doing.
     
    Ryiah and mwss1996 like this.
  3. _geo__

    _geo__

    Joined:
    Feb 26, 2014
    Posts:
    737
    If you want some code to read then maybe the unity source could be a starting point, though I am not sure if it's a great one. I have picked up one or two tricks browsing it: https://github.com/Unity-Technologies/UnityCsReference

    There are lots of youtube channels out there but I feel most are beginner targeted to maximize audience. The unite talks are a nice source though.
     
    mwss1996 likes this.
  4. mwss1996

    mwss1996

    Joined:
    May 27, 2018
    Posts:
    5
    Thanks for the recommendation, i saved some talks to watch later ^^
     
  5. jvo3dc

    jvo3dc

    Joined:
    Oct 11, 2013
    Posts:
    1,520
    For me, beyond knowing the basics of the Unity api, it's about knowing what your options are in C# and what common design patterns are. Then whenever you are selecting from those many approaches to a problem, you are at least familiar with the approaches and can make an informed decision on which way to go (first.)
    https://www.tutorialsteacher.com/csharp
    (delegates vs interfaces is interesting, I do miss generics and linq here as options)
    https://refactoring.guru/design-patterns/csharp

    The unity specific code only goes so far, beyond that, it's just C# and design patterns, nothing else. In fact, I'd recommend splitting your unity specific code (view) from the general code (model) for larger projects. It's common practice.
     
    mwss1996 likes this.
  6. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    The first thing you will notice is how much slower the unity CLR (mono) is from what you are used to at work (.NET 6 and 7) :)

    Edit: and if you decompile the msil code you will see that mono is type casting floats to doubles before operating on it and than back again for no appernt reason
     
    Last edited: Nov 29, 2022
    mwss1996 likes this.
  7. Owen-Reynolds

    Owen-Reynolds

    Joined:
    Feb 15, 2012
    Posts:
    1,777
    I've got a 3 or 4 page intro to Unity for programmers at taxesforcatses-dot-com/codeNotes/UforP.shtml. One thing from there -- everything in Unity wants to run all at once through their own Update()'s. That's fine, but if you like a traditional top-down structure where one top-level chunk runs (or skips) everything, that works just fine, too.

    About jvo3dc's mentions of "design patterns", that can be confusing. Some people say that to mean just any cool programming trick. But the official things called "Design Patterns" aren't useful. Sure, you'll find lots of old internet stuff about how great they are, but we finally have lots of newer internet stuff about what time-wasters they turned out to be.
     
    mwss1996 likes this.
  8. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    I will again point out, and this is the 3rd time this week alone, that the historical reason was that C# didn't have a dedicated 32-bit floating point math library until .net Core 2.0 which was a) relatively recent, and b) unavailable in Unity. Unity made their own Mathf library (very early on) to support 32-bit math, however they've abandoned the idea of actually implementing their own low-level computations, because that's just ludicrous for many reasons, and this kind of worked for 97% of games out there.

    (Edit: Also I think it gets optimized by IL2CPP anyway.)

    Because of this UnityEngine.Mathf is mostly just a 32-bit wrapper for the System.Math class which is 64-bit, this is why floats get cast to doubles and back, and this is especially painful for square roots and trigonometric functions. Some other methods such as System.Math.Abs do support 32-bit floats natively. And the rest of it are Unity's own useful compounds or math functions, so not everything is bad about it.

    Nowadays, we have an official 32-bit floating math library since .net 5 because .net core got reintegrated: MathF. According to my simple benchmarks they offer a significantly better performance and everyone and their grandma is encouraged to switch from UnityEngine.Mathf to System.MathF.
     
    dlorre and mwss1996 like this.
  9. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    That's true and untrue at the same time. There are "Design Patterns" and there are design patterns. Some of the patterns are so ubiquitous and evergreen, that they work in every environment and in any context, regardless of whether Unity is object- or composition- or data-oriented (as with DOTS), the language itself is vast and open to various approaches. Obviously all of the patterns are situational, but there is nothing black & white about them. Observer pattern, strategy pattern, factory pattern, singleton pattern, composition pattern, and many many more are perfectly valid or even beneficial for Unity projects.

    That said, I never actually stress myself with remembering them, let alone applying them to the letter, but there is value in appreciating the reasons why they exist and what they solve.
     
    mwss1996 likes this.
  10. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Write a Vector3 * float or worse Vector3 * Quaternion compile it with unity and look at the decompiled result.
     
    mwss1996 likes this.
  11. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Design patterns are one thing, but more important is design principles like SOLID. Though, always approach principles with a bit of pragmatism
     
  12. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    For the OP:

    Try to get a good grasp on what Unity is -- it's a holistic engine, and your code is a mere guest to the party (quote by @Kurt-Dekker). Pay special attention to how you can build things that are self-contained, but then you want to strategically connect them to the underlying mechanism, to make a cohesive hole. Where juniors would over-connect and lose all control, you can funnel your logic with complex solutions and intermittent machines that you write on top of it. You can also make everything as simple as possible, but the longevity of that approach depends on the scale of your particular project.

    Learn about serialization in Unity, that's a huge topic.

    After you learn what makes Unity tick, learn C# as much as possible. This is easier to do than learning Unity. After almost 15 years I can't manage to fully go through everything that exist, because it constantly evolves, yet I'm pretty much convinced I've seen nearly everything C# has to offer (at least on a fundamental level), because its design is much more deliberate and stable. And well-documented.

    Spend some time at researching features you'd like to implement. Especially if you don't have a graphics programming background. Learn the jargon, learn about the GPU and the techniques involved. Learn where the bottlenecks lie. What Unity does on its own, and what is beyond but must be maintained by Unity.

    Spend some time in learning how to automate editors and make your own dev tools, this is a large part of being in control over your dev time in Unity. Nearly everything is programmable and extensible, but it's shifty, sometimes not so well documented, and very different from actually working on a code that will be deployed.

    These are the most important things I can think of.
     
    mwss1996 likes this.
  13. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    If you're bothered with that, use mathematics package instead.
     
    mwss1996 likes this.
  14. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    I'll bite...

    Compiled this method:
    Code (csharp):
    1.   private void DoAThing()
    2.   {
    3.     Vector3 vector3 = Vector3.one * 5f;
    4.   }
    Well technically it was this, but it got trimmed down in compiling:
    Code (csharp):
    1.     private void DoAThing()
    2.     {
    3.         Vector3 v = Vector3.one;
    4.         float s = 5f;
    5.         var v2 = v * s;
    6.     }
    Resulted in exactly what I expected, a call to the Vector3.op_Multiply method passed in as a float32:
    Code (csharp):
    1.   .method private hidebysig instance void
    2.     DoAThing() cil managed
    3.   {
    4.     .maxstack 2
    5.     .locals init (
    6.       [0] float32 V_0
    7.     )
    8.  
    9.     // [36 5 - 36 39]
    10.     IL_0000: call         valuetype [UnityEngine.CoreModule]UnityEngine.Vector3 [UnityEngine.CoreModule]UnityEngine.Vector3::get_one()
    11.     IL_0005: ldc.r4       5
    12.     IL_000a: stloc.0      // V_0
    13.     IL_000b: ldloc.0      // V_0
    14.     IL_000c: call         valuetype [UnityEngine.CoreModule]UnityEngine.Vector3 [UnityEngine.CoreModule]UnityEngine.Vector3::op_Multiply(valuetype [UnityEngine.CoreModule]UnityEngine.Vector3, float32)
    15.     IL_0011: pop
    16.     IL_0012: ret
    17.  
    18.   } // end of method zTest01::DoAThing
    And going to that operator:
    Code (csharp):
    1.     [MethodImpl((MethodImplOptions) 256)]
    2.     public static Vector3 operator *(Vector3 a, float d)
    3.     {
    4.       return new Vector3(a.x * d, a.y * d, a.z * d);
    5.     }
    The resulting IL is:
    Code (csharp):
    1.   .method public hidebysig static specialname valuetype UnityEngine.Vector3
    2.     op_Multiply(
    3.       valuetype UnityEngine.Vector3 a,
    4.       float32 d
    5.     ) cil managed
    6.   {
    7.     .maxstack 4
    8.     .locals init (
    9.       [0] valuetype UnityEngine.Vector3 V_0
    10.     )
    11.  
    12.     IL_0000: nop
    13.  
    14.     // [543 7 - 543 53]
    15.     IL_0001: ldarg.0      // a
    16.     IL_0002: ldfld        float32 UnityEngine.Vector3::x
    17.     IL_0007: ldarg.1      // d
    18.     IL_0008: mul
    19.     IL_0009: ldarg.0      // a
    20.     IL_000a: ldfld        float32 UnityEngine.Vector3::y
    21.     IL_000f: ldarg.1      // d
    22.     IL_0010: mul
    23.     IL_0011: ldarg.0      // a
    24.     IL_0012: ldfld        float32 UnityEngine.Vector3::z
    25.     IL_0017: ldarg.1      // d
    26.     IL_0018: mul
    27.     IL_0019: newobj       instance void UnityEngine.Vector3::.ctor(float32, float32, float32)
    28.     IL_001e: stloc.0      // V_0
    29.     IL_001f: br.s         IL_0021
    30.     IL_0021: ldloc.0      // V_0
    31.     IL_0022: ret
    32.  
    33.   } // end of method Vector3::op_Multiply
    I'm not seeing anything overtly out of place here. You could make an argument about the Vector3 constructor adding an extra stack frame that could be avoided. But I'm not seeing anything about doubles here. Everything is a float32.
     
    Last edited: Nov 29, 2022
    Ryiah, mwss1996 and orionsyndrome like this.
  15. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Now check the jited code
     
    mwss1996 likes this.
  16. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    You said:
    That's what I did.

    The JIT'd code is very platform dependent.
     
    Ryiah, mwss1996 and orionsyndrome like this.
  17. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Yeah wrong by me sorry. It's the JITed code that will be typecasting.
     
    mwss1996 likes this.
  18. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Question... how are you getting a peak at the JIT'd code?

    The CLR will compile the code in memory... you'd have to either A) read the memory in place, or B) use the debugger api to observe the state of the application at the time. Not exactly a scenario that I have setup ready to just do on a whim. And I have the sneaking suspicion this is not what you're referring to... and that just like you mistyped about the MSIL, you're mistyping again here.

    Do you mean the compiled code generated by IL2CPP?
     
    mwss1996 likes this.
  19. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Cheat engine
     
  20. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    The reason why System.MathF is faster than UnityEngine.MathF is that unity calls System.Math under the hood, an extra typrcast and method call.

    Both libraries will typecast back and forth between floats and doubles because how mono JIT works

    This is UnityEngine.MathF.Abs

    Code (CSharp):
    1. public static float Abs(float f)
    2. {
    3.     float a = f;
    4.     a = (float)(double)a;
    5.     a = (float)(double)(float)(double)System.Math.Abs(a);
    6.     return (float)(double)a;
    7. }
     
    mwss1996 likes this.
  21. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Yeah, Orion already said that when they were talking about Mathf. I never talked about Mathf, so I assume you're talking to them.

    Well I already did my leg work, how about you do some as well and demonstrate some proof?

    ...

    This isn't to say I don't believe you. The mono CLR used by unity isn't exactly the most top of the line thing. This is also where IL2CPP can step in and help.

    But... float/double processing speeds are pretty negligible on modern x86-64 CPUs, meaning the casting is the only thing that might incur a speed hit. And really... it's not that huge of a concern since really the biggest performance hitters are performed under the hood like physics, and video processing.

    And I'd like to repeat... JIT'n is very platform dependent.
     
    mwss1996 and orionsyndrome like this.
  22. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    @lordofduct Thanks for that. I had my suspicions because it made no sense, but I was lazy to try this for myself.
    @Max-om Read my post again with more attention.
     
    mwss1996 likes this.
  23. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    I have. My point being even using System.MathF they will be typecast in JIT it's only faster because it does not wrap System.Math like UnityEngine.MarhF is doing. It will still roundtrip around double when it's actually running on the CPU.
     
    mwss1996 likes this.
  24. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    UnityEngine.Mathf is actually implemented that way. Sin is literally implemented like this
    Code (csharp):
    1. static public float Sin(float r) => (float)System.Math.Sin((double)r);
    You're perhaps needlessly extending the speculation that every computation in Unity roundtrips to double, even when that's not true. As I explained, this is only the case for trigonometric functions, square root, logarithm and such functions which dwell in UnityEngine.Mathf, not the rest of the system, definitely not the case with simple matrix and vector multiplications.

    HOWEVER many quaternion operations do hinge on trigonometry, so these methods are probably doing the roundtrip to double as well, because they rely on Mathf. This is maybe what you're referring to. As I said, IL2CPP probably irons that out anyway. You can also write your own quaternion math if you know quaternions. I wrote around 40% of quaternions from scratch. I recently did both slerps for my codebase: Vector3.Slerp and Quaternion.Slerp, I think both of these you can find in that thread (I have to add MelvMay helped me a lot).

    On the other hand, System.MathF is a native 32-bit library. If you can circumvent Unity doing the math, even better. I do that all the time, and my benchmarks show significant gains in very hot paths, but not that much universally.

    Edit:
    Btw, feel free to benchmark my SLerpUnit (last post in that thread) vs Unity's Vector3.Slerp between two unit directions. The difference is MASSIVE (and it is obviously situational). This is why we (senior cretins like myself) do this, but this is admittedly an overkill for many typical programmers.
     
    Last edited: Nov 29, 2022
    mwss1996 likes this.
  25. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    It doesn't matter how it's implemented when the mono JIT is fubar and still keeps typecasting single precision floats to double precision floats and back again. The performance gains you see are because you skip one typecast and then a method call ontop of that.

    I miss typed on my initial post. I ment JIT code not msil.
     
    mwss1996 likes this.
  26. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Here is a quick example.

    Code (CSharp):
    1. public static class TestTestTestTest
    2. {
    3.     public static float A;
    4.     public static float B;
    5.     public static float C;
    6.  
    7.     public static float DoIt()
    8.     {
    9.         C = A + B;
    10.         return C;
    11.     }
    12. }
    Gets compiled to
    Code (CSharp):
    1.  
    2. public static class TestTestTestTest
    3. {
    4.     public static float A;
    5.     public static float B;
    6.     public static float C;
    7.  
    8.     public static float DoIt()
    9.     {
    10.         double a = (double)A;
    11.         double b = (double)B;
    12.         a += b;
    13.         C = (float)a;
    14.         return (float)(double)C;
    15.     }
    16. }
    Disassembly

    Code (asm):
    1. TestTestTestTest:DoIt+00- 48 83 EC 08            - sub rsp,08
    2. TestTestTestTest:DoIt+04- 48 B8 688C035AC3010000 - mov rax,000001C35A038C68; A
    3. TestTestTestTest:DoIt+0e- F3 0F10 00             - movss xmm0,[rax]
    4. TestTestTestTest:DoIt+12- F3 0F5A C0             - cvtss2sd xmm0,xmm0
    5. TestTestTestTest:DoIt+16- 48 B8 6C8C035AC3010000 - mov rax,000001C35A038C6C; B
    6. TestTestTestTest:DoIt+20- F3 0F10 08             - movss xmm1,[rax]
    7. TestTestTestTest:DoIt+24- F3 0F5A C9             - cvtss2sd xmm1,xmm1
    8. TestTestTestTest:DoIt+28- F2 0F58 C1             - addsd xmm0,xmm1
    9. TestTestTestTest:DoIt+2c- 48 B8 708C035AC3010000 - mov rax,000001C35A038C70; C
    10. TestTestTestTest:DoIt+36- F2 0F5A E8             - cvtsd2ss xmm5,xmm0
    11. TestTestTestTest:DoIt+3a- F3 0F11 28             - movss [rax],xmm5
    12. TestTestTestTest:DoIt+3e- 48 B8 708C035AC3010000 - mov rax,000001C35A038C70; C
    13. TestTestTestTest:DoIt+48- F3 0F10 00             - movss xmm0,[rax]
    14. TestTestTestTest:DoIt+4c- F3 0F5A C0             - cvtss2sd xmm0,xmm0
    15. TestTestTestTest:DoIt+50- F2 0F5A C0             - cvtsd2ss xmm0,xmm0
    16. TestTestTestTest:DoIt+54- 48 83 C4 08            - add rsp,08
    17. TestTestTestTest:DoIt+58- C3                     - ret
     
    mwss1996 likes this.
  27. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    So a little while ago I found time to confirm what Max-om had said before they posted this. I now see they finally did show their proof... at least now no one has to just take them at their word.

    Regardless I too have confirmed this in regards to the mono runtime:
    upload_2022-11-29_18-35-59.png
    (vctss2sd is the conversion since they rely on mulsd... ss = serial single, sd = serial double)

    And even found some people talking about it over the years in terms of mono (note this is a mono thing, not a unity thing):
    https://tirania.org/blog/archive/2018/Apr-11.html

    Funny enough I had a post touching on these very topics, but decided not to post as I often get long in the tooth when I post...

    Regardless, as I said in my last post when I determined what they were actually talking about. I stated I wouldn't be surprised if this is true (mono is old + deleted comments mirroring topics found in this quote), but that I would still consider this less offensive than the overhead incurred by Mathf calling through to System.Math since the extra stack frame is far more costly than a float->double conversion during the operation. This point is what Max-om is asserting is the true origin of the gains... which is true! But as far as I could tell that's what Orion was also saying... so it's still repeating the point.

    And since most of the heavy maths is under the hood outside of the mono runtime the places where this performance lost to the mono's JIT'r really doesn't matter unless you just happen to be doing hardcore maths in your scripts (and I mean hardcore).

    And of course IL2CPP would help remedy this as it doesn't rely on the mono runtime.

    Oh, and I wouldn't necessarily call this "mono is fubar"... that really undermines the meaning of fubar. It was a design choice made by the mono team done so without foresight of where CPUs were necessarily going in the future since well... the 2000s were a bit crazy when it came to CPU designs.

    ...

    2 other points though that I find funny about the System.MathF...

    1) Ironically it too passes through to System.Math for a couple methods like Sign/Abs/Max/Min/some more.

    2) Most of the methods aren't actually direct arithmetic operations and instead are complex operations like trig and the sort. These are generally passed through and aren't actually implemented in IL and therefore aren't necessarily effected by the mono runtime's JIT preferring doubles. I ran the same memory analyzer on that as well and noted there was still some calls to vctss2sd for various random things (mainly debugger related stuff), but over all aren't necessarily impacted in the same way as say Vector3.op_multiply would be.

    Code (csharp):
    1. // Decompiled with JetBrains decompiler
    2. // Type: System.MathF
    3. // Assembly: mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089
    4. // MVID: 9AAD1B3A-4748-4D63-BA2B-3985692D80E9
    5. // Assembly location: C:\zTemp\VectorMulTestBuild\TestVectorMul_Data\Managed\mscorlib.dll
    6.  
    7. using System.Runtime.CompilerServices;
    8.  
    9. namespace System
    10. {
    11.   public static class MathF
    12.   {
    13.     private static float[] roundPower10Single = new float[7]
    14.     {
    15.       1f,
    16.       10f,
    17.       100f,
    18.       1000f,
    19.       10000f,
    20.       100000f,
    21.       1000000f
    22.     };
    23.     private static float singleRoundLimit = 1E+08f;
    24.     public const float E = 2.718282f;
    25.     public const float PI = 3.141593f;
    26.     private const int maxRoundingDigits = 6;
    27.  
    28.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    29.     public static float Abs(float x)
    30.     {
    31.       return Math.Abs(x);
    32.     }
    33.  
    34.     public static float IEEERemainder(float x, float y)
    35.     {
    36.       if (float.IsNaN(x))
    37.         return x;
    38.       if (float.IsNaN(y))
    39.         return y;
    40.       float num = x % y;
    41.       if (float.IsNaN(num))
    42.         return float.NaN;
    43.       if ((double) num == 0.0 && float.IsNegative(x))
    44.         return -0.0f;
    45.       float x1 = num - MathF.Abs(y) * (float) MathF.Sign(x);
    46.       if ((double) MathF.Abs(x1) == (double) MathF.Abs(num))
    47.       {
    48.         float x2 = x / y;
    49.         if ((double) MathF.Abs(MathF.Round(x2)) > (double) MathF.Abs(x2))
    50.           return x1;
    51.         return num;
    52.       }
    53.       if ((double) MathF.Abs(x1) < (double) MathF.Abs(num))
    54.         return x1;
    55.       return num;
    56.     }
    57.  
    58.     public static float Log(float x, float y)
    59.     {
    60.       if (float.IsNaN(x))
    61.         return x;
    62.       if (float.IsNaN(y))
    63.         return y;
    64.       if ((double) y == 1.0 || (double) x != 1.0 && ((double) y == 0.0 || float.IsPositiveInfinity(y)))
    65.         return float.NaN;
    66.       return MathF.Log(x) / MathF.Log(y);
    67.     }
    68.  
    69.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    70.     public static float Max(float x, float y)
    71.     {
    72.       return Math.Max(x, y);
    73.     }
    74.  
    75.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    76.     public static float Min(float x, float y)
    77.     {
    78.       return Math.Min(x, y);
    79.     }
    80.  
    81.     [Intrinsic]
    82.     public static float Round(float x)
    83.     {
    84.       if ((double) x == (double) (int) x)
    85.         return x;
    86.       float x1 = MathF.Floor(x + 0.5f);
    87.       if ((double) x == (double) MathF.Floor(x) + 0.5 && (double) MathF.FMod(x1, 2f) != 0.0)
    88.         --x1;
    89.       return MathF.CopySign(x1, x);
    90.     }
    91.  
    92.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    93.     public static float Round(float x, int digits)
    94.     {
    95.       return MathF.Round(x, digits, MidpointRounding.ToEven);
    96.     }
    97.  
    98.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    99.     public static float Round(float x, MidpointRounding mode)
    100.     {
    101.       return MathF.Round(x, 0, mode);
    102.     }
    103.  
    104.     public static unsafe float Round(float x, int digits, MidpointRounding mode)
    105.     {
    106.       switch (digits)
    107.       {
    108.         case 0:
    109.         case 1:
    110.         case 2:
    111.         case 3:
    112.         case 4:
    113.         case 5:
    114.         case 6:
    115.           switch (mode)
    116.           {
    117.             case MidpointRounding.ToEven:
    118.             case MidpointRounding.AwayFromZero:
    119.               if ((double) MathF.Abs(x) < (double) MathF.singleRoundLimit)
    120.               {
    121.                 float num = MathF.roundPower10Single[digits];
    122.                 x *= num;
    123.                 if (mode == MidpointRounding.AwayFromZero)
    124.                 {
    125.                   float x1 = MathF.ModF(x, &x);
    126.                   if ((double) MathF.Abs(x1) >= 0.5)
    127.                     x += (float) MathF.Sign(x1);
    128.                 }
    129.                 else
    130.                   x = MathF.Round(x);
    131.                 x /= num;
    132.               }
    133.               return x;
    134.             default:
    135.               throw new ArgumentException(SR.Format("The Enum type should contain one and only one instance field.", (object) mode, (object) "MidpointRounding"), nameof (mode));
    136.           }
    137.         default:
    138.           throw new ArgumentOutOfRangeException(nameof (digits), "Rounding digits must be between 0 and 15, inclusive.");
    139.       }
    140.     }
    141.  
    142.     [MethodImpl(MethodImplOptions.AggressiveInlining)]
    143.     public static int Sign(float x)
    144.     {
    145.       return Math.Sign(x);
    146.     }
    147.  
    148.     public static unsafe float Truncate(float x)
    149.     {
    150.       double num = (double) MathF.ModF(x, &x);
    151.       return x;
    152.     }
    153.  
    154.     private static float CopySign(float x, float y)
    155.     {
    156.       int int32Bits1 = BitConverter.SingleToInt32Bits(x);
    157.       int int32Bits2 = BitConverter.SingleToInt32Bits(y);
    158.       if ((int32Bits1 ^ int32Bits2) >> 31 != 0)
    159.         return BitConverter.Int32BitsToSingle(int32Bits1 ^ int.MinValue);
    160.       return x;
    161.     }
    162.  
    163.     [MethodImpl(MethodImplOptions.InternalCall)]
    164.     public static extern float Acos(float x);
    165.  
    166.     [MethodImpl(MethodImplOptions.InternalCall)]
    167.     public static extern float Acosh(float x);
    168.  
    169.     [MethodImpl(MethodImplOptions.InternalCall)]
    170.     public static extern float Asin(float x);
    171.  
    172.     [MethodImpl(MethodImplOptions.InternalCall)]
    173.     public static extern float Asinh(float x);
    174.  
    175.     [MethodImpl(MethodImplOptions.InternalCall)]
    176.     public static extern float Atan(float x);
    177.  
    178.     [MethodImpl(MethodImplOptions.InternalCall)]
    179.     public static extern float Atan2(float y, float x);
    180.  
    181.     [MethodImpl(MethodImplOptions.InternalCall)]
    182.     public static extern float Atanh(float x);
    183.  
    184.     [MethodImpl(MethodImplOptions.InternalCall)]
    185.     public static extern float Cbrt(float x);
    186.  
    187.     [MethodImpl(MethodImplOptions.InternalCall)]
    188.     public static extern float Ceiling(float x);
    189.  
    190.     [MethodImpl(MethodImplOptions.InternalCall)]
    191.     public static extern float Cos(float x);
    192.  
    193.     [MethodImpl(MethodImplOptions.InternalCall)]
    194.     public static extern float Cosh(float x);
    195.  
    196.     [MethodImpl(MethodImplOptions.InternalCall)]
    197.     public static extern float Exp(float x);
    198.  
    199.     [MethodImpl(MethodImplOptions.InternalCall)]
    200.     public static extern float Floor(float x);
    201.  
    202.     [MethodImpl(MethodImplOptions.InternalCall)]
    203.     public static extern float Log(float x);
    204.  
    205.     [MethodImpl(MethodImplOptions.InternalCall)]
    206.     public static extern float Log10(float x);
    207.  
    208.     [MethodImpl(MethodImplOptions.InternalCall)]
    209.     public static extern float Pow(float x, float y);
    210.  
    211.     [MethodImpl(MethodImplOptions.InternalCall)]
    212.     public static extern float Sin(float x);
    213.  
    214.     [MethodImpl(MethodImplOptions.InternalCall)]
    215.     public static extern float Sinh(float x);
    216.  
    217.     [MethodImpl(MethodImplOptions.InternalCall)]
    218.     public static extern float Sqrt(float x);
    219.  
    220.     [MethodImpl(MethodImplOptions.InternalCall)]
    221.     public static extern float Tan(float x);
    222.  
    223.     [MethodImpl(MethodImplOptions.InternalCall)]
    224.     public static extern float Tanh(float x);
    225.  
    226.     [MethodImpl(MethodImplOptions.InternalCall)]
    227.     private static extern float FMod(float x, float y);
    228.  
    229.     [MethodImpl(MethodImplOptions.InternalCall)]
    230.     private static extern unsafe float ModF(float x, float* intptr);
    231.   }
    232. }
    ...

    TLDR;

    eh, so what?
     
    Last edited: Nov 29, 2022
  28. Owen-Reynolds

    Owen-Reynolds

    Joined:
    Feb 15, 2012
    Posts:
    1,777
    Maybe you write code and think to yourself "factory pattern" and it really helps you out. But my point is that it won't help others. Say someone asks about making a very adaptable spawner (where we can't just change the attached prefab). We could say "replace Instantiate with a delegate spawn function" . We give an example or we don't, but they can look up how delegates work and see how it lets them "plug-in" different functions.

    If we instead say "use the Factory pattern" they get a mess. They're told to write an abstract class as a holder for their plug-in subclass which overrides one function. No one does that anymore, and it's also a rough place to try to learn polymorphism. Likewise for Strategy: we could say "remember those delegates for spawning? You can also use them to plug in your different logic". Or we could tell them to read Strategy, which explains it as if it's whole new thing (and also with an abstract class).
     
  29. orionsyndrome

    orionsyndrome

    Joined:
    May 4, 2014
    Posts:
    2,178
    Ah but I never talked about Mono. If you want performance and can use a different runtime, don't bother with it. There are at least half a dozen Unite videos I watched where the speaker would suggest to everyone to simply ignore Mono when a chart would include it for a performance comparison. It's literally off the chart for most of things we would today consider mandatory, like math performance. It simply didn't age well, because other technologies managed to squeeze the extra juice in the meantime, and it stagnated. I guess everybody thought hardware would keep progressing forever, when in fact it began to parallelize.

    That's because Math already included some native support for 32-bit values.

    Here's a complete list:
    Abs(single)
    Clamp(single, single, single)
    Max(single, single)
    Min(single, single)
    Sign(single)

    But no trigonometry, no sqrt, no log, no exp, no rounding, etc.
     
    Last edited: Nov 30, 2022
    mwss1996 likes this.
  30. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    I know, I just think its funny from a semantics perspective.
     
  31. mwss1996

    mwss1996

    Joined:
    May 27, 2018
    Posts:
    5
    Just loved the refactoring.guru site, thank you!
     
    orionsyndrome likes this.
  32. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    It's not so easy if you have an existing large complex game. For example we have written our own net code and it uses System.Collection.Concurrent which last time i tried didn't compile under il2cpp. Plus il2cpp is not s silver bullet. For example .NET 6 compiles more optimized JIT than il2cpp
     
    orionsyndrome likes this.
  33. MulleDK19

    MulleDK19

    Joined:
    Sep 23, 2016
    Posts:
    1
    Mono in general isn't just not exactly top of the line, it's so far from the line, the line is a dot..
    Mono was never meant for gaming.
    I mean, they officially call it a "fast JIT" JIT-compiler.

    Well, this is what Mono says:
    Whether Unity is using the "fast mini JIT" engine or the "heavy duty engine", I don't know.
    However, I can tell you that the Mono implementation in Unity 2022, insn't great at optimizing. Other than inlining, quite the opposite. More instructions are generated than necessary.

    At the IL level, yes; not at the JIT level.

    The Mono implementation used by Unity does not support 32-bit floating point operations at all.
    This was fixed many years ago in Mono, but like with everything else, Unity has stayed behind.

    The implementation used by Unity converts every single value to double precision before doing any operation, and then converts it back to single precision. Even a simple copy (eg. a return).

    Code (csharp):
    1.  
    2.     private static float x;
    3.     public static void Bla()
    4.     {
    5.         x = 15f;
    6.     }
    7.  
    8.     public static float GetX()
    9.     {
    10.         return x;
    11.     }
    12.  
    Code (csharp):
    1.  
    2.  TestTestTestTest:Bla+00 - 48 83 EC 08            - sub rsp,08
    3.  TestTestTestTest:Bla+04 - F3 0F10 05 24000000    - movss xmm0,[FloatValue15] ; Load 15f
    4.  TestTestTestTest:Bla+0C - F3 0F5A C0             - cvtss2sd xmm0,xmm0        ; Convert it to double
    5.  TestTestTestTest:Bla+10 - 48 B8 CCB52CD30B010000 - mov rax,0000010BD32CB5CC  ; Load address of x
    6.  TestTestTestTest:Bla+1A - F2 0F5A E8             - cvtsd2ss xmm5,xmm0        ; Convert it back to float
    7.  TestTestTestTest:Bla+1E - F3 0F11 28             - movss [rax],xmm5          ; Store it in x
    8.  TestTestTestTest:Bla+22 - 48 83 C4 08            - add rsp,08
    9.  TestTestTestTest:Bla+26 - C3                     - ret
    10.  
    11. TestTestTestTest:GetX+00 - 48 83 EC 08            - sub rsp,08
    12. TestTestTestTest:GetX+04 - 48 B8 CCB52CD30B010000 - mov rax,0000010BD32CB5CC  ; Load address of x
    13. TestTestTestTest:GetX+0E - F3 0F10 00             - movss xmm0,[rax]          ; Load x
    14. TestTestTestTest:GetX+12 - F3 0F5A C0             - cvtss2sd xmm0,xmm0        ; Convert it to double
    15. TestTestTestTest:GetX+16 - F2 0F5A C0             - cvtsd2ss xmm0,xmm0        ; Convert it back to float
    16. TestTestTestTest:GetX+1A - 48 83 C4 08            - add rsp,08
    17. TestTestTestTest:GetX+1E - C3                     - ret
    18.  
    The above translates to
    Code (csharp):
    1.  
    2.     private static float x;
    3.     public static void Bla()
    4.     {
    5.         x = (float)(double)15f;
    6.     }
    7.  
    8.     public static float GetX()
    9.     {
    10.         return (float)(double)x;
    11.     }
    12.  
    IL2CPP should not be the solution to a S***ty JIT compiler. Nor can you just switch every project to IL2CPP.
    They're finally switching to .NET, but that should have happened a long time ago.

    It F***ing should be. A cast here and there may not be a big issue, but we aren't talking a few casts here and there, we're talking casts between every single F***ing thing. Sometimes even several times. IE. instead of just (float)(double) it's (float)(double)(float)(double).

    Code (csharp):
    1.  
    2. private static void Bla()
    3. {
    4.     float a = 1;
    5.     float b = 2;
    6.     float c = a + b;
    7. }
    8.  
    Compiles to:
    Code (csharp):
    1.  
    2. TestTestTestTest:Bla+00 - sub rsp,08
    3. TestTestTestTest:Bla+04 - movss xmm0,[FloatValue1] ; Load 1f.
    4. TestTestTestTest:Bla+0C - cvtss2sd xmm0,xmm0       ; Convert it to double and discard the result.
    5. TestTestTestTest:Bla+10 - movss xmm0,[FloatValue2] ; Load 2f.
    6. TestTestTestTest:Bla+18 - cvtss2sd xmm0,xmm0       ; Convert it to double.
    7. TestTestTestTest:Bla+1C - cvtsd2ss xmm5,xmm0       ; Convert it back to float.
    8. TestTestTestTest:Bla+20 - movss [rsp],xmm5         ; Store it in local variable.
    9. TestTestTestTest:Bla+25 - add rsp,08
    10. TestTestTestTest:Bla+29 - ret
    11.  
    Essentially:
    Code (csharp):
    1.  
    2. private static void Bla()
    3. {
    4.     float x = 1f;
    5.     (double)x;
    6.     x = 2f;
    7.     float y = (float)(double)x;
    8. }
    9.  
    or
    Code (csharp):
    1.  
    2. private static void Bla()
    3. {
    4.     (double)1f;
    5.     float x = (float)(double)2f;
    6. }
    7.  
    Basically, every read or write from or to a float is converted to double and back.
    And since Mono doesn't exactly optimize much, you get a lot of redundancy. Unity doesn't even perform elisions. This method SHOULD have been compiled to:
    Code (csharp):
    1.  
    2. TestTestTestTest:Bla+00 - ret
    3.  
    To top it off, internal calls (Eg. a call to Math.Abs(float)) have enormous overhead. A simple Math.Abs(float) call executes around 60 instructions, of which 2 is the actual Math.Abs(float). That is, 3.33% of the method is the actual code, the rest is the call to it.

    This is a trace of a simple Math.Abs(float) call (Note, this is not the disassembly of the function, this is a trace, showing each executed instruction, including the initial call (but excluding the argument passing)):
    Code (csharp):
    1.  
    2. 00 | 49:BB 400FB81754020000  | mov r11,25417B80F40                     ; Address of Math.Abs (or rather, the code responsible for transitioning to and from the internal call).
    3. 01 | 41:FFD3                 | call r11                                ; Call it.
    4. 02 | 55                      | push rbp                                ; Everything from now on till the end, other than what I've commented as Abs() is the transitioning code.
    5. 03 | 48:8BEC                 | mov rbp,rsp                             ; All this occurs for every single call to Math.Abs.
    6. 04 | 48:81EC 90000000        | sub rsp,90                              
    7. 05 | 48:8965 C0              | mov qword ptr ss:[rbp-40],rsp          
    8. 06 | 48:896D B8              | mov qword ptr ss:[rbp-48],rbp          
    9. 07 | 48:895D C8              | mov qword ptr ss:[rbp-38],rbx          
    10. 08 | 48:8975 D0              | mov qword ptr ss:[rbp-30],rsi          
    11. 09 | 48:897D D8              | mov qword ptr ss:[rbp-28],rdi          
    12. 0A | 4C:8965 E0              | mov qword ptr ss:[rbp-20],r12          
    13. 0B | 4C:896D E8              | mov qword ptr ss:[rbp-18],r13          
    14. 0C | 4C:8975 F0              | mov qword ptr ss:[rbp-10],r14          
    15. 0D | 4C:897D F8              | mov qword ptr ss:[rbp-8],r15            
    16. 0E | F3:0F1145 A0            | movss dword ptr ss:[rbp-60],xmm0        
    17. 0F | 48:8965 C0              | mov qword ptr ss:[rbp-40],rsp          
    18. 10 | 48:8D6424 00            | lea rsp,qword ptr ss:[rsp]              
    19. 11 | 90                      | nop                                    
    20. 12 | 49:BB 908F72F2F87F0000  | mov r11,mono-2.0-bdwgc.7FF8F2728F90    
    21. 13 | 41:FFD3                 | call r11                                
    22. 14 | 8B0D 06C66A00           | mov ecx,dword ptr ds:[7FF8F2DD559C]    
    23. 15 | 6548:8B0425 58000000    | mov rax,qword ptr gs:[58]              
    24. 16 | BA 18000000             | mov edx,18                              
    25. 17 | 48:8B04C8               | mov rax,qword ptr ds:[rax+rcx*8]        
    26. 18 | 48:8B0402               | mov rax,qword ptr ds:[rdx+rax]          
    27. 19 | C3                      | ret                                    
    28. 1A | 48:8BF0                 | mov rsi,rax                            
    29. 1B | 48:8BC5                 | mov rax,rbp                            
    30. 1C | 48:83C0 B0              | add rax,FFFFFFFFFFFFFFB0                
    31. 1D | 48:8B0E                 | mov rcx,qword ptr ds:[rsi]              
    32. 1E | 48:894D B0              | mov qword ptr ss:[rbp-50],rcx          
    33. 1F | 48:8906                 | mov qword ptr ds:[rsi],rax              
    34. 20 | F3:0F1045 A0            | movss xmm0,dword ptr ss:[rbp-60]        
    35. 21 | F3:0F5AC0               | cvtss2sd xmm0,xmm0                      
    36. 22 | 48:B8 90A588F2F87F0000  | mov rax,mono-2.0-bdwgc.7FF8F288A590    
    37. 23 | F2:0F5AC0               | cvtsd2ss xmm0,xmm0                      
    38. 24 | 48:8965 C0              | mov qword ptr ss:[rbp-40],rsp          
    39. 25 | FFD0                    | call rax                                
    40. 26 | 0F5405 B9E14B00         | andps xmm0,xmmword ptr ds:[7FF8F2D48750] ; Actual Math.Abs() code
    41. 27 | C3                      | ret                                      ; Actual Math.Abs() code
    42. 28 | F3:0F5AC0               | cvtss2sd xmm0,xmm0                      
    43. 29 | 48:B8 000909F3F87F0000  | mov rax,mono-2.0-bdwgc.7FF8F3090900    
    44. 2A | 8B00                    | mov eax,dword ptr ds:[rax]              
    45. 2B | F2:0F1145 A8            | movsd qword ptr ss:[rbp-58],xmm0        
    46. 2C | 85C0                    | test eax,eax                            
    47. 2D | 0F85 62000000           | jne 25417B8103B                        
    48. 2E | E9 0A000000             | jmp 25417B80FE8                        
    49. 2F | F2:0F1045 A8            | movsd xmm0,qword ptr ss:[rbp-58]        
    50. 30 | 48:8B45 B0              | mov rax,qword ptr ss:[rbp-50]          
    51. 31 | 48:8906                 | mov qword ptr ds:[rsi],rax              
    52. 32 | F2:0F5AC0               | cvtsd2ss xmm0,xmm0                      
    53. 33 | 48:8B5D C8              | mov rbx,qword ptr ss:[rbp-38]          
    54. 34 | 48:8B75 D0              | mov rsi,qword ptr ss:[rbp-30]          
    55. 35 | 48:8B7D D8              | mov rdi,qword ptr ss:[rbp-28]          
    56. 36 | 4C:8B65 E0              | mov r12,qword ptr ss:[rbp-20]          
    57. 37 | 4C:8B6D E8              | mov r13,qword ptr ss:[rbp-18]          
    58. 38 | 4C:8B75 F0              | mov r14,qword ptr ss:[rbp-10]          
    59. 39 | 4C:8B7D F8              | mov r15,qword ptr ss:[rbp-8]            
    60. 3A | 48:8D65 00              | lea rsp,qword ptr ss:[rbp]              
    61. 3B | 5D                      | pop rbp                                
    62. 3C | C3                      | ret                                    
    63.  

    Despite Math.Abs(float) being a single instruction, Mono does not inline the internal call.

    In comparison, here's "A = Math.Abs(B);" in .NET 6, with all optimizations disabled, including the call to Math.Abs(float) with argument passing, and writing the returned value back to A:
    Code (csharp):
    1.  
    2. 0 | 00007FF872C371B | C5FA1005 A3FB0B00       | vmovss xmm0,dword ptr ds:[7FF872CF6D64]
    3. 1 | 00007FF872C371C | E8 BA5BDD5F             | call coreclr.7FF8D2A0CD80              
    4. 2 | 00007FF8D2A0CD8 | 48:83EC 28              | sub rsp,28                              
    5. 3 | 00007FF8D2A0CD8 | E8 07020000             | call coreclr.7FF8D2A0CF90              
    6. 4 | 00007FF8D2A0CF9 | 0F5405 89000600         | andps xmm0,xmmword ptr ds:[7FF8D2A6D020]
    7. 5 | 00007FF8D2A0CF9 | C3                      | ret                                    
    8. 6 | 00007FF8D2A0CD8 | 48:83C4 28              | add rsp,28                              
    9. 7 | 00007FF8D2A0CD8 | C3                      | ret                                    
    10. 8 | 00007FF872C371C | C5FA1105 96FB0B00       | vmovss dword ptr ds:[7FF872CF6D64],xmm0
    11.  
    To top it off, .NET can inline the call, making the whole thing just:
    Code (csharp):
    1.  
    2. 00007FF872C2715E | C5FA1005 FAFB0B00 | vmovss xmm0, dword ptr [0x7FF872CE6D60]        
    3. 00007FF872C27166 | C5F85405 52000000 | vandps xmm0, xmm0, xmmword ptr [0x7FF872C271C0]
    4. 00007FF872C2716E | C5FA1105 EAFB0B00 | vmovss dword ptr [0x7FF872CE6D60], xmm0        
    5.  
    3 vs Mono's ~70 instructions.

    HOWEVER, the Math.Abs(double) overload IS inlined, as well as other methods taking doubles.
    It seems if the internal call takes floats, it triggers the transitioning code.

    And that's just internal calls. Everything Mono in Unity does, it does as complex as possible..
    Here's a simple "quat = Quaternion.identity;" (where quat is a static field):

    Mono in Unity 2022.2.0b16:
    Code (csharp):
    1.  
    2. 48:B8 508468FED0020000   | mov rax, 0x2D0FE688450           ; Load address of static readonly Quaternion.identityQuaternion.
    3. 48:6308                  | movsxd rcx, dword ptr [rax]      ; Sign extend Quaternion.identityQuaternion.X 32-bit to 64-bits.
    4. 894D E0                  | mov dword ptr [rbp-0x20], ecx    ; Store lower 32-bits in local variable.
    5. 48:6348 04               | movsxd rcx, dword ptr [rax+0x4]  ; Sign extend Quaternion.identityQuaternion.Y 32-bit to 64-bits.
    6. 894D E4                  | mov dword ptr [rbp-0x1C], ecx    ; Store lower 32-bits in local variable.
    7. 48:6348 08               | movsxd rcx, dword ptr [rax+0x8]  ; Sign extend Quaternion.identityQuaternion.Z 32-bit to 64-bits.
    8. 894D E8                  | mov dword ptr [rbp-0x18], ecx    ; Store lower 32-bits in local variable.
    9. 48:6340 0C               | movsxd rax, dword ptr [rax+0xC]  ; Sign extend Quaternion.identityQuaternion.W 32-bit to 64-bits.
    10. 8945 EC                  | mov dword ptr [rbp-0x14], eax    ; Store lower 32-bits in local variable.
    11. 48:B8 FC8168FED0020000   | mov rax, 0x2D0FE6881FC           ; Load address of static field quat.
    12. 48:634D E0               | movsxd rcx, dword ptr [rbp-0x20] ; Read Quaternion.identityQuaternion.X from local variable and sign extend to 32-bit.
    13. 8908                     | mov dword ptr [rax], ecx         ; Write to quat.X.
    14. 48:634D E4               | movsxd rcx, dword ptr [rbp-0x1C] ; Read Quaternion.identityQuaternion.Y from local variable and sign extend to 32-bit.
    15. 8948 04                  | mov dword ptr [rax+0x4], ecx     ; Write to quat.Y.
    16. 48:634D E8               | movsxd rcx, dword ptr [rbp-0x18] ; Read Quaternion.identityQuaternion.Z from local variable and sign extend to 32-bit.
    17. 8948 08                  | mov dword ptr [rax+0x8], ecx     ; Write to quat.Z.
    18. 48:634D EC               | movsxd rcx, dword ptr [rbp-0x14] ; Read Quaternion.identityQuaternion.W from local variable and sign extend to 32-bit.
    19. 8948 0C                  | mov dword ptr [rax+0xC], ecx     ; Write to quat.W.
    20.  
    When it should be just (.NET Framework):
    Code (csharp):
    1.  
    2. C5 F8 77                        vzeroupper                         ; Zero upper 128+ bits of YMM/ZMM to avoid transition penalties.
    3. 48 B8 E0 5A B3 70 F8 01 00 00   mov         rax,1F870B35AE0h       ; Load address of static readonly Quaternion.identityQuaternion.
    4. 48 8B 00                        mov         rax,qword ptr [rax]    ; All user-defined structs in .NET are boxed in static fields, so read it from the heap.
    5. 48 83 C0 08                     add         rax,8                  ; Skip the object header (unbox it).
    6. 48 BA E8 5A B3 70 F8 01 00 00   mov         rdx,1F870B35AE8h       ; Load address of static field 'quat'.
    7. 48 8B 12                        mov         rdx,qword ptr [rdx]    ; Dereference heap.
    8. 48 83 C2 08                     add         rdx,8                  ; Unbox.
    9. C4 E1 7A 6F 00                  vmovdqu     xmm0,xmmword ptr [rax] ; Read 128 bits from Quaternion.identityQuaternion.
    10. C4 E1 7A 7F 02                  vmovdqu     xmmword ptr [rdx],xmm0 ; Write the 128 bits to quat.
    11.  

    Except, as I've shown above, most internal calls have massive overhead in terms of transitioning.
    Even if Unity implemented everything in C++, it would still be slower than pure C++, because of the overhead of transitioning.

    But sure, if your game has no logic at all, Unity is plenty fast..


    Mono is not meant for games; it never was. The "Quaternion * Quaternion" operator alone is almost 400 instructions in Mono, and used to be almost 500; while just 70 in .NET Framework (and a few less in pure C++). And not just a lot more instructions than there should be; less efficient instructions.

    Here's a benchmark I did at the end of 2020 ("Latest Mono" obviously refers to the latest version at the time):
    Code (csharp):
    1.  
    2. Quaternion * Quaternion operator benchmark
    3.  
    4. RUNTIME/COMPILER   INSTRUCTIONS   AVERAGE TIME   SPEED COMPARED TO C++
    5.              C++             62     22.362 ns.               (100.00%)
    6.   .NET Framework             70     22.554 ns.                (99.15%)
    7.           IL2CPP             86     24.685 ns.                (90.59%)
    8.        .NET Core             79     25.703 ns.                (87.00%)
    9.      Latest Mono            160     38.267 ns.                (58.44%)
    10.  Unity 2020 Mono            373     58.295 ns.                (38.36%)
    11.  Unity 2019 Mono            463     67.868 ns.                (32.95%)
    12.  
    Double arithmetic and float arithmetic is not identical.. it isn't just a design choice, it changes the result of the calculations.

    And Mono fixed this mistake EIGHT YEARS ago, but Unity is still using some very old Frankenstein version of Mono.

    Yeah, that isn't just "some people". That's Miguel de Icaza, Mono's founder, and according to him and that very post, the performance difference is far from negligible..
    Simply doing all the calculations in floats instead of converting everything to double and back, alone is a 20% performance gain; at least.

    While I did not benchmark your function, I can tell you that Mono in Unity 2022.2.0b16 produces 583 instructions with 94 casts from float to double, and 59 casts from double to float.
    And that's just SLerpUnit, ignoring any function it calls, including your own.

    The call to SLerpUnit below executed 961 instructions (of which 152 is the calls to onUnitSphere and argument passing), casts float to double 96 times, and double to float 63 times. (Note, the functions contain more casts than that, these are just the number of casts executed in this test run).
    Code (csharp):
    1.  
    2. Vector3 a = UnityEngine.Random.onUnitSphere.normalized;
    3. Vector3 b = UnityEngine.Random.onUnitSphere.normalized;
    4. Vector3 c = SLerpUnit(a, b, 0.5f);
    5.  
    In comparison, excluding the callsite, the .NET 6 JIT'ter produced just 153 instructions for the main SLerpUnit method, and the runtime executed just 256 (and far more efficient) instructions, with only 3 casts from float to double, and 3 casts from double to float.
    Note: Since you can't run .NET 6 on Unity, I made a simple stub version of Vector3 with just the used methods for the .NET test run.


    Which ones did you benchmark specifically?

    Because calling MathF.Abs(float) produces 100% the same code as calling Math.Abs(float).

    And MathF.Ceiling(float) is much slower than Math.Ceiling(double) because MathF.Ceiling(float) takes a float, and thus invokes the transitioning code.
    IE. MathF.Ceiling(float) executes 61 instructions, while Math.Ceiling(double) executes 1 instruction.

    MathF being float is not the reason to use it, since, as we've already established, Mono converts everything to double anyway, and then back to float.
    Eg. a MathF.Ceiling(float) call is converted to essentially: "MathF.Ceiling((float)(double)floatField);"

    And with MathF.Ceiling taking a float, it invokes the transitioning call, whereas Math.Ceiling is inlined.

    IE:
    Code (csharp):
    1.  
    2. 0000019244221BCB | 48:B8 E830F20092010000  | mov rax, 0x19200F230E8      ; Load address of floatField.
    3. 0000019244221BD5 | F3:0F1000               | movss xmm0, dword ptr [rax] ; Load float.
    4. 0000019244221BD9 | F3:0F5AC0               | cvtss2sd xmm0, xmm0         ; Convert it to double.
    5. 0000019244221BDD | F2:0F5AC0               | cvtsd2ss xmm0, xmm0         ; Convert it back to float.
    6. 0000019244221BE1 | 48:8D6424 00            | lea rsp, qword ptr [rsp]    ; 5 byte NOP.
    7. 0000019244221BE6 | 49:BB 701C224492010000  | mov r11, 0x19244221C70      ; Load address of MathF.Ceiling(float); or rather, the transition code.
    8. 0000019244221BF0 | 41:FFD3                 | call r11                    ; Call MathF.Ceiling(float) through the 60+ instruction transitioner.
    9.  
    Versus Math.Ceiling(double) which is just:
    Code (csharp):
    1.  
    2. 00000226C7631BCB | 48:B8 48FF368426020000  | mov rax, 0x2268436FF48      ; Load address of floatField.
    3. 00000226C7631BD5 | F3:0F1000               | movss xmm0, dword ptr [rax] ; Load float.
    4. 00000226C7631BD9 | F3:0F5AC0               | cvtss2sd xmm0, xmm0         ; Convert it to double.
    5. 00000226C7631BDD | F2:0F10C0               | movsd xmm0, xmm0            ; 4 byte NOP.
    6. 00000226C7631BE1 | 66:0F3A09C0 02          | roundpd xmm0, xmm0, 0x2     ; Math.Ceiling(double) inlined.
    7. 00000226C7631BE7 | F2:0F10C0               | movsd xmm0, xmm0            ; 4 byte NOP.
    8.  

    If they'd even just switched to the latest build of Mono, instead of staying on these 10 year old Frankenstein builds, that alone, would be a massive improvement..
     
    Max-om likes this.
  34. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    There is a long history as to why Unity is on such an outdated version of Mono.

    Back before they had backing from Microsoft, they had a special license for Mono (because just because it's open source, open source still comes with stipulations for commercial usages, depending the open-source license applied). This was very early on in Unity's life. They had done, as you refer to, a 'frankenstein' on the thing to integrate it into Unity.

    Thing is after Mono matured more they attempted to update it internally. But Xamarian refused to give them the rights to without a hefty increase in their license agreement which Unity didn't want to do. Cause they're a company and their bottom line matters to them.

    This is why for years we were stuck on a very outdated version of C#/.net support.

    Unity started work on IL2CPP as a work around then... main part being so that they could get away from Xamarian's AOT (which most of the license was held up in, and was the major expense of the license), but it also would facilitate getting their C# version pushed forward some more.

    But then Microsoft stepped in and facilitated further remedying the license situation... but Unity has already sunk a lot of effort into IL2CPP at this point, no reason in tossing that out.

    And thusly we are in the boat we are now.

    ...

    Or at least that's the most concise history I could do of it... there is definitely a lot missing from what I'm saying.

    But in the end, the general thing is... long term software development is chaotic and can not always rely on always staying on the bleeding edge. Instead it's full of compromises.

    Or to put it shortly...
    I was making a tangential nod to a topic that many of us legacy people here are very well aware of without having to crack that tangent open all the way.

    ...

    As to massive improvements.

    Eh... I honestly don't see the need for improving the efficiency of the mono runtime used in Unity.

    1) Unity has focused their efforts, which are limited, on things that can give even better improvements over all. Such as DOTs, IL2CPP, the garbage collection (I'll tell you what, that thing was a huge stinker years ago, less so today... even if you think today's GC is bad, you have no idea what it was like 10 years ago).

    2) IL2CPP has huge performance benefits. For instance the link I supplied to the blog post talking about the float32 thing in the mono runtime having lower performance. They list numbers for the IL2CPP and demonstrate that it doesn't have these performance issues.

    And sure... maybe some parts of the framework can't be accessed do to incompatibility. But they're minor parts, and you can usually find work arounds.

    And even if you can't... the mono runtime isn't horrendous. It works.

    I don't know what games you guys are writing... are y'all triple-A bleeding edge studios?

    Because I have yet to drown my CPU with the code I write in C# in unity on the mono runtime. The GPU is almost always my bottle neck.
     
    Last edited: Nov 30, 2022
    Ryiah and mwss1996 like this.
  35. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Here's my TLDR for me in regards to my opinion of Unity.

    Is Unity the best? Nope.

    Is Unity the most performant? Nope.

    Does Unity have cracks in its design? Oh, for certain!

    ...

    But what engine doesn't?

    What I love about Unity... despite its warts... is that I can write a farely simple yet robust language like C#, in an engine that's pretty user friendly from a beginner perspective, but also has enough oomph to go beyond beginner.

    If I was making they next triple-A title... Unity wouldn't be my choice. It will never be my choice. The amount of PR Unity would have to overcome to break the stigma it holds in the professional world is so immense it isn't going to happen any time soon.

    And honestly... I ain't ever going to be there either... I don't want to. I like the indie space way more.

    People like to talk about "don't fix what ain't broke"... well, the mono runtime works! Is it the best? Nope. But it works.

    I don't fret over these minor optimizations. I've been writing Unity games for 12 years now and never once has the "oh, did you know that internally the unity mono clr actually treats all floats as doubles" has never ONCE came up until this thread. Because it never once impacted me in any meaningful way.

    I like @Kurt-Dekker often has to say about this... and I'm paraphrasing here. But he generally nods out of these conversations with the comment "I'm here to make games".
     
  36. Nad_B

    Nad_B

    Joined:
    Aug 1, 2021
    Posts:
    151
    mwss1996 and Kurt-Dekker like this.
  37. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Small trivial games it doesn't matter offcourse. But it adds up fast. And like I said moving to il2cpp isn't easy for a large game.

    Interesting that mono performance haven't impacted you and at the same time you advocate il2cpp.
     
  38. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    I don't advocate IL2CPP from a perspective of it's sped things up for me over poor performance I theoretically had on mono.

    I advocate it because it does speed things up, demonstrably it does, and if you truly need it, it's there. It's like saying "I've never needed to go 180mph, but if you need to, a Lotus exists for you." My knowing that such things exist doesn't mean I necessarily need them.

    Personally I've gotten into IL2CPP because recently I've been writing more games that target platforms where mono isn't available (webgl and mobile), and by splitting my work across mono AND il2cpp actually complicates it more than just sticking to il2cpp.

    And that's kind of to my point though... my games aren't small/trivial. It's just that games logic isn't using hardcore maths usually. Simulations use a lot of math... but in most video games (not all, but most) the simulations are handled by the physics engine which is NOT ran in mono. Same goes for graphics, which again is seldom run in mono. If you need special graphics you usually write a shader that runs on the GPU. Furthermore if you want to do some heavy duty maths there are compute shaders that allow you to tap into that as well.

    But game logic tends to be some reading inputs, translating that to movement, and done with it. The math is mostly trivial. A handful of trig methods and a couple square roots in the most extreme of situations.

    Other things might be AI graph solving like A*. But that's not really that math heavy either.

    You said "it adds up", but I have yet to see it "add up" in any of my own games. And that's why I ask... what games are you making?

    Cause like... I'm thinking of games like City Skylines which is simulation heavy (it has to simulate the entire city), and runs in Unity. I've been playing it for years and never really felt any issues on the compute side of things (again, gpu/rendering, sure... but not compute).
     
    Last edited: Nov 30, 2022
    orionsyndrome and Nad_B like this.
  39. Nad_B

    Nad_B

    Joined:
    Aug 1, 2021
    Posts:
    151
    I know some big, professional, successful games running on Unity/Mono with no problem. The game I'm making is far from being "small/trivial", runs on Mono without sweating on a 5 years old hardware (90+ FPS on a 6 years old GTX 950). I don't plan to switching to IL2CPP at all, as I enjoy my Reflection/DI too much.

    The secret? good architecture + good code + (most importantly) reasonable expectations from a managed, JIT compiled language.
     
    orionsyndrome likes this.
  40. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Agreed!

    I use so much reflection and DI as well! (again could be argued it's over engineered)

    These are some major performance hitters in terms of CPU! I've ran the benchmarks and I'm throwing away massive amounts of performance by using it.

    But it's convenient to our game design.

    For example I have a data binding library for connecting in game objects to UI. It's trivial to display various information on screen. And it all relies on reflection. This is great because a designer with no coding knowledge can go in and create screens worth of UI that display an abundance of information to the player without me writing a single line of code for them. I wrote the databinder, they just point it at targets and run with it.

    Still getting good framerates/performance.

    My last released game does this all over the place. And it targets mobile. We wanted it to be able to run on hardware dating back to 2015... so I made sure it ran on an old Android I had laying around. And the sole hiccup I had performance wise?

    Post-processing FX. Just cause most mid/low-tier mobile GPUs suck at post-processing fx.

    ...

    Oh and I should point out. Reflection does work in IL2CPP. The only thing that doesn't work is any dynamically generated code... i.e. System.Reflection.Emit:
    https://learn.microsoft.com/en-us/dotnet/api/system.reflection.emit?view=net-7.0

    There's also little gotchas with things like generics and the sort as Unity needs to make everything concrete in the end when converting to C++, and if it can't, it'll fail.

    But general old reflection of type members works fine.
     
    orionsyndrome, mwss1996 and Nad_B like this.
  41. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    My game
    Even though physics are handled by native code you often need to control physics by Adding force, et. Which are managed code. I agree though our game which granted have a complex domain only spend a small portion of its frame render time in our code, most of the time is spent in the render thread and rendering.

    Though faster execution of our code means more time can be spent rendering the game. They should have moved away from Mono a long time ago in favour for .NET core atleast for supported platforms like windows.
     
  42. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Our domain takes about 0.15 Ms to execute, zero allocation. 0.15 MS that could be spent rendering the game. Edit: that's on my 5950x which by now have low single core perf

    Edit: our net code uses reflection to invoke RPC so also hard to move to il2cpp without making worse RPC API.
     
    Nad_B likes this.
  43. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
  44. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Sure, but again... but that's not heavy duty maths. That's generally something like "read input, if user presses A, apply an impulse". The maths involved on the managed side is mostly trivial... maybe calculate a directional vector or something. We're talking all of 10 math operations ran once per event that requires the calculation.

    Let's say you're even making a spawner that shoots projectiles using an impulse. In any given frame you're doing what... 100 of these in an extreme situation?

    To go back to my car analogy... most of these calculations would be like driving my car to the end of my street to the corner store. Sure, I could get there faster with a Lotus... but it's inconsequential the time difference on such small scales.

    Where as if I raced a Lotus vs a Yaris from Boston to NYC... well yeah, the Lotus is going to perform leagues better, because it has the space to stretch its legs and really show you what its got.

    Shoulda... woulda... coulda.

    They can only do so much in the time they have. Will they? Yeah... they're working on getting to .net, @Nad_B already linked to that.

    But I mean... get a tiny performance boost from moving off of mono... or give us better version of IL2CPP for mobile builds, implement features like DOTs, Pathfinding, Physical Shaders, etc.

    I'll take things that make making games more awesome for 500 Alex!
     
    Ryiah and Nad_B like this.
  45. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    OK... but how heavily did it impact you?

    In that thread you note that there's an alloc, and you want to know where you can find it exactly occurring. But like... did this break your game?

    The images in the the tread show immense framerates! Probably cause you have it quarantined into an empty project cause you're trying to hunt it down.

    But... like when you said:
    Well... you inferred that maybe I have had issues in mono.

    OK... well I'm going to infer from this... that you're one of the performance chasers constantly looking to squash every last tiny spec of operations you deem unnecessary. The kind of person who wants a 0 GC game.

    If you are or aren't... what I generally say to people like that. Don't write in Unity. If you want to squash every alloc, write your game in C++ where you control every single alloc.
     
    Nad_B likes this.
  46. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    We have a ballistics engine for our game. Using structs, tightly packed on the CPU cache . If all players empty a magazine because of time of flight potentielly a thousand projectiles can be active. That's a few float to double casts for you. Before you tell us to move to burst, async raycasting still is wonky, plus it's harder to implement lag compensation. With vanilla physics library you can move transforms call SyncTransform and raycast, for each player in same frame. Not currently possible on async raycasting
     
  47. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    We get some gc spikes yes. Remeber our game is VR and can never go below 90 fps
     
  48. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,138
    Remember what?

    :hits ctrl+f and types VR... no results appear aside from this most recent post:

    This is the first time I heard your game is VR! I asked you before what kind of game you're working on so I can understand why you're so concerned about performance to this degree!

    I'm not you... I don't know you or your game. I've conceded time and time again that there are potential niche scenarios where this may be a concern... it's why I asked you what kind of game you're making. And I've said this very well may be something you need to concern yourself with.

    But in the grand scheme of things... it's not.

    And I wouldn't call it "fubar'd" or anything egregious like that on the hand of Unity. They have bigger fish to fry. They'll get to it though... which they are. It's already been linked to.

    This isn't a "when your game gets big enough" scenario... which I should add is a bit demeaning to say as it implies that everyone else is making baby games unlike your big bad-ass game... but rather this is a "my game is niche from most games". Cause VR is niche!
     
    Last edited: Nov 30, 2022
    Ryiah, mwss1996 and Nad_B like this.
  49. Nad_B

    Nad_B

    Joined:
    Aug 1, 2021
    Posts:
    151
    Sorry but how would an additional 0.15ms make your game better? 0.15ms is literally nothing, it's less than 1% of a 60 FPS frame time, and if all your game logic runs in that time I'd say kudos to you and to Unity for such an amazing performance!
     
    lordofduct and mwss1996 like this.
  50. Max-om

    Max-om

    Joined:
    Aug 9, 2017
    Posts:
    470
    Sure it's fine I guess. But it's when players are just running around doing nothing. If they start shooting and we need lag comp, ballistics simulation etc. On coop we have AI. It all adds up.

    I have seen the update loop reach 1ms when alot are going around. 1ms in a 11ms renderer time is almost 10 procent.