Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question float performance can be terrible compared to double in mono?!

Discussion in 'Scripting' started by andyz, Aug 24, 2023.

  1. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    Doing some speed tests recently I found that mono is generally rubbish in comparison to il2cpp for simple maths!
    But if compiling for mono I also found floats seem to cause all kinds of problems (just very slow) for mono where doubles do not and have since read some snippets that suggest .NET may be more optimal with doubles.
    Is there any truth to this or have I just messed up a test somewhere?!
     
    Last edited: Aug 24, 2023
  2. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    Like what? You need to be specific and back it up with measurements or sources, otherwise you‘re just spreading myths, half-wisdom, plain misunderstandings, and such.

    If you want to measure math performance and want to have the fastest math in general, do use the Unity Mathematics and Burst packages.
     
  3. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    I can run any line of maths like:

    Code (CSharp):
    1.  
    2. double div = 1.9999;
    3. double tot = .01;
    4. for (int i = 0; i < 200000; i++)
    5. {
    6.     tot *= 2.0001;
    7.     tot /= div;
    8. }
    9.  
    10. float fdiv = 1.9999f;
    11. float ftot = .01f;
    12. for (int i = 0; i < 200000; i++)
    13. {
    14.     ftot *= 2.0001f;
    15.     ftot /= fdiv;
    16. }
    17.  
    The float version takes nearly twice as long to run in a mono build (similar speed with il2cpp, though double slightly slower)
    64bit windows
     
  4. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    4,019
    Measured how? To what accuracy? And the actual time with units please.

    Does the measurement still give the same result if you switch the float and double blocks around? Perhaps the initial cache misses cause the float part to be slower.
     
  5. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    Are you using Stopwatch? If so, the first return of ticks/milliseconds is always a false number, best to switch the order when comparing 2 methods.
     
  6. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    yes stopwatch - you can repeat multiple times, switch order, and later are much the same as first runs, ticks/milliseconds all give the same result
     
  7. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    But please do try yourself:

    Code (CSharp):
    1. using System.Collections;
    2. using UnityEngine;
    3. using System.Diagnostics;
    4.  
    5. public class MinTest : MonoBehaviour
    6. {
    7.     IEnumerator Start()
    8.     {
    9.         yield return new WaitForSeconds( 1f );
    10.  
    11.         DoubleTest();
    12.         FloatTest();
    13.  
    14.         yield return new WaitForSeconds( .1f );
    15.  
    16.         FloatTest();
    17.         DoubleTest();
    18.     }
    19.  
    20.     void DoubleTest()
    21.     {
    22.         var timer = new Stopwatch();
    23.         timer.Start();
    24.         double div = 1.9999;
    25.         double tot = .01;
    26.         for (int i = 0; i < 200000; i++)
    27.         {
    28.             tot *= 2.0001;
    29.             tot /= div;
    30.         }
    31.         timer.Stop();
    32.         UnityEngine.Debug.Log( "double: " + timer.Elapsed.TotalMilliseconds + "ms" );
    33.     }
    34.  
    35.     void FloatTest()
    36.     {
    37.         var timer = new Stopwatch();
    38.         timer.Start();
    39.         float fdiv = 1.9999f;
    40.         float ftot = .01f;
    41.         for (int i = 0; i < 200000; i++)
    42.         {
    43.             ftot *= 2.0001f;
    44.             ftot /= fdiv;
    45.         }
    46.         timer.Stop();
    47.         UnityEngine.Debug.Log( "float: " + timer.Elapsed.TotalMilliseconds + "ms" );
    48.     }
    49. }
     
  8. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    That's what I mean, the Start() is always false, best to use Restart(). And test them swapped in order, you'll see what I mean. The first Start() always gathers up Unity's Editor handles, the first read out would be 234876 ticks, and every avg tick after that would be 34.. it's crazy
     
  9. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    So I ran the test myself:
    Code (CSharp):
    1. public class Test01 : MonoBehaviour
    2. {
    3.     Stopwatch stopwatch = new Stopwatch();
    4.     public int iterations = 999999999; // highest I could go, without using long
    5.     void Start()
    6.     {
    7.         stopwatch.Start();
    8.         DoubleTest();
    9.         stopwatch.Stop();
    10.         print($"double 1 : {stopwatch.ElapsedTicks}");
    11.         stopwatch.Restart();
    12.         FloatTest();
    13.         stopwatch.Stop();
    14.         print($"float 1 : {stopwatch.ElapsedTicks}");
    15.         stopwatch.Restart();
    16.         DoubleTest();
    17.         stopwatch.Stop();
    18.         print($"double 2 : {stopwatch.ElapsedTicks}");
    19.         stopwatch.Restart();
    20.         FloatTest();
    21.         stopwatch.Stop();
    22.         print($"float 2 : {stopwatch.ElapsedTicks}");
    23.     }
    24.  
    25.     void DoubleTest()
    26.     {
    27.         double div = 1.9999;
    28.         double tot = .01;
    29.         for (int i = 0; i < iterations; i++)
    30.         {
    31.             tot *= 2.0001;
    32.             tot /= div;
    33.         }
    34.     }
    35.  
    36.     void FloatTest()
    37.     {
    38.         float fdiv = 1.9999f;
    39.         float ftot = .01f;
    40.         for (int i = 0; i < iterations; i++)
    41.         {
    42.             ftot *= 2.0001f;
    43.             ftot /= fdiv;
    44.         }
    45.     }
    46. }
    EDIT: Typo, results further down
     

    Attached Files:

    Last edited: Aug 24, 2023
  10. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    I have run it standalone it is not 0. I have done it without start (restart).

    Change your reset to restart
     
    Last edited: Aug 24, 2023
  11. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    oof! good catch, I typed quickly and thought it said restart, accurate readout:
    SW_floatDouble02.jpg
    Okay then, yeah that's crazy ^ ^ ^
     
  12. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,751
    I remember reading the Mono version used in Unity does float math operations using doubles. There's a thread about that somewhere here on the forums.

    This means Mono will cast the floats to doubles, perform the operations, then cast the results back to float. When doing the math directly with doubles no conversions are done and thus it runs faster.

    -- EDIT --
    Found the thread:
    https://forum.unity.com/threads/uni...-issues-easy-solved-unity-please-read.830811/

    This is supposed to have been fixed in Unity 2021.2, what version are you testing on?
     
    Last edited: Aug 24, 2023
  13. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
    Ah yes I think that is 1 of 2 things I found (the other about .NET), but seems wildly unreported.
    I mean il2cpp does not seem to suffer such issues...

    But I tested in 2022 LTS and still no fix I guess
     
  14. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    andyz likes this.
  15. AcidArrow

    AcidArrow

    Joined:
    May 20, 2010
    Posts:
    11,001
    Is it? In that thread there is a dev saying it will land in 2021.2 or later, so I guess it was "later".
     
  16. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    In that video (thanks to @s_schoener ) explains what exactly happens under the hood of different compilers.
     
  17. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    I'm still on 2021.1.10f1... I'm always afraid to update in case things change(for the worst), lol :confused:
     
  18. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    20,124
    Reading this thread gave me the urge to check on the Burst compiler. I know the first results have the compile time included, but I don't know why either of them are as low as they are. I didn't think there was that much difference between Burst and Mono.

    Code (CSharp):
    1. using System.Diagnostics;
    2. using UnityEngine;
    3. using Unity.Burst;
    4. using Unity.Jobs;
    5.  
    6. public class PerfTest : MonoBehaviour
    7. {
    8.     private void Start()
    9.     {
    10.         var stopwatch = new Stopwatch();
    11.  
    12.         stopwatch.Start();
    13.         DoubleTest();
    14.         stopwatch.Stop();
    15.         print($"<color=#FFAAAA>Double 1: {stopwatch.ElapsedTicks}</color>");
    16.  
    17.         stopwatch.Restart();
    18.         FloatTest();
    19.         stopwatch.Stop();
    20.         print($"<color=#FFAAAA>Float 1: {stopwatch.ElapsedTicks}</color>");
    21.  
    22.         stopwatch.Restart();
    23.         new DoubleJob().Schedule().Complete();
    24.         stopwatch.Stop();
    25.         print($"<color=#FFAAAA>DoubleJob 1: {stopwatch.ElapsedTicks}</color>");
    26.  
    27.         stopwatch.Restart();
    28.         new FloatJob().Schedule().Complete();
    29.         stopwatch.Stop();
    30.         print($"<color=#FFAAAA>FloatJob 1: {stopwatch.ElapsedTicks}</color>");
    31.  
    32.         stopwatch.Restart();
    33.         DoubleTest();
    34.         stopwatch.Stop();
    35.         print($"<color=#AAAAFF>Double 2: {stopwatch.ElapsedTicks}</color>");
    36.  
    37.         stopwatch.Restart();
    38.         FloatTest();
    39.         stopwatch.Stop();
    40.         print($"<color=#AAAAFF>Float 2: {stopwatch.ElapsedTicks}</color>");
    41.  
    42.         stopwatch.Restart();
    43.         new DoubleJob().Schedule().Complete();
    44.         stopwatch.Stop();
    45.         print($"<color=#AAAAFF>DoubleJob 2: {stopwatch.ElapsedTicks}</color>");
    46.  
    47.         stopwatch.Restart();
    48.         new FloatJob().Schedule().Complete();
    49.         stopwatch.Stop();
    50.         print($"<color=#AAAAFF>FloatJob 2: {stopwatch.ElapsedTicks}</color>");
    51.     }
    52.  
    53.     private void FloatTest()
    54.     {
    55.         float divide = 1.9999f;
    56.         float total = 0.01f;
    57.  
    58.         for (int i = 0; i < 999999999; i++)
    59.         {
    60.             total *= 2.0001f;
    61.             total /= divide;
    62.         }
    63.     }
    64.  
    65.     private void DoubleTest()
    66.     {
    67.         double divide = 1.9999;
    68.         double total = 0.01;
    69.  
    70.         for (int i = 0; i < 999999999; i++)
    71.         {
    72.             total *= 2.0001;
    73.             total /= divide;
    74.         }
    75.     }
    76. }
    77.  
    78. [BurstCompile]
    79. public struct FloatJob : IJob
    80. {
    81.     public void Execute()
    82.     {
    83.         float divide = 1.9999f;
    84.         float total = 0.01f;
    85.  
    86.         for (int i = 0; i < 999999999; i++)
    87.         {
    88.             total *= 2.0001f;
    89.             total /= divide;
    90.         }
    91.     }
    92. }
    93.  
    94. [BurstCompile]
    95. public struct DoubleJob : IJob
    96. {
    97.     public void Execute()
    98.     {
    99.         double divide = 1.9999;
    100.         double total = 0.01;
    101.  
    102.         for (int i = 0; i < 999999999; i++)
    103.         {
    104.             total *= 2.0001;
    105.             total /= divide;
    106.         }
    107.     }
    108. }

    upload_2023-8-24_19-49-18.png
     
    Last edited: Aug 25, 2023
    andyz and wideeyenow_unity like this.
  19. wideeyenow_unity

    wideeyenow_unity

    Joined:
    Oct 7, 2020
    Posts:
    728
    True, was gonna say, floats are 4 bytes, and doubles 8 bytes.. So logic should point to the obvious.

    However whether Unity uses floats as doubles in calculations, it's doing twice the work. Or even if it's using a form of
    Mathf.Round()
    in order to keep accurate within it's limits, it will cause it to do more work.
     
  20. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    840
    I had always had it in the back of my mind that doubles were usually as fast and sometimes even faster than floats but those numbers are quite surprising. That being said, using floats can be useful for the sake of memory bandwidth. Perhaps a more 'real world' scenario where cache thrashing comes into play would have different results?

    Either way a 50% difference is kinda shocking to me lol
     
  21. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    Code (CSharp):
    1.     [BurstCompile]
    2.     public static class Math_Float_vs_Double
    3.     {
    4.         public static void FloatMul_NonBursted(ref NativeArray<float> inArr,in float val)
    5.         {
    6.             for (int i = 0; i < inArr.Length; i++)
    7.             {
    8.                 inArr[i] *= val;
    9.             }
    10.         }
    11.      
    12.         public  static void DoubleMul_NonBursted(ref NativeArray<double> inArr,in double val)
    13.         {
    14.             for (int i = 0; i < inArr.Length; i++)
    15.             {
    16.                 inArr[i] *= val;
    17.             }
    18.         }
    19.      
    20.         [BurstCompile]
    21.         public static void FloatMul([NoAlias]ref NativeArray<float> inArr,[NoAlias] in float val)
    22.         {
    23.             for (int i = 0; i < inArr.Length; i++)
    24.             {
    25.                 inArr[i] *= val;
    26.             }
    27.         }
    28.      
    29.      
    30.         [BurstCompile]
    31.         public  static void DoubleMul([NoAlias]ref NativeArray<double> inArr,[NoAlias] in double val)
    32.         {
    33.             for (int i = 0; i < inArr.Length; i++)
    34.             {
    35.                 inArr[i] *= val;
    36.             }
    37.         }
    38.      
    39.     }
    Code (CSharp):
    1.    [Test, Performance]
    2.     public void Float_vs_Double_Mul()
    3.     {
    4.         var floats = new NativeArray<float>(1_000_000, Allocator.Temp);
    5.         var doubles = new NativeArray<double>(1_000_000, Allocator.Temp);
    6.         RandomFiller.FillArrayFloat(ref floats);
    7.         RandomFiller.FillArrayDouble(ref doubles);
    8.  
    9.         SampleGroup FloatMul_NonBursted = new SampleGroup("FloatMul_NonBursted", SampleUnit.Microsecond);
    10.         Measure.Method(() => Math_Float_vs_Double.FloatMul_NonBursted(ref floats, 36.247f))
    11.             .SampleGroup(FloatMul_NonBursted).MeasurementCount(1000).Run();
    12.  
    13.         SampleGroup DoubleMul_NonBursted = new SampleGroup("DoubleMul_NonBursted", SampleUnit.Microsecond);
    14.         Measure.Method(() => Math_Float_vs_Double.DoubleMul_NonBursted(ref doubles, 36.247f))
    15.             .SampleGroup(DoubleMul_NonBursted).MeasurementCount(1000).Run();
    16.      
    17.         SampleGroup FloatMul = new SampleGroup("FloatMul", SampleUnit.Microsecond);
    18.         Measure.Method(() => Math_Float_vs_Double.FloatMul(ref floats, 36.247f))
    19.             .SampleGroup(FloatMul).MeasurementCount(1000).Run();
    20.  
    21.         SampleGroup DoubleMul = new SampleGroup("DoubleMul", SampleUnit.Microsecond);
    22.         Measure.Method(() => Math_Float_vs_Double.DoubleMul(ref doubles, 36.247f))
    23.             .SampleGroup(DoubleMul).MeasurementCount(1000).Run();
    24.  
    25.  
    26.     }
    (sorry for the crayon-like draw)

    -Unity_Performance_Tests - Unity 2022.3.7f1 _DX.png


    My guess is float 2x faster than double in basic scalar (non simd) operations like this, since this test was memory bound (like majority of math programs) packing 2x more variable in cache line means double the performance.
    And when simd occurs (burst copiled) more data can be packed to simd register (avx2=256 bit on my case) it also become faster to calculate.

    On mono looks like converting floats to double cause a huge mess. Its a total joke tbh. I hope mono dies soon as possible
     
  22. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    Job system immune to cache thrashing, false sharing and other bad things since it copies data for each thread and with some other safety rules
    https://docs.unity3d.com/2021.2/Documentation/Manual/JobSystemSafetySystem.html
     
    Sluggy likes this.
  23. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    840
  24. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    How any form of cache invalidating could be happen without multithreading? As far as I know its multithreading only problem. Also I don't see anyone use c# task for compute heavy code after v2019. Also even occurs how could a float perform worse than double in that situation?
     
  25. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    2,132
  26. icauroboros

    icauroboros

    Joined:
    Apr 30, 2021
    Posts:
    99
    It's up to you how much precision / automation you need for your benchmarks. If you want to use stopwatch I would suggest at least try to detect min time instead of sum time otherwise you will end gathering a lot of noise in sum time. Also I would suggest implement pre-warm runs.
    In Development mode frames slower because it needs to send data to profiler every frame but I don't think any reason to make it individual methods runs slower.
    Maybe some allocation speed differences can occur because every allocation have some overhead in dev mode. see https://docs.unity3d.com/Manual/memory-allocator-customization.html