Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question If SOA is faster than AOS then do we need a SOA VectorArray for highest 3d performance?

Discussion in 'Entity Component System' started by Arowx, Aug 12, 2022.

  1. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Benchmark: soa vs aos - MeasureThat.net

    So a simple benchmark that SOAs vector components gets a 3.2 Ops/sec vs Vector3 AOS 2.2 Ops/sec.

    Tried the same x2 vector operation in Unity (code below).
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5.  
    6. using System.Diagnostics;
    7. using RND = UnityEngine.Random;
    8. using DBG = UnityEngine.Debug;
    9.  
    10. public class VectorArraySOAvsAOS : MonoBehaviour
    11. {
    12.     void Start()
    13.     {
    14.         float[] x = new float[1000000];
    15.         float[] y = new float[1000000];
    16.         float[] z = new float[1000000];
    17.  
    18.         float3[] vectors = new float3[1000000];
    19.  
    20.         for (var i = 0; i < 1000000; i++)
    21.         {
    22.             x[i] = RND.value * 100;
    23.             y[i] = RND.value * 100;
    24.             z[i] = RND.value * 100;
    25.  
    26.             vectors[i].x = x[i];
    27.             vectors[i].y = y[i];
    28.             vectors[i].z = z[i];
    29.         }
    30.  
    31.         Stopwatch stopwatch = new Stopwatch();
    32.  
    33.         stopwatch.Start();
    34.         for (int i = 0; i < x.Length; i++)
    35.         {
    36.             x[i] *= 2f;
    37.             y[i] *= 2f;
    38.             z[i] *= 2f;
    39.         }
    40.         stopwatch.Stop();
    41.  
    42.         DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    43.  
    44.         stopwatch.Restart();    
    45.         for (int i = 0; i < vectors.Length; i++)
    46.         {
    47.             vectors[i] *= 2f;
    48.         }
    49.         stopwatch.Stop();
    50.  
    51.         DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    52.  
    53.     }
    54.  
    55.     // Update is called once per frame
    56.     void Update()
    57.     {
    58.      
    59.     }
    60. }
    61.  
    Results:
    SOA ticks 34,632
    AOS ticks 142,167

    OK this is only an x2 operation but there is a staggering 4.105 x performance difference on my PC.

    So could there be a case for SOA VectorArrays in Unity?
     
  2. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Try it for youself different hardware is probably more of a factor in actual numbers.
     
  3. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,574
    Your benchmark is poor, whatever you test.

    You should be running it in update.
    Not in startup. Let run benchmark few times and record multiple numbers.
    You need to consider warmup, when game starts up.
    Also repeat test, with swapping order of for loops, just in case it does matter for compiler.

    Then repeat test with a burst.

    Can you show burst assembly code?
     
  4. Enzi

    Enzi

    Joined:
    Jan 28, 2013
    Posts:
    908
    Entities is SOA.
    So, we already have it.
    /thread
     
    Kmsxkuse and Anthiese like this.
  5. scottjdaley

    scottjdaley

    Joined:
    Aug 1, 2013
    Posts:
    152
    Its SOA at the component level, but if a component has a float3 in it, the xyz components are stored alongside each other. IIUC, the suggestion here is to separate an array of float3s into 3 arrays of floats for certain use cases.

    I thought the whole advantage of SOA was that you can iterate and load into memory only the components you care about instead of loading the entire object/entity and all of its components. In the case of a float3, I feel like the vast majority of the time you want to do something with all three components, so there isn't value to storing them separately and it would just be annoying to deal with. Obviously, if the game is 2D or takes place on a flat world, I could see someone only caring about two of the components.
     
  6. apkdev

    apkdev

    Joined:
    Dec 12, 2015
    Posts:
    263
    Guys, what's with the negativity? We fight benchmarks with benchmarks!

    I've put this together, but rather hastily so, can someone improve?

    Code (CSharp):
    1. using System;
    2. using UnityEngine;
    3. using Unity.Mathematics;
    4. using System.Diagnostics;
    5. using Unity.Burst;
    6. using Unity.Collections.LowLevel.Unsafe;
    7. using static Unity.Collections.LowLevel.Unsafe.UnsafeUtility;
    8. using RND = UnityEngine.Random;
    9. using DBG = UnityEngine.Debug;
    10.  
    11. [BurstCompile(CompileSynchronously = true)]
    12. public sealed class VectorArraySOAvsAOS : MonoBehaviour
    13. {
    14.     readonly float[] x = new float[1000000];
    15.     readonly float[] y = new float[1000000];
    16.     readonly float[] z = new float[1000000];
    17.     readonly float3[] vectors = new float3[1000000];
    18.  
    19.     [ContextMenu(nameof(RunWithWarmup))]
    20.     void RunWithWarmup()
    21.     {
    22.         for (var i = 0; i < 1000000; i++)
    23.         {
    24.             x[i] = RND.value * 100;
    25.             y[i] = RND.value * 100;
    26.             z[i] = RND.value * 100;
    27.  
    28.             vectors[i].x = x[i];
    29.             vectors[i].y = y[i];
    30.             vectors[i].z = z[i];
    31.         }
    32.  
    33.         for (int i = 0; i < 10; ++i)
    34.             Run();
    35.     }
    36.  
    37.     void Run()
    38.     {
    39.         // yes, storing fields in variables actually speeds up access. sometimes.
    40.         // (i dunno man, ask the mono team about it)
    41.         var x = this.x;
    42.         var y = this.y;
    43.         var z = this.z;
    44.         var vectors = this.vectors;
    45.  
    46.         var stopwatch = new Stopwatch();
    47.  
    48.         {
    49.             stopwatch.Start();
    50.             for (int i = 0; i < x.Length; i++)
    51.             {
    52.                 x[i] *= 2f;
    53.                 y[i] *= 2f;
    54.                 z[i] *= 2f;
    55.             }
    56.  
    57.             stopwatch.Stop();
    58.             DBG.Log($"SOA1 ticks {stopwatch.Elapsed.Ticks}");
    59.         }
    60.  
    61.         {
    62.             stopwatch.Start();
    63.             for (int i = 0; i < x.Length; i++)
    64.                 x[i] *= 2f;
    65.             for (int i = 0; i < y.Length; i++)
    66.                 y[i] *= 2f;
    67.             for (int i = 0; i < z.Length; i++)
    68.                 z[i] *= 2f;
    69.             stopwatch.Stop();
    70.             DBG.Log($"SOA2 ticks {stopwatch.Elapsed.Ticks}");
    71.         }
    72.  
    73.         unsafe
    74.         {
    75.             var xUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(x, out ulong gch1), x.Length);
    76.             var yUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(y, out ulong gch2), y.Length);
    77.             var zUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(z, out ulong gch3), z.Length);
    78.  
    79.             stopwatch.Restart();
    80.             SOABursted1(xUnsafeList, yUnsafeList, zUnsafeList);
    81.             stopwatch.Stop();
    82.  
    83.             ReleaseGCObject(gch1);
    84.             ReleaseGCObject(gch2);
    85.             ReleaseGCObject(gch3);
    86.  
    87.             DBG.Log($"SOABurst1 ticks {stopwatch.Elapsed.Ticks}");
    88.         }
    89.  
    90.         unsafe
    91.         {
    92.             var xUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(x, out ulong gch1), x.Length);
    93.             var yUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(y, out ulong gch2), y.Length);
    94.             var zUnsafeList = new UnsafeList<float>((float*)PinGCArrayAndGetDataAddress(z, out ulong gch3), z.Length);
    95.  
    96.             stopwatch.Restart();
    97.             SOABursted2(xUnsafeList, yUnsafeList, zUnsafeList);
    98.             stopwatch.Stop();
    99.  
    100.             ReleaseGCObject(gch1);
    101.             ReleaseGCObject(gch2);
    102.             ReleaseGCObject(gch3);
    103.  
    104.             DBG.Log($"SOABurst2 ticks {stopwatch.Elapsed.Ticks}");
    105.         }
    106.  
    107.         {
    108.             stopwatch.Restart();
    109.  
    110.             for (int i = 0; i < vectors.Length; i++)
    111.                 vectors[i] *= 2f;
    112.  
    113.             stopwatch.Stop();
    114.  
    115.             DBG.Log($"AOS1 ticks {stopwatch.Elapsed.Ticks}");
    116.         }
    117.  
    118.         {
    119.             stopwatch.Restart();
    120.             for (int i = 0; i < vectors.Length; i++)
    121.             {
    122.                 ref var v = ref vectors[i];
    123.                 v.x *= 2f;
    124.                 v.y *= 2f;
    125.                 v.z *= 2f;
    126.             }
    127.  
    128.             stopwatch.Stop();
    129.  
    130.             DBG.Log($"AOS2 ticks {stopwatch.Elapsed.Ticks}");
    131.         }
    132.  
    133.         unsafe
    134.         {
    135.             var vectorsUnsafeList = new UnsafeList<float3>((float3*)PinGCArrayAndGetDataAddress(vectors, out ulong gch), length: vectors.Length);
    136.  
    137.             stopwatch.Restart();
    138.             AOSBursted(vectorsUnsafeList);
    139.             stopwatch.Stop();
    140.  
    141.             ReleaseGCObject(gch);
    142.  
    143.             DBG.Log($"AOSBurst ticks {stopwatch.Elapsed.Ticks}");
    144.         }
    145.  
    146.         DBG.Log("===");
    147.         GC.Collect();
    148.     }
    149.  
    150.     [BurstCompile(CompileSynchronously = true)]
    151.     static void SOABursted1(in UnsafeList<float> vx, in UnsafeList<float> vy, in UnsafeList<float> vz)
    152.     {
    153.         for (int i = 0; i < vx.Length; i++)
    154.             vx.ElementAt(i) *= 2f;
    155.         for (int i = 0; i < vy.Length; i++)
    156.             vy.ElementAt(i) *= 2f;
    157.         for (int i = 0; i < vz.Length; i++)
    158.             vz.ElementAt(i) *= 2f;
    159.     }
    160.  
    161.     [BurstCompile(CompileSynchronously = true)]
    162.     static void SOABursted2([NoAlias] in UnsafeList<float> vx, [NoAlias] in UnsafeList<float> vy, [NoAlias] in UnsafeList<float> vz)
    163.     {
    164.         for (int i = 0; i < vx.Length; i++)
    165.             vx.ElementAt(i) *= 2f;
    166.         for (int i = 0; i < vy.Length; i++)
    167.             vy.ElementAt(i) *= 2f;
    168.         for (int i = 0; i < vz.Length; i++)
    169.             vz.ElementAt(i) *= 2f;
    170.     }
    171.  
    172.  
    173.     [BurstCompile(CompileSynchronously = true)]
    174.     static void AOSBursted(in UnsafeList<float3> vectors)
    175.     {
    176.         for (int i = 0; i < vectors.Length; i++)
    177.             vectors.ElementAt(i) *= 2f;
    178.     }
    179. }
    upload_2022-8-12_21-30-30.png

    To sum up, yes! SOA is faster (when correctly used), but what's even faster is a decent compiler.

    Keep in mind this whole thing is waaaay deep in micro-optimization territory. I think it's rather unlikely that rewriting half of your game to SOA will result in any real FPS gain, but it's a nice tool to have for the really blazing hot code paths.
     
    Ryiah, Arowx, Antypodish and 3 others like this.
  7. scottjdaley

    scottjdaley

    Joined:
    Aug 1, 2013
    Posts:
    152
    Sorry, I really wasn't trying to be negative. I'm actually quite surprised by the results and appreciate them being shared.

    Just to clarify my point, I always thought that when people said "SOA is faster" it was mostly due to the access pattern better aligning with the storage. Specifically, in cases where you don't need the entire data structure, the speed up would come from only loading (and prefetching) the data you are going to use.

    In these benchmarks, all of the data is stored and being accessed in a linear fashion, so its surprising to see that SOA comes up considerably faster. I would actually expect the AOS version to do slightly better because it doesn't need to be prefetching from 3 different places in memory.

    Does anyone have any intuition as to why SOA is performing better in these benchmarks?
     
  8. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,574
    Perhaps SIMD kicks in.
     
    scottjdaley likes this.
  9. koirat

    koirat

    Joined:
    Jul 7, 2012
    Posts:
    2,008
    How about Loop Unrolling this AOS.
     
  10. officialfonee

    officialfonee

    Joined:
    May 22, 2018
    Posts:
    42
    I don't believe this is the main reason why SOA is faster (although, like you say, if you don't use 2 of the three floats, you can lose 2/3 of your cache line).

    Now assuming your compiler vectorizes the AOS, you lose operations pulling apart your Vector3, doing your v-instructions, reconstructing the Vector3 and then storing it. (It's possible the compilers decided to do it another way, but this is the simplest way for explanations).

    Now compare this to SOA: loading floats from array "X" in register, multiple them all by constant register, store them back into array. When using SOA, your compiler easily sees the potential in using SIMD, especially in the tests provided and implements it.

    When you are doing such a small number of operations, having to manage a Vector3 instead of loading into registers, ready to operate on, can become a very large waste of instructions.
     
    Antypodish and scottjdaley like this.
  11. officialfonee

    officialfonee

    Joined:
    May 22, 2018
    Posts:
    42
    At the end of the day, you spend less instructions in SOA than AOS so unless your compiler is very dumb, AOS should end up faster.

    I implore anyone to spend some time trying to write some SIMD code with Burst. They have built in Intrinsics: Unity.Burst.Intrinsics | Burst | 1.8.0-pre.2 (unity3d.com) which I think is some of the best when it comes to writing manual vectorizing. You can look into every instruction, and Unity writes out what exactly it does behind the hood. Plus, it gives you great insight to the difficulty of when you have AOS vs SOA and what caters best to SIMD.
     
    Antypodish and scottjdaley like this.
  12. scottjdaley

    scottjdaley

    Joined:
    Aug 1, 2013
    Posts:
    152
    Oh yeah, that could definitely explain it.

    These benchmarks use float3 not Vector3, so the overhead isn't as bad but I can still see that contributing to the difference. And like you said, it is much easier for the compiler to vectorize the SOA version even if both could benefit from it.
     
  13. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    I think the speed of SOA is just that the CPU is dealing with contigous array blocks in memory so it can pull down the next set of data into cache easier than if it's a struct with three chunks of data that are not contiguous e.g. x1, y1, z1 ,x2 ,y2 ,z2 vs x1, x2, x3 ... y1, y2, y3 ... z1, z2, z3.

    And think about nearly everything in a game is processing a lot of Vector3, Quaternion4 or Matrix4x4.
     
  14. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,983
    I pretty regularly dabble in this space and have quite a bit of technical insight.

    First, which is faster varies a lot based on the problem. There is not a one-size-fits-all. Not even close.

    SOA is more efficient as it can process more elements with fewer instructions. But that only matters if and only if memory is not a problem. Once you reach a certain number of input components per element (usually some number above 8), you run out of hardware prefetch streams to keep track of all your inputs. At that point, you need to start interleaving your inputs into something like this:
    XXXXYYYY_ZZZZXXXX_YYYYZZZZ

    Another benefit occurs when particular components in a vector go unused, but in practice, that rarely happens. In the cases where it does, usually it isn't an all-or-nothing. I store AABBs with the min x and max x as separate arrays and the rest in a float4 array. The x values are used for interval searches. But for the other components, instead of having a transformation of 32 elements with 4 outputs in a single register that I have to unpack (full SOA), I instead have 8 elements and 1 output. That 1 output can be immediately handled by scalar integer hardware while the SIMD hardware loads up the next batch of elements.

    And that's for linear memory access. As soon as you start getting into random access or even enabled masks, your cache miss count multiplies by the number of separate arrays.

    If one of your transformations is random access, and the step before or after is linear access, that's usually a good opportunity to switch paradigms.

    And sometimes, the nature of the problem just makes everything weird. For animation blending and hierarchy handling, I thought for sure SOA was going to be faster. But then I discovered that if I stored things in AOS, each final matrix element could actually be type punned to store the local space TRS. I was able to accumulate all poses in the buffer, and then transform them to their root-space matrices completely in place. This drastically reduced the amount of memory required. And most of the operations could still make use of SIMD instructions, even if they were slightly less efficient. The memory savings, more readable code, and Burst being incredibly smart, AOS ended up being the better choice.
     
    Egad_McDad, Ryiah, xVergilx and 5 others like this.
  15. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    I think there is an issue in your SOABurst 1 code
    Code (CSharp):
    1. [BurstCompile(CompileSynchronously = true)]
    2.     static void SOABursted1(in UnsafeList<float> vx, in UnsafeList<float> vy, in UnsafeList<float> vz)
    3.     {
    4.         for (int i = 0; i < vx.Length; i++)
    5.             vx.ElementAt(i) *= 2f;
    6.         for (int i = 0; i < vy.Length; i++)
    7.             vy.ElementAt(i) *= 2f;
    8.         for (int i = 0; i < vz.Length; i++)
    9.             vz.ElementAt(i) *= 2f;
    10.     }
    11.     [BurstCompile(CompileSynchronously = true)]
    12.     static void SOABursted2([NoAlias] in UnsafeList<float> vx, [NoAlias] in UnsafeList<float> vy, [NoAlias] in UnsafeList<float> vz)
    13.     {
    14.         for (int i = 0; i < vx.Length; i++)
    15.             vx.ElementAt(i) *= 2f;
    16.         for (int i = 0; i < vy.Length; i++)
    17.             vy.ElementAt(i) *= 2f;
    18.         for (int i = 0; i < vz.Length; i++)
    19.             vz.ElementAt(i) *= 2f;
    20.     }
    Shouldn't it be a single loop like SOA1

    Code (CSharp):
    1. for (int i = 0; i < x.Length; i++)
    2.             {
    3.                 x[i] *= 2f;
    4.                 y[i] *= 2f;
    5.                 z[i] *= 2f;
    6.             }
    7.             stopwatch.Stop();
    8.             DBG.Log($"SOA1 ticks {stopwatch.Elapsed.Ticks}");
    e.g.
    Code (CSharp):
    1. static void SOABursted1(in UnsafeList<float> vx, in UnsafeList<float> vy, in UnsafeList<float> vz)
    2.     {
    3.         for (int i = 0; i < vx.Length; i++)
    4.        {
    5.             vx.ElementAt(i) *= 2f;
    6.             vy.ElementAt(i) *= 2f;
    7.             vz.ElementAt(i) *= 2f;
    8.         }
    9.     }
    Then [NOALias] could be applied to both as it attemps to boost vectorisation, batching a set of loops data into one SIMD operation.

    Not sure why you wrote SO2 3 loops when 1 can do. CPU cores have multiple registers for data so making them loop three seperate times in sequence is just a waste of time unless the compiler/CPU can run all 3 at the same time.
     
    apkdev likes this.
  16. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK Added Native Array Burst Functions and Jobs (single threaded).

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5. using Unity.Burst;
    6. using Unity.Collections;
    7.  
    8. using System.Diagnostics;
    9. using RND = UnityEngine.Random;
    10. using DBG = UnityEngine.Debug;
    11. using TMPro;
    12. using Unity.Jobs;
    13.  
    14. public class VectorArraySOAvsAOS : MonoBehaviour
    15. {
    16.     public TextMeshProUGUI text;
    17.  
    18.     void Start()
    19.     {
    20.         const int size = 1000000;
    21.         float[] x = new float[size];
    22.         float[] y = new float[size];
    23.         float[] z = new float[size];
    24.  
    25.         float3[] vectors = new float3[size];
    26.         float3[] vectors1 = new float3[size];
    27.  
    28.         float[] x1 = new float[size];
    29.         float[] y1 = new float[size];
    30.         float[] z1 = new float[size];
    31.  
    32.         NativeArray<float3> navectors = new NativeArray<float3>(size, Allocator.Persistent);
    33.  
    34.         NativeArray<float> nax = new NativeArray<float>(size, Allocator.Persistent);
    35.         NativeArray<float> nay = new NativeArray<float>(size, Allocator.Persistent);
    36.         NativeArray<float> naz = new NativeArray<float>(size, Allocator.Persistent);
    37.  
    38.         NativeArray<float3> jobvectors = new NativeArray<float3>(size, Allocator.Persistent);
    39.  
    40.         NativeArray<float> jobx = new NativeArray<float>(size, Allocator.Persistent);
    41.         NativeArray<float> joby = new NativeArray<float>(size, Allocator.Persistent);
    42.         NativeArray<float> jobz = new NativeArray<float>(size, Allocator.Persistent);
    43.  
    44.  
    45.         for (var i = 0; i < size; i++)
    46.         {
    47.             x[i] = RND.value * 100;
    48.             y[i] = RND.value * 100;
    49.             z[i] = RND.value * 100;
    50.  
    51.             vectors[i].x = x[i];
    52.             vectors[i].y = y[i];
    53.             vectors[i].z = z[i];
    54.  
    55.             vectors[i] = vectors[i];
    56.  
    57.             x1[i] = x[i];
    58.             y1[i] = y[i];
    59.             z1[i] = z[i];
    60.  
    61.             navectors[i] = vectors[i];
    62.  
    63.             nax[i] = x[i];
    64.             nay[i] = y[i];
    65.             naz[i] = z[i];
    66.  
    67.             jobvectors[i] = vectors[i];
    68.  
    69.             jobx[i] = x[i];
    70.             joby[i] = y[i];
    71.             jobz[i] = z[i];
    72.  
    73.         }
    74.  
    75.         Stopwatch stopwatch = new Stopwatch();
    76.  
    77.         stopwatch.Start();
    78.         for (int i = 0; i < x.Length; i++)
    79.         {
    80.             x[i] *= 2f;
    81.             y[i] *= 2f;
    82.             z[i] *= 2f;
    83.         }
    84.         stopwatch.Stop();
    85.  
    86.         DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    87.  
    88.         text.text = "SOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    89.  
    90.         stopwatch.Restart();      
    91.         for (int i = 0; i < vectors.Length; i++)
    92.         {
    93.             vectors[i] *= 2f;
    94.         }
    95.         stopwatch.Stop();
    96.  
    97.         DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    98.  
    99.         text.text += "AOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    100.  
    101.         stopwatch.Restart();
    102.             BurstAOS(vectors1);
    103.         stopwatch.Stop();
    104.  
    105.         DBG.Log("BurstAOS ticks " + stopwatch.Elapsed.Ticks);
    106.  
    107.         text.text += "BurstAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    108.  
    109.         stopwatch.Restart();
    110.             BurstSOA(x1,y1,z1);
    111.         stopwatch.Stop();
    112.  
    113.         DBG.Log("BurstSOA ticks " + stopwatch.Elapsed.Ticks);
    114.  
    115.         text.text += "BurstSOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    116.  
    117.  
    118.         stopwatch.Restart();
    119.         BurstNAAOS(navectors);
    120.         stopwatch.Stop();
    121.  
    122.         navectors.Dispose();
    123.  
    124.         DBG.Log("BurstNAAOS ticks " + stopwatch.Elapsed.Ticks);
    125.  
    126.         text.text += "BurstNAAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    127.  
    128.         stopwatch.Restart();
    129.         BurstNASOA(nax, nay, naz);
    130.         stopwatch.Stop();
    131.  
    132.         DBG.Log("BurstNASOA ticks " + stopwatch.Elapsed.Ticks);
    133.  
    134.         text.text += "BurstNASOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    135.        
    136.         nax.Dispose();
    137.         nay.Dispose();
    138.         naz.Dispose();
    139.  
    140.         // Jobs versions
    141.         stopwatch.Restart();
    142.         BurstNAAOSJob job1 = new BurstNAAOSJob { vectors = jobvectors };
    143.         job1.Schedule().Complete();
    144.         stopwatch.Stop();
    145.  
    146.         DBG.Log("Burst AOS Job ticks " + stopwatch.Elapsed.Ticks);
    147.  
    148.         text.text += "Burst AOS Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    149.  
    150.         jobvectors.Dispose();
    151.  
    152.  
    153.         stopwatch.Restart();
    154.         BurstNASOAJob job2 = new BurstNASOAJob { x = jobx, y = joby, z = jobz };
    155.         job2.Schedule().Complete();
    156.         stopwatch.Stop();
    157.  
    158.         DBG.Log("Burst SOA Job ticks " + stopwatch.Elapsed.Ticks);
    159.  
    160.         text.text += "Burst SOA Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    161.  
    162.         jobx.Dispose();
    163.         joby.Dispose();
    164.         jobz.Dispose();
    165.  
    166.     }
    167.  
    168.     // Update is called once per frame
    169.     void Update()
    170.     {
    171.        
    172.     }
    173.  
    174.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    175.     public static void BurstAOS(float3[] vectors)
    176.     {
    177.         for (int i = 0; i < vectors.Length; i++)
    178.         {
    179.             vectors[i] *= 2f;
    180.         }
    181.     }
    182.  
    183.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    184.     public static void BurstSOA(float[] x, float[] y, float[] z)
    185.     {
    186.         for (int i = 0; i < x.Length; i++)
    187.         {
    188.             x[i] *= 2f;
    189.             y[i] *= 2f;
    190.             z[i] *= 2f;
    191.         }
    192.     }
    193.  
    194.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    195.     public static void BurstNAAOS(NativeArray<float3> vectors)
    196.     {
    197.         for (int i = 0; i < vectors.Length; i++)
    198.         {
    199.             vectors[i] *= 2f;
    200.         }
    201.     }
    202.  
    203.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    204.     public static void BurstNASOA(NativeArray<float> x, NativeArray<float> y, NativeArray<float> z)
    205.     {
    206.         for (int i = 0; i < x.Length; i++)
    207.         {
    208.             x[i] *= 2f;
    209.             y[i] *= 2f;
    210.             z[i] *= 2f;
    211.         }
    212.     }
    213.  
    214.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    215.     private struct BurstNAAOSJob : IJob
    216.     {
    217.         public NativeArray<float3> vectors;
    218.  
    219.         public void Execute()
    220.         {
    221.             for (int i = 0; i < vectors.Length; i++)
    222.             {
    223.                 vectors[i] *= 2f;
    224.             }
    225.         }
    226.     }
    227.  
    228.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    229.     private struct BurstNASOAJob : IJob
    230.     {
    231.         public NativeArray<float> x;
    232.         public NativeArray<float> y;
    233.         public NativeArray<float> z;
    234.  
    235.         public void Execute()
    236.         {
    237.             for (int i = 0; i < x.Length; i++)
    238.             {
    239.                 x[i] *= 2f;
    240.                 y[i] *= 2f;
    241.                 z[i] *= 2f;
    242.             }
    243.         }
    244.     }
    245. }
    246.  
    Scores from x86 build (performance descending):
    1. Burst Job SOA 11,754 ticks
    2. Burst NativeArray SOA 11,824 ticks
    3. SOA 14,671 ticks
    4. Burst SOA 14,813 ticks
    5. AOS 15,098 ticks
    6. Burst AOS 15,855 ticks
    7. Burst NativeArray AOS 17,443 ticks
    8. Burst Job AOS 32,763 ticks
    Note the AOS with Burst, Jobs and NativeArrays gets worse than vanilla!
     
  17. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    BurstCompile(CompileSynchronously = true -> to make sure that the method is compiled on the first schedule.

    Using Burst Jobs as per examples and building to x86 IL2CPP??

    OK so I should use Run then?

    Scores from x86 build (performance descending):
    1. Burst Job SOA 10,915 ticks
    2. Burst NativeArray SOA 12,211 ticks
    3. Burst SOA 14,574 ticks
    4. SOA 14,644 ticks
    5. AOS 14,992 ticks
    6. Burst AOS 16,863 ticks
    7. Burst NativeArray AOS 13,931 ticks
    8. Burst Job AOS 21,994 ticks
    Improved AOS a touch and slight differences in numbers but roughly the same outcome.
     
    Last edited: Aug 14, 2022
  18. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    1000 tests

    Scores from x86 build (performance descending):
    1. Burst NativeArray SOA 10,577.00 ticks
    2. Burst Job SOA 11,609.00 ticks
    3. Burst JOb AOS 14,009.00 ticks
    4. SOA 15,063.00 ticks
    5. Burst NativeArray AOS 14,831.00 ticks
    6. Burst SOA 16,475.00 ticks
    7. Burst AOS 16,488.00 ticks
    8. AOS 16,542.00 ticks
    Burst Job AOS works quite well making it to 3rd place.

    Odd that Burst NativeArray SOA beats Burst Job SOA by a small fraction.

    Would a DOTS version be faster?

    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5. using Unity.Burst;
    6. using Unity.Collections;
    7.  
    8. using System.Diagnostics;
    9. using RND = UnityEngine.Random;
    10. using DBG = UnityEngine.Debug;
    11. using TMPro;
    12. using Unity.Jobs;
    13.  
    14. public class VectorArraySOAvsAOS : MonoBehaviour
    15. {
    16.     public TMP_InputField text;
    17.  
    18.     public int numberOfTestsToRun = 1000;
    19.  
    20.     private int testNo = 0;
    21.  
    22.     public const int numOfTests = 8;
    23.  
    24.     public long[] testTotalTicks = new long[numOfTests];
    25.  
    26.     public void RunTests()
    27.     {
    28.         const int size = 1000000;
    29.         float[] x = new float[size];
    30.         float[] y = new float[size];
    31.         float[] z = new float[size];
    32.  
    33.         float3[] vectors = new float3[size];
    34.         float3[] vectors1 = new float3[size];
    35.  
    36.         float[] x1 = new float[size];
    37.         float[] y1 = new float[size];
    38.         float[] z1 = new float[size];
    39.  
    40.         NativeArray<float3> navectors = new NativeArray<float3>(size, Allocator.Persistent);
    41.  
    42.         NativeArray<float> nax = new NativeArray<float>(size, Allocator.Persistent);
    43.         NativeArray<float> nay = new NativeArray<float>(size, Allocator.Persistent);
    44.         NativeArray<float> naz = new NativeArray<float>(size, Allocator.Persistent);
    45.  
    46.         NativeArray<float3> jobvectors = new NativeArray<float3>(size, Allocator.Persistent);
    47.  
    48.         NativeArray<float> jobx = new NativeArray<float>(size, Allocator.Persistent);
    49.         NativeArray<float> joby = new NativeArray<float>(size, Allocator.Persistent);
    50.         NativeArray<float> jobz = new NativeArray<float>(size, Allocator.Persistent);
    51.  
    52.  
    53.         for (var i = 0; i < size; i++)
    54.         {
    55.             x[i] = RND.value * 100;
    56.             y[i] = RND.value * 100;
    57.             z[i] = RND.value * 100;
    58.  
    59.             vectors[i].x = x[i];
    60.             vectors[i].y = y[i];
    61.             vectors[i].z = z[i];
    62.  
    63.             vectors[i] = vectors[i];
    64.  
    65.             x1[i] = x[i];
    66.             y1[i] = y[i];
    67.             z1[i] = z[i];
    68.  
    69.             navectors[i] = vectors[i];
    70.  
    71.             nax[i] = x[i];
    72.             nay[i] = y[i];
    73.             naz[i] = z[i];
    74.  
    75.             jobvectors[i] = vectors[i];
    76.  
    77.             jobx[i] = x[i];
    78.             joby[i] = y[i];
    79.             jobz[i] = z[i];
    80.  
    81.         }
    82.  
    83.         Stopwatch stopwatch = new Stopwatch();
    84.  
    85.         stopwatch.Start();
    86.         for (int i = 0; i < x.Length; i++)
    87.         {
    88.             x[i] *= 2f;
    89.             y[i] *= 2f;
    90.             z[i] *= 2f;
    91.         }
    92.         stopwatch.Stop();
    93.  
    94.         testTotalTicks[0] += stopwatch.Elapsed.Ticks;
    95.         //DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    96.         text.text += "SOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    97.  
    98.         stopwatch.Restart();    
    99.         for (int i = 0; i < vectors.Length; i++)
    100.         {
    101.             vectors[i] *= 2f;
    102.         }
    103.         stopwatch.Stop();
    104.  
    105.         testTotalTicks[1] += stopwatch.Elapsed.Ticks;
    106.         //DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    107.         text.text += "AOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    108.  
    109.         stopwatch.Restart();
    110.             BurstAOS(vectors1);
    111.         stopwatch.Stop();
    112.  
    113.         testTotalTicks[2] += stopwatch.Elapsed.Ticks;
    114.         //DBG.Log("BurstAOS ticks " + stopwatch.Elapsed.Ticks);
    115.         text.text += "BurstAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    116.  
    117.         stopwatch.Restart();
    118.             BurstSOA(x1,y1,z1);
    119.         stopwatch.Stop();
    120.  
    121.         testTotalTicks[3] += stopwatch.Elapsed.Ticks;
    122.         //DBG.Log("BurstSOA ticks " + stopwatch.Elapsed.Ticks);
    123.         text.text += "BurstSOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    124.  
    125.  
    126.         stopwatch.Restart();
    127.         BurstNAAOS(navectors);
    128.         stopwatch.Stop();
    129.  
    130.         navectors.Dispose();
    131.  
    132.         testTotalTicks[4] += stopwatch.Elapsed.Ticks;
    133.         //DBG.Log("BurstNAAOS ticks " + stopwatch.Elapsed.Ticks);
    134.         text.text += "BurstNAAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    135.  
    136.         stopwatch.Restart();
    137.         BurstNASOA(nax, nay, naz);
    138.         stopwatch.Stop();
    139.  
    140.         testTotalTicks[5] += stopwatch.Elapsed.Ticks;
    141.         //DBG.Log("BurstNASOA ticks " + stopwatch.Elapsed.Ticks);
    142.         text.text += "BurstNASOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    143.    
    144.         nax.Dispose();
    145.         nay.Dispose();
    146.         naz.Dispose();
    147.  
    148.         // Jobs versions
    149.         stopwatch.Restart();
    150.         BurstNAAOSJob job1 = new BurstNAAOSJob { vectors = jobvectors };
    151.         job1.Run();
    152.         stopwatch.Stop();
    153.  
    154.         testTotalTicks[6] += stopwatch.Elapsed.Ticks;
    155.         //DBG.Log("Burst AOS Job ticks " + stopwatch.Elapsed.Ticks);
    156.         text.text += "Burst AOS Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    157.  
    158.         jobvectors.Dispose();
    159.  
    160.  
    161.         stopwatch.Restart();
    162.         BurstNASOAJob job2 = new BurstNASOAJob { x = jobx, y = joby, z = jobz };
    163.         job2.Run();
    164.         stopwatch.Stop();
    165.  
    166.         testTotalTicks[7] += stopwatch.Elapsed.Ticks;
    167.         //DBG.Log("Burst SOA Job ticks " + stopwatch.Elapsed.Ticks);
    168.         text.text += "Burst SOA Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    169.  
    170.         jobx.Dispose();
    171.         joby.Dispose();
    172.         jobz.Dispose();
    173.  
    174.     }
    175.  
    176.     // Update is called once per frame
    177.     void Update()
    178.     {
    179.         if (testNo <= numberOfTestsToRun)
    180.         {
    181.             text.text = "Test " + testNo + " of "+ numberOfTestsToRun + "\n";
    182.             RunTests();
    183.  
    184.             if (testNo == numberOfTestsToRun) CalcResults();
    185.             testNo++;        
    186.         }
    187.     }
    188.  
    189.     public void CalcResults()
    190.     {
    191.         text.text += "\nFinal Results\n";
    192.  
    193.         float[] results = new float[numOfTests];
    194.  
    195.         string[] testNames = { "SOA", "AOS", "Burst AOS", "Burst SOA",
    196.                                 "Burst NativeArray AOS", "Burst NativeArray SOA",
    197.                                 "Burst JOb AOS", "Burst Job SOA" };
    198.  
    199.         int[] resultOrder = new int[numOfTests];
    200.  
    201.         for (int i = 0; i < numOfTests; i++)
    202.         {
    203.             results[i] = (float)(testTotalTicks[i] / numberOfTestsToRun);
    204.             resultOrder[i] = i;
    205.         }
    206.  
    207.         // sort the resultOrder
    208.    
    209.         int minIndex = 0;
    210.         int maxIndex = numOfTests - 1;
    211.  
    212.         int temp;
    213.  
    214.         for (minIndex = 0; minIndex <= numOfTests/2; minIndex++)
    215.         {
    216.             //if (minIndex >= maxIndex) break;
    217.  
    218.             int foundMinIndex = minIndex;
    219.             int foundMaxIndex = maxIndex;
    220.  
    221.             for (int i = minIndex; i <= maxIndex; i++)
    222.             {
    223.                 if (results[resultOrder[i]] < results[resultOrder[foundMinIndex]]) foundMinIndex = i;
    224.                 if (results[resultOrder[i]] > results[resultOrder[foundMaxIndex]]) foundMaxIndex = i;
    225.             }
    226.  
    227.             temp = resultOrder[minIndex];
    228.             resultOrder[minIndex] = resultOrder[foundMinIndex];
    229.             resultOrder[foundMinIndex] = temp;            
    230.  
    231.             temp = resultOrder[maxIndex];
    232.             resultOrder[maxIndex] = resultOrder[foundMaxIndex];
    233.             resultOrder[foundMaxIndex] = temp;
    234.        
    235.             maxIndex--;
    236.         }
    237.  
    238.         foreach (int i in resultOrder)
    239.         {
    240.             text.text += i + " " + testNames[i] + " " + results[i].ToString("N") + " ticks\n";
    241.         }
    242.     }
    243.  
    244.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    245.     public static void BurstAOS(float3[] vectors)
    246.     {
    247.         for (int i = 0; i < vectors.Length; i++)
    248.         {
    249.             vectors[i] *= 2f;
    250.         }
    251.     }
    252.  
    253.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    254.     public static void BurstSOA(float[] x, float[] y, float[] z)
    255.     {
    256.         for (int i = 0; i < x.Length; i++)
    257.         {
    258.             x[i] *= 2f;
    259.             y[i] *= 2f;
    260.             z[i] *= 2f;
    261.         }
    262.     }
    263.  
    264.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    265.     public static void BurstNAAOS(NativeArray<float3> vectors)
    266.     {
    267.         for (int i = 0; i < vectors.Length; i++)
    268.         {
    269.             vectors[i] *= 2f;
    270.         }
    271.     }
    272.  
    273.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    274.     public static void BurstNASOA(NativeArray<float> x, NativeArray<float> y, NativeArray<float> z)
    275.     {
    276.         for (int i = 0; i < x.Length; i++)
    277.         {
    278.             x[i] *= 2f;
    279.             y[i] *= 2f;
    280.             z[i] *= 2f;
    281.         }
    282.     }
    283.  
    284.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    285.     private struct BurstNAAOSJob : IJob
    286.     {
    287.         public NativeArray<float3> vectors;
    288.  
    289.         public void Execute()
    290.         {
    291.             for (int i = 0; i < vectors.Length; i++)
    292.             {
    293.                 vectors[i] *= 2f;
    294.             }
    295.         }
    296.     }
    297.  
    298.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    299.     private struct BurstNASOAJob : IJob
    300.     {
    301.         public NativeArray<float> x;
    302.         public NativeArray<float> y;
    303.         public NativeArray<float> z;
    304.  
    305.         public void Execute()
    306.         {
    307.             for (int i = 0; i < x.Length; i++)
    308.             {
    309.                 x[i] *= 2f;
    310.                 y[i] *= 2f;
    311.                 z[i] *= 2f;
    312.             }
    313.         }
    314.     }
    315. }
    316.  
     
    Last edited: Aug 14, 2022
  19. apkdev

    apkdev

    Joined:
    Dec 12, 2015
    Posts:
    263
    Note that you can't pass a managed array into a bursted function (unless something's changed very recently?), so I think you're still running on Mono. It's why I used UnsafeLists (the alternative is pointers, but I'm not a huge fan). NativeArrays won't work either, they only work in jobs. Make sure your functions look OK in the Burst Inspector.
     
  20. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    10,000 Runs
    Code (Boo):
    1.  SOA Burst NativeArray        10,369.63 ticks
    2.  SOA Burst NativeArray Job    11,623.89 ticks
    3.  AOS Burst NativeArray Job    14,067.96 ticks
    4.  AOS Burst NativeArray        14,911.34 ticks
    5.  SOA                          15,081.29 ticks
    6.  AOS Burst                    16,419.43 ticks
    7.  AOS                          16,605.74 ticks
    8.  SOA Burst                    17,234.52 ticks
    Updated Code
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5. using Unity.Burst;
    6. using Unity.Collections;
    7.  
    8. using System.Diagnostics;
    9. using RND = UnityEngine.Random;
    10. using DBG = UnityEngine.Debug;
    11. using TMPro;
    12. using Unity.Jobs;
    13.  
    14. public class VectorArraySOAvsAOS : MonoBehaviour
    15. {
    16.     public TMP_InputField text;
    17.  
    18.     public int numberOfTestsToRun = 1000;
    19.  
    20.     private int testNo = 0;
    21.  
    22.     public const int numOfTests = 8;
    23.  
    24.     public long[] testTotalTicks = new long[numOfTests];
    25.  
    26.     public void RunTests()
    27.     {
    28.         const int size = 1000000;
    29.         float[] x = new float[size];
    30.         float[] y = new float[size];
    31.         float[] z = new float[size];
    32.  
    33.         float3[] vectors = new float3[size];
    34.         float3[] vectors1 = new float3[size];
    35.  
    36.         float[] x1 = new float[size];
    37.         float[] y1 = new float[size];
    38.         float[] z1 = new float[size];
    39.  
    40.         NativeArray<float3> navectors = new NativeArray<float3>(size, Allocator.Persistent);
    41.  
    42.         NativeArray<float> nax = new NativeArray<float>(size, Allocator.Persistent);
    43.         NativeArray<float> nay = new NativeArray<float>(size, Allocator.Persistent);
    44.         NativeArray<float> naz = new NativeArray<float>(size, Allocator.Persistent);
    45.  
    46.         NativeArray<float3> jobvectors = new NativeArray<float3>(size, Allocator.Persistent);
    47.  
    48.         NativeArray<float> jobx = new NativeArray<float>(size, Allocator.Persistent);
    49.         NativeArray<float> joby = new NativeArray<float>(size, Allocator.Persistent);
    50.         NativeArray<float> jobz = new NativeArray<float>(size, Allocator.Persistent);
    51.  
    52.  
    53.         for (var i = 0; i < size; i++)
    54.         {
    55.             x[i] = RND.value * 100;
    56.             y[i] = RND.value * 100;
    57.             z[i] = RND.value * 100;
    58.  
    59.             vectors[i].x = x[i];
    60.             vectors[i].y = y[i];
    61.             vectors[i].z = z[i];
    62.  
    63.             vectors[i] = vectors[i];
    64.  
    65.             x1[i] = x[i];
    66.             y1[i] = y[i];
    67.             z1[i] = z[i];
    68.  
    69.             navectors[i] = vectors[i];
    70.  
    71.             nax[i] = x[i];
    72.             nay[i] = y[i];
    73.             naz[i] = z[i];
    74.  
    75.             jobvectors[i] = vectors[i];
    76.  
    77.             jobx[i] = x[i];
    78.             joby[i] = y[i];
    79.             jobz[i] = z[i];
    80.  
    81.         }
    82.  
    83.         Stopwatch stopwatch = new Stopwatch();
    84.  
    85.         stopwatch.Start();
    86.         for (int i = 0; i < x.Length; i++)
    87.         {
    88.             x[i] *= 2f;
    89.             y[i] *= 2f;
    90.             z[i] *= 2f;
    91.         }
    92.         stopwatch.Stop();
    93.  
    94.         testTotalTicks[0] += stopwatch.Elapsed.Ticks;
    95.         //DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    96.         //text.text += "SOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    97.  
    98.         stopwatch.Restart();  
    99.         for (int i = 0; i < vectors.Length; i++)
    100.         {
    101.             vectors[i] *= 2f;
    102.         }
    103.         stopwatch.Stop();
    104.  
    105.         testTotalTicks[1] += stopwatch.Elapsed.Ticks;
    106.         //DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    107.         //text.text += "AOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    108.  
    109.         stopwatch.Restart();
    110.             BurstAOS(vectors1);
    111.         stopwatch.Stop();
    112.  
    113.         testTotalTicks[2] += stopwatch.Elapsed.Ticks;
    114.         //DBG.Log("BurstAOS ticks " + stopwatch.Elapsed.Ticks);
    115.         //text.text += "BurstAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    116.  
    117.         stopwatch.Restart();
    118.             BurstSOA(x1,y1,z1);
    119.         stopwatch.Stop();
    120.  
    121.         testTotalTicks[3] += stopwatch.Elapsed.Ticks;
    122.         //DBG.Log("BurstSOA ticks " + stopwatch.Elapsed.Ticks);
    123.         //text.text += "BurstSOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    124.  
    125.  
    126.         stopwatch.Restart();
    127.         BurstNAAOS(navectors);
    128.         stopwatch.Stop();
    129.  
    130.         navectors.Dispose();
    131.  
    132.         testTotalTicks[4] += stopwatch.Elapsed.Ticks;
    133.         //DBG.Log("BurstNAAOS ticks " + stopwatch.Elapsed.Ticks);
    134.         //text.text += "BurstNAAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    135.  
    136.         stopwatch.Restart();
    137.         BurstNASOA(nax, nay, naz);
    138.         stopwatch.Stop();
    139.  
    140.         testTotalTicks[5] += stopwatch.Elapsed.Ticks;
    141.         //DBG.Log("BurstNASOA ticks " + stopwatch.Elapsed.Ticks);
    142.         //text.text += "BurstNASOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    143.    
    144.         nax.Dispose();
    145.         nay.Dispose();
    146.         naz.Dispose();
    147.  
    148.         // Jobs versions
    149.         stopwatch.Restart();
    150.         BurstNAAOSJob job1 = new BurstNAAOSJob { vectors = jobvectors };
    151.         job1.Run();
    152.         stopwatch.Stop();
    153.  
    154.         testTotalTicks[6] += stopwatch.Elapsed.Ticks;
    155.         //DBG.Log("Burst AOS Job ticks " + stopwatch.Elapsed.Ticks);
    156.         //text.text += "Burst AOS Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    157.  
    158.         jobvectors.Dispose();
    159.  
    160.  
    161.         stopwatch.Restart();
    162.         BurstNASOAJob job2 = new BurstNASOAJob { x = jobx, y = joby, z = jobz };
    163.         job2.Run();
    164.         stopwatch.Stop();
    165.  
    166.         testTotalTicks[7] += stopwatch.Elapsed.Ticks;
    167.         //DBG.Log("Burst SOA Job ticks " + stopwatch.Elapsed.Ticks);
    168.         //text.text += "Burst SOA Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    169.  
    170.         jobx.Dispose();
    171.         joby.Dispose();
    172.         jobz.Dispose();
    173.  
    174.         CalcResults();
    175.  
    176.     }
    177.  
    178.     // Update is called once per frame
    179.     void Update()
    180.     {
    181.         if (testNo <= numberOfTestsToRun)
    182.         {
    183.             text.text = "Test " + testNo + " of "+ numberOfTestsToRun + "\n";
    184.             RunTests();
    185.  
    186.             //if (testNo == numberOfTestsToRun) CalcResults();
    187.             testNo++;      
    188.         }
    189.     }
    190.  
    191.     public void CalcResults()
    192.     {
    193.         text.text += "Results\n";
    194.  
    195.         float[] results = new float[numOfTests];
    196.  
    197.         string[] testNames = { "SOA\t\t\t", "AOS\t\t\t", "AOS Burst\t\t", "SOA Burst\t\t",
    198.                                 "AOS Burst NativeArray\t", "SOA Burst NativeArray\t",
    199.                                 "AOS Burst NativeArray Job\t", "SOA Burst NativeArray Job\t" };
    200.  
    201.         int[] resultOrder = new int[numOfTests];
    202.  
    203.         for (int i = 0; i < numOfTests; i++)
    204.         {
    205.             results[i] = (float)testTotalTicks[i] / (float)testNo;
    206.             resultOrder[i] = i;
    207.         }
    208.  
    209.         foreach (int i in resultOrder)
    210.         {
    211.             text.text += i + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}",results[i]) + " ticks\n";
    212.         }
    213.  
    214.         text.text += "\n";
    215.  
    216.         // sort the resultOrder
    217.  
    218.         int minIndex = 0;
    219.  
    220.         int temp;  
    221.  
    222.         for (minIndex = 0; minIndex < numOfTests-1; minIndex++)
    223.         {
    224.             //if (minIndex >= maxIndex) break;
    225.  
    226.             int foundMinIndex = minIndex;
    227.        
    228.             int mini = minIndex;
    229.                  
    230.             for (int i = mini; i < numOfTests; i++)
    231.             {
    232.                 if (results[resultOrder[i]] < results[resultOrder[foundMinIndex]]) foundMinIndex = i;          
    233.             }
    234.  
    235.             temp = resultOrder[minIndex];
    236.             resultOrder[minIndex] = resultOrder[foundMinIndex];
    237.             resultOrder[foundMinIndex] = temp;                                    
    238.         }
    239.  
    240.  
    241.         text.text += "\nSorted\n";
    242.  
    243.         int order = 1;
    244.         foreach (int i in resultOrder)
    245.         {
    246.             text.text += order++ + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}", results[i]) + " ticks\n";
    247.         }
    248.     }
    249.  
    250.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    251.     public static void BurstAOS(float3[] vectors)
    252.     {
    253.         for (int i = 0; i < vectors.Length; i++)
    254.         {
    255.             vectors[i] *= 2f;
    256.         }
    257.     }
    258.  
    259.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    260.     public static void BurstSOA(float[] x, float[] y, float[] z)
    261.     {
    262.         for (int i = 0; i < x.Length; i++)
    263.         {
    264.             x[i] *= 2f;
    265.             y[i] *= 2f;
    266.             z[i] *= 2f;
    267.         }
    268.     }
    269.  
    270.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    271.     public static void BurstNAAOS(NativeArray<float3> vectors)
    272.     {
    273.         for (int i = 0; i < vectors.Length; i++)
    274.         {
    275.             vectors[i] *= 2f;
    276.         }
    277.     }
    278.  
    279.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    280.     public static void BurstNASOA(NativeArray<float> x, NativeArray<float> y, NativeArray<float> z)
    281.     {
    282.         for (int i = 0; i < x.Length; i++)
    283.         {
    284.             x[i] *= 2f;
    285.             y[i] *= 2f;
    286.             z[i] *= 2f;
    287.         }
    288.     }
    289.  
    290.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    291.     private struct BurstNAAOSJob : IJob
    292.     {
    293.         public NativeArray<float3> vectors;
    294.  
    295.         public void Execute()
    296.         {
    297.             for (int i = 0; i < vectors.Length; i++)
    298.             {
    299.                 vectors[i] *= 2f;
    300.             }
    301.         }
    302.     }
    303.  
    304.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    305.     private struct BurstNASOAJob : IJob
    306.     {
    307.         public NativeArray<float> x;
    308.         public NativeArray<float> y;
    309.         public NativeArray<float> z;
    310.  
    311.         public void Execute()
    312.         {
    313.             for (int i = 0; i < x.Length; i++)
    314.             {
    315.                 x[i] *= 2f;
    316.                 y[i] *= 2f;
    317.                 z[i] *= 2f;
    318.             }
    319.         }
    320.     }
    321. }
    322.  
    Code (Boo):
    1. Vs AOS
    2. SOA BN  160%
    3. SOA BNJ 142%
    4. AOS BNJ 118%
    5. AOS BN  111%
    6. SOA     110%
    7. AOS B   110%
    8. AOS     100%
    9. SOA B    96%
    Note: I need to add precentages to the program results.
     
    Last edited: Aug 14, 2022
  21. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Building to IL2CPP exe and obtaining results from 10k runs on my old CPU (only takes a couple of minutes).

    Please try my version of the benchmark on your machine and post your results?

    PS: Ensure you set an InputText component in the scene and link it so you can grab the output.
     
    Last edited: Aug 14, 2022
  22. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Added percetage vs AOS to results...
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5. using Unity.Burst;
    6. using Unity.Collections;
    7.  
    8. using System.Diagnostics;
    9. using RND = UnityEngine.Random;
    10. using DBG = UnityEngine.Debug;
    11. using TMPro;
    12. using Unity.Jobs;
    13.  
    14. public class VectorArraySOAvsAOS : MonoBehaviour
    15. {
    16.     public TMP_InputField text;
    17.  
    18.     public int numberOfTestsToRun = 1000;
    19.  
    20.     private int testNo = 0;
    21.  
    22.     public const int numOfTests = 8;
    23.  
    24.     public long[] testTotalTicks = new long[numOfTests];
    25.  
    26.     public void RunTests()
    27.     {
    28.         const int size = 1000000;
    29.         float[] x = new float[size];
    30.         float[] y = new float[size];
    31.         float[] z = new float[size];
    32.  
    33.         float3[] vectors = new float3[size];
    34.         float3[] vectors1 = new float3[size];
    35.  
    36.         float[] x1 = new float[size];
    37.         float[] y1 = new float[size];
    38.         float[] z1 = new float[size];
    39.  
    40.         NativeArray<float3> navectors = new NativeArray<float3>(size, Allocator.Persistent);
    41.  
    42.         NativeArray<float> nax = new NativeArray<float>(size, Allocator.Persistent);
    43.         NativeArray<float> nay = new NativeArray<float>(size, Allocator.Persistent);
    44.         NativeArray<float> naz = new NativeArray<float>(size, Allocator.Persistent);
    45.  
    46.         NativeArray<float3> jobvectors = new NativeArray<float3>(size, Allocator.Persistent);
    47.  
    48.         NativeArray<float> jobx = new NativeArray<float>(size, Allocator.Persistent);
    49.         NativeArray<float> joby = new NativeArray<float>(size, Allocator.Persistent);
    50.         NativeArray<float> jobz = new NativeArray<float>(size, Allocator.Persistent);
    51.  
    52.  
    53.         for (var i = 0; i < size; i++)
    54.         {
    55.             x[i] = RND.value * 100;
    56.             y[i] = RND.value * 100;
    57.             z[i] = RND.value * 100;
    58.  
    59.             vectors[i].x = x[i];
    60.             vectors[i].y = y[i];
    61.             vectors[i].z = z[i];
    62.  
    63.             vectors[i] = vectors[i];
    64.  
    65.             x1[i] = x[i];
    66.             y1[i] = y[i];
    67.             z1[i] = z[i];
    68.  
    69.             navectors[i] = vectors[i];
    70.  
    71.             nax[i] = x[i];
    72.             nay[i] = y[i];
    73.             naz[i] = z[i];
    74.  
    75.             jobvectors[i] = vectors[i];
    76.  
    77.             jobx[i] = x[i];
    78.             joby[i] = y[i];
    79.             jobz[i] = z[i];
    80.  
    81.         }
    82.  
    83.         Stopwatch stopwatch = new Stopwatch();
    84.  
    85.         stopwatch.Start();
    86.         for (int i = 0; i < x.Length; i++)
    87.         {
    88.             x[i] *= 2f;
    89.             y[i] *= 2f;
    90.             z[i] *= 2f;
    91.         }
    92.         stopwatch.Stop();
    93.  
    94.         testTotalTicks[0] += stopwatch.Elapsed.Ticks;
    95.         //DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    96.         //text.text += "SOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    97.  
    98.         stopwatch.Restart();      
    99.         for (int i = 0; i < vectors.Length; i++)
    100.         {
    101.             vectors[i] *= 2f;
    102.         }
    103.         stopwatch.Stop();
    104.  
    105.         testTotalTicks[1] += stopwatch.Elapsed.Ticks;
    106.         //DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    107.         //text.text += "AOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    108.  
    109.         stopwatch.Restart();
    110.             BurstAOS(vectors1);
    111.         stopwatch.Stop();
    112.  
    113.         testTotalTicks[2] += stopwatch.Elapsed.Ticks;
    114.         //DBG.Log("BurstAOS ticks " + stopwatch.Elapsed.Ticks);
    115.         //text.text += "BurstAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    116.  
    117.         stopwatch.Restart();
    118.             BurstSOA(x1,y1,z1);
    119.         stopwatch.Stop();
    120.  
    121.         testTotalTicks[3] += stopwatch.Elapsed.Ticks;
    122.         //DBG.Log("BurstSOA ticks " + stopwatch.Elapsed.Ticks);
    123.         //text.text += "BurstSOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    124.  
    125.  
    126.         stopwatch.Restart();
    127.         BurstNAAOS(navectors);
    128.         stopwatch.Stop();
    129.  
    130.         navectors.Dispose();
    131.  
    132.         testTotalTicks[4] += stopwatch.Elapsed.Ticks;
    133.         //DBG.Log("BurstNAAOS ticks " + stopwatch.Elapsed.Ticks);
    134.         //text.text += "BurstNAAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    135.  
    136.         stopwatch.Restart();
    137.         BurstNASOA(nax, nay, naz);
    138.         stopwatch.Stop();
    139.  
    140.         testTotalTicks[5] += stopwatch.Elapsed.Ticks;
    141.         //DBG.Log("BurstNASOA ticks " + stopwatch.Elapsed.Ticks);
    142.         //text.text += "BurstNASOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    143.        
    144.         nax.Dispose();
    145.         nay.Dispose();
    146.         naz.Dispose();
    147.  
    148.         // Jobs versions
    149.         stopwatch.Restart();
    150.         BurstNAAOSJob job1 = new BurstNAAOSJob { vectors = jobvectors };
    151.         job1.Run();
    152.         stopwatch.Stop();
    153.  
    154.         testTotalTicks[6] += stopwatch.Elapsed.Ticks;
    155.         //DBG.Log("Burst AOS Job ticks " + stopwatch.Elapsed.Ticks);
    156.         //text.text += "Burst AOS Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    157.  
    158.         jobvectors.Dispose();
    159.  
    160.  
    161.         stopwatch.Restart();
    162.         BurstNASOAJob job2 = new BurstNASOAJob { x = jobx, y = joby, z = jobz };
    163.         job2.Run();
    164.         stopwatch.Stop();
    165.  
    166.         testTotalTicks[7] += stopwatch.Elapsed.Ticks;
    167.         //DBG.Log("Burst SOA Job ticks " + stopwatch.Elapsed.Ticks);
    168.         //text.text += "Burst SOA Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    169.  
    170.         jobx.Dispose();
    171.         joby.Dispose();
    172.         jobz.Dispose();
    173.  
    174.         CalcResults();
    175.  
    176.     }
    177.  
    178.     // Update is called once per frame
    179.     void Update()
    180.     {
    181.         if (testNo <= numberOfTestsToRun)
    182.         {
    183.             text.text = "Test " + testNo + " of "+ numberOfTestsToRun + "\n";
    184.             RunTests();
    185.  
    186.             //if (testNo == numberOfTestsToRun) CalcResults();
    187.             testNo++;          
    188.         }
    189.     }
    190.  
    191.     public void CalcResults()
    192.     {
    193.         text.text += "Results\n";
    194.  
    195.         float[] results = new float[numOfTests];
    196.  
    197.         string[] testNames = { "SOA\t\t\t", "AOS\t\t\t", "AOS Burst\t\t", "SOA Burst\t\t",
    198.                                 "AOS Burst NativeArray\t", "SOA Burst NativeArray\t",
    199.                                 "AOS Burst NativeArray Job\t", "SOA Burst NativeArray Job\t" };
    200.  
    201.         int[] resultOrder = new int[numOfTests];
    202.  
    203.         for (int i = 0; i < numOfTests; i++)
    204.         {
    205.             results[i] = (float)testTotalTicks[i] / (float)testNo;
    206.             resultOrder[i] = i;
    207.         }
    208.  
    209.         foreach (int i in resultOrder)
    210.         {
    211.             text.text += i + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}",results[i]) + " ticks\n";
    212.         }
    213.  
    214.         text.text += "\n";
    215.  
    216.         // sort the resultOrder
    217.  
    218.         int minIndex = 0;
    219.  
    220.         int temp;      
    221.  
    222.         for (minIndex = 0; minIndex < numOfTests-1; minIndex++)
    223.         {
    224.             //if (minIndex >= maxIndex) break;
    225.  
    226.             int foundMinIndex = minIndex;
    227.            
    228.             int mini = minIndex;
    229.                      
    230.             for (int i = mini; i < numOfTests; i++)
    231.             {
    232.                 if (results[resultOrder[i]] < results[resultOrder[foundMinIndex]]) foundMinIndex = i;              
    233.             }
    234.  
    235.             temp = resultOrder[minIndex];
    236.             resultOrder[minIndex] = resultOrder[foundMinIndex];
    237.             resultOrder[foundMinIndex] = temp;                                        
    238.         }
    239.  
    240.  
    241.         text.text += "\nSorted\n";
    242.  
    243.         int order = 1;
    244.         foreach (int i in resultOrder)
    245.         {
    246.             float percentageOfAOS = results[1] / results[i];
    247.             text.text += order++ + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}", results[i]) + " ticks "+percentageOfAOS.ToString("P")+ "\n";
    248.         }
    249.     }
    250.  
    251.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    252.     public static void BurstAOS(float3[] vectors)
    253.     {
    254.         for (int i = 0; i < vectors.Length; i++)
    255.         {
    256.             vectors[i] *= 2f;
    257.         }
    258.     }
    259.  
    260.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    261.     public static void BurstSOA(float[] x, float[] y, float[] z)
    262.     {
    263.         for (int i = 0; i < x.Length; i++)
    264.         {
    265.             x[i] *= 2f;
    266.             y[i] *= 2f;
    267.             z[i] *= 2f;
    268.         }
    269.     }
    270.  
    271.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    272.     public static void BurstNAAOS(NativeArray<float3> vectors)
    273.     {
    274.         for (int i = 0; i < vectors.Length; i++)
    275.         {
    276.             vectors[i] *= 2f;
    277.         }
    278.     }
    279.  
    280.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    281.     public static void BurstNASOA(NativeArray<float> x, NativeArray<float> y, NativeArray<float> z)
    282.     {
    283.         for (int i = 0; i < x.Length; i++)
    284.         {
    285.             x[i] *= 2f;
    286.             y[i] *= 2f;
    287.             z[i] *= 2f;
    288.         }
    289.     }
    290.  
    291.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    292.     private struct BurstNAAOSJob : IJob
    293.     {
    294.         public NativeArray<float3> vectors;
    295.  
    296.         public void Execute()
    297.         {
    298.             for (int i = 0; i < vectors.Length; i++)
    299.             {
    300.                 vectors[i] *= 2f;
    301.             }
    302.         }
    303.     }
    304.  
    305.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    306.     private struct BurstNASOAJob : IJob
    307.     {
    308.         public NativeArray<float> x;
    309.         public NativeArray<float> y;
    310.         public NativeArray<float> z;
    311.  
    312.         public void Execute()
    313.         {
    314.             for (int i = 0; i < x.Length; i++)
    315.             {
    316.                 x[i] *= 2f;
    317.                 y[i] *= 2f;
    318.                 z[i] *= 2f;
    319.             }
    320.         }
    321.     }
    322. }
     
  23. apkdev

    apkdev

    Joined:
    Dec 12, 2015
    Posts:
    263
    Code (csharp):
    1. Final Results
    2. 7 Burst Job SOA 6,561.00 ticks
    3. 5 Burst NativeArray SOA 7,934.00 ticks
    4. 6 Burst JOb AOS 8,981.00 ticks
    5. 3 Burst SOA 12,953.00 ticks
    6. 0 SOA 11,384.00 ticks
    7. 1 AOS 21,986.00 ticks
    8. 2 Burst AOS 24,133.00 ticks
    9. 4 Burst NativeArray AOS 24,288.00 ticks
    I noticed that you don't have [BurstCompile] added on top of the class that contains the bursted functions. I fixed that, and changed the functions to use UnsafeList because they wouldn't compile otherwise. Results:

    Code (csharp):
    1. 2 Burst AOS 9,598.00 ticks
    2. 3 Burst SOA 12,760.00 ticks
    With [NoAlias] applied:
    Code (csharp):
    1. 3 Burst SOA 5,943.00 ticks
    This makes bursted functions just slightly faster than jobs, which is expected, since you're including the job scheduling/completion overhead in measurements.

    I am getting very large differences between runs, but that could be my CPU's clocks going crazy. Still, I think it would be wise to use the performance testing package: https://docs.unity3d.com/Packages/com.unity.test-framework.performance@1.0/manual/index.html
     
  24. Per-Morten

    Per-Morten

    Joined:
    Aug 23, 2019
    Posts:
    109
    Arowx, have you opened, for instance, BurstNASOA function in the burst inspector and verified that it's burst compiled? I'm pretty certain that you can't send a NativeArray into a burst function, you need to send in pointers, what you have there should be a compile error. You also need to have [BurstCompile] at the top of your class, that's a requirement to be able to use burst c# methods.
     
    Last edited: Aug 15, 2022
  25. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    And the funny thing is SOA Burst Native Array runs the fastest, about 150% faster than AOS when compiled.
     
    Last edited: Aug 15, 2022
  26. Per-Morten

    Per-Morten

    Joined:
    Aug 23, 2019
    Posts:
    109
    Code (CSharp):
    1. using System.Collections;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4. using Unity.Mathematics;
    5. using Unity.Burst;
    6. using Unity.Collections;
    7.  
    8. using System.Diagnostics;
    9. using RND = UnityEngine.Random;
    10. using DBG = UnityEngine.Debug;
    11. using TMPro;
    12. using Unity.Jobs;
    13. using Unity.Collections.LowLevel.Unsafe;
    14.  
    15. [BurstCompile]
    16. public class VectorArraySOAvsAOS : MonoBehaviour
    17. {
    18.     public TMPro.TMP_InputField text;
    19.  
    20.     public int numberOfTestsToRun = 1000;
    21.  
    22.     private int testNo = 0;
    23.  
    24.     public const int numOfTests = 8;
    25.  
    26.     public long[] testTotalTicks = new long[numOfTests];
    27.  
    28.     public void RunTests()
    29.     {
    30.         const int size = 1000000;
    31.         float[] x = new float[size];
    32.         float[] y = new float[size];
    33.         float[] z = new float[size];
    34.  
    35.         float3[] vectors = new float3[size];
    36.         float3[] vectors1 = new float3[size];
    37.  
    38.         float[] x1 = new float[size];
    39.         float[] y1 = new float[size];
    40.         float[] z1 = new float[size];
    41.  
    42.         NativeArray<float3> navectors = new NativeArray<float3>(size, Allocator.Persistent);
    43.  
    44.         NativeArray<float> nax = new NativeArray<float>(size, Allocator.Persistent);
    45.         NativeArray<float> nay = new NativeArray<float>(size, Allocator.Persistent);
    46.         NativeArray<float> naz = new NativeArray<float>(size, Allocator.Persistent);
    47.  
    48.         NativeArray<float3> jobvectors = new NativeArray<float3>(size, Allocator.Persistent);
    49.  
    50.         NativeArray<float> jobx = new NativeArray<float>(size, Allocator.Persistent);
    51.         NativeArray<float> joby = new NativeArray<float>(size, Allocator.Persistent);
    52.         NativeArray<float> jobz = new NativeArray<float>(size, Allocator.Persistent);
    53.  
    54.  
    55.         for (var i = 0; i < size; i++)
    56.         {
    57.             x[i] = RND.value * 100;
    58.             y[i] = RND.value * 100;
    59.             z[i] = RND.value * 100;
    60.  
    61.             vectors[i].x = x[i];
    62.             vectors[i].y = y[i];
    63.             vectors[i].z = z[i];
    64.  
    65.             vectors[i] = vectors[i];
    66.  
    67.             x1[i] = x[i];
    68.             y1[i] = y[i];
    69.             z1[i] = z[i];
    70.  
    71.             navectors[i] = vectors[i];
    72.  
    73.             nax[i] = x[i];
    74.             nay[i] = y[i];
    75.             naz[i] = z[i];
    76.  
    77.             jobvectors[i] = vectors[i];
    78.  
    79.             jobx[i] = x[i];
    80.             joby[i] = y[i];
    81.             jobz[i] = z[i];
    82.  
    83.         }
    84.  
    85.         Stopwatch stopwatch = new Stopwatch();
    86.  
    87.         stopwatch.Start();
    88.         for (int i = 0; i < x.Length; i++)
    89.         {
    90.             x[i] *= 2f;
    91.             y[i] *= 2f;
    92.             z[i] *= 2f;
    93.         }
    94.         stopwatch.Stop();
    95.  
    96.         testTotalTicks[0] += stopwatch.Elapsed.Ticks;
    97.         //DBG.Log("SOA ticks "+ stopwatch.Elapsed.Ticks);
    98.         //text.text += "SOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    99.  
    100.         stopwatch.Restart();
    101.         for (int i = 0; i < vectors.Length; i++)
    102.         {
    103.             vectors[i] *= 2f;
    104.         }
    105.         stopwatch.Stop();
    106.  
    107.         testTotalTicks[1] += stopwatch.Elapsed.Ticks;
    108.         //DBG.Log("AOS ticks " + stopwatch.Elapsed.Ticks);
    109.         //text.text += "AOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    110.  
    111.         stopwatch.Restart();
    112.         BurstAOS(vectors1);
    113.         stopwatch.Stop();
    114.  
    115.         testTotalTicks[2] += stopwatch.Elapsed.Ticks;
    116.         //DBG.Log("BurstAOS ticks " + stopwatch.Elapsed.Ticks);
    117.         //text.text += "BurstAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    118.  
    119.         stopwatch.Restart();
    120.         BurstSOA(x1, y1, z1);
    121.         stopwatch.Stop();
    122.  
    123.         testTotalTicks[3] += stopwatch.Elapsed.Ticks;
    124.         //DBG.Log("BurstSOA ticks " + stopwatch.Elapsed.Ticks);
    125.         //text.text += "BurstSOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    126.  
    127.  
    128.         stopwatch.Restart();
    129.         BurstNAAOS(navectors);
    130.         stopwatch.Stop();
    131.  
    132.         navectors.Dispose();
    133.  
    134.         testTotalTicks[4] += stopwatch.Elapsed.Ticks;
    135.         //DBG.Log("BurstNAAOS ticks " + stopwatch.Elapsed.Ticks);
    136.         //text.text += "BurstNAAOS ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    137.  
    138.         stopwatch.Restart();
    139.         BurstNASOA(nax, nay, naz);
    140.         stopwatch.Stop();
    141.  
    142.         testTotalTicks[5] += stopwatch.Elapsed.Ticks;
    143.         //DBG.Log("BurstNASOA ticks " + stopwatch.Elapsed.Ticks);
    144.         //text.text += "BurstNASOA ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    145.  
    146.         nax.Dispose();
    147.         nay.Dispose();
    148.         naz.Dispose();
    149.  
    150.         // Jobs versions
    151.         stopwatch.Restart();
    152.         BurstNAAOSJob job1 = new BurstNAAOSJob { vectors = jobvectors };
    153.         job1.Run();
    154.         stopwatch.Stop();
    155.  
    156.         testTotalTicks[6] += stopwatch.Elapsed.Ticks;
    157.         //DBG.Log("Burst AOS Job ticks " + stopwatch.Elapsed.Ticks);
    158.         //text.text += "Burst AOS Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    159.  
    160.         jobvectors.Dispose();
    161.  
    162.  
    163.         stopwatch.Restart();
    164.         BurstNASOAJob job2 = new BurstNASOAJob { x = jobx, y = joby, z = jobz };
    165.         job2.Run();
    166.         stopwatch.Stop();
    167.  
    168.         testTotalTicks[7] += stopwatch.Elapsed.Ticks;
    169.         //DBG.Log("Burst SOA Job ticks " + stopwatch.Elapsed.Ticks);
    170.         //text.text += "Burst SOA Job ticks " + stopwatch.Elapsed.Ticks.ToString("N0") + "\n";
    171.  
    172.         jobx.Dispose();
    173.         joby.Dispose();
    174.         jobz.Dispose();
    175.  
    176.         CalcResults();
    177.  
    178.     }
    179.  
    180.     // Update is called once per frame
    181.     void Update()
    182.     {
    183.         if (testNo <= numberOfTestsToRun)
    184.         {
    185.             text.text = "Test " + testNo + " of " + numberOfTestsToRun + "\n";
    186.             RunTests();
    187.  
    188.             // if (testNo == numberOfTestsToRun) CalcResults();
    189.             testNo++;
    190.         }
    191.     }
    192.  
    193.     public void CalcResults()
    194.     {
    195.         text.text += "Results\n";
    196.  
    197.         float[] results = new float[numOfTests];
    198.  
    199.         string[] testNames = { "SOA\t\t\t", "AOS\t\t\t", "AOS Burst\t\t", "SOA Burst\t\t",
    200.                                 "AOS Burst NativeArray\t", "SOA Burst NativeArray\t",
    201.                                 "AOS Burst NativeArray Job\t", "SOA Burst NativeArray Job\t" };
    202.  
    203.         int[] resultOrder = new int[numOfTests];
    204.  
    205.         for (int i = 0; i < numOfTests; i++)
    206.         {
    207.             results[i] = (float)testTotalTicks[i] / (float)testNo;
    208.             resultOrder[i] = i;
    209.         }
    210.  
    211.         foreach (int i in resultOrder)
    212.         {
    213.             text.text += i + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}", results[i]) + " ticks\n";
    214.         }
    215.  
    216.         text.text += "\n";
    217.  
    218.         // sort the resultOrder
    219.  
    220.         int minIndex = 0;
    221.  
    222.         int temp;
    223.  
    224.         for (minIndex = 0; minIndex < numOfTests - 1; minIndex++)
    225.         {
    226.             //if (minIndex >= maxIndex) break;
    227.  
    228.             int foundMinIndex = minIndex;
    229.  
    230.             int mini = minIndex;
    231.  
    232.             for (int i = mini; i < numOfTests; i++)
    233.             {
    234.                 if (results[resultOrder[i]] < results[resultOrder[foundMinIndex]]) foundMinIndex = i;
    235.             }
    236.  
    237.             temp = resultOrder[minIndex];
    238.             resultOrder[minIndex] = resultOrder[foundMinIndex];
    239.             resultOrder[foundMinIndex] = temp;
    240.         }
    241.  
    242.  
    243.         text.text += "\nSorted\n";
    244.  
    245.         int order = 1;
    246.         foreach (int i in resultOrder)
    247.         {
    248.             float percentageOfAOS = results[1] / results[i];
    249.             text.text += order++ + " " + testNames[i] + " " + System.String.Format("{0,11:0,0.00}", results[i]) + " ticks " + percentageOfAOS.ToString("P") + "\n";
    250.         }
    251.     }
    252.  
    253.     public static unsafe void BurstAOS(float3[] vectors)
    254.     {
    255.         fixed (float3* ptr = vectors)
    256.             BurstAOS(ptr, vectors.Length);
    257.     }
    258.  
    259.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    260.     public static unsafe void BurstAOS(float3* vectors, int count)
    261.     {
    262.         for (int i = 0; i < count; i++)
    263.         {
    264.             vectors[i] *= 2f;
    265.         }
    266.     }
    267.  
    268.     public static unsafe void BurstSOA(float[] x, float[] y, float[] z)
    269.     {
    270.         fixed (float* xPtr = x)
    271.         fixed (float* yPtr = y)
    272.         fixed (float* zPtr = z)
    273.             BurstSOA(xPtr, yPtr, zPtr, x.Length);
    274.     }
    275.  
    276.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    277.     public static unsafe void BurstSOA([NoAlias] float* x, [NoAlias] float* y, [NoAlias] float* z, int count)
    278.     {
    279.         for (int i = 0; i < count; i++)
    280.         {
    281.             x[i] *= 2f;
    282.             y[i] *= 2f;
    283.             z[i] *= 2f;
    284.         }
    285.     }
    286.  
    287.     public static unsafe void BurstNAAOS(NativeArray<float3> vectors)
    288.     {
    289.         BurstAOS((float3*)vectors.GetUnsafePtr(), vectors.Length);
    290.     }
    291.  
    292.     public static unsafe void BurstNASOA(NativeArray<float> x, NativeArray<float> y, NativeArray<float> z)
    293.     {
    294.         BurstSOA((float*)x.GetUnsafePtr(), (float*)y.GetUnsafePtr(), (float*)z.GetUnsafePtr(), x.Length);
    295.     }
    296.  
    297.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    298.     private struct BurstNAAOSJob : IJob
    299.     {
    300.         public NativeArray<float3> vectors;
    301.  
    302.         public void Execute()
    303.         {
    304.             for (int i = 0; i < vectors.Length; i++)
    305.             {
    306.                 vectors[i] *= 2f;
    307.             }
    308.         }
    309.     }
    310.  
    311.     [BurstCompile(CompileSynchronously = true, FloatMode = FloatMode.Fast, OptimizeFor = OptimizeFor.Performance)]
    312.     private struct BurstNASOAJob : IJob
    313.     {
    314.         public NativeArray<float> x;
    315.         public NativeArray<float> y;
    316.         public NativeArray<float> z;
    317.  
    318.         public void Execute()
    319.         {
    320.             for (int i = 0; i < x.Length; i++)
    321.             {
    322.                 x[i] *= 2f;
    323.                 y[i] *= 2f;
    324.                 z[i] *= 2f;
    325.             }
    326.         }
    327.     }
    328. }
    Here, now all the functions are actually burst compiled :). They weren't before because you didn't specify [BurstCompile] for the class, which is required for Burst to recognize C# bursted functions. Since you didn't specify it burst didn't find the functions and then also didn't tell you that you're not allowed to send NativeArray and managed arrays directly into burst functions.
     
    Arowx likes this.
  27. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Nice one new scores on the doors, definitly seeing the functions in Burst Inspector now.
    Code (Boo):
    1. Sorted
    2. 1 SOA Burst NativeArray            8,797.44 ticks 190.51%
    3. 2 SOA Burst                        9,868.87 ticks 169.82%
    4. 3 SOA Burst NativeArray Job       11,507.84 ticks 145.64%
    5. 4 AOS Burst                       13,764.39 ticks 121.76%
    6. 5 AOS Burst NativeArray           13,852.77 ticks 120.98%
    7. 6 AOS Burst NativeArray Job       13,973.93 ticks 119.94%
    8. 7 SOA                             15,239.35 ticks 109.98%
    9. 8 AOS                             16,759.74 ticks 100.00%
    190% for SOA Burst NativeArray vs AOS

    So is there a good case for SOA Vector Native Array Structs instead of Vector3?
     
  28. Per-Morten

    Per-Morten

    Joined:
    Aug 23, 2019
    Posts:
    109
    If it fits your problem and access patterns then an SoA approach is quite reasonable. I usually start with SoA structures when I write code these days because the perf tends to be better and I find the code more malleable. As DreamImLatios said it's also possible to do SoAoS, i.e. xxxxyyyyzzzzxxxxyyyyzzzz. However, if you have lots of components and access them all at the same time and don't have any way to break up your processing then an AoS approach might be better. Everything kinda depends on the problem at hand. I also agree with officialfonee that it might be a good idea to experiment writing code with intrinsics, really gives you a new perspective on organizing data, and makes it pretty obvious why SIMD code tend to favor an SoA or SoAoS approach :)