Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Burst and vectorization of loop

Discussion in 'Burst' started by e199, Aug 25, 2019.

  1. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    101
    Hi, I tried a lot of things to vectorize this loop, but without success:
    Code (CSharp):
    1.     [BurstCompile]
    2.     public struct ChangeDetectionChunk1 : IJobChunk
    3.     {
    4.         [ReadOnly] public ArchetypeChunkComponentType<TestComponent>         TestType;
    5.         public            ArchetypeChunkComponentType<Shadow<TestComponent>> STestType;
    6.  
    7.         public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
    8.         {
    9.             var testAr  = chunk.GetNativeArray(TestType);
    10.             var sTestAr = chunk.GetNativeArray(STestType);
    11.             for (var i = 0; i < chunk.Count; i++)
    12.             {
    13.                 var component    = testAr[i];
    14.                 var shadow       = sTestAr[i];
    15.                 var componentDup = shadow.Value;
    16.  
    17.                 if (component.Value != componentDup.Value || component.Value2 != componentDup.Value2 ||
    18.                     component.Value3 != componentDup.Value3 || component.Value4 != componentDup.Value4)
    19.                 {
    20.                     shadow.Value   = component;
    21.                     shadow.Changed = true;
    22.                     sTestAr[i]     = shadow;
    23.                 }
    24.             }
    25.         }
    26.     }
    Another attempt using pointers and [NoAlias]:
    Code (CSharp):
    1.     [BurstCompile]
    2.     public struct ChangeDetectionChunk2 : IJobChunk
    3.     {
    4.         [ReadOnly] public ArchetypeChunkComponentType<TestComponent2>         TestType;
    5.         public            ArchetypeChunkComponentType<Shadow<TestComponent2>> STestType;
    6.  
    7.         public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
    8.         {
    9.             var testAr   = chunk.GetNativeArray(TestType);
    10.             var sTestAr  = chunk.GetNativeArray(STestType);
    11.             var length   = testAr.Length;
    12.             var testPtr  = testAr.GetUnsafeReadOnlyPtr();
    13.             var sTestPtr = sTestAr.GetUnsafePtr();
    14.  
    15.             Do(testPtr, sTestPtr, length);
    16.         }
    17.  
    18.         private void Do([NoAlias] void* test, [NoAlias] void* sTest, int length)
    19.         {
    20.             for (int i = 0; i < length; i++)
    21.             {
    22.                 var testOffset  = Unsafe.Add<TestComponent2>(test, i);
    23.                 var sTestOffset = Unsafe.Add<Shadow<TestComponent2>>(sTest, i);
    24.  
    25.                 var component = Unsafe.Read<TestComponent2>(testOffset);
    26.                 var shadow    = Unsafe.Read<Shadow<TestComponent2>>(sTestOffset);
    27.  
    28.                 shadow.Changed = !component.Equals(shadow.Value);
    29.                 shadow.Value = component;
    30.                 Unsafe.Write(sTestOffset, shadow);
    31.             }
    32.         }
    33.     }

    What code should do is compare 2 structs with each other, if they are not equal - mark it in Changed bool field of shadow.

    Components:
    Code (CSharp):
    1. using System;
    2. using Unity.Entities;
    3. [assembly: RegisterGenericComponentType(typeof(Shadow<TestComponent>))]
    4. [assembly: RegisterGenericComponentType(typeof(Shadow<TestComponent2>))]
    5. public struct Shadow<T> : IComponentData where T : unmanaged, IComponentData
    6. {
    7.     public T    Value;
    8.     public bool Changed;
    9. }
    10.  
    11. public partial struct TestComponent2 : IComponentData
    12. {
    13.     public int Val1;
    14.     public int Val2;
    15.     public int Val3;
    16.     public int Val4;
    17. }
    18.  
    19. public partial struct TestComponent2
    20. {
    21.     public bool Equals(TestComponent2 other)
    22.     {
    23.         return Val1 == other.Val1 && Val2 == other.Val2 && Val3 == other.Val3 && Val4 == other.Val4;
    24.     }
    25. }
    26.  
    If you have any ideas how to make it vectorizable, I would be very glad to test it.
     
    Last edited: Aug 25, 2019
  2. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    101
    @Lee_Hammerton @xoofx I think you can give me the best insight about how I can push Burst to vectorize it
     
  3. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    @e199 I'm going to put this on my list to look at - I'll get back to you as soon as I know more :)
     
  4. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    Can I just check that your TestComponent and Shadow definitions are something like:

    Code (CSharp):
    1.     public struct TestComponent : IComponentData
    2.     {
    3.         public int Value;
    4.         public int Value2;
    5.         public int Value3;
    6.         public int Value4;
    7.     };
    8.  
    9.     public struct Shadow<T> : IComponentData
    10.     {
    11.         public T Value;
    12.         public bool Changed;
    13.     };
    What I think is happening is the compiler is turning the load for component into a <4 x i32> -> or in other words it is loading a vector from memory. The problem with this is that the LLVM loop vectorizer comes along and looks at the already vectorized type inside the loop and says "Ok cool! The code is already vectorized so I'm not gonna do anything!".
     
  5. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    So I also tried lots of different types for TestComponent:

    Code (CSharp):
    1.     public struct TestComponent : IComponentData
    2.     {
    3.         public int Value;
    4.         public short Value2;
    5.         public byte Value3;
    6.         public long Value4;
    7.     };
    And the compiler then loads a struct in the loop - and the LLVM loop vectorizer looks at the struct and thinks "No idea how to vectorize that!".

    This is a fundamentally hard problem for the compiler - generally if all the data is being used it is going to try and load it all together (so you get a bigger contiguous load). The problem in this case is that affects the ability to vectorize the loop.
     
  6. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    101
    Thanks, a lot for the information!

    If I reduce field count to 1-2, then it should be able to vectorize it, then?
     
  7. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    Unfortunately not - LLVM's loop vectorizer still trips up even if you do that unfortunately. Basically the default LLVM alias analysis is not being smart enough to realise that the store to the boolean Changed variable could not alias with the next loop iterations load from the Value.

    What is even more frustrating is that even if I separate out the data streams entirely like:

    Code (CSharp):
    1. public struct TestComponent1 : IComponentData
    2.     {
    3.         public int Value;
    4.     }
    5.  
    6.     public struct Changed1 : IComponentData
    7.     {
    8.         public bool Changed;
    9.     }
    10.  
    11.     [BurstCompile]
    12.     public unsafe struct ChangeDetectionChunk1 : IJobChunk
    13.     {
    14.         [ReadOnly] public ArchetypeChunkComponentType<TestComponent1> TestType;
    15.         public ArchetypeChunkComponentType<TestComponent1> SValueType;
    16.         public ArchetypeChunkComponentType<Changed1> SChangedType;
    17.  
    18.         public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
    19.         {
    20.             var testAr = chunk.GetNativeArray(TestType);
    21.             var sValueAr = chunk.GetNativeArray(SValueType);
    22.             var sChangedAr = chunk.GetNativeArray(SChangedType);
    23.             var length = testAr.Length;
    24.             var testPtr = testAr.GetUnsafeReadOnlyPtr();
    25.             var sValuePtr = sValueAr.GetUnsafePtr();
    26.             var sChangedPtr = sChangedAr.GetUnsafePtr();
    27.  
    28.             Do(testPtr, sValuePtr, sChangedPtr, length);
    29.         }
    30.  
    31.         private void Do([NoAlias] void* test, [NoAlias] void* sValue, [NoAlias] void* sChanged, int length)
    32.         {
    33.             for (int i = 0; i < length; i++)
    34.             {
    35.                 var testOffset = Unsafe.Add<TestComponent1>(test, i);
    36.                 var sValueOffset = Unsafe.Add<TestComponent1>(sValue, i);
    37.                 var sChangedOffset = Unsafe.Add<Changed1>(sChanged, i);
    38.  
    39.                 var component = Unsafe.Read<TestComponent1>(testOffset);
    40.                 var shadow = Unsafe.Read<TestComponent1>(sValueOffset);
    41.  
    42.                 if (component.Value != shadow.Value)
    43.                 {
    44.                     Unsafe.Write(sValueOffset, component);
    45.                     Unsafe.Write(sChangedOffset, true);
    46.                 }
    47.             }
    48.         }
    49.     }
    LLVM decides that yup - the loop is vectorizable, but that its not worth doing it "LV: Vectorization is possible but not beneficial."

    I've not got a great workaround for you at present unfortunately - but I'm going to keep this issue on my list because this is something that we really want users to be able to control :)
     
    e199 likes this.
  8. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    101
    Awesome, thanks for taking your time
    I will keep an eye on burst changelogs
     
    sheredom likes this.
  9. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    300
    Just as some follow-up - I tried using our already vectorized Unity.Mathematics.int4 instead of the 4 Value's you had in your original component, and got much better code from the loop out of the compiler:

    Code (CSharp):
    1. public struct TestComponent2 : IComponentData
    2.     {
    3.         public int4 Value;
    4.  
    5.         public bool Equals(TestComponent2 other)
    6.         {
    7.             return Value.Equals(other.Value);
    8.         }
    9.     };
    10.  
    11.     public struct Shadow2<T> : IComponentData
    12.     {
    13.         public T Value;
    14.         public bool Changed;
    15.     };
    16.     [BurstCompile]
    17.     public struct ChangeDetectionChunk2 : IJobChunk
    18.     {
    19.         [ReadOnly] public ArchetypeChunkComponentType<TestComponent2> TestType;
    20.         public ArchetypeChunkComponentType<Shadow<TestComponent2>> STestType;
    21.  
    22.         public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
    23.         {
    24.             var testAr = chunk.GetNativeArray(TestType);
    25.             var sTestAr = chunk.GetNativeArray(STestType);
    26.  
    27.             for (var i = 0; i < chunk.Count; i++)
    28.             {
    29.                 var component = testAr[i];
    30.                 var shadow = sTestAr[i];
    31.                 var componentDup = shadow.Value;
    32.  
    33.                 if (!component.Equals(componentDup))
    34.                 {
    35.                     shadow.Value = component;
    36.                     shadow.Changed = true;
    37.                     sTestAr[i] = shadow;
    38.                 }
    39.             }
    40.         }
    41.     }
    If you are still looking at this then I'd advise in this instance you just use the already vectorized type (since your payload type within the TestComponent was already effectively a vector!).
     
    Deleted User and digitaliliad like this.
  10. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    101
    Hi, Thanks for input!
    This is just test component, I doubt I will have 4 fields of same type in same component in real project