Search Unity

Jobs system performance on Android?

Discussion in 'Entity Component System' started by Warmcher, Feb 3, 2018.

  1. Warmcher

    Warmcher

    Joined:
    Feb 24, 2015
    Posts:
    2
    My pretty empty project shows decrease in performance in Android with Jobs System;
    In editor it shows terrible performance(10x worse) too but in build for Windows it is even faster than simple code without job;
    Code (csharp):
    1.  
    2. private void Test()
    3. {
    4.      for (var i = 1; i < Points.Length; i++)
    5.      {
    6.           Points[i] = (byte) GroundKind.None;
    7.      }
    8. }
    9.  
    vs
    Code (csharp):
    1.  
    2. struct TestJob : IJob
    3. {
    4.      public NativeArray<byte> Array;
    5.      public int Length;
    6.  
    7.      public void Execute()
    8.      {
    9.          for (int i = 0; i < Length; i++)
    10.          {
    11.              Array[i] = (byte) GroundKind.None;
    12.          }
    13.      }
    14. }
    15.  
    Length is 2049 * 8193;
    Stopwatch results:
    Android:
    100 ms vs 1200ms
    Windows:
    11 ms vs 9ms

    Windows was built using mono;
    Android was built using ARMx64 ILCPP(can`t build with mono becasue "Couldnt load mono" on device, x32 ILCPP crash too)

    Is it supposed to be like so(yet) or there is something wrong?
     
    Last edited: Feb 3, 2018
  2. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    669
    Try using IJobParallelFor and split the batch count up to the number of processor cores as a starting point. You can use System.Environment.ProcessorCount to get the number of CPU cores, and the workload will be automatically divided between them (and I suspect the underlying loop itself is much faster).

    I've noticed IJob is less optimized for large array processing in several of my tests, although I wouldn't be surprised if they haven't optimized the C# jobs for Android as thoroughly yet.

    EDIT:
    One more thing I thought of, is that byte operation performance may not be optimal, since on many platforms (including on the CLR and the C++ environment, byte operations are actually converted to integers and then back to bytes. There may or may not be some performance issues with this, I'd need someone with more recent experience dealing with low-level stuff to chime in.
     
    Last edited: Feb 4, 2018
    vanxining likes this.
  3. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    669
    Code (CSharp):
    1. struct TestJob : IJobParallelFor
    2. {
    3.         public NativeArray<byte> Array;
    4.  
    5.         public void Execute(int i)
    6.         {
    7.                 Array[i] = (byte) GroundKind.None;
    8.         }
    9. }
    10.  
    11. public void TestJobFunc()
    12. {
    13.         var processorCount = System.Environment.ProcessorCount;
    14.  
    15.         var array = new NativeArray<byte>(100000);
    16.  
    17.  
    18.         var testJob = new TestJob
    19.         {
    20.                 Array = array,
    21.         };
    22.  
    23.         var testJobHandle = testJob.Schedule(array.Length, processorCount);
    24.  
    25.         //...later...
    26.  
    27.         testJobHandle.Complete();
    28. }
    Added an example.
     
  4. Warmcher

    Warmcher

    Joined:
    Feb 24, 2015
    Posts:
    2
    I did not mentioned it, but i did test parallel. Actually the first thing i want Jobs for is Parallel. Parallel.For and Parallel.ForEach(with partition) are much faster than IJobParallelFor on Android. Windows results with IJobParallelFor for are amazing.
     
  5. recursive

    recursive

    Joined:
    Jul 12, 2012
    Posts:
    669
    Sounds like it may be an performance regression bug then, I'd go ahead and report it.
     
  6. interpol_kun

    interpol_kun

    Joined:
    Jul 28, 2016
    Posts:
    134
    Points.Length in for loop can decrease the performance. Rewrite the example without jobs and see if there a real difference in Windows build.

    What's about the Android, I think that the conversion from byte to int is not so heavy operation for x10 speed decrease. It seems that Job System currently is not optimized for an Android.
     
  7. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    We have been working on perf speedups for NativeArray in IL2CPP. So far we have mostly focused internally on measuring performance with burst, where we have incredible performance with NativeArray.

    But we are now also making sure that we are at a minimum on par with builtin arrays in IL2CPP. We have that working locally now, just need to get it reviewed and into one of the next beta builds. We'll post some numbers soon.


    In Mono / Editor we also do extensive race conditions checks, out of bounds checks etc and that costs performance. We can probably get a bit of speedup on that but nothing major. For NativeArray speed in the editor our idea is that anyone who wants to write code that has to also run fast in the editor will use Burst on the jobs to those use cases Burst will solve those since it can run in the Editor. But thats not in 18.1 of course.

    If there is an absolute need to write performant NativeArray code right now. You can use NativeArray.GetUnsafePtr() to get the pointer to the data. Naturally this removes all out of bounds checks and you can very easily crash unity that way. So i recommend just waiting for the IL2CPP changes to land to make it fast.
     
    Last edited: Feb 9, 2018
    5argon, Havokki, recursive and 3 others like this.
  8. MadeFromPolygons

    MadeFromPolygons

    Joined:
    Oct 5, 2013
    Posts:
    3,980
    Thanks @Joachim_Ante , I really appreciate how active on the forums you are despite being a CTO, very refreshing!
     
    andywatts, Krajca and Peter77 like this.
  9. Astiolo

    Astiolo

    Joined:
    Dec 18, 2014
    Posts:
    27
    Now that 2018 is out of beta does someone have some stats on the Jobs system performance on Android?

    When googling, this thread was just about the only thing that came up for me.