Search Unity

  1. Click here to see what's on sale for the "Best of Super Sale" on the Asset Store
    Dismiss Notice
  2. We are looking for feedback on the naming of a new user research platform that we are working on.
    Dismiss Notice
  3. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Parallel memory allocation

Discussion in 'Data Oriented Technology Stack' started by vitautart, Jan 24, 2019.

  1. vitautart

    vitautart

    Joined:
    May 3, 2018
    Posts:
    29
    As a newbie in concurent programing, and because for now there aren't any tool for in-job nativearray allocation, I try to write terrifying code like this.

    Code (CSharp):
    1.  
    2. unsafe struct TestJob : IJobParallelFor
    3. {
    4.        public int someLength; // can't be 0
    5.  
    6.        public void Execute(int index){
    7.  
    8.               int* ptr;
    9.               if (someLength > 8) {
    10.                    ptr = (int*)UnsafeUtility.Malloc(UnsafeUtility.SizeOf<int>() * someLength , 4, Allocator.TempJob);
    11.              }
    12.              else {
    13.                    ptr = stackalloc int[someLength];
    14.              }
    15.  
    16.              // do some reads and writes to allocated buffer
    17.  
    18.             if (someLength > 8) {
    19.                  UnsafeUtility.Free(ptr, Allocator.TempJob);
    20.             }
    21.        }
    22. }
    And it works most of times, but in some rare cases where more then one agents use this kind of code, I have hard crashes, and especially when I allocate in stack, but also when allocation was in heap only. As I understand there I have data race, because each thread doesn't know about each other pointers, and they can point to the same memory, or memory can overlaps with each other.

    Am I understand this right?
     
    wobes likes this.
  2. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,139
    This code. While unsafe is legit. and should cause no issues.

    I'd expect you are overwriting memory somewhere. (Writing to stale memory or out of bounds writes)This is easy to happen with unsafe code...

    It would be better to use Allocator.Temp however for perf and memory consumption reasons. Because the concept is that the memory is only accessable by the job itself anyway.
     
  3. vitautart

    vitautart

    Joined:
    May 3, 2018
    Posts:
    29
    Thanks for reply. Yes I also started think that TempJob is bad choice here.

    So in theory this can work, but in my code between allocation and deallocation, I'm doing something bad?

    P.S. Anyway I have plans to rewrite this to avoid in-job allocation, but it can cause some unnecessary overhead.
     
  4. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,139
    You didn't paste the code inbetween so its impossible to say, but given that the allocation and dealloaction looks good, that seems to be the logical conclusion.

    Do note that in 19.1 it is possible to use Allocator.Temp on NativeArray in a job.
     
    Mr-Mechanical, Abbrew and vitautart like this.
  5. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,139
    >P.S. Anyway I have plans to rewrite this to avoid in-job allocation, but it can cause some unnecessary overhead.
    There is nothing wrong with doing in-job allocation assuming you use stackalloc or TempAlloc.
     
    FROS7 and vitautart like this.
  6. vitautart

    vitautart

    Joined:
    May 3, 2018
    Posts:
    29
    Ok, for now I decided to have two variants with in-job allocation and without. And will try find some bugs in in-job version. And when 19.1 will be in stable state I will switch to NativeArray with Allocator.Temp.

    Thanks!
     
    FROS7 likes this.
  7. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    761
    Good day,

    What would be the difference in performance between Marshal.AllocHGlobal(); and UnsafeUtility.Malloc()?
     
  8. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,139
    UnsafeUtility.Malloc has different allocators with massively different performance characteristics.

    Temp is a stack allocator per thread. TempJob is reusing on a per frame basis across jobs. (Both of those are very fast and meant for allocations every frame)

    Persistent is a TLSF allocator when lifetime is unknown.
    All of them significantly faster than system allocation.

    Essentially using Marshal.AllocHGlobal is always a bad idea.
     
    FROS7 and wobes like this.
  9. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    568
    @Joachim_Ante For persistent memory allocator what was the key factor when choosing the TLSF in favor of let's say tcmalloc/jemalloc/smmalloc (if you compared it to any of them)? Lowest memory fragmentation? And another question: any lock mechanisms are used in the allocator implementations?
     
    Last edited: Jan 29, 2019
  10. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    761
    Much appreciate your answer.
     
  11. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,139
    The memory allocator has been improved over several years based on performance in games.

    Temp is per thread so has no contention.
    TempJob can have contention on a atomic int but has no locks.
     
    FROS7 likes this.
  12. ReadyPlayGames

    ReadyPlayGames

    Joined:
    Jan 24, 2015
    Posts:
    46
    I've always been a tad confused between Temp and TempJob, I think I got it now!
     
  13. e199

    e199

    Joined:
    Mar 24, 2015
    Posts:
    100
    I did a few tests with allocators.
    I allocate and deallocate 64 bytes 1000 times per frame after 5 sec of runtime. I do this for each allocator type.
    Tested in build and editor, almost no differences. Results are from the build with attached profiler.

    First allocation:

    Each next allocation:


    Code (CSharp):
    1. using System;
    2. using System.Runtime.InteropServices;
    3. using Smmalloc;
    4. using Unity.Collections;
    5. using Unity.Collections.LowLevel.Unsafe;
    6. using UnityEngine;
    7. using UnityEngine.Profiling;
    8.  
    9. public class AllocTest : MonoBehaviour
    10. {
    11.     public float StartDelay        = 5f;
    12.     public int   FramesToRun       = 10;
    13.     public int   AllocationsPerRun = 1000;
    14.  
    15.     private        int              _currentFrame = 0;
    16.     private        SmmallocInstance smmalloc;
    17.     private        IntPtr[]         _pointers;
    18.     private unsafe void*[]          _rawPointers;
    19.  
    20.     unsafe void Start()
    21.     {
    22.         smmalloc = new SmmallocInstance(8, 16 * 1024 * 1024);
    23.         smmalloc.CreateThreadCache(4 * 1024, CacheWarmupOptions.Hot);
    24.         _pointers    = new IntPtr[AllocationsPerRun];
    25.         _rawPointers = new void*[AllocationsPerRun];
    26.     }
    27.  
    28.     void Update()
    29.     {
    30.         if (Time.realtimeSinceStartup < StartDelay) return;
    31.  
    32.         if (_currentFrame < FramesToRun)
    33.         {
    34.             Profiler.BeginSample("SMMALLOC ALLOCATOR");
    35.          
    36.             for (int i = 0; i < AllocationsPerRun; i++)
    37.             {
    38.                 var memory = smmalloc.Malloc(64);
    39.                 _pointers[i] = memory;
    40.             }
    41.  
    42.             for (int i = 0; i < AllocationsPerRun; i++)
    43.             {
    44.                 smmalloc.Free(_pointers[i]);
    45.             }
    46.  
    47.             Profiler.EndSample();
    48.  
    49.             Profiler.BeginSample("UNITY TEMP ALLOCATOR");
    50.          
    51.             for (int i = 0; i < AllocationsPerRun; i++)
    52.                 unsafe
    53.                 {
    54.                     var memory = UnsafeUtility.Malloc(64, 4, Allocator.Temp);
    55.                     _rawPointers[i] = memory;
    56.                 }
    57.  
    58.             for (int i = 0; i < AllocationsPerRun; i++)
    59.                 unsafe
    60.                 {
    61.                     UnsafeUtility.Free(_rawPointers[i], Allocator.Temp);
    62.                 }
    63.          
    64.             Profiler.EndSample();
    65.          
    66.             Profiler.BeginSample("UNITY TEMPJOB ALLOCATOR");
    67.          
    68.             for (int i = 0; i < AllocationsPerRun; i++)
    69.                 unsafe
    70.                 {
    71.                     var memory = UnsafeUtility.Malloc(64, 4, Allocator.TempJob);
    72.                     _rawPointers[i] = memory;
    73.                 }
    74.  
    75.             for (int i = 0; i < AllocationsPerRun; i++)
    76.                 unsafe
    77.                 {
    78.                     UnsafeUtility.Free(_rawPointers[i], Allocator.TempJob);
    79.                 }
    80.          
    81.             Profiler.EndSample();
    82.          
    83.             Profiler.BeginSample("UNITY PERSISTENT ALLOCATOR");
    84.          
    85.             for (int i = 0; i < AllocationsPerRun; i++)
    86.                 unsafe
    87.                 {
    88.                     var memory = UnsafeUtility.Malloc(64, 4, Allocator.Persistent);
    89.                     _rawPointers[i] = memory;
    90.                 }
    91.  
    92.             for (int i = 0; i < AllocationsPerRun; i++)
    93.                 unsafe
    94.                 {
    95.                     UnsafeUtility.Free(_rawPointers[i], Allocator.Persistent);
    96.                 }
    97.          
    98.             Profiler.EndSample();
    99.          
    100.             Profiler.BeginSample("MARSHAL ALLOCATOR");
    101.          
    102.             for (int i = 0; i < AllocationsPerRun; i++)
    103.             {
    104.                 var memory = Marshal.AllocHGlobal(64);
    105.                 _pointers[i] = memory;
    106.             }
    107.  
    108.             for (int i = 0; i < AllocationsPerRun; i++)
    109.             {
    110.                 Marshal.FreeHGlobal(_pointers[i]);
    111.             }
    112.  
    113.             Profiler.EndSample();
    114.          
    115.             _currentFrame++;
    116.         }
    117.     }
    118.  
    119.     private void OnDestroy()
    120.     {
    121.         smmalloc.DestroyThreadCache();
    122.         smmalloc.Dispose();
    123.     }
    124. }

    The smmalloc and marshal both can be used for unknown lifetime, so they should be compared to Persistent unity allocator.
     
unityunity