Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

Request: allow parallel writing to NativeList when [WriteOnly] is used, or add .Concurrent

Discussion in 'Entity Component System' started by DreamPower, Mar 20, 2018.

  1. DreamPower

    DreamPower

    Joined:
    Apr 2, 2017
    Posts:
    103
    I have a simple IJobParallelFor job which takes a couple [ReadOnly] NativeArrays of mesh vertices and triangles, and returns a smaller section of that mesh as a list of vertices and triangles (specifically, I'm taking a large mesh scanned from an AR headset, and trying to find flat-ish surfaces in it).

    I had thought NativeList would be the solution, but even when setting [WriteOnly], I get the "Container does not support parallel writing" error. I read that NativeQueue<>.Concurrent can be used for this, but when I'm done with the job I want to use random-access for the contents of the list (I'm comparing the resulting triangles to each other), and it seems a waste of processing time to copy the results to another container first.

    So my request is to either just allow NativeList in IJobParallelFor if it's WriteOnly, or add a Concurrent struct to NativeList (similar to how NativeQueue.Concurrent works) that limits NativeList to write-only. I'd bet a lot of people will try using NativeList the way I did.

    Here's what I'd like to do, my current job which produces the error:
    Code (CSharp):
    1.     // Job to find all upward-facing triangles at (roughly) the same height, in front of the user
    2.     struct FindSurfaceTriangles : IJobParallelFor
    3.     {
    4.         // Input
    5.         [Unity.Collections.ReadOnly]
    6.         public NativeArray<int> triangles;
    7.         [Unity.Collections.ReadOnly]
    8.         public NativeArray<Vector3> normals;
    9.         [Unity.Collections.ReadOnly]
    10.         public NativeArray<Vector3> vertices;
    11.         public Vector3 meshFoundPosition;
    12.         public Vector3 headsetPosition;
    13.         public Vector3 headsetDirection;
    14.  
    15.         // Output
    16.         [Unity.Collections.WriteOnly]
    17.         public NativeList<Vector3> outputTriangleCenters;
    18.         [Unity.Collections.WriteOnly]
    19.         public NativeList<Vector3> outputVertices;
    20.  
    21.         public void Execute(int index)
    22.         {
    23.             int triIndex = index * 3;
    24.             Vector3 norm0 = normals[triangles[triIndex]];
    25.             Vector3 norm1 = normals[triangles[triIndex + 1]];
    26.             Vector3 norm2 = normals[triangles[triIndex + 2]];
    27.  
    28.             Vector3 triangleFaceNormal = (norm0 + norm1 + norm2) / 3;
    29.             if (Vector3.Angle(triangleFaceNormal, Vector3.up) < 25)     // Upwards within 25 degrees
    30.             {// This triangle is facing roughly upwards
    31.                 Vector3 center = (vertices[triangles[triIndex]] + vertices[triangles[triIndex + 1]] + vertices[triangles[triIndex + 2]]) / 3;
    32.                 Vector3 vectorToTri = (center - headsetPosition).normalized;
    33.                 if (Mathf.Abs(center.y - meshFoundPosition.y) < 0.10f && Vector3.Angle(headsetDirection, vectorToTri) < 45)     // Within 10 cm height of mesh-found point, within 45 degrees of the direction the headset is pointing (90 degrees total FOV)
    34.                 {
    35.                     outputTriangleCenters.Add(center);
    36.                     outputVertices.Add(vertices[triangles[triIndex]]);
    37.                     outputVertices.Add(vertices[triangles[triIndex + 1]]);
    38.                     outputVertices.Add(vertices[triangles[triIndex + 2]]);
    39.                 }
    40.             }
    41.         }
    42.     }
    43.  
     
    Last edited: Mar 20, 2018
    DragonCoder and Armegalo like this.
  2. one_one

    one_one

    Joined:
    May 20, 2013
    Posts:
    615
    How would making it write-only solve all those concurrency issues you get when having multiple threads write to the same data set; especially collections that may change element count such as lists?
     
  3. DreamPower

    DreamPower

    Joined:
    Apr 2, 2017
    Posts:
    103
    If Queue can do it when Enqueueing, List should be able to do it when Adding. I'm really not up on writing my own containers, but here's an early doc on writing your own, including making a constantly-incrementing list that works in IJobParallelFor:

    https://gist.github.com/joeante/3f6b75c738fe0a1be19207e7e4294578

    That doc has a couple solutions for the problem, including having each thread cache its own counter value, and when the main thread (outside the job) accesses the counter, it adds up all the cached values.
     
    Last edited: Mar 20, 2018
  4. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    NativeList allows parallel writing. The pattern is optimized for the best performance.

    Some background. NativeArray is a direct pointer to the data, NativeList is a pointer to the pointer, size it is resizable...

    So first of all you can cast a NativeList -> NativeArray. (Using implicit operator) And then you can simply use the NativeArray in the job. This works just fine.


    Now... There is one more really cool pattern we added recently. Which is deferred conversion to arrays.
    What it solves is that, you often have some jobs that change the List size before the parallel for job is supposed to process it. If you cast NativeList to NativeArray, the length is locked in and Unity will correctly give you error message if you try to resize it in a job in paralell.

    But that still leaves the question how can i resize a list on a job, and then use NativeArray to process the resized array efficiently. This is why we added the ability to
    a) Make iteration count of IJobParallelFor dependent on the Length of the list
    b) Deferred conversion of NativeList -> NativeArray

    I attached the Unit Tests for the deferred conversion to arrays.
     

    Attached Files:

  5. DreamPower

    DreamPower

    Joined:
    Apr 2, 2017
    Posts:
    103
    I think I see what you are suggesting, and those solutions sound great for anyone wanting to pass a pre-existing List into a job. But what I need is for the IJobParallelFor to create the list in the first place, starting with a completely empty list, without knowing how long the resulting list will be. As you can see in my job code above, my IJobParallelFor iterates through every single triangle in a large mesh (those triangles and vertices passed in via read-only NativeArrays), and returns just a few triangles and their vertices, something that cannot be done with an array without returning a very large list with a lot of empty entries that I have to then (outside the job) iterate through and create yet another list of exactly the length needed to get the results to be used later.

    And of course, when doing a NativeList.Add in an IJobParallelFor, the error is given, even if the Capacity is set in the List construction. This seems like something a lot of people will try, and run into the same problems I have.
     
    Dale-Nation, FixItFelix and deus0 like this.
  6. jselstad

    jselstad

    Joined:
    Apr 12, 2016
    Posts:
    6
    I don't believe there's a way around allocating a NativeArray that's as big as the number of values you expect to receive, but if I recall, there's a Native atomically incrementable counter script out there which you can use to atomically increment the write index on this shared array, so multiple threads can "append" elements into the array in a thread-safe fashion.

    When the job completes, you can use the value of that counter to determine how many elements within the resulting NativeArray you'll need to actually iterate through (a similar workflow to OverlapSphereNonAlloc). Actually doing this might require using the unsafe attributes to disable the parallel read/write checks on NativeArrays...

    There's an example of a thread-safe native counter here:
    https://github.com/Unity-Technologi...les/Assets/NativeCounterDemo/NativeCounter.cs

    and Keijiro has an ECS example using a thread-safe counter here:
    https://github.com/keijiro/Voxelman/blob/master/Assets/ECS/ScannerSystem.cs
     
  7. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    So the point here is that calling Add to a list from multitple threads is not thread safe. So the job debugger tells you that its not.

    The point here is that we try to expose safe multithreading patterns that solve these problems. Potentially what you are looking for is maybe IJobParallelForFilter? Not sure. It would be great if you could describe how you want to use the list and paste some code or info about what exactly your are trying to solve.
     
    one_one likes this.
  8. DreamPower

    DreamPower

    Joined:
    Apr 2, 2017
    Posts:
    103
    Yeah, an IJobParallelForFilter would be exactly what I need. Here’s exactly what I’m doing: I’ve got an AR headset which scans the room and creates one giant mesh for everything in the user’s real life room (or multiple meshes with 65,536 vertices each). I’m trying to detect and extract a mostly-flat surface from that mesh, a table or desk (or bed) in front of the user, which will become the play-area for a game. So what I’m trying to accomplish is to run through a few steps, each one breaking that huge room-sized mesh into smaller more manageable submeshes until I’m left with only the flat surface in front of the user.

    The first step is to create a list of all the triangles (and their vertices) in the room that face upwards, are at the same height, and are in front of the headset; that is what I’m trying to do here in the IJobParallelFor, and my code for that step is the job in the first post in this thread. The second step is to search through that smaller list and compare the triangles with each other, finding groups of triangles that are touching and separating each group into its own list; those are the various surfaces in front of the user. The largest contiguous group of triangles at that point is assumed to be the play surface, and the final step is to find/trim the edges of that group to create the useable area for gameplay.

    I *have* written a version of the IJobParallelFor which works for the first step, but what it does is output a giant NativeArray the same size as the original room-sized mesh, with the triangles flagged as good or bad, so I have to then add an extra step scanning through that list to extract only the good triangles/vertices into new lists, which seems wasteful.
     
  9. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    Try using IJobParallelForFilter it is part of the entities component system package.

    (Do note that IJobParallelForFilter doesn't go wide yet... it is to do, we will get to that very soon)
     
  10. DreamPower

    DreamPower

    Joined:
    Apr 2, 2017
    Posts:
    103
    Thanks a lot, got IJobParallelForFilter working, that definitely seems to be less wasteful than iterating over the entire array after the job, instead using a List pointing at the filtered entries.
     
  11. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,448
    Did the Schedule (list, innerloopBatchCount, jobhandle) extension get removed at some point.

    I can't seem to use the ToDeferredJobArray pattern as I can't pass a list to the Scheduler

    -edit-

    did some snooping through source code, needed

    ENABLE_MORE_CONTAINER_SUPPORT
     
    Last edited: Jul 22, 2018
    bb8_1 and deus0 like this.
  12. Tony_Max

    Tony_Max

    Joined:
    Feb 7, 2017
    Posts:
    274
    Where can i read about this (IJobParallelForFilter) and couple of other things like for example NativeQueue<T>.Concurrent. I just can't find some documentation about it.
     
    chantey and T-Zee like this.
  13. TLRMatthew

    TLRMatthew

    Joined:
    Apr 10, 2019
    Posts:
    65
    Hi @Joachim_Ante, it seems like IJobParallelForFilter is still running on the main thread - is making it run wide still on the todo list? Thanks!
     
  14. 8bitgoose

    8bitgoose

    Joined:
    Dec 28, 2014
    Posts:
    437
    Just confirming that you can write to a list in parallel by using
    Code (CSharp):
    1. [NativeDisableParallelForRestriction]
    but doing an NativeList.Add is not concurrent when I have that attribute added.
     
  15. Nwappz

    Nwappz

    Joined:
    Jul 29, 2019
    Posts:
    3
    lets say it in simple term,i have many entities which i like each entity to hold a collection of entities closer than say...40 units(m)...i am using ijobforeachwithentity interface,and i am literary unable to add close entities to each entity,facing the same error,what is going on?
     
  16. chadfranklin47

    chadfranklin47

    Joined:
    Aug 11, 2015
    Posts:
    204
    Know if this has been updated?
     
  17. DragonCoder

    DragonCoder

    Joined:
    Jul 3, 2015
    Posts:
    1,007
    Sorry for necroing this thread, but this is outdated information, right?
    because trying this, results in the following error:
    Code (CSharp):
    1. InvalidOperationException: SwarmMigrationCheck.swarms_in_range [[[a NativeList<>]]] is not declared [ReadOnly] in a IJobParallelFor job. The container does not support parallel writing. Please use a more suitable container type.
    2. Unity.Jobs.LowLevel.Unsafe.JobsUtility.ScheduleParallelFor
    EDIT: or did "writing" in that quote refer to something like
    my_list[5] = 20
    but does not include
    my_list.Add(20)
    ?

    In that case, what is the solution nowadays to retrieve an arbitrary (at schedule time unknown) number of elements out of an IJobParallelFor?
    There is no concurrent variant of NativeList as far as I can see sadly.
     
  18. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,467
    It is called NativeList<>.ParallelWriter now. But it is not always optimal.

    What exactly are you trying to do?
     
    tonytopper and DragonCoder like this.
  19. chadfranklin47

    chadfranklin47

    Joined:
    Aug 11, 2015
    Posts:
    204
    Just curious, when is NativeList<>.ParallelWriter not optimal?
     
    tonytopper likes this.
  20. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,467
    If you are writing frequently to the list from a bunch of threads at once, things will slow down drastically as there is a sort of per-thread lock on the next free index in the list that the threads will be fighting over.

    However, if your writes are buried behind some rare condition and you don't need determinism, then you are probably fine.
     
    chadfranklin47 likes this.
  21. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,448
    If you know the amount you need to write to a native list per thread (common case is you have X entities in a chunk) I have this extension for native list parallel writing that will make writing to a native list in parallel magnitudes faster.

    Code (CSharp):
    1. public static void ReserveNoResize<T>(this NativeList<T>.ParallelWriter nativeList, int length, out T* ptr, out int idx)
    2.     where T : unmanaged
    3. {
    4.     idx = Interlocked.Add(ref nativeList.ListData->m_length, length) - length;
    5.     ptr = (T*)((byte*)nativeList.Ptr + (idx * UnsafeUtility.SizeOf<T>()));
    6. }
    Basically you just reserve the chunk of memory you need for your thread, then populate it afterwards. This significantly reduces the amount of interlocks and waiting your threads need to do.

    I'm sure someone will probably enlighten me, but I can't personally think of a faster way to write parallel to the same continuous chunk of memory outside of potentially per-calculating your indices though I'm not sure that overhead would even be faster (obviously there are alternatives like native stream that write to blocks of memory.)

    (You will need to give yourself internal access to use the above code)
     
  22. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,467
    This heavily depends on how you get your initial NativeList capacity in the first place. If the algorithm involves a CalculateEntityCount(), then you are iterating chunks anyways so you might as well build a prefix sum and pre-allocate the NativeList then and there. But if you do something more conservative, like if you know a total entity count but not necessarily whether or not a chunk qualifies, then I can't think of anything better than your extensions for this purpose.
     
  23. calabi

    calabi

    Joined:
    Oct 29, 2009
    Posts:
    230
    This is interesting
    This is interesting I just did something similar to this but I just used a preallocated nativearray, I'm also adding to a buffer in parrallel which doesn't seem to slow things down much if at all, I'm wondering if that's a special case or something.
     
  24. mikaelK

    mikaelK

    Joined:
    Oct 2, 2013
    Posts:
    278
    Thanks :)
    What do you mean by internal access? like to modify some package and write extension to nativelist?
    Could this work with hash sets and native multi hash maps?