Search Unity

Pull vs Push data in parallel jobs

Discussion in 'Data Oriented Technology Stack' started by NanushTol, Aug 20, 2019.

  1. NanushTol


    Jan 9, 2018
    Hi friends, Im playing around with the ecs and job system for 2 weeks now and I'm trying to understand what would be the preferred way of "pushing" data from one entity to another.

    I'm trying to model a simplistic wind & atmosphere system with a grid of entities, in each frame the system does its calculations to determine the movement of cell content from one to the other.

    Gird copy Small.jpg

    the system I came up with is currently pulling data
    Job 1: (Let's say the system works on cell #5)
    • calculate how much and to who to transfer the content
    • storing the data in a "ToTransfer" Component

    Job 2: (when cells 1,4,6,9 are at work)
    • check adjacent cells "ToTransfer" Component to see if others has transfered something to "me" (the cell that is currently at work)
    • if so, add that data to "my" data

    Code (CSharp):
    1. // here the system transfers "MotionVector"
    3. public struct TransferMotionVectorJob : IJobForEach<Cell, AdjacentCells, WindData, ToTransfer>
    4. {
    5.     public float Drag;
    7.     public void Execute(ref Cell cell, ref AdjacentCells adjacent, ref WindData windData, ref ToTransfer toTransfer)
    8.     {
    9.         float ratio;
    10.         float2 cellTransferRatio;
    12.         // Some calculations to determine how much to transfer
    14.         // determining who to transfer to
    15.         if (windData.MotionVector.x > 0) // Right Movement
    16.         {
    17.             toTransfer.TransferXCellId = cell.RightCellId;
    18.             toTransfer.TransferMotionVectorX = passedMV * cellTransferRatio.x;
    19.         }
    20.         else if (windData.MotionVector.x < 0) // left Movement
    21.         {
    22.             toTransfer.TransferXCellId = cell.LeftCellId;
    23.             toTransfer.TransferMotionVectorX = passedMV * cellTransferRatio.x;
    24.         }
    26.         if (windData.MotionVector.y > 0) // Up Movement
    27.         {
    28.             toTransfer.TransferYCellId = cell.UpCellId;
    29.             toTransfer.TransferMotionVectorY = passedMV * cellTransferRatio.y;
    30.         }
    31.         else if (windData.MotionVector.y < 0) // Down Movement
    32.         {
    33.             toTransfer.TransferYCellId = cell.DownCellId;
    34.             toTransfer.TransferMotionVectorY = passedMV * cellTransferRatio.y;
    35.         }
    36.     }
    37. }
    39. [BurstCompile]
    40. public struct PullMotionVector : IJobForEach<Cell, AdjacentCells, WindData>
    41. {
    42.     [ReadOnly] public ComponentDataFromEntity<ToTransfer> TransferFromEntity;
    44.     public void Execute(ref Cell cell, ref AdjacentCells adjacent, ref WindData windData)
    45.     {
    47.         ToTransfer up = TransferFromEntity[adjacent.Up];
    48.         ToTransfer right = TransferFromEntity[adjacent.Right];
    49.         ToTransfer down = TransferFromEntity[adjacent.Down];
    50.         ToTransfer left = TransferFromEntity[adjacent.Left];
    52.         if (right.TransferXCellId == cell.ID)
    53.             windData.RecivedMotionVector += right.TransferMotionVectorX;
    54.         if (left.TransferXCellId == cell.ID)
    55.             windData.RecivedMotionVector += left.TransferMotionVectorX;
    56.         if (up.TransferXCellId == cell.ID)
    57.             windData.RecivedMotionVector += up.TransferMotionVectorY;
    58.         if (down.TransferXCellId == cell.ID)
    59.             windData.RecivedMotionVector += down.TransferMotionVectorY;
    60.     }
    61. }

    This works fine for now, but it seems a bit convoluted to me, and it will create a problem once I will want to add other entities that interact with the atmosphere like a tree burning for example.

    what I'm looking for, is a way to "push data", for example:
    Job 1: (cell 5 is currently in work)
    • calculate how much and to who to transfer the content
    • write to cells 6 & 9 "ReceivedContent" Component

    I haven't found a way of doing that because of parallel writing safety,
    I'm not worried about race conditions because I'm only using the received data in the next frame.

    When I do try to write in parallel I get the error that I have to use the job index to write to a NativeArray,
    or if I try to use "AsParallelWriter" (the new concurrent) I cannot access the array value by index.

    Can anyone point me in the right direction of how to "Push" data from one entity to another in a parallel job?
  2. DreamingImLatios


    Jun 3, 2017
    Pulling is usually better than pushing for the exact reason you discovered. You have reading and writing where only one of the two will ever have a "known" memory destination. Reading can do random access safely in parallel. Writing can't. Even though you probably know in advance the general location an element will land in (in cell 5 for example), you don't know the exact memory address it should land in because your data is dynamically sized. I'm going to offer you 2 push methods and one pull method that might be more appealing to you.

    Push 1) NativeMultiHashMap
    It's slow compared to what you are currently doing, but it gives you the behavior you want.

    Push 2) Calendar Method
    I call this the Calendar Method because if you swap hours and minutes with memory addresses this analogy holds up pretty well. The idea is that each advisor (cell) has a calendar. And all these advisors have pretty busy lives and don't want everyone all coming to them at once because fights break out. (Let your imagination run wild with this.) So to avoid that problem, everyone schedules meetings with advisors on their calendars. The advisors block out times for these meetings. Then people show up for their meetings with their advisors at the right time. They don't always use the full time, but they at least don't show up at other people's time.

    Instead of meetings on a calendar, you have index ranges into NativeArrays or Dynamic Buffers or whatever data structure you are using.

    Pull) Pull Streams
    The idea is exactly as you have it already, except you create a general purpose bucket inside each cell for all the different types of objects inside the cell to pile up. Then the adjacent cells pull from the general purpose buckets rather than the specific entities inside the cell.

    And lastly I should mention there's the [NativeDisableParallelForRestriction] attribute in case that's giving you problems. But make sure what you are doing ensures you are always writing to a unique memory address per frame (or dependency block) or else while you say you don't care about race conditions, two writers racing each other tends to end up with lots of plot holes. :p
    tarahugger likes this.
  3. NanushTol


    Jan 9, 2018
    Wow! amazing, thanks for the detailed answer :D

    push option num 2 sound good, i understand the metaphor but i'm not sure that I understand the implementation,
    what you meant is that for each cell there is a NativeArray with the length of the cells count and each cell writes to his own place in the other cell's array?
  4. DreamingImLatios


    Jun 3, 2017
    I would use a dynamic buffer for each advisor (cell) calendar. That way the advisor can grow their calendar if they need to and the clients can write to it from an entity lookup. The advisor doesn't really know or need to care how many clients he or she will have or where they come from. This solves your "burning tree" problem. I would also have a second dynamic buffer for each advisor stating how many entries were actually written during the client's meeting. The advisors need to zero this out before every meeting in case a client no-shows.

    The tricky part is the scheduling of appointments. There's two ways I can think of to go about it.

    The first is have a NativeMultiHashMap which you can think of as all the advisors' singular email exchange server. The clients send appointment requests to all the advisors' emails. In each request there is a reference entity back to the client (This client entity is probably a child entity of the real client unless you want to use dynamic buffers here too in which case the request also holds an index into such a buffer). Each of the advisors then go through all their emails and write back to the requests the appointment times to a component (or dynamic buffer at prescribed index) on the entity given in the request. It is also important that these appointments are SystemState and can reactively be "canceled".

    The second way is more like an auto-ticketing appointment system in which each client grabs a dynamic buffer of length one, converts to an unsafe pointer, and then atomically increments to get an appointment block. If the client needs a longer appointment, the client grabs multiple blocks. You'd have to schedule a reset system every minute or so to clear all the calendars and make everyone reschedule because cancellations don't work with this approach.
  5. NanushTol


    Jan 9, 2018
    Thank you!
    I am wondering now, maybe I should stick to the pulling method because it has a lot less overhead, and I am starting to get confused :oops: (i'm an artist learning to code, so i'm leaning towards simpler solutions in technical terms), and i think i can find a way to solve the "burning tree" problem, i'm thinking maybe with a position check, so each cell will check "who occupies my space?" and check it's "ToTransfer" component.

    I did try today a "brute force" push approach with the [NativeDisableParallelForRestriction] and i'm getting some weird results like you have predicted, the other unexpected part of today's experiment is that i'm getting those weird results even if i'm using the Run() method instead of the Schedule() one, which by what i understand is running on a single thread, so it's not supposed to have race conditions there.
    that makes me think I have created another bug that i can't currently find in the transition between my original single threaded logic and the new ECS logic. o_O

    I do wonder if the advisor's calendars approach won't bite of the performance of the system.
    In any case I need to read some more, on dynamic buffers and the SystemState.

    thank you very much for your help! :)
  6. DreamingImLatios


    Jun 3, 2017
    Hardcoded pulling is the least overhead, but lacks dynamic flexibility. But if that's all you need, then call it good. Definitely post again if you run into any more issues. Your simulation sounds pretty awesome!
    florianhanke likes this.
  7. Antypodish


    Apr 29, 2014
    Make sure, your algorithm works on main thread first. Since you are learning, then move into single threaded job. More importantly for you, rather than multithreads, make burst compatible. It gives immense performance bust.

    Regarding cells, yes keep them as buffer(s). Depends on amount of data, you can 'move' content by simply changing index reference. If cell is just containing few values, just move them to next cell.