Search Unity

[Please Help] Adding to a non-repeating set from multiple jobs

Discussion in 'Entity Component System' started by PublicEnumE, Sep 1, 2019.

  1. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    I've been trying different ways to solve this problem for a few weeks, with no luck so far. Please help if you know solutions in this area, and thank you:

    - - -

    This game is a spy management sim. You control a secret government agency, which sends its agents to perform covert operations in cities across the globe.

    Different cities are eligible for different types of missions. Part of the game in choosing which cities you'll send your agents to next.

    - - -

    In ECS, I have systems which schedule jobs to find eligible cities, based on mission type and other criteria. Some of these "FindCities" jobs are more expensive than others, but the most expensive ones don't run very often. That's why i have these different types of "find" operations broken out into different systems: so I don't do the most expensive main thread work (prepping the data I'll need to send into these jobs) unless I need to.

    For instance, here's my system order breakdown:

    1. FindCitiesCheapSystem <--(the most common type of query - happens frequently)
    2. FindCititesMediumSystem <--(slightly more expensive/ happens less frequently)
    3. FindCitiesExpensiveSystem <--(the slowest type of query, but happens rarely)

    Each of these systems needs to add the cities they find to a single collection. But that collection shouldn't have any duplicate entries, because...

    4. ProcessFoundCitiesSystem

    ^ In a 4th system, I want to schedule an IJobForEach, and process each of the cities from the collection in parallel. If there are duplicate city entries in the list, this work can't be done in parallel (multiple jobs might be trying to write to Components on the same city).

    - - -

    I have yet to find a good way to write to the same, non-repeating collection from multiple jobs.

    And if I turn off safety checks on the collection, then I don't know how to make sure the 4th system waits for all the "FindCities" systems to complete.

    Whats the best approach here? Should I just not be trying to write to a single collection in parallel to begin with? Please help - I could really use your advice!

    Thank you,
    E
     
  2. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
    What does the definition of your city that you write to a collection look like?
     
  3. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Right now it’s just an Entity, since that 4th system will need to access a DynamicBuffer on the city.

    So rather than store a collection of some component type, I’m storing the city’s Entity, which I’ll use to access a BufferFromEntity.
     
  4. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,761
    Most parallel containers work fine from different systems, just the safety system will trigger a warning which you can turn off with [NativeDisableContainerSafetyRestriction]

    I personally don't like this but people do it. I like to stick to the rule of don't write to a collection from more than 1 system, and I see no reason these have to be different systems.

    Just early out on the expensive stuff

    Code (CSharp):
    1. public override JobHandle OnUpdate(JobHandle handle)
    2. {
    3.    handle = FindCitiesCheapSystem(handle);
    4.    handle = FindCititesMediumSystem(handle);
    5.    handle = FindCitiesExpensiveSystem(handle);
    6.    handle = ProcessFoundCitiesSystem(handle);
    7.  
    8.    return handle;
    9. }
    10.  
    11. private JobHandle FindCitiesCheapSystem(JobHandle handle)
    12. {
    13.    if (this.cheapQuery.CalculateEntityCount() == 0)
    14.    {
    15.        return handle;
    16.    }
    17.  
    18.    return new EasyJob().Schedule(handle);
    19. }
    20.  
    21. private JobHandle FindCititesMediumSystem(JobHandle handle)
    22. {
    23.    if (this.mediumQuery.CalculateEntityCount() == 0)
    24.    {
    25.        return handle;
    26.    }
    27.  
    28.    return new MediumJob().Schedule(handle);
    29. }
    30.  
    31. private JobHandle FindCitiesExpensiveSystem(JobHandle handle)
    32. {
    33.    if (this.expensiveQuery.CalculateEntityCount() == 0)
    34.    {
    35.        return handle;
    36.    }
    37.  
    38.    // setup expensive main thread stuff.
    39.  
    40.    return new ExpensiveJob().Schedule(handle);
    41. }
    42.  
    i feel like this will result in a lot fewer issues in the long run.
     
    Last edited: Sep 1, 2019
    PublicEnumE likes this.
  5. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Thanks @tertle!

    In your example, you give up parallel writing. Your example jobs all seem to run sequentially. I was thinking the parallelism could save time. Did you exclude it intentionally?
     
    Last edited: Sep 1, 2019
  6. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,154
    My assumption would also be to give up parallel writing. Just write to one dynamic buffer per job and then filter out duplicates and write to final list. Could be faster than parallel writing. Also flexibility to run job over multiple frames - I hope they provide Better support for this (discusses somewhere else...)
     
    PublicEnumE likes this.
  7. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Thank you! Would you mind explaining to this dummy why this would be?
     
    Last edited: Sep 1, 2019
  8. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,154
    Because parallel writing comes at a cost. It’s more efficient to run various independent things in parallel than to try parallelize one thing.

    You will have to test it out ;) but my assumption is that what you are doing, could be such a case. Or the benefit of parallel writing is minimal.
     
  9. PublicEnumE

    PublicEnumE

    Joined:
    Feb 3, 2019
    Posts:
    729
    Thank you for the explanation. I seriously appreciate your help. Would you mind explaining what the difference between the two things in your quote is? "Running various independent things in parallel" vs. "Parallelizing one thing" - How are these two things different?

    Are you referring to using a Job type in a single System that parallelizes its work across multiple threads, vs spreading out a single "task" across multiple, parallel jobs?

    I'm not familiar with these costs, besides the limits this put on how Unity can schedule its other jobs. Is that what you're referring to here? And if it's not, where can I learn more about the costs of parallel writing?
     
  10. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
    Like many others are suggesting, get the algorithm working single-threaded first.

    As for extracting a set of all unique results from parallel multiple producers, there's two techniques I know.

    First technique: Write the city entities to a NativeHashmap.Concurrent and assign a dummy variable to the value and then copy to a NativeArray. There's also a NativeHashset floating around in the forums which may do the same thing a little more elegantly.

    Second technique: Write the city entities to a NativeStream. Then convert the NativeStream to a NativeArray. Then sort the array. And lastly remove duplicates (which should be easier since the duplicates will all be next to each other).

    The first technique is easier and probably meets your needs. But I mention the second technique because it gets around a couple nasty issues. First, there's the issue if your multiple producers like to produce a lot of duplicates. That generates a lot of atomics and a lot of potential false sharing for writes which don't actually do anything. Second, at very, very large scales, the first method doesn't scale all that well. Whereas the second method, while a fair amount of work to implement, can be almost completely parallelized (only 2 list resizes have to happen single-threaded in IJobs).
     
    PublicEnumE likes this.
  11. RecursiveEclipse

    RecursiveEclipse

    Joined:
    Sep 6, 2018
    Posts:
    298
    Last edited: Sep 2, 2019
  12. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,154
    1. What I meant is that you likely have many other things in your game going on that could happen in parallel to “find cities” - even if they are all single threaded, you are still parallelize the work.

    2. I am not an expert in this either, so won’t be able to give you the best explanation. But for parallel writing you either lock or keep a copy for each thread - it seems @DreamingImLatios could explain better

    Ps: I have tried a lot of this in a collision system & profiling a long while back - you will find old threads here around this parallel writing topic
     
    PublicEnumE likes this.
  13. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,264
    Parallel writing by itself is not expensive. But common attempts at it do expensive things.

    First, is false sharing (and well later, true sharing), where multiple threads write to same cache line. This causes the CPU cores to juggle the cache line around like a game of catch. Unless you have a counter per thread and all the counters are in the same cache line, this one by itself is pretty easy to avoid. You won't avoid it completely, but a couple of "tosses" is pretty cheap.

    Second is atomics. This is when you have multiple threads writing in random access, and possibly also reading if checking the size of a resizable container. Basically race condition city without some mechanism like atomics to maintain coordination. Not only do the CPU cores have to use special operations to coordinate shared variables to track who is writing to what and when, but such algorithms can also be heavily prone to the first issue. NativeHashmap and friends use atomics. They work well for sparse outputs, but have their pitfalls.

    I've been spending a lot of time lately perfecting my multibox pruner so these concepts are pretty fresh in my head.
     
    PublicEnumE likes this.
  14. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    1,154