Search Unity

Help understand System and Job dependencies and implicit/explicit sync points

Discussion in 'Entity Component System' started by The5, Jan 20, 2019.

  1. The5

    The5

    Joined:
    May 26, 2017
    Posts:
    19
    For Future Readers:
    The primary issue was, that I had a component with an Entity inside.
    In the system where I use that component, I did not check if the Entity this component references actually still exists.
    Code (CSharp):
    1. someComponentFromEntity.Exists(referenceEntity)
    ---

    I think my entire understanding of implicit and explicit dependency management is wrong.
    Yesterday I tried to destroy some entities (at runtime) for the first time and everything broke down.
    Now, I can not imagine my "apparent Insights" below are actually correct, but here is how I interpreted the errors I came across.

    Lets assume a desired execution order SystemA -> SystemB -> SystemC.
    All these systems are truly dependent on all calculations of the previous Systems' Jobs.
    If anything is not required by a subsequent system, it's in a different system.

    Example:
    SystemA = Attack System adding "DamageEvents" to a queue provided by SystemB
    SystemB = Health System dequeuing Damage Events and adding "DestroyTag" if health == 0
    SystemC = CleanupSystem deleting all tagged Entities via CommandBuffer

    With this, all my systems so far were just using the [UpdateBefore(typeof(SystemC))] and [UpdateAfter(typeof(SystemA))] tags.
    I assumed these implicitly introduce sync points and e.g. SystemB will take care that all SystemA's Jobs are done before calling its own Update. A sparse number of these attributes would then allow to implicitly build a efficient dependency graph.

    Anyways, I tried passing a MultiHashMap between systems (via injecting SystemB into SystemA) and I got this Error that I would need to complete the jobs in SystemA before being able to use the Queue within jobs from SystemB.

    Apparent Insight #1:
    The [UpdateBefore] [UpdateAfter] attributes do not actually introduce any sync points to Jobs.
    It appears they truly just define the order of the initial Update() method calls.

    It is just the order in which OnUpdate() is called.
    ComponentData automatically allows systems to deduces dependencies between them.
    For NativeContainers you should be able to properly pass job dependencies between systems.

    Consider that a non-job system calls all its OnUpdate() on the main thread, blocking all other systems after it for the time being. This means stuff that needs to be done before another system could be forced to happen by using a non-job system. But this is just a dirty hack, dependencies are the way to go.

    Now, I had another look a the Nordeus demo and the TwinStick Example and noticed that they in fact manually call JobHandle.Complete() quite a few times.

    So I passed a JobHandle from SystemB's final job to SystemC.
    Now I was unsure:
    In SystemC I could call Complete() on systemB's handle before scheduling SystemC's own jobs, like the samples do.
    But .schedule() also takes a handle as argument. Shouldn't this input dependency handle mean that a job needs all its handles to be complete?
    So I tried just passing SystemB's JobHandle along via
    Code (CSharp):
    1. var outsideDeps = JobHandle.CombineDependencies(SystemBDeps, inputDeps);
    2. var handle = SystemCJob.Schedule(outsideDeps);
    3.  
    This did not work either, the error persisted.

    Apparent Insight #2:
    Passing a JobHandle fom JobA to JobB.schedule() does not actually imply any dependency between those jobs.
    JobB happily begins running even though JobA is still in progress. It just seems to implicitly do .CombineDependencies().
    I wanted to verify this but I could not seem find the source of "JobsUtility.ScheduleParallelFor()" which ultimately takes the dependency JobHandle.

    No, handles do exactly that, they build dependencies.
    However, you need to manually pass dependencies for Native containers.

    So I added the JobHandle.Comple() and immediately got a new Error:
    This time SystemB's jobs were trying to access the Queue while SystemC's Jobs are still using it?
    But SystemC has [UpdateAfter(typeof(SystemB))], how could SystemB be blocked by SystemC's execution?

    Apparent Insight #3:
    Jobs do execute beyond the end of a Update/Frame, they keep executing into the next frame.
    I need to manually introduce some .Complete() that finishes all my jobs across all systems before the end of the frame.

    JobComponentSystem makes sure to complete still running jobs before calling OnUpdate() the next frame!
    Code (CSharp):
    1.     unsafe JobHandle BeforeOnUpdate()
    2.     {   ...
    3.         m_PreviousFrameDependency.Complete(); //first, complete and block any jobs that are not done since last frame yet
    4.         return GetDependency();
    5.     }

    So I added another .Complete() immediately after SystemC scheduled it's own job, as to make sure the queue is available again by the time other systems use it.
    This worked.

    Lastly, I tried Deleting Entities via command buffer.
    This threw exceptions within the Burst Jobs, as it seems those entities are destroyed while other Jobs still use them.
    Again a case of Apparent Insight #3:
    I need to manually .Complete() all jobs across all Systems before doing something like Destroying Entities.
    However, how to do that for the Built-In Systems. Ins't there a Pre-Defined Barrier?
    Getting a CommandBuffer from EndFrameBarrier did not prevent the exceptions.

    This would lead me to the question:
    Should we not be able define dependencies on a per-system-level?
    Why do we have to pass dependencies on a per-job-level?

    Sure, we need to have the option of control on a per-job-level.
     
    Last edited: Jan 25, 2019
    mbaker and bb8_1 like this.
  2. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,269
    This took me a while to understand too.

    You are correct in that all the attributes do is set up the order in which systems have Update() called on the main thread. You can disable automatic system creation and manually create your systems and call Update() on them to achieve the same effect.

    You should only ever have to call Complete() on a JobHandle if you need to work with the results on the main thread after scheduling the job in the same system or having multiple jobs handling a non-ECS NativeContainer. The latter can be avoided with some clever hackery.

    So an important distinction to make is that the ECS automatic dependency management is built on top of Job System, not built into the Job System. They have fundamentally different strategies. The Job system only cares about what previous jobs need to be completed before a particular job is allowed to run. It does not care about what containers are being used with each job (with the exception of the safety system hooks which do not apply for a release build). When using the job system by itself, you manually have to keep track of these dependencies and pass them between jobs.

    So for the ECS to perform automatic dependency management, it needs to keep track of all the JobHandles and reason about which jobs depend on which. This is impossible to do for non-ECS NativeContainers. But all ECS data access is declared through APIs such as IJobProcessComponentData.Schedule() or GetComponentDataFromEntity(). But because ECS can't know in advance which entities these jobs will process, it can only make conclusions based on the types.

    So if system A touches the types Position and Rotation, and system B touches the types Scale and Health, then in no way could these systems be touching the same data, even if they are touching the same entities. And consequently, they can be scheduled and processed concurrently.

    However, if system A touches (writes to) Position but only for entities in chunk A, and system B also touches (reads or writes) Position but only for entities in chunk B, system B's jobs will still have to wait for system A's jobs to finish since ECS only cares about dependencies by type. This does not apply for multiple jobs scheduled within a single system. There only Job rules apply, which let you optimize a little more aggressively if you desire.

    And then the other catch is that the main difference between ComponentSystem and JobComponentSystem is that ComponentSystem completes the inputDeps (all the jobs previously scheduled that touch the types needed by the system) before calling OnUpdate().

    If a new type is introduced during OnUpdate() that wasn't registered as a dependency in previous frames for the system, then all jobs for that type are completed on the spot. Therefore it is normal for the first frame to be a little bit slower due to suboptimal job scheduling.

    Destroying entities using a CommandBuffer retrieved from EndFrameBarrier should work. We would have to see code to figure out the issue. But you probably want to get your dependency management with your NativeQueue worked out first, since a NativeQueue isn't accessed through ECS and consequently requires manual dependency management. How you manage that dependency is up to you. Each person here on the forums seems to have a different approach.
     
  3. The5

    The5

    Joined:
    May 26, 2017
    Posts:
    19
    Thanks a lot @DreamingImLatios !
    I was absolutely not aware of the distinction between IComponentData and NativeContainers dependency management, though it makes sense.
    Interestingly, I have a NativeMultiHash map that hashes all my positions (like the Nordeus Demo).
    I use this Map as [ReadOnly] all over the place but never had to use .Complete() after writing to it initially.
    I wonder if the systems using it just coincidentally depend on the same Components and thus are synced properly, could this be?

    As for deleting Entities, I have this non-job component system:

    Code (CSharp):
    1.     public class DestroyEntitySystem : ComponentSystem
    2.     {
    3.         struct Destroy{
    4.             public readonly int Length;
    5.             public EntityArray Entity;
    6.             public ComponentDataArray<DestroyTag> DestroyTag;
    7.         } [Inject] private Destroy DestroyEntities;
    8.  
    9.         [Inject] EndFrameBarrier EndFrameBarrier;
    10.  
    11.         protected override void OnUpdate()
    12.         {
    13.             for(int i = 0; i < DestroyEntities.Length; i++){
    14.                 PostUpdateCommands.DestroyEntity(DestroyEntities.Entity[i]);
    15.             }
    16.         }
    17.     }

    It currently causes an exception while my AttackSystem.AttackUpdateTargetDistanceJob is running:
    Code (CSharp):
    1.     [UpdateAfter(typeof(GridSystem))]
    2.     class AttackTargetSystem : JobComponentSystem{
    3.  
    4.         [Inject][ReadOnly] GridSystem gridSystem;
    5.  
    6.         ComponentDataFromEntity<Position> PositionsFromEntity;
    7.         ComponentDataFromEntity<AttackData> AttackDataFromEntity;
    8.         ComponentDataFromEntity<Faction> FactionFromEntity;
    9.         ComponentDataFromEntity<FactionMember> FactionMemberFromEntity;
    10.         ComponentDataFromEntity<HeroUnitTag> HeroUnitTagFromEntity;
    11.         ComponentDataFromEntity<ColliderLarge> SpecialCollisionsFromEntity;
    12.         ComponentDataFromEntity<DeadTag> DeadTagFromEntity;
    13.  
    14.         protected override void OnCreateManager(){
    15.        
    16.         }
    17.  
    18.         protected override void OnDestroyManager(){
    19.             //base.OnDestroyManager();
    20.         }
    21.  
    22.         protected override JobHandle OnUpdate(JobHandle inputDeps)
    23.         {
    24.             PositionsFromEntity = this.GetComponentDataFromEntity<Position>(true);
    25.             AttackDataFromEntity = this.GetComponentDataFromEntity<AttackData>(false);
    26.             FactionFromEntity = this.GetComponentDataFromEntity<Faction>(true);
    27.             FactionMemberFromEntity = this.GetComponentDataFromEntity<FactionMember>(true);
    28.             HeroUnitTagFromEntity = this.GetComponentDataFromEntity<HeroUnitTag>(true);
    29.             SpecialCollisionsFromEntity = this.GetComponentDataFromEntity<ColliderLarge>(true);
    30.             DeadTagFromEntity = this.GetComponentDataFromEntity<DeadTag>(true);
    31.  
    32.             var autoAttackJob = new AttackFindTargetAutoattackJob {
    33.                 CellSize = gridSystem.GridCellsLvl1_Size,
    34.                 GridCells = gridSystem.GridCellsLvl1, //ToConcurrent() does not support iterators, so we cant use it here
    35.                 PositionsFromEntity = this.PositionsFromEntity,
    36.                 AttackDataFromEntity = this.AttackDataFromEntity,
    37.                 FactionFromEntity = this.FactionFromEntity,
    38.                 FactionMemberFromEntity = this.FactionMemberFromEntity,
    39.                 HeroUnitTagFromEntity = this.HeroUnitTagFromEntity,
    40.                 DeadTagFromEntity = this.DeadTagFromEntity
    41.             };
    42.             var handleFindAutoAttack = autoAttackJob.Schedule(this,inputDeps);
    43.  
    44.             var handleFindHeroAttack = inputDeps;
    45.  
    46.             var handle = JobHandle.CombineDependencies(handleFindAutoAttack,handleFindHeroAttack);
    47.  
    48.             var updateDistanceJob = new AttackUpdateTargetDistanceJob{
    49.                 PositionsFromEntity = this.PositionsFromEntity,
    50.                 SpecialCollisionsFromEntity = this.SpecialCollisionsFromEntity,
    51.             };
    52.             handle = updateDistanceJob.Schedule(this,handle);
    53.  
    54.             return handle;
    55.         }
    56.     }

    An alternate DelteEntitiesSystem i tried:

    Code (CSharp):
    1.     [UpdateAfter(typeof(AttackTargetSystem))]
    2.     [UpdateBefore(typeof(EndFrameBarrier))]
    3.     public class DestroyEntitySystem : JobComponentSystem
    4.     {
    5.         [Inject] EndFrameBarrier EndFrameBarrier;
    6.  
    7.         protected override JobHandle OnUpdate(JobHandle inputDeps)
    8.         {
    9.             return new DestroyEntityJob{
    10.                 CommandBuffer = this.EndFrameBarrier.CreateCommandBuffer().ToConcurrent()
    11.             }.Schedule(this,inputDeps);
    12.         }
    13.     }
    14.  
    15.     public struct DestroyEntityJob : IJobProcessComponentDataWithEntity<DestroyTag>
    16.     {
    17.         public EntityCommandBuffer.Concurrent CommandBuffer;
    18.  
    19.         public void Execute(Entity entity, int index, ref DestroyTag destroy)
    20.         {
    21.             CommandBuffer.DestroyEntity(index, entity);
    22.         }
    23.     }
     
    Last edited: Jan 21, 2019
    mbaker and bb8_1 like this.
  4. anihilnine

    anihilnine

    Joined:
    Jan 29, 2016
    Posts:
    27
    Useful post
     
    bb8_1 likes this.