Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Figuring out State Machines in DOTS

Discussion in 'Entity Component System' started by PhilSA, Nov 24, 2020.

  1. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    I see a lot of people in forums/discord/twitter having a hard time figuring out a good State Machine implementation in DOTS. One that is able to deal with more complex scenarios. And I'm curious to see if people here would have any good ideas regarding this.

    Solutions
    Typically we mostly hear about 2 approaches:
    1. state is a component, and each state has its own job
    2. state is an enum, and we do a switch case in one job to determine what logic to do based on current state
    Problems
    Solution 1 seems non-ideal when there are lots of states (50 states = potentially 50 jobs scheduled), and when there are frequent state changes.

    Solution 2 seems non-ideal because of the performance of the switch case when there are lots of states, and also I would assume there's unavoidable cache-unfriendliness.

    Both solutions seem to scale very badly when state-dependent logic must be called at multiple different points in a frame. It multiplies the amount of jobs to schedule (which is terrible for solution 1 where you'd quickly end up with 100s of jobs), and multiplies the amount of times we must do the switch case to select the appropriate logic for solution 2.

    Also, both solutions require enough boilerplate to make you feel like you'd need to write some kind of codegen tool to help you create new states. It's not an unsolvable problem, but it's worth considering in our search for the best solution. And if the best solution sadly involves codegen, it's worth considering for Unity if they want to provide built-in tools so simplify that common use case.

    Final words
    Has anyone found a solution that they are happy with, and that would deal well with having lots of states + state-dependent logic called multiple times in a frame? And am I making any false assumptions here, especially regarding the cost of the switch case approach?
     
  2. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    There's also the option of thinking of an approach that is not a state machine, but still solves our problem. That's trickier to discuss, though, because we need a very specific use case to reason about. So I would suggest the following:
    • You're making a mario64-style game and your character has many different "states": Walking, Running, Falling, LedgeGrab, Sliding, Climbing, Stunned, Frozen, Swimming, Attacking, SwingingOnRope, etc.... there could easily be 30+ of those
    • each conceptual "state" has different logic to do for input handling, movement/rotation handling, animation handling, and handling of various other gameplay features
    • The state-dependant logic will need to be called at various different points in a frame: pre-physics update, post-physics update, input update, animation update, etc....
    • We need all that logic to not be a big mess of if/else statements in the code, and to be easily manageable & scalable for programmers & designers
    So far I really haven't found a better alternative to state machines for that problem. "Solution 2" above is the best fit I could find for that level of complexity
     
    Last edited: Nov 24, 2020
    Egad_McDad, OldMage and _met44 like this.
  3. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,579
    If switch statements can not be used otherwise, I would consider try approach, using ecb and entities, which call relevant system to execute.
    Lets say you have 10k agents, and they can jump in various times.

    So for example, when condition is met, you create CallJump entity, with reference entity to the agent and a component, to jump system.

    Then JumpSystem executes jump of the agent, based on the reference entity. And destroy entity.
    You could also try event systems, proposed here on forum.
    Alternatively, if system is executing for longer than a single frame, I would probably just set a tag component on agent entity. I.e. Seek for a targets.
     
  4. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,984
    There's no silver bullet here. I would be asking these questions:
    1) What is the frequency of state changes?
    2) What is the worst case number of states to be processed in a single frame (accounting for all entities at once)?
    3) Can the state changes be performed during sync points?
    4) How much main thread headroom do you have for sync points?
    5) How much worker thread headroom do you have for additional bool checks, switch cases, and random access?
    6) How rigid are your state definitions? Are your states' data unionizable (not Unity-serialized or do not contain Entity or BlobAssetReference fields)?

    A third option is to make each state an entity. This basically gives you (1) except with cheaper sync points and more expensive worker threads. The code can look pretty clean though, especially regarding state changes (you can instantiate and destroy states).

    Remember, you only pay the price of the job if that state actually exists on an entity. But in this case, I think the real problem is you have too many interleaved dependencies between your state machine and other operations. You may want to try and reduce that rather than make a state machine that scales well in such an environment (which I'm not even sure is possible).
     
  5. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,626
    You can use write groups to create an interesting state machine setup. I think lot of people forget this even exists.
     
  6. Timboc

    Timboc

    Joined:
    Jun 22, 2015
    Posts:
    234
    There's a massive conceptual utility of states. In practice however... do they ever work well at scale & high complexity? In reality I've always experienced large issues with hidden or mixed states.
    I think any attempt to conceptually simplify things pays off massively. One approach is making a hierarchy of states and potentially using different techniques based on e.g. the factors DreamingImLatios laid out.

    DOD seems to constantly ask the question of what data is there actually and how is it changing exactly. Is it possible to have a good general solution? Maybe?
    Taking the example:
    What states are *actually* mutually exclusive? You can be Frozen and Falling, Stunned and Falling but not Frozen and Stunned? That matrix feels important.
    Maybe there's a Moving and Fighting ICD? Perhaps the Moving ICD has a bool for user controlled, a flags enum for pose and a speed for walk/run/swim?

    I think it's a good post / question to ask. Look forward to seeing other responses.
     
  7. Sarkahn

    Sarkahn

    Joined:
    Jan 9, 2013
    Posts:
    440
    Just a reminder if someone wants to see a practical example of method #1 from the original post Unity has a state machine example in the ECS Samples. At the bottom of the readme they go into great detail about the tradeoffs they made based on what their requirements were for the example, as well as some possible alternatives.
     
  8. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    I'm thinking function pointers could be a very interesting alternative here. No idea if this would work because I haven't used function pointers much, but it could look like this:
    • State is an enum
    • Remember the function pointers of all the functions of all states in a "NativeArray<StateFunctions> MyStateFunctions" that gets passed to jobs
    • calling a function on the current state in a job would look like "MyStateFunctions[(int)myCurrentState].SomeFunction.Invoke(......)"
    However, the manual says there are lots of downsides to function pointers (poorer performance, limitations on what you can pass to them, etc....) so I'm not totally sure if this really is an attractive solution. Specifically, I'm worried about the point on not being able to pass native containers, because that would be a dealbreaker in my case. Lack of generics support is also a potential dealbreaker for me, although it's not totally clear to me just by reading the manual what that means exactly
     
    Last edited: Nov 24, 2020
  9. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,984
    While there are ways to work around the function pointer limitations, you would rather write the lines of code to compile function pointers into a NativeArray and pass them around rather than write a switch case that directly invokes the static methods without the function pointer workarounds?
     
  10. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    yeah I think I would have more peace of mind with the switch case. I'm a fan of using simple things when possible, to avoid whatever risks/gotchas/limitations that might come with something like burst-compiled function pointers. I don't have a very good idea of how heavy switch cases are for performance, though. So I'll need to do some tests

    In the actual project I'm working on right now, each of my character states have 5 different functions (called at 5 different times in the frame). And each time we call one of those, we do a switch case over 25 states. So every frame, for each of the 100-200 characters in the scene, I do 5 switch statements of 25 cases. Right now I can't really tell how bad this is, but it does sound at least a little bit bad to me
     
    Last edited: Nov 25, 2020
  11. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,984
    So I can give you some insight into how Burst handles switch cases. It is fast.
    For reference, I'll be looking at the x86 assembly generated from this code: https://github.com/Dreaming381/Lati...l/Builders/BuildCollisionLayerInternal.cs#L65
    Code (CSharp):
    1. === BuildCollisionLayerInternal.cs(68, 1)                mask     += math.select(0, 0x2, r);
    2.         or        r11d, edx
    3.         === BuildCollisionLayerInternal.cs(69, 1)                mask     += math.select(0, 0x1, s);
    4.         or        r11d, esi
    5.         movabs        rax, offset .LJTI0_0
    6.         mov        rcx, rax
    7.         movsxd        rax, dword ptr [rax + 4*r11]
    8.         add        rax, rcx
    9. .Ltmp26:
    10.         === BuildCollisionLayerInternal.cs(124, 1)                    ProcessEntity(firstEntityIndex + i, chunkEntities[i], chunkColliders[i], RigidTransform.identity);
    11.         movabs        r11, offset "Latios.PhysicsEngine.BuildCollisionLayerInternal.Part1FromQueryJob.ProcessEntity(Latios.PhysicsEngine.BuildCollisionLayerInternal.Part1FromQueryJob* this, int index, Unity.Entities.Entity entity, Latios.PhysicsEngine.Collider collider, Unity.Mathematics.RigidTransform rigidTransform)_9975CD292A285859"
    12.         jmp        rax
    Alright, let's break this down. The mask is stored in r11 initially. Then, rax gets stored with a literal that is .LTJI0_0. I'll get to what that is in a sec. It gets backed up to rcx. Then the value stored at rax plus the offset r11 that is the mask value (whatever LTJI0_0 is, it is being treated like an array here) gets loaded into rax as a non-64-bit number sign-extended to 64 bits. Then it is added to rax. I'm not sure why the function address to ProcessEntity gets loaded into r11 here and not later, but let's ignore that. The last step is the jmp rax, which is where we jump to whatever is stored at the address of .LTJI0_0 + mask.
    What is in that array?
    Code (CSharp):
    1. .LJTI0_0:
    2.         .long        .LBB0_34-.LJTI0_0
    3.         .long        .LBB0_73-.LJTI0_0
    4.         .long        .LBB0_70-.LJTI0_0
    5.         .long        .LBB0_71-.LJTI0_0
    6.         .long        .LBB0_58-.LJTI0_0
    7.         .long        .LBB0_74-.LJTI0_0
    8.         .long        .LBB0_76-.LJTI0_0
    9.         .long        .LBB0_72-.LJTI0_0
    10.         .long        .LBB0_139-.LJTI0_0
    11.         .long        .LBB0_139-.LJTI0_0
    12.         .long        .LBB0_139-.LJTI0_0
    13.         .long        .LBB0_139-.LJTI0_0
    14.         .long        .LBB0_139-.LJTI0_0
    15.         .long        .LBB0_139-.LJTI0_0
    16.         .long        .LBB0_139-.LJTI0_0
    17.         .long        .LBB0_139-.LJTI0_0
    18.         .long        .LBB0_60-.LJTI0_0
    19.         .long        .LBB0_69-.LJTI0_0
    20.         .long        .LBB0_53-.LJTI0_0
    21.         .long        .LBB0_78-.LJTI0_0
    22.         .long        .LBB0_55-.LJTI0_0
    23.         .long        .LBB0_75-.LJTI0_0
    24.         .long        .LBB0_52-.LJTI0_0
    25.         .long        .LBB0_77-.LJTI0_0
    26.         .long        .LBB0_57-.LJTI0_0
    27.         .long        .LBB0_48-.LJTI0_0
    28.         .long        .LBB0_51-.LJTI0_0
    29.         .long        .LBB0_56-.LJTI0_0
    30.         .long        .LBB0_49-.LJTI0_0
    31.         .long        .LBB0_50-.LJTI0_0
    32.         .long        .LBB0_54-.LJTI0_0
    33.         .long        .LBB0_59-.LJTI0_0
    It is a bunch of offsets from LJTI0_0 (It is done this way to not use full 64bit addresses). Some of these implementations are inlined, some are converted to functions. Oddly enough, the ProcessParent and ProcessParentScale all lead to nearly identical blocks with function calls, except there's a couple of offsets that are different. I'm not sure what's up with that.

    Anyways, the short story here is that with a switch case, Burst is indexing a static array of offsets which either invoke functions or are inlined completely. So it is pretty similar to the array of function pointers in that regard.
     
    bb8_1, OldMage, Krajca and 2 others like this.
  12. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    wow awesome, thanks for that explanation!
     
  13. charleshendry

    charleshendry

    Joined:
    Jan 7, 2018
    Posts:
    95
    Would a BitField32 be a sensible approach to define multiple states?
     
  14. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    I think once we figure out how a regular state machine should be implemented, this problem of multiple parallel states can be solved by simply declaring multiple independant state machines. A bit like "layers" of a mecanim animation controller. Or sometimes, a state machine that lives inside another state

    For example:
    • There would be a MovementStateMachine for Walk, Run, Falling, RopeSwing, etc...
    • There would be an ActionStateMachine for None, Attacking, CastingSpell, etc...
    • "Status effects" that can stack up, like Stunned, Frozen, etc... would just be bools or components added to the character entity
    • Inside the "Attacking" state, there could be a nested state machine for the 3 parts of an attack: Preparation, Attack, Recovery

    It definitely isn't an exact science, though, and I think there will always be aspects of this separation into states that'll feel not totally perfect. I think the most "pure" solution to a problem like this would be that instead of state machines, our character's state would be a composite of potentially hundreds of very granular bools and values, such as "hasJumpHandling", "currentMoveSpeed", "currentMoveFunction", "currentRotationMethod", "detectCollisions", "isAttacking", "attackingPhase", "canAttack", etc... and there would be just one single character update function that would take all of these factors into account. But I think the problem with this would be that the complexity and cognitive load of it would quickly become be unmanageable for human programmers; hence why we end up trying to simplify (or "over-simplify"?) the problem to "states"
     
    Last edited: Nov 25, 2020
    OldMage and charleshendry like this.
  15. davenirline

    davenirline

    Joined:
    Jul 7, 2010
    Posts:
    943
    I'm still waiting for enable/disable components. It should be able to help here.
     
    charleshendry likes this.
  16. Kolyasisan

    Kolyasisan

    Joined:
    Feb 2, 2015
    Posts:
    391
    I'm still inclined to believe that DOD and FSM don't get along well at all, not with Unity's ECS at least.

    Consider a hypothetic situation: you have a platformer game and states can decide to transition to some other states on certain conditions. But let's say that it's an incredibly fast-paced game, like Sonic Mania times 2, running at 30 fps, so to keep it all accurate, the state machine needs to be ran in multiple steps (let's say that your character has a climbing ability. If it moved too great of a distance, sometimes it won't be able to grab ledges).

    And this is where DOD in Unity's ECS falls apart tremendously.
    -States can change on a whim. If one system = one state processing, then situations like "do state A -> do state B -> do state A, all in one frame" are not possible (if your character hits some spikes and goes into the hurt state on the next frame then it's wrong in this scenario).
    -If you decide to fix this by running all of the systems several times, then you'll need to consider situations where there are several characters with different speeds. And no, adding tag components to character entities that should no longer update during the frame is not a solution due to performance.
    -Fixed Update-like behaviour is a hack, doesn't resolve the previous issue.
    -If you decide to move all of it to enums, then it's much more boilerplate for you. Not cool. And you shouldn't use enums in the first place for generic FSMs.

    That's why I'll never stop advocating for bringing some basic monobehaviour-like functionality to ECS. You can't do everything in pure DOD, some OOP is necessary. For games with smaller scales that require great precision/steps (fighting games, fast-paced platforming games), it'll be a godsend. You can already do it with Unity ECS to some degree, but GetComponentData not working with subclasses/superclasses and the necessity to generate IEquatable are a big hassle.

    If you need insane performance for a ton of entities, you'll naturally design a brain-dead simple FSM that won't require such stuff. It's like with mass-scale NPC crowds whose animations are baked into textures and their behaviours kept simple. For such stuff, ECS is great, just like for rendering, particles, culling, physics, etc... For everything else regarding gameplay, getting away without some abstract functionality is, imo, not possible to do elegantly.

    For a ton of gameplay elements, it's not about simply processing data, but deciding how to process it, and that's why the common way of doing slightly more complex FSM with OOP will always, imo, be better in virtually every aspect except performance (and even *that* is debatable in a lot of scenarios).
     
    Last edited: Nov 26, 2020
    SenseEater and youri_ssk like this.
  17. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    This is a pretty spot-on description of the sort of scenario that I've been facing, and what lead me away from using states-as-components approaches. Especially cases where an OnStateExit of a current state and an OnStateEnter of a next state must happen in the same frame

    However, I do have a fully functionnal implementation right now that solves this problem. Using states-as-enums, I can fit my entire character update in one job, and I can call state-dependent logic of any amount of different states as many times as necessary in a frame. The caveat is, of course, boilerplate. But if this use case gets the same kind of attention as Entities.Foreach in terms of using codegen to make it easier, I think it could be made much simpler. And potentially it could be flexible enough to handle most cases where you'd need some kind of basic level of polymorphism. The implementation wouldn't be super cache-friendly, but then again; neither is OOP

    But I'm definitely interested in the possibility of unity adding OOP-like behaviour to DOTS if possible. I have to say I'm not sure what that would look like in terms of making the safety of it guaranteed (I don't know enough to have an opinion on it), but I'd be curious. Or maybe it can simply be restricted to non-multithreaded, unless you intentionally disable safeties. It would at the very least give users an easy-yet-unoptimal fix to these sorts of situations, while still allowing them to benefit from DOTS for the rest of the game
     
    Last edited: Nov 26, 2020
    florianhanke likes this.
  18. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    What prevents you from writing MonoBehaviour style code inside of an Entities.ForEach in DOTS today?

    GetComponent / SetComponent allows access to any other referenced component.

    It would be great if you can write down where you think writing MonoBehaviour style code in DOTS breaks down for you with a concrete code example.
     
    mattdymott, Krajca and Antypodish like this.
  19. Kolyasisan

    Kolyasisan

    Joined:
    Feb 2, 2015
    Posts:
    391
    Personally, for me it's 3 things.

    First, the necessity to implement the IEquatable functionality for a class component. As stated in the docs (https://docs.unity3d.com/Packages/com.unity.entities@0.16/manual/component_data.html), "Managed IComponentData must implement the IEquatable<T> interface and override for Object.GetHashCode()". Each new class will require it and further modifications will force us to go back and re-generate those functions. It's overall a hassle, MonoBehaviours and ScriptableObjects didn't need this.

    Second: pretty slow access to struct components. An FSM state that is in class component, for example, may want to change the translation component and for that we need to call GetComponentData, change the struct and then call SetComponentData. GetComponentData and SetComponentData copy entire structs and have quite a few safety checks to boot. I don't think there is any faster way to modifying components in unmodified Unity ECS package. Maybe the ability to modify structs directly by pointers would help?

    Three, the most crucial one: GetComponentData doesn't work with inheritance/polymorphism. If I have a base class StateBase and a class, say, RunningState inherited from it, GetComponentData<StateBase>(entity) won't return an instance of RunningState (as StateBase) while GetComponentData<RunningState>(entity) will work (of course provided that an entity "has" it).

    Please, clarify me if I'm wrong with any of those.
     
    Last edited: Nov 27, 2020
  20. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    I'd say that's mostly it for me as well

    Although, just to clarify, I see "OOP in DOTS" as a bit of an unsatisfactory plan B in the context of this thread and I am much more interested in figuring out better pure DOTS implementations of FSMs

    In my particular case, since the logic of my state updates involves lots of costly physics queries, I definitely want this multithreaded & bursted
     
    Last edited: Nov 28, 2020
  21. Krajca

    Krajca

    Joined:
    May 6, 2014
    Posts:
    347
    There might not be a better pure DOTS implementation of some things. I wonder why won't you use a hybrid approach?
    Do you want lots of physics queries or other calculations? Do them in DOTS and feed results to hybrid script to figure out the next state using polymorphism and stuff.
     
  22. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    The type of physics queries we do must depend on what state we are in, and the queries happen at various points in-between various steps of the update of a state. Sometimes there is even a for loop that involves physics queries, and at each iteration of the loop, we must do state-specific logic based on the results of the physics query of that iteration, which then determines the parameters that must be given to the physics query of the following iteration. Doing this with a hybrid approach would be technically possible, but it would involve so many jobs & sync points that it would just feel unmanageable to me, from a QoL point of view

    The least I can say is that I would still highly prefer my current switch case approach. The ability to call state-specific logic at any point inside any job gives me a lot of reassurance about the flexibility and future-proofness of this approach. I just know that I'm unlikely to reach a point where I'll think "oh crap, my architecture wouldn't properly support X new feature that I would've wanted to implement", and then have to do a huge refactor late in production

    But I'll keep thinking about a hybrid approach, maybe there could be a way to make this work that I'm not seeing currently
     
    Last edited: Nov 27, 2020
    PublicEnumE likes this.
  23. thelebaron

    thelebaron

    Joined:
    Jun 2, 2013
    Posts:
    825
    How appropriate or inappropriate is data flow graph for tackling this?
     
  24. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    3,984
    I don't think there's ever a good reason to transition from a working pure solution to a hybrid solution. I can think of a few ways to get more polymorphic Burst-compatible components, but they require changes to Entities. Probably the easiest change would be exposing CleanupEntity in the EntityQueryOptions. This would open the door for extendible states without having to hack an enum. Although the boilerplate for that would still be ugly without codegen, so I won't elaborate further.

    DataFlowGraph has a major flaw with regards to sync points. Unless there's some mechanism added to the job system to allow for populating a job batch with a predefined dependency from another job (AKA scheduling jobs from a job), it is really difficult to use in practice without a significant performance hit. Trust me, I have tried to make it work. But it does not scale and is really only useful for authored graphs.
     
  25. UsmanMemon

    UsmanMemon

    Joined:
    Jan 24, 2020
    Posts:
    87
    There is a pattern I use to avoid using GetComponentData(Many Times), In this case "Translation" which is

    Make a component called TranslationValueUpdateComponentlike so

    Code (CSharp):
    1. public struct TranslationValueUpdateComponent : IComponentData
    2. {
    3.     public float3 Value;
    4.     public ValueOperaion ValueOperaion;
    5. }
    6.  
    7. public enum ValueOperaion
    8. {
    9.     None,
    10.     Add,
    11.     Sub,
    12.     Div,
    13.     Mul
    14. }
    and attach it to an entity and then from a job using ECB I set this value for an entity and when ECB does the job, A system takes care of applying operations and apply it to the Translation Component but still I have to wait for ECB and system to execute. To avoid it even more add more Specific ValueOperations eg "MultiplyByLocalToWorldMatrixAndAddThisValueAndTakeSquareRootAndDoThisAndThis..................AndConvertItBackToLocalSpaceAndApplyBack". I know this has a big cost than one SetAndGetComponent but for many entities performance will add up.
     
    Last edited: Dec 7, 2020
  26. UsmanMemon

    UsmanMemon

    Joined:
    Jan 24, 2020
    Posts:
    87
    I hope this feature never becomes available since using this carelessly "might might" IMO reduce performace by alot. "Not sure".
    this feature should become easy to implement though so less-experienced cant use it and say performace is bad. Or add a automatic defragmentation at the end of frame to again tightly pack the entities into their respective chunks.
     
    Last edited: Dec 7, 2020