Search Unity

Official Introducing the DOTS Best Practices guide (0.16, 0.17)

Discussion in 'Entity Component System' started by SteveM_Unity, Feb 9, 2021.

Thread Status:
Not open for further replies.
  1. jasons-novaleaf

    jasons-novaleaf

    Joined:
    Sep 13, 2012
    Posts:
    181
    If you say so, OtherMonarch...
     
  2. TheOtherMonarch

    TheOtherMonarch

    Joined:
    Jul 28, 2012
    Posts:
    866
    My point was New Math is clearly easier it does not require knowing subtraction or multiplication etc.
     
    Last edited: Feb 25, 2021
    MINORLIFE likes this.
  3. I never ever thought I would witness this video on a coding forum. Ever.
     
  4. Micz84

    Micz84

    Joined:
    Jul 21, 2012
    Posts:
    451
    Yep, but it is also slower because requires much more steps ;)
     
    MNNoxMortem likes this.
  5. calabi

    calabi

    Joined:
    Oct 29, 2009
    Posts:
    232
    One thing I'm curious about is the best ways for how you use the same data in several places when you can't use classes. Like I know you use structs, like I used a GetNeighbour struct for instance, to do just that in several places, but I'm not sure if I'm using it correctly. And what other ways are there?
     
  6. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    I'm sorry, I don't really understand the question. What's a GetNeighbour struct, and what do you mean by "use the same data in several places"?
     
  7. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    Thank you so much for this talk! I'm only at the beginning of part 2 (watching ALL the videos) and have already learned so much about CPU's and caches! I had programmed a bit in ECS before (blockout, nothing fancy) but never with a real understanding of how I should 'think' about my code. And I would have never thought to look at the hardware, so thank you thank you thank you!

    M̶y̶ ̶l̶a̶t̶e̶s̶t̶ ̶s̶h̶o̶c̶k̶e̶r̶ ̶w̶a̶s̶ ̶h̶o̶w̶ ̶y̶o̶u̶ ̶s̶h̶o̶u̶l̶d̶ ̶n̶e̶v̶e̶r̶ ̶u̶s̶e̶ ̶a̶ ̶b̶o̶o̶l̶ ̶i̶n̶s̶i̶d̶e̶ ̶y̶o̶u̶r̶ ̶I̶C̶o̶m̶p̶o̶n̶e̶n̶t̶D̶a̶t̶a̶,̶ ̶s̶i̶n̶c̶e̶ ̶t̶h̶e̶ ̶s̶a̶m̶e̶ ̶c̶a̶n̶ ̶b̶e̶ ̶a̶c̶h̶i̶e̶v̶e̶d̶ ̶w̶i̶t̶h̶ ̶m̶o̶r̶e̶ ̶I̶C̶o̶m̶p̶o̶n̶e̶n̶t̶D̶a̶t̶a̶,̶ ̶a̶n̶d̶ ̶s̶i̶n̶c̶e̶ ̶a̶n̶ ̶i̶f̶ ̶s̶t̶a̶t̶e̶m̶e̶n̶t̶ ̶i̶s̶ ̶t̶h̶e̶ ̶q̶u̶i̶c̶k̶e̶s̶t̶ ̶w̶a̶y̶ ̶t̶o̶ ̶m̶a̶k̶e̶ ̶y̶o̶u̶r̶ ̶c̶o̶d̶e̶ ̶u̶n̶-̶v̶e̶c̶t̶o̶r̶i̶z̶a̶b̶l̶e̶. It's little details like that, along with all the understanding about RAM being slow as bricks, that makes me truly appreciate this amazing resource!

    EDIT:
    The author was kind to reply and explain, nay, proof that this is use-case dependent (and in most use cases bools are probably the right call). You may scroll down to read the full discussion, and read this post to find out how you can vectorize your code even when working with bools.
     
    Last edited: Apr 1, 2021
    SteveM_Unity likes this.
  8. AdrielCodeops

    AdrielCodeops

    Joined:
    Jul 25, 2017
    Posts:
    54
    How is that easy? Is it easy to memorize such a huge procedure instead of understanding some basic concepts?
    Am I missing a joke?
     
    amisner2k and JediNizar like this.
  9. jasons-novaleaf

    jasons-novaleaf

    Joined:
    Sep 13, 2012
    Posts:
    181
    it did seem like a troll to me.
     
    JediNizar likes this.
  10. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    The general consensus is that people are far quicker in addition than subtraction (scientific fact). It’s relevance however escapes me as well.
     
  11. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,776
    In England primary schools they teach these methods. I don't know if everywhere tho. Not a fond of that method to be honest. But there is nothing to be memorised in there, just adding until reaching the value. Is simple in execution, but may tekę longer to get answer. However, probably easier for large numbers, when trying do math in a head.
     
  12. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    Hah, never say never! I'm beginning to think that calling this guide "Best Practices" has given people the impression that there's always One Best Way to do things, but I tried to be careful in the guide to explain that many things are situational and dependent on the specific problem you're trying to solve with your data and its transformations.

    A bool in an IComponentData might be bad for vectorization, but removing the bool and using a tag component and queries instead might be worse if it results in a lot of chunk fragmentation or frequent structural changes. I'm told that there are plans to address some of this in future version of the Entities package, but for now (well, in general) you have to consider your expected data access patterns and performance requirements before making a decision.

    I'm glad you're finding the guide useful, though. :)
     
  13. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    I wanna poke at this a little bit though: even if you had 5000 entities you wanted to call “add tag” on and 5000 to call “remove tag”, you can group these into 1 or 2 entity command buffers.

    https://docs.unity3d.com/Packages/com.unity.entities@0.17/manual/sync_points.html

    Grouped into such sync points (whatever they are ), shouldn’t that address those concerns?

    (btw sorry if my terminology is abysmal, I’m still learning. The link should explain what I’m trying to convey though.)
     
    Last edited: Mar 30, 2021
  14. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    It depends a lot on how often you'd want to add or remove those tags, and exactly how you'd do it. Presumably you wouldn't already have an EntityQuery which encompasses all of the entities you want to add the tag to (if you did have such a query, you wouldn't need the tag), so your best option if using an ECB would be adding/removing the tags in an Entities.ForEach().ScheduleParallel(). As the table in section 3.4.2 of the guide shows, on my machine adding a tag component for a million Entities this way would take about 203ms, so rough scaling that down to 5000 Entities you might be looking at something like 1ms. Double those numbers if you want to remove a tag on the same frame. Maybe that's acceptable to you, or maybe that's an enormous amount of time - it depends a lot on your project, target platform/device, your use-case and your performance requirements.
     
    Haneferd likes this.
  15. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    Wow, I haven’t seen this list yet (just finished step 2) but it gets me excited for what’s about to come!

    May I ask how you got the timestamps? Did you use a simple stopwatch? It might be a stupid question, but I’m curious how you time these things since asynchronous and all that, do you simply call Complete() in your update? (Removed afterwards of course).

    After looking long and hard at your table and thinking about if I could beat it, I realized the following. Say you have a Multiplier component that multiplies a number by 2 until it reaches 1,000,000 and then starts dividing until it reaches below 1. Even if you did this with an ECB you’d still need to make that if-check inside your job, killing the vectorization anyway ^_^’

    Also interesting to know that this approach potentially ‘could’ work if you ‘had’ an entity query and the vectorization gain was big enough.

    Thank you for being so helpful on this! Scott Meyers had somewhere in his presentation a very brief comment stating “bools are wrong” and I took it as gospel, but the data is king!
     
  16. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    I just put a ProfilerMarker begin/end around the code that makes the structural change, run the code (which just repeatedly adds a component to 1 million entities on one frame and removes it again the next frame, using whichever method I want to test), grab a full set of data in the Unity Profiler and then aggregate it together in Profile Analyzer. I don't call Complete(), but I do add an AddJobHandleForProducer() for the EndSimulationEntityCommandBufferSystem, which in a simple test project with nothing else going on is kind of the same thing.

    Your Multiplier example is a bit different to what we're discussing with tag components because multiplying a number isn't a structural change. Something as simple as you describe should be blazingly fast even with a fairly naive implementation, and there are plenty of tricks to make it faster if you still need to. For example, you could split the checks into a separate job that performs the check and sets a number to multiply by (either 2 or 0.5 depending on whether we're going up or down) and then a second job that just multiplies numbers together. Burst should vectorize that pretty well :)
     
    Haneferd likes this.
  17. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    Alright, I think I have found the holy grail I was looking for: https://m.youtube.com/watch?v=bVJ-mWWL7cE

    Completely vectorizable code using boolean fields.

    Code (CSharp):
    1. float value;
    2. bool multiply;
    3.  
    4. multiply = (value < 1 || (multiply && value <= 1000000));
    5. value *= (0.5f + 1.5f * (int)multiply);
    WHY WAS I NEVER THOUGHT TO THINK LIKE THIS IN SCHOOL?! It’s just such perfection and completely parallelizable. It's :eek:, it's :D, it's just :cool:
     
    Last edited: Apr 4, 2021
  18. DrBoum

    DrBoum

    Joined:
    Apr 19, 2020
    Posts:
    26
    if you want to avoid booleans(and branches), you can use union struct in C# without resorting to unsafe with mathematics help, a branchless sign (therefore vectorizable) method would look like the following:

    Code (CSharp):
    1.    [StructLayout(LayoutKind.Explicit)]
    2.         public struct BoolToByte {
    3.             [FieldOffset(0)]
    4.             public bool condition;
    5.  
    6.             [FieldOffset(0)]
    7.             public byte tmp;
    8.         }
    9.  
    10.         public static void BranchlessSign(in float number, out float result,ref BoolToByte boolToByte)
    11.         {
    12.             boolToByte.condition = number >= 0;
    13.             result = (boolToByte.tmp + -0.5f) * 2;
    14.         }
     
    Last edited: Apr 3, 2021
    Kmsxkuse, CaseyHofland and Shinyclef like this.
  19. JediNizar

    JediNizar

    Joined:
    Nov 13, 2016
    Posts:
    111
    Sorry to bring back old off Topic..
    But apart from that this "new math" is dumb as hell.. It fits pretty well on a coding forum. It's just a regular if/else condition with a for loop.

    Imagine in your code instead of doing c = a -b you are doing a for loop until a reaches c etc...
    A really performant code :D
     
  20. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    The main sort of challenge I keep facing in DOTS is how to implement polymorphic-type behaviour in a way that seems both efficient and non-tedious, and I do see tons of people starting threads about it pretty often. I think providing insight/examples for these things should be considered a priority, since there are countless common scenarios in games where you'd face these kinds of problems: AI, state machines, event systems, ability systems, inventory systems, weapons, etc...

    For example:

    1- Ability system
    Imagine a game like Diablo where you have equipped abilities (fireball, heal, etc...) and you can launch those abilities by pressing buttons. And imagine there could be 100s of different Abilities.

    One solution would be to have one entity per instance of an ability. "Equipped" abilities are remembered with their Entity ref. Each ability has its own system/component which looks for either a bool on a common "Ability" component, or a tag component to see if it should launch the ability. But, would 100s of systems each scheduling their jobs for abilities be too much? I suppose chances are only a small fraction of them would actually schedule their jobs every frame, since not all possible abilities will be fired at the same time, but what about the cost of just 100s of systems seeing if their entityQuery finds any match every frame?

    Alternatively, we could go with an approach where we assign an int ID to each ability, remember the IDs of the abilities we have equipped, and then when we want to launch an ability, we do a switch case over all possible ability IDs to find the right LaunchAbilityX() function. Again, would a switch case over 100s of cases be efficient (especially if there are many actors using abilities constantly), compared to some kind of polymorphism equivalent? Can you avoid having to pass the data of all possible skills in the game to the LaunchAbility() function? Etc...

    And then there is the function pointer approach, but the doc says we can't pass any native collections to it unless we pass the whole job struct instead. But then what if we need to be able to launch abilities from all kinds of different jobs? We need a new type of delegate for each type of job?

    It would be great to have some benchmarks regarding the performance of different approaches at different scales. Or see if someone can think of better alternatives

    2- Character movement state machine
    Imagine a state machine for a character where states can be for example Walk, Sprint, InAir, Crouched, LedgeHang, Climbing, etc... The states would have an OnStateExit, OnStateEnter, OnUpdate, OnPrePhysicsUpdate, OnPostPhysicsUpdate, OnPostAnimationUpdate, etc...

    This poses similar challenges to the Ability System example above, but also has extra challenges. You will often want stateA's OnStateExit(), stateB's OnStateEnter(), stateB's OnUpdate(), etc... to happen all in the same frame, and in that specific order. That means that if you go with the approach of "each state has its system(s)", you'll end up with about 6 systems per state; one per function of a state. In a prototype game I made recently, I had about 25 of such states, and that means I would need 150 systems (6x25) for my character states with an approach like this. Feels like too much.

    I ended up going with the switch case approach instead. But with the switch case approach, we have to do the switch case over all 25 possible states 4-6 times per frame per character. (one time for each function of a state). Is that too heavy? Is it too tedious to maintain all those switch cases? Could codegen help? etc...

    And finally, in all of these situations there would be the "do something with regular classes on the main thread in a system's Update" approach. But in the case of the character state machine, being constrained to the main thread is a very heavy price to pay. The character states will almost always contain expensive physics queries done every frame, and this is the sort of thing that would hugely benefit from multithreading. It would also cause tons of sync points since these updates must happen at all kinds of different points in the frame.

    I'm ready to accept the reality that maybe there simply isn't a magic easy solution to these problems, but I'm asking anyway just in case someone more experienced can think of something better than the solutions I described. Also, I'm not even asking for a one-size-fits-all solution here; even when trying to find specific solutions to specific scenarios, I'm still having difficulties
     
    Last edited: Apr 28, 2021
    MINORLIFE, Haneferd, pcg and 2 others like this.
  21. desertGhost_

    desertGhost_

    Joined:
    Apr 12, 2018
    Posts:
    260
    For my ability / character state machine I just use individual entities. Each state has its own entity and each ability has its own entity. Abilities are explicitly tied to states. I have series of entities that are defined for each state that the character can be in. Each state defines which abilities can be used. These states include things like grounded movement, falling, jumping, in vehicle, swimming, climbing, interacting with another entity, etc.

    When this state is activated I add an active state tag and an initialized state tag (this allows for cleanup / initialization for a state). Only states with these tags are processed by systems.

    Each ability is defined by its own entity and has a tag added to it when its owner state is active. The ability has a boolean that defines if it is actively in use. This limits the number of abilities that a system has to check for being active, but also keeps memory swapping to a minimum. Key bindings / NPC inputs are mapped to an input bitmask and each state has a buffer defining which ability is used for each bit being set.

    This is by no means a perfect solution, but it simplifies state / ability management and keeps system overhead down. A big disadvantage of this approach is that due to memory fragmentation between different entities some data needs to be either copied or accessed / updated via GetComponent / SetComponent. This inevitably leads to having to manage the synchronization of data or prevents the use of ScheduleParallel in systems that need to write data across entities.

    A visual state machine system would make for a huge improvement (at least for authoring and debugging purposes).

    I would also like to hear what Unity thinks the best approach for this would be.
     
    Haneferd likes this.
  22. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    Either way, it is good to note that having 6 systems for the StateMachine isn't bad code. It would have been bad in a hierarchial design, but we're past that.

    I have by no means experience with pointers whatsoever so this may be complete rubbish, but if I'm thinking of solutions off the top of my head then I'm thinking of a ComponentData with a pointer to a state and an array of transition callbacks. Transition fired = pointer changed and there's no need to add and remove components like crazy. But again, no idea if this works and no idea how to implement it.
     
    Last edited: Apr 30, 2021
  23. TheOtherMonarch

    TheOtherMonarch

    Joined:
    Jul 28, 2012
    Posts:
    866
    Basically polymorphic as used in OOP is not possible. You can kind of replicate it by breaking your data into smaller components and with enum metadata. This works because encapsulation is removed.
     
    Last edited: Apr 29, 2021
  24. PhilSA

    PhilSA

    Joined:
    Jul 11, 2013
    Posts:
    1,926
    Yeah, I think I should've mentioned that I'm not necessarily looking for a solution that looks like polymorphism in terms of usage, but rather *any* solution that solves the problem while being efficient & not tedious to set up.

    For the 2 specific examples I've given, I have yet to think of a satisfactory solution, aside from trying to convince myself to just do it on main thread with classes & managed components (I need to keep reminding myself that it's at least not worse that what we'd get in OOP monobehaviour, performance wise. Especially with burst now being mainThread-compatible). Maybe there is such a thing as a specific problem where the best approach is OOP, which means OOP would actually qualify as "data-oriented" in that case.

    But... since I have very little multithreaded programming experience outside of DOTS, I kinda often wonder how these problems are typically solved in regular non-ECS multithreading. And I wonder if those approaches could be used with the job system by disabling certain safeties?
     
    Last edited: Apr 29, 2021
  25. TieSKey

    TieSKey

    Joined:
    Apr 14, 2011
    Posts:
    225
    Sorry if u already went though this but:
    Do you already know you will be doing something that ECS can actually speed up?
    Are you sure u need that performance?

    OOP, as far from perfect as anything is, exists for a reason. ECS isn't a defacto superior choice for any game.
     
    MINORLIFE likes this.
  26. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    In terms of CPU utilization and memory usage they are though. Yes, even state machines. Just different state machines.

    I was curious so I looked up if any data oriented state machines have been written in the past. Unfortunately, I could only find this. It has no pictures or even so much boiler plate code.
    https://www.dataorienteddesign.com/dodmain/node8.html

    But the approach is quite interesting, and most certainly "different". If someone makes it work one day, I'd love to see it in action.
     
    desertGhost_ likes this.
  27. TheOtherMonarch

    TheOtherMonarch

    Joined:
    Jul 28, 2012
    Posts:
    866
    That way does not make a lot of sense. You would need to be adding and removing IComponentData. Like I said and that link also mentions you can get the same results by maintaining a “state variables” / type field enum metadata. Not dissimilar from how custom protocol low level serialization works to get around reflection.

    Just like your custom protocol serialization knows what to serialize to bytes better then a generic solution would, because your know the length of your data, here you also should know the set of states of your finite state machine. You are not going to get a generic finite state machine. But you can definitely have a custom finite state machine.
     
    Last edited: Apr 30, 2021
  28. TieSKey

    TieSKey

    Joined:
    Apr 14, 2011
    Posts:
    225
    Well, no, that's the point. Nothing in engineering is an automatic superior choice to anything else. There's always a tradeoff between several variables.

    Unless u have several hundreds of certain things (characters, bullets, platforms, whatever) following the same logic and updating each frame, any potential benefit of ECS will be negligible.
     
  29. desertGhost_

    desertGhost_

    Joined:
    Apr 12, 2018
    Posts:
    260
    I mostly agree with the caveat that Burst can make a measurable performance difference in some instances where you have very few entities (i.e. a multiple step rollback on a few entities over many systems each frame).
     
  30. CaseyHofland

    CaseyHofland

    Joined:
    Mar 18, 2016
    Posts:
    613
    That’s my point. ECS scales much better, thus I consider it superior. That’s not to say there aren’t other priorities to consider like deadlines and necessity, but from a pure CPU-utilization-and-memory-usage standpoint it can’t be beat: performance literally scales linearly with your project.

    Regarding this part of the problem, it might be too much all at once. It’s easier to reason just about OnUpdate then about all at once.

    I also had thoughts on a good implementation of ISharedComponentData. Our game may have 1000 orcs all with the same FSM graph: this graph could be an ISharedComponentData, with all its transitions, states and possibly even layers laid out, much like the AnimatorController.
     
  31. Micz84

    Micz84

    Joined:
    Jul 21, 2012
    Posts:
    451
    You do not have to have several hundreds of things to benefit from burst and jobs. I am currently playing with creating may own influence Map system, were there can be several types of influence maps. Each unit can add an influence of some types to those maps. Currently my stress test allows to paint on 64 maps 256x256. Each map has 3 types of influence and there is 512 units perm map painting with influence of size 16x16. It takes about 20 millisecond to calculate all of this. So there is more then 90k influence painted over 192 maps. Imagine you have smaller map with just 100 units you will spend small fraction of this 20 ms to calculate this influence maps and you would have plenty of time for some sophisticated logic that uses data from those maps.
     
  32. SenseEater

    SenseEater

    Joined:
    Nov 28, 2014
    Posts:
    84
    512 Units x 64 Maps X 3 Types = 98,304 Influence => ~ 90K as you said?
    I don't understand where you mention 192 Maps , where did this come from? Am I am missing something in the math? Just curious..
     
  33. Micz84

    Micz84

    Joined:
    Jul 21, 2012
    Posts:
    451
    Maybe I wrote it not clearly enough. There is 64 tiles 256x256. Each of those tiles has 3 influence maps. Right now it does nothing just splat random influence of size 16x16 512 times for each influence map. But speed is insane and I am using only jobs and burst, probably it can be optimised better using full DOTS.
     
  34. SenseEater

    SenseEater

    Joined:
    Nov 28, 2014
    Posts:
    84
    I believe that TieSKey's statement that you need of 'many of something' to benefit from ECS is technically valid.

    Many
    is basically a problem of Scale.

    Looking at core of DOTS as of today we , i see following

    Burst : Performance gains using from SIMD + Native Optimizations ;
    Jobs : Performance gains using with Multi-threading ;Useful if you got a problem worth solving at scale.
    ECS : Performance gains leveraging Cache-Coherence. Also Architectural Advantages if you are in for it.

    For a 1 to 1 to problem between OOP( MB , GO) vs ECS , If a problem is not of scale , i seriously doubt if there is any significant performance gains to be had.

    @Micz84 do mind that what you have shared is a problem of scale , and solved using Burst & Jobs as you have said yourself which are really not the same as ECS which @TieSKey was talking about.
     
    Last edited: May 1, 2021
    MINORLIFE and eatbuckshot like this.
  35. Micz84

    Micz84

    Joined:
    Jul 21, 2012
    Posts:
    451
    As I said this is just stress test if jobs that write to maps. But to make it useable in real application ECS will be helpful to decide ignite into correct tile maps and detecting units that influence other maps. I know that my test us an example of usage at large scale, but my point is even with much smaller unit count you will benefit from calculating it very fast. For example 4 tiles 10 influence maps per tile and just 30 units per tile could be calculated in fraction of millisecond so you have a lot of time for actual AI logic. And influence may be anything fire on map. Of course there are some game types that benefit from ECS less then other.
     
  36. TieSKey

    TieSKey

    Joined:
    Apr 14, 2011
    Posts:
    225
    Again, no, that's unfortunately not true. Try making some game with only 100 entities but 999 systems that need to query for different things each frame and add/remove components or check state on all 100 entities cuz your problem (game) is a perfect fit for inheritance and polymorphism, but not so much for an ECS paradigm. Guess what, it will probably perform worse in terms of CPU/memory that OOP.

    A bus can scale better in terms of fuel/person but if your bus only carries 4 people, a car will be multiple times better.

    Burst is a tool, it grants a performance burst (thus the name) where u need it if u can compromise and fulfil it's restrictions.
    Jobs are a tool, they allow for some limited MT (wow, we finally reached 21 century). Really good if your problem (game) can benefit from it.
    ECS is not a tool, is more of a paradigm, it means doing things in a completely different way than OOP. As a paradigm it is good for some things and bad for others but u can't simply change or stop using it midway. Choosing the right paradigm for your particular game is fundamental (and unfortunately something a lot of companies overlook, just following trends/fashions).

    That being said there's nothing wrong with using an ECS paradigm only on part of a game. Personally I'm using HR to render the terrain for my game (thousands of blocks) while everything else remains in OOP.
     
  37. desertGhost_

    desertGhost_

    Joined:
    Apr 12, 2018
    Posts:
    260
    The nice thing about Unity ECS is that you can have class component data and hybrid monobehaviour components. These class component data and hybrid monobehaviour components let you program in OOP where needed, but still use DOD where it is beneficial. You can use Unity / Havok Physics with multithreaded queries and script AI / state logic using an OOP design.

    Doing more work (even with better data layout and code compilation / optimization) is still doing more work. Unless that code runs significantly faster (it is really well optimized and has few to no cache misses) it will be slower than simply doing less work. The key to good performance with DOTS is designing for data. If you simply don't have a lot of data to process then you don't need to approach that particular problem with DOD.

    That being said I have found that my state machine systems processing 100 owner entities (each of these entities had 10 or more state entities linked to them) using ECS and Entities.Foreach loops with .Run() (bursted, no multithreading) were still able to see some level speedup compared to a comparable GameObject and monobehaviour solution.
     
    AmarIbr, CaseyHofland and JesOb like this.
  38. Goldseeker

    Goldseeker

    Joined:
    Apr 2, 2013
    Posts:
    42
    Sorry, for necro, but I always thought that polymorphic behaviour is the best solution for a wide variaty of problems and I thought that it is a good place as any to write the following approach which is ecs friendly, albeit may not be burst or cache friendly

    So imagine we have an Ability entity, a System that "casts" equiped ability when user provides input( and to increase complexity of the example let's say that every ability is "cast" with the same code before ability specific code and some common code after that) and a wide variaty of functions that actually execute ability cast based on the each individual ability type and data.

    So the approach without polymorphism would be to create a PreAbilityCastSystem, that would place a tag or toggle a boolean on ability to trigger appropriate specific ability system (ProjectileCastSystem, ConeCastSystem, InstantCastSystem and many more) and then finish up with PostAbilitycastSystem to finish with the common code.
    I find it VERY annoying, first of all we have practically a guarantee that any of the specific cast systems will run on a very small subset of entities and most of the time they will not need to be run at all, as casting happens relatively rarery. Moreover from code organisation standpoint it makes reasoning about the system is a whole very hard as logic is spread through 2 common classes and undetermined amount of specific systems. Each ability subtype will need a unique tag components or at least a combination of components to be only used by the intended specific system. I can go on and on about how clunky and non-performant it is, but let's move on.

    Next options is to have a big AbilityCastSystem and when the time comes just just switch by type(type id, whatever index you have) and execute specific code as a private method of the system. If you ask me it's a far better approach, there are no systems that almost never run, the logic is contained in one place and there is no entity modification just for the sake of calling a function (which effectively happened in previous example when with added tag to trigger specific cast system).
    But if we look into switch by type statement we will surelly notice that is exactly the problem that OOP polymorphism is designed to solve. At this point there are two ways to implement polymorhism while still staying within ECS framework. First one is simple - carry with your ability component a reference to a class implementing IAbilityOperations interface with a method with a signature that looks somethign like this void Cast(EntityManager, Entity ability, Entity caster/* etc */) and just call it from AbilityCastSystem it will have all the access to your ECS world to correctly execute the specific ability cast logic. It is important to note here that this class instance, can be shared by all the abilities of the same "type" in relation to this operation. It is also important to note that those classes should not have any internal state and should be used only as a way to get polymorphic behaviour, not store any data related to implementing this behavour, the very important part of ecs design is that all the state is stored withing entities and this principle is worth preserving. If you're no able to use references like in Unity ECS, you can instead have what basically is a manual virtual function call table: you're Ability component, can contain int typeid that could be used as index to find a function in array of functions.

    As to this thread as a whole, I think @TieSKey has a lot of great points and selecting the right paradigm for the job is very important so that we don't get carried away by slogans, but make informed desicions
     
    Last edited: Aug 8, 2021
    spacepluk likes this.
  39. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    Any attempt to recreate polymorphism in ECS will come down to hacks/workarounds which, at best, perform no better than OOP. The examples you listed here have different issues depending on which one we're talking about.

    Using booleans or bitfields is, as you say, kind of annoying, and also means you have poor cache line utilisation and no real chance to take advantage of SIMD optimizations.

    A big switch statement can be okay from a performance standpoint, but does make the code look messy - as you say, it's a bit like trying to implement OOP features in something like standard C.

    Your solution involving class components with references would mean that those jobs can only be run on the main thread and can't be Burst compiled, plus the references would mean a lot of cache misses. I have seen (and implemented) similar ideas that use generics in the systems, which run into similar problems.

    The "recommended" approach in the documentation might be to use tag components so that each ability-specific system is only processing archetypes that are relevant to that ability, but that potentially introduces chunk fragmentation, which may or may not be a problem.

    If I was looking at an ability system like the one described, my first thought would be to try to find a way to do it without polymorphism. Even in OOP I prefer composition to polymorphism for solving a lot of problems, but that's a personal preference that comes from having dealt with several codebase that descended into class hierarchy hell.

    If polymorphism is absolutely the best way to solve a given problem, perhaps the best approach is to accept that that system will always be main thread bound and not Burst compilable, and perhaps it should be just written as standard OOP code if that's the case. There's no rule that says that every single part of your codebase has to use ECS, and sometimes it's best to mix and match.
     
  40. Goldseeker

    Goldseeker

    Joined:
    Apr 2, 2013
    Posts:
    42
    The idea here is to find a standard way to use OOP-like features when there are needed and are the best tools for job, without taking additional perforamance hit compared to just using OOP everywhere, while still being within the ECS framework to use good parts of it. Performance is not the core issue here - coherence, readability and maintainability of the code base is.

    It cannot be free can it? We'd still have to run all specific systems every update, just to check there is nothing to process. Get enough of the specific systems and it will be more performant to just have a regular OOP-style call when something actually happens, instead of maintaining queries and taking a hit of changin entities structure (by adding tag component) when ability needs to be cast.


    There definetely ways to do it without polymorphism, but in most use cases (when using ability is a relatively rare thing) they will be harder to maintain and most likely will not be more performant, as there are usually not enough opportunities to use SIMD in casting one ability. But if there are isn't it theoretically possible to burst compile a static function that doesn't use references internally and just invoke the burst compiled function? Again I want to point out, for that approach to work we only need need polymorphic behaviour, not polymorphic data (ECS and components can handle that)

    As to composition vs polymorphism:
    I too like to use aggregation in favor of inheritance, but its not because of the polymorphism, its because it produces more maintainable code with more freedom for the child(or aggregattor in case of composition) in implementing required functionality (akhem interface -> polymorphism) in a cleaner way with a small price of a little bit more boilerplate in some cases.

    Composition is not exatly opposed to polymorphism. In the end polymoprhism is just a syntax suger on top of function lookup and it has lots of uses. The way I describe it you only lookup the reference to static function basically and all the data is within ECS.

    It makes it sound like there is no MT in OOP land, but there is. Its harder to guarantee correctness in that case. Still in specific cases as long as the problem itself is parallizable one can both use OOP and utilize multiple cores.


    Finding a good pattern to mix and match in a not convoluted way (no offence, but current monobehaviour and ECS interop is anything but easy to understand) is very important to making bigger projects using ECS.

    I personally dislike how the Unity ECS currently is very deeply built in into unity core, and its hard to have as an aggregate of an OOP framework and call update and other stuff on it manually, instead it insist on existing as a parallel subsystem.
     
  41. davenirline

    davenirline

    Joined:
    Jul 7, 2010
    Posts:
    987
    I have some sort of a dilemma when you want to use some kind of OOP. If your data is in ECS, the way to access data is to use EntityManager.GetComponent()/SetComponent() which are not performant. Because of this, it's already sort of discouraged to use OOP when data resides in ECS. What I'm trying to say Burst and Jobs can be used in OOP code with no problems, but ECS data used with OOP is just bad in terms of performance. The other way around is untenable. What's your take on this?
     
  42. JesOb

    JesOb

    Joined:
    Sep 3, 2012
    Posts:
    1,109
    MINORLIFE and SteveM_Unity like this.
  43. Goldseeker

    Goldseeker

    Joined:
    Apr 2, 2013
    Posts:
    42
    Oh, that's a good one. I'll test my approaches compared to this one performance wise. It is basically the switch case approach from my post, but with great syntatic sugar on top of it.
     
  44. xVergilx

    xVergilx

    Joined:
    Dec 22, 2014
    Posts:
    3,296
    Not really OOP advice, but rather hybrid ECS one:

    Access these methods / data when your jobs aren't running. Create a special group somewhere, and place your main thread running groups there, when no jobs is executing. GetComponent would create a "sync point" only if you're accessing data that have some kind of job dependency on that type of data running. Which is the main source of performance loss.

    SetComponent is a bit trickier, but they still can be used quite fast via EntityCommandBuffers without performing any stalls to the running jobs. That is if you're designing against data change to not be instant. Bonus points for Playback being burstable, so all structural changes occur at once.

    Main key for the solution here is to make a custom EntityCommandBufferSystem that runs before simulation, and access that system generated buffer, to perform structural changes from MonoBehaviour side instead of EntityManager directly.

    In both of these cases custom update manager for MonoBehaviours helps immensely.
    As it can be convert to the SystemBase, and simply ordered wherever you wish.


    I've been using this kind of approach to manage hybrid connections from MonoBehaviours to Entities without any conversion for quite some time (Authoring entities from MonoBehaviours). And it works quite well from performance perspective. It sure won't beat conversion flow, but ECB's bursted playbacks is blazing fast.
     
    Last edited: Aug 9, 2021
    davenirline likes this.
  45. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    Fair enough. The Best Practice Guide doesn't really cover these sorts of cases because it's already pretty long, and because a big part of my focus with it was to try to wean OOP developers without prior experience of DOD off trying to view all problems/solutions through an OOP lens. I have no problem with OOP (used it very happily for most of my career, and still do for most situations) but I've seen too many DOTS projects run into problems by trying to mix the approaches without fully understanding what they're trying to achieve in DOD in the first place.

    That said, I'd be interested to see some code examples of your suggestions, to see how the code gets put together and where the performance bottlenecks come from. I think I understand the approaches you're describing (class references in components, or components that index into an array of function pointers), but the devil is in the details.

    Systems checking to see whether there are any entities which match their queries is... Not free, but certainly very, very cheap. Cheap enough that all of Unity's profiling tools rounds the cost for each system that uses an empty query down to 0.00ms. The cost comes when the archetype changes actually happen, because that's when the housekeeping is performed that makes the queries so fast. So in practical terms you might see a performance hit from running thousands of systems with empty queries, but it's likely that to get to that point you'd experience bigger problems with memory usage from chunk fragmentation, long build times, or poorly-performing structural changes before you got to that point.

    Sure, there's multithreading - I oversimplified there, apologies. But if you're using the job system to do it, you can't create scheduled jobs that (for example) process components that contain a class reference, so that would have to somehow be separated out to ensure you're only processing blittable data. And whilst you could theoretically use a ThreadPool or something to run mulithreaded code, that tends not to play particularly nicely with the job system worker threads, unfortunately. It's possible to run MT OOP code in Unity, but it's not easy to get it right, and in practical terms I have only seen multithreaded C# scripts in Unity projects very, very rarely.
     
  46. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    This looks interesting. I wasn't aware of these, but I want to try to find some time to look at them.
     
  47. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    From my mobile dots project, it's very expensive at low/mid range mobile hardware. I know there's struct based system will improve this issue but I still have a lots of systems need to use class based system. Any plan to improve performance of class based system too that code gen better performance system query and execution code? I have use other 3rd party ECS solution before. It's also class type system but it's extremely faster than this official Unity class based system.
     
  48. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    I'm not part of the DOTS R&D team or product management team, so I'm not the person to ask about what might be coming in the future.

    That said, I'm surprised that using class components would make EntityQuery slower than using standard components. How many systems do you have running empty queries, and how long do the empty queries take?
     
  49. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    Oh. I think u misunderstand what I say just. I mean class based system not class based component. Anyway I believe struct based component and class based component should get almost the same EntityQuery speed. I also believe that EntityQuery is implemented in generic way and cause it very slow. Change it to code gen better performance EntityQuery code and replace the generic one should improve the performance significantly. For empty queries, do u mean system with query none of component meaning none of EntityQuery and none of Entities.ForEach?
     
  50. SteveM_Unity

    SteveM_Unity

    Unity Technologies

    Joined:
    Nov 21, 2017
    Posts:
    41
    I'm not sure what you mean here. SystemBase is a class. All Unity ECS systems are classes. What is the problem, exactly?

    Again, I don't understand. Can you give an example of a query that runs slowly, and tell me how long it takes to run on your target device?

    I mean systems that contain an EntityQuery which don't find any Entities which match it. For example, a system which does Entities.WithAll<Foo, Bar, Baz>().ForEach() ... If there are no entities on a particular frame that have all 3 components (Foo, Bar and Baz) then that system should detect that and not run its OnUpdate() method.
     
    MINORLIFE likes this.
Thread Status:
Not open for further replies.