Search Unity

ECS Memory Layout

Discussion in 'Data Oriented Technology Stack' started by jooleanlogic, May 18, 2018.

  1. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    327
    I've read the ECS features in detail section of the documentation and want to see if my understanding of the data layout for entities/components is correct.

    Chunks
    Data is stored by Entity Archetype in 16kb chunks.
    A chunk is arranged by component streams. So all of component A, followed by all of component B etc.

    Is the chunk split up on creation such that the space for all component streams is already reserved? Like so:
    [A, A, A, A, A, A][B, B, B, B, B, B]
    even if there's only one Entity? When you add an Entity, you just copy the component data straight to their relative index positions. This is pretty neat as allocating n entities of an archetype is virtually a no op.
    Or do you compact the streams such that they occupy the memory like so
    [A, A][B, B]
    for two entities. If you add an Entity to this structure, then you have to move all component streams down the memory to get this [A, A, A][B, B, B]. I can't imagine this would work anyway as it would involve re-indexing all the entities?

    Entities
    All entities are stored in a single EntityData struct array. Entity.index is the index into this array and EntityData provides a direct address to its Components. Is an Entity struct also stored in the chunk so it can refer back to the entities array? This is what EntityArray is generated from?
    As a user can store Entity, am I right in assuming that the items in the entities array never change position? If you add 1000 entities and remove the first 999, that last entity is still going to be at the 1000th index?

    Archetypes
    If you add a new component to an Entity, it moves that Entity from its current chunk to a new chunk matching the new archetype. So there'll be a chunk of memory for every possible archetype. If the user doesn't specify a full archetype ahead of time, Unity will create one on demand along with a chunk for it. So adding a unique component to one Entity creates a new chunk just for that one entity.

    ComponentDataArray
    When we access the components via ComponentDataArray, are these direct pointers to the chunk data or are components copied into temp storage at the start of a system and back again at the end?
    Looking at the source code for ComponentDataArray, the iterator jumps from chunk to chunk instead of being contiguous so I'd assume they're direct pointers.

    SharedComponentData (SCD)
    An SCD is part of the archetype and each unique (by value, not type) instance of an SCD requires its own chunk. So an entity archetype will be split over as many chunks as there are unique SCDs.
    The SCDs are stored in their own type arrays somewhere, not in the archetype chunk; the chunk just contains an index into that array.

    Filtering
    Does the SCD include metadata with references back to the chunks that match it?
    So filtering on an SCD should be super quick depending on the Entity to unique SCD ratio.
    If you had 1000 entities split up into units of 100 by SCD, then the filter would just search the 10 items in the SCD array and from that, can directly locate the relevant archetype chunks?

    If I'm right on most of the above, then this structure is pretty damn awesome and I can see why creating and processing thousands of entities is so fast. I think I cleared up a lot of my own misunderstanding in the process of writing this out. Unless I'm completely wrong. :)
     
    Sarkahn, sngdan, Afonso-Lage and 2 others like this.
  2. Sgrueling

    Sgrueling

    Joined:
    Nov 7, 2016
    Posts:
    13
    Chunks
    All components are preallocated. No need to move the components in a chunk.

    Entities
    Yes in every chunk all entities are saved and yes thats where entity array comes from. For the system they are really similar to components.

    Archtype
    Yes thats correct.

    ComponentDataArray
    Direct pointer into the chunk.

    So far i didn't look deep into SCD and filtering but that is correct afaik.
     
  3. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    327
    Thanks Sgrueling
     
  4. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    4,649
    We know upfront based on the archetype how many entities fit into one chunk based on the components in the archetype. Thus adding an entity involves simply writing to the end of the stream. Until it is full. At which point we allocate/reuse a new chunk and fill that up.

    Yes. In fact we have essentially an Entity as the 0 component. This is what EntityArray is using internally.

    Correct. In the future will probably experiment with first allocating smaller chunk sizes and depending on how many chunks are in play for one archetype we can start using large chunk sizes. But so far it has not been a problem.

    They are direct pointers, data is contigous within one chunk, when indices go to the next jump we calculate the next base address and continue iteration from there until we hit the end of the chunk.

    IJobProcessComponentData is the most incrediable iterator since its literally the same speed as just working with pointers directly. So you could have two components with floats and copy a to b, and it would result in the same code as a batched memcpy etc...

    SCD have their own manager with a freelist array of shared component data. And hashtables to quickly lookup by value.

    SetFilter internally just resolves to an int index. And each chunk internally has an array of shared component indices used by this chunk. As a result filtering is insanely cheap and done per chunk. There is no cost per entity when using filtering.

    It also means that SCD are close to zero memory cost on a per entity basis (Just one int per chunk), since there is literally zero data for each entity.
     
    iam2bam, filod, Antypodish and 6 others like this.
  5. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    864
    Where can I learn more about these concepts? Something I can't figure out about Unity ECS is how it stores internally the filtered arrays and how it keeps the data between arrays in sync (intuitively I could say that data is duplicated between arrays sharing the same components).
     
  6. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    327
    Not sure what you mean by keeping the arrays in sync? There's no duplication of data under the hood.
    The docs I linked to in my first post do describe the architecture but I didn't really understand that until I went through some source code. I'll try explain, to my knowledge anyway, how the component data is stored and iterated, and then how filtering is just a simple extension of that process.
    It's very clever in its simplicity how they've done it in my view.

    ComponentDataArray Iteration
    Data is stored primarily by its archetype. Say you have three entities with just the components Position and Heading. That's one archetype. The three entities will be stored in the same chunk of memory like so:
    Code (CSharp):
    1. ArchetypeA Chunk (Position, Heading)
    2. [ [Entity, Entity, Entity, ..n][Position, Position, Position, ..n][Heading, Heading, Heading, ..n] ]
    3. where n is the total number of entities a single chunk(16kb) can store.
    If you have another two entities that also contain a Movement component, then they are stored in a different chunk of memory as it's a different Archetype.
    Code (CSharp):
    1. ArchetypeB Chunk (Position, Heading, Movement)
    2. [ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n][Movement, Movement, ..n] ]
    In your System, lets say you request a group for just Position and Heading like so:
    Code (CSharp):
    1. struct Group{
    2.     int Length;
    3.     EntityArray entity;
    4.     ComponentDataArray<Position> position;
    5.     ComponentDataArray<Heading> heading;
    6. }
    7. [Inject] Group group;
    This group matches entities in both archetypes. Three from ArchetypeA and two from ArchetypeB.
    group.Length will equal 5.
    ComponentDataArray is an iterator, so group.position[n] is actually a function call. When you go
    group.position[0..n], it results in the following.
    Code (CSharp):
    1. position[0] =    ArchetypeA.chunk[0].position[0];
    2. position[1] =    ArchetypeA.chunk[0].position[1];
    3. position[2] =    ArchetypeA.chunk[0].position[2];
    4. position[3] =    ArchetypeB.chunk[0].position[0]; // Note the change here
    5. position[4] =    ArchetypeB.chunk[0].position[1];
    Internally, ComponentDataArray[] iterates the archetypes and chunks based on the array index passed in. Were you dealing with thousands of entities, then you'd see ArchetypeA.chunk[0..n].position[0..n] before it got to ArchetypeB.
    To us, it just looks like one contiguous array.
    That's how it steps you through all the entities that match a particular component group. It iterates by Archetype, then chunk, then Component array, which is what it says in the docs. There's no alignment or syncing required.

    Filtering
    I haven't looked through the SharedComponentData filter side of the source code yet but I think it works like this, in principle at least.
    If we add a SharedComponentData to ArchetypeA above, then for each unique value of that SharedComponentData, you get a new chunk.
    Lets say we add a MySharedComponent with value of 1 to the first two entities and a MySharedComponent with value 2 to the third one, it would break the ArchetypeA chunk into two chunks like so.
    Code (CSharp):
    1. ArchetypeA (Position, Heading, MySharedComponent=1)
    2. [ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n] SharedComponentIndex ] // 2 entities in this chunk
    3.  
    4. ArchetypeA (Position, Heading, MySharedComponent=2)
    5. [ [Entity, ..n][Position, ..n][Heading, ..n] SharedComponentIndex ] // 1 entity in here
    (As Joachim said, all it stores for the SharedComponentData in each chunk is a single index value used to look up the single MySharedComponent for that chunk)
    So the data is already stored in filtered chunks by virtue of having a SharedComponent, whether you filter it or not in the ComponentSystem. When you go:
    ComponentGroup.SetFilter(MySharedComponent=1)
    it doesn't reorder any data. It just applies a few extra steps to the first process outlined above so that you only get chunks that match the filter. Your filtered data is already stored in contiguous chunks of memory.
    Pretty awesome.
    Hopefully Joachim can correct me if I'm wrong on any of the above.
    I don't know what affect the job system has on the above process. Presumably parallelfor could even assign individual chunks to threads but I've no knowledge on that side.

    Ideally someone with better graphic skills, or even Unity, will put together an image of the ecs memory architecture and add it to the docs as I think it will greatly help everyone with initial understanding of how ecs works. The architecture is so clean that a single graphic could explain it all at a glance.
     
  7. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    864
    @jooleanlogic thanks a lot for your time spent on this post! It makes things much clearer. I will need to spend a bit more time on understanding it, because I want to understand better how this archetype structure can affect the cache in the worst cases (or better saying how to avoid to break the cache), but otherwise it makes more sense now to me!
     
  8. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    327
    No probs sebas. Writing this stuff out helps me understand it better myself. There's still a lot of details I'm unsure of.

    If you're about Joachim, could you explain how index maps to m_Cache.CachedPtr below in ComponentDataArray?
    Code (CSharp):
    1. public T this[int index]{
    2.     get{
    3.         ...
    4.         return UnsafeUtility.ReadArrayElement<T>(m_Cache.CachedPtr, index);
    5.     }
    6. }
    ReadArrayElement just takes a void* and an index and I presume just returns type T from the address offset, but I don't see where in the above function index gets clamped to the begin/end range of the cache chunk? Am I missing it or perhaps my understanding is wrong on this part of it.

    If I go componentDataArray[10000] and that index is somewhere in chunk 5 of ArchetypeB, how is that index of 10000 mapped to m_Cache.CachedPtr? Does m_Cache.CachedPtr not point to the start of the Component array in the chunk? Wouldn't index overflow?
     
  9. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    327
    Ah never mind. I should've just stepped through the source code first.
    m_Cache.CachedPtr is negatively offset such that CachedPtr + index will give correct address of current chunk element.
     
  10. Mrb83

    Mrb83

    Joined:
    May 12, 2013
    Posts:
    8
    Given the following (pseudo, hypothetical) code:

    Code (CSharp):
    1.  
    2. class MySharedComponent : ISharedComponentData
    3. {
    4.   enum PossibleValues {
    5.     One,
    6.     Two,
    7.     Three,
    8.     ...,
    9.     Ten
    10.   }
    11.   PossibleValues Value;
    12. }
    13. class MySystem : ComponentSystem
    14. {
    15.   void OnCreateManager(int capacity)
    16.   {
    17.     var values = Enum.GetValues(typeof(MySharedComponent.PossibleValues));
    18.     foreach (var v in values)
    19.     {
    20.       var entityManager = ..;
    21.       var entity = entityManager.CreateEntity(typeof(MySharedComponent));
    22.       entity.SetSharedComponentData(new MySharedComponentData {Value = v});
    23.     }
    24.   }
    25. }
    26.  
    what memory layout would I end up with?

    In my current understanding, each shared component instance with a unique value will result in a 16kb chunk allocation. In the example above I create 10 entities each with a shared component with a different value. Does this result in 10 * 16 = 160kb of memory being allocated for just 10 entities with a unique MySharedComponent each?

    If so, does Unity plan to optimize this later with variable chunk sizes? Not all data types require big values of n, but it's still useful having them in the ECS data structure.

    I am implementing an event system where an event is represented as a single entity with an Event shared component on it. If there are 100 possible different Event values, I don't want to end up with 100*16kb of memory allocated.

    Thanks!
     
    5argon likes this.
  11. simonm_unity

    simonm_unity

    Unity Technologies

    Joined:
    Mar 21, 2018
    Posts:
    13
    Yes you would end up with 10 mostly empty chunks, that's a consequence how shared components work. Supporting variable chunk sizes is something we are looking into.
    It's very convenient and potentially fast to query for entities with specific shared components but it can be faster to use IComponentData and just search through the components.
    It depends on how many different shared component values and entities you end up with.
     
    Zoey_O and Mrb83 like this.
  12. Mrb83

    Mrb83

    Joined:
    May 12, 2013
    Posts:
    8
    Thanks for clarifying.
     
  13. DrabaL

    DrabaL

    Joined:
    Jan 4, 2018
    Posts:
    6
    My understanding is that entities are stored entirely within one of the chunks belonging to their archetype.
    What happens when an entity is larger than a single chunk?
    How are entities with lots of data/arrays handled in general(or is having them a sign of a fundamental problem)?
    Just starting off with ECS so excuse me if I'm missing something obvious :)

    Does anyone know of a relatively new pure ECS demo with a bit more complexity than boids?
     
  14. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    4,649
    >What happens when an entity is larger than a single chunk?
    Unity throws an exception. I am not sure how you would create a single entity with more than 16kb of data on it.

    Do note that DynamicBuffer<> elements allocate memory outside of the chunk when it exceeds the default capacity. (Which is generally recommended to be quite small)
    Thus most of such large array data will not be located in the chunk itself.
     
  15. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    903
    Are or will we be able to set a custom chunk size for a particular archetype?
     
  16. AndesSunset

    AndesSunset

    Joined:
    Jan 28, 2019
    Posts:
    60
    I believe Unity has said no. Chunks are always the minimum standard size of an L1 or L2 cache (can’t remember which atm).

    Sizing up a chunk could make it unable to entirely fit in the cache on some CPUs. Sizing it down...probably wouldn’t be a problem, but I can imagine Unity not considering this a common case. What are you aiming to do?
     
  17. sngdan

    sngdan

    Joined:
    Feb 7, 2014
    Posts:
    903
  18. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    4,649
    A cache line is 64 bytes large on most platforms. So no that is not the case.

    We are looking at supporting different sized chunk sizes at some point. We haven't decided on exactly how yet.
     
    optimise likes this.
  19. gebbiz

    gebbiz

    Joined:
    May 23, 2018
    Posts:
    10
    It would be great to be able to somehow specify a modifier at IComponentData level so that any archetype with the component would increase/decrease the chunk size. I have a use case where entities with a specific component will be quite large but depending on some other components there will either be thousands or just a few entities per archetype.
     
  20. old_man_willow

    old_man_willow

    Joined:
    Feb 22, 2019
    Posts:
    1
    Could this DynamicBuffer be used to store joint transform matrices for a pose component, for example? Where in memory is the allocation outside of the chunk, and what are the performance implications of this? Is there a more efficient way to compute poses?

    Sorry, I'm new to Unity :(.
     
  21. justaguygames

    justaguygames

    Joined:
    Mar 4, 2015
    Posts:
    9
    Apologies for reviving an old thread but I'm very interested in understanding the internals. I've tried to follow as much as I can through the source but a lot of it is somewhat low level.

    Chunks are created in memory for a set amount of entity archetypes, when you create a new entity it gets an id but also has some other int that defines its index in the array. I'm just wondering how you pack the data and keep it from having gaps (for the entities.Foreach() iteration to be fast), but then also make it possible to get data for specific entities based on id.

    I was considering sparsesets to map ids to dense arrays in my ecs implementation (just trying to understand concepts by doing), but it seems you have no need for sparse sets.
     
  22. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    381
    EntityComponentStore.cs and EntityInChunk.cs is where to look for this information. There's essentially two different data structures going on. There's the chunks which hold all the components to the entities. The order the entities show up in a chunk has no relationship to the index field of an entity. Then there are a set of parallel arrays where for each entity index, there is a version, a chunk pointer, and an index in the chunk. This is effectively the most perfect of perfect hashmaps where the key is an Entity whose hashcode (index in this context) is the exact index in the hashmap and whose value is the location of the entity's data inside the chunks data structure.
     
    justaguygames likes this.