Search Unity

  1. Unity 2018.1 has arrived! Read about it here
    Dismiss Notice
  2. Scriptable Render Pipeline improvements, Texture Mipmap Streaming, and more! Check out what we have in store for you in the 2018.2 Beta.
    Dismiss Notice
  3. If you couldn't join the live stream, take a peek at what you missed.
    Dismiss Notice
  4. Improve your Unity skills with a certified instructor in a private, interactive classroom. Learn more.
    Dismiss Notice
  5. ARCore is out of developer preview! Read about it here.
    Dismiss Notice
  6. Magic Leap’s Lumin SDK Technical Preview for Unity lets you get started creating content for Magic Leap One™. Find more information on our blog!
    Dismiss Notice
  7. Want to see the most recent patch releases? Take a peek at the patch release page.
    Dismiss Notice

ECS Memory Layout

Discussion in 'Entity Component System and C# Job system' started by jooleanlogic, May 18, 2018.

  1. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    34
    I've read the ECS features in detail section of the documentation and want to see if my understanding of the data layout for entities/components is correct.

    Chunks
    Data is stored by Entity Archetype in 16kb chunks.
    A chunk is arranged by component streams. So all of component A, followed by all of component B etc.

    Is the chunk split up on creation such that the space for all component streams is already reserved? Like so:
    [A, A, A, A, A, A][B, B, B, B, B, B]
    even if there's only one Entity? When you add an Entity, you just copy the component data straight to their relative index positions. This is pretty neat as allocating n entities of an archetype is virtually a no op.
    Or do you compact the streams such that they occupy the memory like so
    [A, A][B, B]
    for two entities. If you add an Entity to this structure, then you have to move all component streams down the memory to get this [A, A, A][B, B, B]. I can't imagine this would work anyway as it would involve re-indexing all the entities?

    Entities
    All entities are stored in a single EntityData struct array. Entity.index is the index into this array and EntityData provides a direct address to its Components. Is an Entity struct also stored in the chunk so it can refer back to the entities array? This is what EntityArray is generated from?
    As a user can store Entity, am I right in assuming that the items in the entities array never change position? If you add 1000 entities and remove the first 999, that last entity is still going to be at the 1000th index?

    Archetypes
    If you add a new component to an Entity, it moves that Entity from its current chunk to a new chunk matching the new archetype. So there'll be a chunk of memory for every possible archetype. If the user doesn't specify a full archetype ahead of time, Unity will create one on demand along with a chunk for it. So adding a unique component to one Entity creates a new chunk just for that one entity.

    ComponentDataArray
    When we access the components via ComponentDataArray, are these direct pointers to the chunk data or are components copied into temp storage at the start of a system and back again at the end?
    Looking at the source code for ComponentDataArray, the iterator jumps from chunk to chunk instead of being contiguous so I'd assume they're direct pointers.

    SharedComponentData (SCD)
    An SCD is part of the archetype and each unique (by value, not type) instance of an SCD requires its own chunk. So an entity archetype will be split over as many chunks as there are unique SCDs.
    The SCDs are stored in their own type arrays somewhere, not in the archetype chunk; the chunk just contains an index into that array.

    Filtering
    Does the SCD include metadata with references back to the chunks that match it?
    So filtering on an SCD should be super quick depending on the Entity to unique SCD ratio.
    If you had 1000 entities split up into units of 100 by SCD, then the filter would just search the 10 items in the SCD array and from that, can directly locate the relevant archetype chunks?

    If I'm right on most of the above, then this structure is pretty damn awesome and I can see why creating and processing thousands of entities is so fast. I think I cleared up a lot of my own misunderstanding in the process of writing this out. Unless I'm completely wrong. :)
     
    Afonso-Lage, PhilSA and asdzxcv777 like this.
  2. Sgrueling

    Sgrueling

    Joined:
    Nov 7, 2016
    Posts:
    10
    Chunks
    All components are preallocated. No need to move the components in a chunk.

    Entities
    Yes in every chunk all entities are saved and yes thats where entity array comes from. For the system they are really similar to components.

    Archtype
    Yes thats correct.

    ComponentDataArray
    Direct pointer into the chunk.

    So far i didn't look deep into SCD and filtering but that is correct afaik.
     
  3. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    34
    Thanks Sgrueling
     
  4. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    3,911
    We know upfront based on the archetype how many entities fit into one chunk based on the components in the archetype. Thus adding an entity involves simply writing to the end of the stream. Until it is full. At which point we allocate/reuse a new chunk and fill that up.

    Yes. In fact we have essentially an Entity as the 0 component. This is what EntityArray is using internally.

    Correct. In the future will probably experiment with first allocating smaller chunk sizes and depending on how many chunks are in play for one archetype we can start using large chunk sizes. But so far it has not been a problem.

    They are direct pointers, data is contigous within one chunk, when indices go to the next jump we calculate the next base address and continue iteration from there until we hit the end of the chunk.

    IJobProcessComponentData is the most incrediable iterator since its literally the same speed as just working with pointers directly. So you could have two components with floats and copy a to b, and it would result in the same code as a batched memcpy etc...

    SCD have their own manager with a freelist array of shared component data. And hashtables to quickly lookup by value.

    SetFilter internally just resolves to an int index. And each chunk internally has an array of shared component indices used by this chunk. As a result filtering is insanely cheap and done per chunk. There is no cost per entity when using filtering.

    It also means that SCD are close to zero memory cost on a per entity basis (Just one int per chunk), since there is literally zero data for each entity.
     
  5. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    579
    Where can I learn more about these concepts? Something I can't figure out about Unity ECS is how it stores internally the filtered arrays and how it keeps the data between arrays in sync (intuitively I could say that data is duplicated between arrays sharing the same components).
     
  6. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    34
    Not sure what you mean by keeping the arrays in sync? There's no duplication of data under the hood.
    The docs I linked to in my first post do describe the architecture but I didn't really understand that until I went through some source code. I'll try explain, to my knowledge anyway, how the component data is stored and iterated, and then how filtering is just a simple extension of that process.
    It's very clever in its simplicity how they've done it in my view.

    ComponentDataArray Iteration
    Data is stored primarily by its archetype. Say you have three entities with just the components Position and Heading. That's one archetype. The three entities will be stored in the same chunk of memory like so:
    Code (CSharp):
    1. ArchetypeA Chunk (Position, Heading)
    2. [ [Entity, Entity, Entity, ..n][Position, Position, Position, ..n][Heading, Heading, Heading, ..n] ]
    3. where n is the total number of entities a single chunk(16kb) can store.
    If you have another two entities that also contain a Movement component, then they are stored in a different chunk of memory as it's a different Archetype.
    Code (CSharp):
    1. ArchetypeB Chunk (Position, Heading, Movement)
    2. [ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n][Movement, Movement, ..n] ]
    In your System, lets say you request a group for just Position and Heading like so:
    Code (CSharp):
    1. struct Group{
    2.     int Length;
    3.     EntityArray entity;
    4.     ComponentDataArray<Position> position;
    5.     ComponentDataArray<Heading> heading;
    6. }
    7. [Inject] Group group;
    This group matches entities in both archetypes. Three from ArchetypeA and two from ArchetypeB.
    group.Length will equal 5.
    ComponentDataArray is an iterator, so group.position[n] is actually a function call. When you go
    group.position[0..n], it results in the following.
    Code (CSharp):
    1. position[0] =    ArchetypeA.chunk[0].position[0];
    2. position[1] =    ArchetypeA.chunk[0].position[1];
    3. position[2] =    ArchetypeA.chunk[0].position[2];
    4. position[3] =    ArchetypeB.chunk[0].position[0]; // Note the change here
    5. position[4] =    ArchetypeB.chunk[0].position[1];
    Internally, ComponentDataArray[] iterates the archetypes and chunks based on the array index passed in. Were you dealing with thousands of entities, then you'd see ArchetypeA.chunk[0..n].position[0..n] before it got to ArchetypeB.
    To us, it just looks like one contiguous array.
    That's how it steps you through all the entities that match a particular component group. It iterates by Archetype, then chunk, then Component array, which is what it says in the docs. There's no alignment or syncing required.

    Filtering
    I haven't looked through the SharedComponentData filter side of the source code yet but I think it works like this, in principle at least.
    If we add a SharedComponentData to ArchetypeA above, then for each unique value of that SharedComponentData, you get a new chunk.
    Lets say we add a MySharedComponent with value of 1 to the first two entities and a MySharedComponent with value 2 to the third one, it would break the ArchetypeA chunk into two chunks like so.
    Code (CSharp):
    1. ArchetypeA (Position, Heading, MySharedComponent=1)
    2. [ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n] SharedComponentIndex ] // 2 entities in this chunk
    3.  
    4. ArchetypeA (Position, Heading, MySharedComponent=2)
    5. [ [Entity, ..n][Position, ..n][Heading, ..n] SharedComponentIndex ] // 1 entity in here
    (As Joachim said, all it stores for the SharedComponentData in each chunk is a single index value used to look up the single MySharedComponent for that chunk)
    So the data is already stored in filtered chunks by virtue of having a SharedComponent, whether you filter it or not in the ComponentSystem. When you go:
    ComponentGroup.SetFilter(MySharedComponent=1)
    it doesn't reorder any data. It just applies a few extra steps to the first process outlined above so that you only get chunks that match the filter. Your filtered data is already stored in contiguous chunks of memory.
    Pretty awesome.
    Hopefully Joachim can correct me if I'm wrong on any of the above.
    I don't know what affect the job system has on the above process. Presumably parallelfor could even assign individual chunks to threads but I've no knowledge on that side.

    Ideally someone with better graphic skills, or even Unity, will put together an image of the ecs memory architecture and add it to the docs as I think it will greatly help everyone with initial understanding of how ecs works. The architecture is so clean that a single graphic could explain it all at a glance.
     
  7. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    579
    @jooleanlogic thanks a lot for your time spent on this post! It makes things much clearer. I will need to spend a bit more time on understanding it, because I want to understand better how this archetype structure can affect the cache in the worst cases (or better saying how to avoid to break the cache), but otherwise it makes more sense now to me!
     
  8. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    34
    No probs sebas. Writing this stuff out helps me understand it better myself. There's still a lot of details I'm unsure of.

    If you're about Joachim, could you explain how index maps to m_Cache.CachedPtr below in ComponentDataArray?
    Code (CSharp):
    1. public T this[int index]{
    2.     get{
    3.         ...
    4.         return UnsafeUtility.ReadArrayElement<T>(m_Cache.CachedPtr, index);
    5.     }
    6. }
    ReadArrayElement just takes a void* and an index and I presume just returns type T from the address offset, but I don't see where in the above function index gets clamped to the begin/end range of the cache chunk? Am I missing it or perhaps my understanding is wrong on this part of it.

    If I go componentDataArray[10000] and that index is somewhere in chunk 5 of ArchetypeB, how is that index of 10000 mapped to m_Cache.CachedPtr? Does m_Cache.CachedPtr not point to the start of the Component array in the chunk? Wouldn't index overflow?
     
  9. jooleanlogic

    jooleanlogic

    Joined:
    Mar 1, 2018
    Posts:
    34
    Ah never mind. I should've just stepped through the source code first.
    m_Cache.CachedPtr is negatively offset such that CachedPtr + index will give correct address of current chunk element.