ECS Memory Layout

JooleanLogic · May 18, 2018

I've read the ECS features in detail section of the documentation and want to see if my understanding of the data layout for entities/components is correct.

Chunks
Data is stored by Entity Archetype in 16kb chunks.
A chunk is arranged by component streams. So all of component A, followed by all of component B etc.

Is the chunk split up on creation such that the space for all component streams is already reserved? Like so:
[A, A, A, A, A, A][B, B, B, B, B, B]
even if there's only one Entity? When you add an Entity, you just copy the component data straight to their relative index positions. This is pretty neat as allocating n entities of an archetype is virtually a no op.
Or do you compact the streams such that they occupy the memory like so
[A, A][B, B]
for two entities. If you add an Entity to this structure, then you have to move all component streams down the memory to get this [A, A, A][B, B, B]. I can't imagine this would work anyway as it would involve re-indexing all the entities?

Entities
All entities are stored in a single EntityData struct array. Entity.index is the index into this array and EntityData provides a direct address to its Components. Is an Entity struct also stored in the chunk so it can refer back to the entities array? This is what EntityArray is generated from?
As a user can store Entity, am I right in assuming that the items in the entities array never change position? If you add 1000 entities and remove the first 999, that last entity is still going to be at the 1000th index?

Archetypes
If you add a new component to an Entity, it moves that Entity from its current chunk to a new chunk matching the new archetype. So there'll be a chunk of memory for every possible archetype. If the user doesn't specify a full archetype ahead of time, Unity will create one on demand along with a chunk for it. So adding a unique component to one Entity creates a new chunk just for that one entity.

ComponentDataArray
When we access the components via ComponentDataArray, are these direct pointers to the chunk data or are components copied into temp storage at the start of a system and back again at the end?
Looking at the source code for ComponentDataArray, the iterator jumps from chunk to chunk instead of being contiguous so I'd assume they're direct pointers.

SharedComponentData (SCD)
An SCD is part of the archetype and each unique (by value, not type) instance of an SCD requires its own chunk. So an entity archetype will be split over as many chunks as there are unique SCDs.
The SCDs are stored in their own type arrays somewhere, not in the archetype chunk; the chunk just contains an index into that array.

Filtering
Does the SCD include metadata with references back to the chunks that match it?
So filtering on an SCD should be super quick depending on the Entity to unique SCD ratio.
If you had 1000 entities split up into units of 100 by SCD, then the filter would just search the 10 items in the SCD array and from that, can directly locate the relevant archetype chunks?

If I'm right on most of the above, then this structure is pretty damn awesome and I can see why creating and processing thousands of entities is so fast. I think I cleared up a lot of my own misunderstanding in the process of writing this out. Unless I'm completely wrong.

Deleted User · May 18, 2018

Chunks
All components are preallocated. No need to move the components in a chunk.

Entities
Yes in every chunk all entities are saved and yes thats where entity array comes from. For the system they are really similar to components.

Archtype
Yes thats correct.

ComponentDataArray
Direct pointer into the chunk.

So far i didn't look deep into SCD and filtering but that is correct afaik.

JooleanLogic · May 19, 2018

Thanks Sgrueling

Joachim_Ante · May 29, 2018

jooleanlogic said: ↑

I've read the ECS features in detail section of the documentation and want to see if my understanding of the data layout for entities/components is correct.

Chunks

Data is stored by Entity Archetype in 16kb chunks.
A chunk is arranged by component streams. So all of component A, followed by all of component B etc.

Is the chunk split up on creation such that the space for all component streams is already reserved? Like so:
[A, A, A, A, A, A][B, B, B, B, B, B]
even if there's only one Entity? When you add an Entity, you just copy the component data straight to their relative index positions. This is pretty neat as allocating n entities of an archetype is virtually a no op.
Or do you compact the streams such that they occupy the memory like so
[A, A][B, B]
for two entities. If you add an Entity to this structure, then you have to move all component streams down the memory to get this [A, A, A][B, B, B]. I can't imagine this would work anyway as it would involve re-indexing all the entities?
Click to expand...

We know upfront based on the archetype how many entities fit into one chunk based on the components in the archetype. Thus adding an entity involves simply writing to the end of the stream. Until it is full. At which point we allocate/reuse a new chunk and fill that up.

Entities
All entities are stored in a single EntityData struct array. Entity.index is the index into this array and EntityData provides a direct address to its Components. Is an Entity struct also stored in the chunk so it can refer back to the entities array? This is what EntityArray is generated from?
As a user can store Entity, am I right in assuming that the items in the entities array never change position? If you add 1000 entities and remove the first 999, that last entity is still going to be at the 1000th index?
Click to expand...

Yes. In fact we have essentially an Entity as the 0 component. This is what EntityArray is using internally.

Archetypes
If you add a new component to an Entity, it moves that Entity from its current chunk to a new chunk matching the new archetype. So there'll be a chunk of memory for every possible archetype. If the user doesn't specify a full archetype ahead of time, Unity will create one on demand along with a chunk for it. So adding a unique component to one Entity creates a new chunk just for that one entity.
Click to expand...

Correct. In the future will probably experiment with first allocating smaller chunk sizes and depending on how many chunks are in play for one archetype we can start using large chunk sizes. But so far it has not been a problem.

ComponentDataArray
When we access the components via ComponentDataArray, are these direct pointers to the chunk data or are components copied into temp storage at the start of a system and back again at the end?
Looking at the source code for ComponentDataArray, the iterator jumps from chunk to chunk instead of being contiguous so I'd assume they're direct pointers.
Click to expand...

They are direct pointers, data is contigous within one chunk, when indices go to the next jump we calculate the next base address and continue iteration from there until we hit the end of the chunk.

IJobProcessComponentData is the most incrediable iterator since its literally the same speed as just working with pointers directly. So you could have two components with floats and copy a to b, and it would result in the same code as a batched memcpy etc...

SharedComponentData (SCD)
An SCD is part of the archetype and each unique (by value, not type) instance of an SCD requires its own chunk. So an entity archetype will be split over as many chunks as there are unique SCDs.
The SCDs are stored in their own type arrays somewhere, not in the archetype chunk; the chunk just contains an index into that array.

Filtering
Does the SCD include metadata with references back to the chunks that match it?
So filtering on an SCD should be super quick depending on the Entity to unique SCD ratio.
If you had 1000 entities split up into units of 100 by SCD, then the filter would just search the 10 items in the SCD array and from that, can directly locate the relevant archetype chunks?
Click to expand...

SCD have their own manager with a freelist array of shared component data. And hashtables to quickly lookup by value.

SetFilter internally just resolves to an int index. And each chunk internally has an array of shared component indices used by this chunk. As a result filtering is insanely cheap and done per chunk. There is no cost per entity when using filtering.

It also means that SCD are close to zero memory cost on a per entity basis (Just one int per chunk), since there is literally zero data for each entity.

sebas77 · May 29, 2018

Where can I learn more about these concepts? Something I can't figure out about Unity ECS is how it stores internally the filtered arrays and how it keeps the data between arrays in sync (intuitively I could say that data is duplicated between arrays sharing the same components).

JooleanLogic · May 30, 2018

Not sure what you mean by keeping the arrays in sync? There's no duplication of data under the hood.
The docs I linked to in my first post do describe the architecture but I didn't really understand that until I went through some source code. I'll try explain, to my knowledge anyway, how the component data is stored and iterated, and then how filtering is just a simple extension of that process.
It's very clever in its simplicity how they've done it in my view.

ComponentDataArray Iteration
Data is stored primarily by its archetype. Say you have three entities with just the components Position and Heading. That's one archetype. The three entities will be stored in the same chunk of memory like so:

Code (CSharp):

ArchetypeA Chunk (Position, Heading)

[ [Entity, Entity, Entity, ..n][Position, Position, Position, ..n][Heading, Heading, Heading, ..n] ]

where n is the total number of entities a single chunk(16kb) can store.

If you have another two entities that also contain a Movement component, then they are stored in a different chunk of memory as it's a different Archetype.

Code (CSharp):

ArchetypeB Chunk (Position, Heading, Movement)

[ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n][Movement, Movement, ..n] ]

In your System, lets say you request a group for just Position and Heading like so:

Code (CSharp):

struct Group{

int Length;

EntityArray entity;

ComponentDataArray<Position> position;

ComponentDataArray<Heading> heading;

}

[Inject] Group group;

This group matches entities in both archetypes. Three from ArchetypeA and two from ArchetypeB.
group.Length will equal 5.
ComponentDataArray is an iterator, so group.position[n] is actually a function call. When you go
group.position[0..n], it results in the following.

Code (CSharp):

position[0] = ArchetypeA.chunk[0].position[0];

position[1] = ArchetypeA.chunk[0].position[1];

position[2] = ArchetypeA.chunk[0].position[2];

position[3] = ArchetypeB.chunk[0].position[0]; // Note the change here

position[4] = ArchetypeB.chunk[0].position[1];

Internally, ComponentDataArray[] iterates the archetypes and chunks based on the array index passed in. Were you dealing with thousands of entities, then you'd see ArchetypeA.chunk[0..n].position[0..n] before it got to ArchetypeB.
To us, it just looks like one contiguous array.
That's how it steps you through all the entities that match a particular component group. It iterates by Archetype, then chunk, then Component array, which is what it says in the docs. There's no alignment or syncing required.

Filtering
I haven't looked through the SharedComponentData filter side of the source code yet but I think it works like this, in principle at least.
If we add a SharedComponentData to ArchetypeA above, then for each unique value of that SharedComponentData, you get a new chunk.
Lets say we add a MySharedComponent with value of 1 to the first two entities and a MySharedComponent with value 2 to the third one, it would break the ArchetypeA chunk into two chunks like so.

Code (CSharp):

ArchetypeA (Position, Heading, MySharedComponent=1)

[ [Entity, Entity, ..n][Position, Position, ..n][Heading, Heading, ..n] SharedComponentIndex ] // 2 entities in this chunk

ArchetypeA (Position, Heading, MySharedComponent=2)

[ [Entity, ..n][Position, ..n][Heading, ..n] SharedComponentIndex ] // 1 entity in here

(As Joachim said, all it stores for the SharedComponentData in each chunk is a single index value used to look up the single MySharedComponent for that chunk)
So the data is already stored in filtered chunks by virtue of having a SharedComponent, whether you filter it or not in the ComponentSystem. When you go:
ComponentGroup.SetFilter(MySharedComponent=1)
it doesn't reorder any data. It just applies a few extra steps to the first process outlined above so that you only get chunks that match the filter. Your filtered data is already stored in contiguous chunks of memory.
Pretty awesome.
Hopefully Joachim can correct me if I'm wrong on any of the above.
I don't know what affect the job system has on the above process. Presumably parallelfor could even assign individual chunks to threads but I've no knowledge on that side.

Ideally someone with better graphic skills, or even Unity, will put together an image of the ecs memory architecture and add it to the docs as I think it will greatly help everyone with initial understanding of how ecs works. The architecture is so clean that a single graphic could explain it all at a glance.

sebas77 · May 30, 2018

@jooleanlogic thanks a lot for your time spent on this post! It makes things much clearer. I will need to spend a bit more time on understanding it, because I want to understand better how this archetype structure can affect the cache in the worst cases (or better saying how to avoid to break the cache), but otherwise it makes more sense now to me!

JooleanLogic · May 30, 2018

No probs sebas. Writing this stuff out helps me understand it better myself. There's still a lot of details I'm unsure of.

If you're about Joachim, could you explain how index maps to m_Cache.CachedPtr below in ComponentDataArray?

Code (CSharp):

public T this[int index]{

get{

...

return UnsafeUtility.ReadArrayElement<T>(m_Cache.CachedPtr, index);

}

}

ReadArrayElement just takes a void* and an index and I presume just returns type T from the address offset, but I don't see where in the above function index gets clamped to the begin/end range of the cache chunk? Am I missing it or perhaps my understanding is wrong on this part of it.

If I go componentDataArray[10000] and that index is somewhere in chunk 5 of ArchetypeB, how is that index of 10000 mapped to m_Cache.CachedPtr? Does m_Cache.CachedPtr not point to the start of the Component array in the chunk? Wouldn't index overflow?

JooleanLogic · May 31, 2018

Ah never mind. I should've just stepped through the source code first.
m_Cache.CachedPtr is negatively offset such that CachedPtr + index will give correct address of current chunk element.

Mrb83 · Aug 24, 2018

Given the following (pseudo, hypothetical) code:

Code (CSharp):

class MySharedComponent : ISharedComponentData

{

enum PossibleValues {

One,

Two,

Three,

...,

Ten

}

PossibleValues Value;

}

class MySystem : ComponentSystem

{

void OnCreateManager(int capacity)

{

var values = Enum.GetValues(typeof(MySharedComponent.PossibleValues));

foreach (var v in values)

{

var entityManager = ..;

var entity = entityManager.CreateEntity(typeof(MySharedComponent));

entity.SetSharedComponentData(new MySharedComponentData {Value = v});

}

}

}

what memory layout would I end up with?

In my current understanding, each shared component instance with a unique value will result in a 16kb chunk allocation. In the example above I create 10 entities each with a shared component with a different value. Does this result in 10 * 16 = 160kb of memory being allocated for just 10 entities with a unique MySharedComponent each?

If so, does Unity plan to optimize this later with variable chunk sizes? Not all data types require big values of n, but it's still useful having them in the ECS data structure.

I am implementing an event system where an event is represented as a single entity with an Event shared component on it. If there are 100 possible different Event values, I don't want to end up with 100*16kb of memory allocated.

Thanks!

simonm_unity · Aug 24, 2018

Yes you would end up with 10 mostly empty chunks, that's a consequence how shared components work. Supporting variable chunk sizes is something we are looking into.
It's very convenient and potentially fast to query for entities with specific shared components but it can be faster to use IComponentData and just search through the components.
It depends on how many different shared component values and entities you end up with.

Mrb83 · Aug 24, 2018

Thanks for clarifying.

DrabaL · Feb 14, 2019

My understanding is that entities are stored entirely within one of the chunks belonging to their archetype.
What happens when an entity is larger than a single chunk?
How are entities with lots of data/arrays handled in general(or is having them a sign of a fundamental problem)?
Just starting off with ECS so excuse me if I'm missing something obvious

Does anyone know of a relatively new pure ECS demo with a bit more complexity than boids?

Joachim_Ante · Feb 14, 2019

>What happens when an entity is larger than a single chunk?
Unity throws an exception. I am not sure how you would create a single entity with more than 16kb of data on it.

Do note that DynamicBuffer<> elements allocate memory outside of the chunk when it exceeds the default capacity. (Which is generally recommended to be quite small)
Thus most of such large array data will not be located in the chunk itself.

sngdan · Feb 14, 2019

Are or will we be able to set a custom chunk size for a particular archetype?

AndesSunset · Feb 14, 2019

I believe Unity has said no. Chunks are always the minimum standard size of an L1 or L2 cache (can’t remember which atm).

Sizing up a chunk could make it unable to entirely fit in the cache on some CPUs. Sizing it down...probably wouldn’t be a problem, but I can imagine Unity not considering this a common case. What are you aiming to do?

sngdan · Feb 15, 2019

Interesting, where did you hear this? Source?

The only thing I know says they might consider it. https://forum.unity.com/threads/ok-...uction-size-limit-in-ecs.577894/#post-4051645

I would like to allow for bigger buffers to be stored in the chunk - so ideally, i could change the chunk size for a specific archetype

Joachim_Ante · Feb 15, 2019

AndesSunset said: ↑

I believe Unity has said no. Chunks are always the minimum standard size of an L1 or L2 cache (can’t remember which atm).

Sizing up a chunk could make it unable to entirely fit in the cache on some CPUs. Sizing it down...probably wouldn’t be a problem, but I can imagine Unity not considering this a common case. What are you aiming to do?
Click to expand...

A cache line is 64 bytes large on most platforms. So no that is not the case.

We are looking at supporting different sized chunk sizes at some point. We haven't decided on exactly how yet.

gebbiz · Feb 15, 2019

Joachim_Ante said: ↑

We are looking at supporting different sized chunk sizes at some point. We haven't decided on exactly how yet.
Click to expand...

It would be great to be able to somehow specify a modifier at IComponentData level so that any archetype with the component would increase/decrease the chunk size. I have a use case where entities with a specific component will be quite large but depending on some other components there will either be thousands or just a few entities per archetype.

old_man_willow · Feb 22, 2019

Joachim_Ante said: ↑

Do note that DynamicBuffer<> elements allocate memory outside of the chunk when it exceeds the default capacity. (Which is generally recommended to be quite small)
Thus most of such large array data will not be located in the chunk itself.
Click to expand...

Could this DynamicBuffer be used to store joint transform matrices for a pose component, for example? Where in memory is the allocation outside of the chunk, and what are the performance implications of this? Is there a more efficient way to compute poses?

Sorry, I'm new to Unity .

justaguygames · Aug 11, 2019

Apologies for reviving an old thread but I'm very interested in understanding the internals. I've tried to follow as much as I can through the source but a lot of it is somewhat low level.

Chunks are created in memory for a set amount of entity archetypes, when you create a new entity it gets an id but also has some other int that defines its index in the array. I'm just wondering how you pack the data and keep it from having gaps (for the entities.Foreach() iteration to be fast), but then also make it possible to get data for specific entities based on id.

I was considering sparsesets to map ids to dense arrays in my ecs implementation (just trying to understand concepts by doing), but it seems you have no need for sparse sets.

DreamingImLatios · Aug 11, 2019

EntityComponentStore.cs and EntityInChunk.cs is where to look for this information. There's essentially two different data structures going on. There's the chunks which hold all the components to the entities. The order the entities show up in a chunk has no relationship to the index field of an entity. Then there are a set of parallel arrays where for each entity index, there is a version, a chunk pointer, and an index in the chunk. This is effectively the most perfect of perfect hashmaps where the key is an Entity whose hashcode (index in this context) is the exact index in the hashmap and whose value is the location of the entity's data inside the chunks data structure.

jmdejoanelli · Nov 5, 2019

I really like the idea of this ECS, seems super powerful to think about these things from a data driver design perspective.

One thing I'm having trouble understanding though:

Two or more different archetypes share a like component, how is it determined where the shared component is stored in memory? For instance, I have two different entities, each with a transform component, in which memory chunk are the transform components stored?

learc83 · Nov 5, 2019

jmdejoanelli said: ↑

Two or more different archetypes share a like component, how is it determined where the shared component is stored in memory? For instance, I have two different entities, each with a transform component, in which memory chunk are the transform components stored?
Click to expand...

Each archetype is stored separately in one or more chunks. When you loop over the transform component, you're really doing a nested loop where you loop over all chunks that have a transform component. Then for each chunk you loop over the transform components within that chunk.

You can do this explicitly with chunk iteration or use the built in abstractions to hide this.

Abbrew · Jan 6, 2020

Joachim_Ante said: ↑

Correct. In the future will probably experiment with first allocating smaller chunk sizes and depending on how many chunks are in play for one archetype we can start using large chunk sizes. But so far it has not been a problem.
Click to expand...

What's the progress on this feature? A messaging system that creates potentially dozens of different entity archetypes would waste a lot of chunk space unless what you described is already in Unity

Goularou · May 16, 2022

Joachim_Ante said: ↑

A cache line is 64 bytes large on most platforms. So no that is not the case.

We are looking at supporting different sized chunk sizes at some point. We haven't decided on exactly how yet.
Click to expand...

I guess it is the same for compound colliders: they are stored outside the chunk? Thanks. (here mine can be up to 2 mB a piece; sometimes more).

Search Unity

ECS Memory Layout

JooleanLogic

Deleted User

Guest

JooleanLogic

Joachim_Ante

Unity Technologies

sebas77

JooleanLogic

sebas77

JooleanLogic

JooleanLogic

Mrb83

simonm_unity

Unity Technologies

Mrb83

DrabaL

Joachim_Ante

Unity Technologies

sngdan

AndesSunset

sngdan

Joachim_Ante

Unity Technologies

gebbiz

old_man_willow

justaguygames

DreamingImLatios

jmdejoanelli

learc83

Abbrew

Goularou

Search Unity

Unity ID

Useful Searches

ECS Memory Layout

Guest

Unity Technologies

Unity Technologies

Unity Technologies

Unity Technologies