Search Unity

How do archetypes work?

Discussion in 'Entity Component System' started by Leonidas85, Sep 21, 2019.

  1. Leonidas85

    Leonidas85

    Joined:
    Mar 11, 2015
    Posts:
    14
    Edit: The answer to this question can be found here: https://docs.unity3d.com/Packages/com.unity.entities@0.1/manual/ecs_core.html

    Just out of curiosity, I've been inundating myself in DoP and am curious how Archetypes were implemented in terms of optimal memory / cache line usage.

    The following is from the ECS samples package, the HelloCube->3. IJobChunk example. An archetype is created to execute a job that requires a pair of components, namely the rotation and rotationSpeed:

    Code (CSharp):
    1.     [BurstCompile]
    2.     struct RotationSpeedJob : IJobChunk
    3.     {
    4.         public float DeltaTime;
    5.         public ArchetypeChunkComponentType<Rotation> RotationType;
    6.         [ReadOnly] public ArchetypeChunkComponentType<RotationSpeed_IJobChunk> RotationSpeedType;
    7.  
    8.         public void Execute(ArchetypeChunk chunk, int chunkIndex, int firstEntityIndex)
    9.         {
    10.             var chunkRotations = chunk.GetNativeArray(RotationType); // what happens in memory when you do this? how does this work with the cache line?
    11.             var chunkRotationSpeeds = chunk.GetNativeArray(RotationSpeedType);
    12.             for (var i = 0; i < chunk.Count; i++)
    13.             {
    14.                 var rotation = chunkRotations[i];
    15.                 var rotationSpeed = chunkRotationSpeeds[i];
    16.  
    17.                 // Rotate something about its up vector at the speed given by RotationSpeed_IJobChunk.
    18.                 chunkRotations[i] = new Rotation
    19.                 {
    20.                     Value = math.mul(math.normalize(rotation.Value),
    21.                         quaternion.AxisAngle(math.up(), rotationSpeed.RadiansPerSecond * DeltaTime))
    22.                 };
    23.             }
    24.         }
    25.     }
    When executing over a large set of component pairs it makes sense to arrange the component pairs in memory so that the job iteration can get as many of the component pairs in contiguous memory so as to optimally utilize the cache line. So in this case ideally the memory layout would be:
    chunkRotations[0] - chunkRotationSpeeds[0] - chunkRotations[1] - chunkRotationSpeeds[1] - chunkRotations[2] - chunkRotationSpeeds[2] - etc.

    Am I correct in assuming that defining an Archetype arranges the components in memory in such a way?
    The 2 consecutive chunk.GetNativeArray() calls at the start of Execute() seem to imply that the chunkRotations and chunkRotationSpeeds arrays are separate arrays and thus would create a lot of cache line misses when iterated over, especially as the amount of components grows.

    My understanding of memory layouts and cache utilization is very basic at this point but any insight would help me better understand how to optimally use these systems and would be much appreciated.
     
    Last edited: Sep 24, 2019
    starikcetin likes this.
  2. Singtaa

    Singtaa

    Joined:
    Dec 14, 2010
    Posts:
    492
    I am also not an expert on CPU caches. But it's my understanding that the arrays will be loaded into separate cache lines. And since the access patterns are linear, the cache misses should be minimal (a miss every 64 bytes). Temporal locality will keep the data in cache for awhile.
     
    Last edited: Sep 21, 2019
  3. Leonidas85

    Leonidas85

    Joined:
    Mar 11, 2015
    Posts:
    14
    Digging into this further it is my understanding that archetypes are nothing more than a convenient collection of components where each unique combination of components is its own archetype. When you create an entity prefab you're actually just creating an archetype (though another entity prefab with the same combination of components would be part of the same archetype).
    The archetypes are divided into chunks which are stored in consecutive memory.
    I suppose further optimization could be performed to arrange the data in memory by the compiler based on what it can derive from the implementation of the job and system.

    The process is described here: https://docs.unity3d.com/Packages/com.unity.entities@0.1/manual/ecs_core.html
     
    Last edited: Sep 24, 2019