Search Unity

Data integrity and primary keys

Discussion in 'Entity Component System' started by snacktime, Jun 22, 2018.

  1. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    The challenge is that while the relation between entities and components is implicit, data integrity demands it be explicit on some level.

    I was thinking maybe extend archetypes with a notion of primary key. Like be able to set which component is the primary key component.

    Implementing this in a general way isn't terribly difficult I just got done with a simple implementation. It would be cleaner if it was baked in, I chose to just add it on top of. I think it's a useful enough thing to provide with the base system in some way.
     
  2. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    The Entity itself IS the primary key. Thinking of each component as a table with it's properties as the columns and then their primary keys being a foreign key to the Entity table - which is just a table with a single primary EntityId column - would be a more accurate description of how the data is related to one another.
     
  3. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Entity as primary key is only useful for local ephemeral data.
     
  4. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    I'm not sure what you're trying to achieve then. What would systems stand to gain through this? Entity is serving as the identifier of the grouping of components. Adding yet another component to do this just doesn't make sense to me. The great thing for me about ECS is that in most cases I don't actually care about the entity, I just care that the data that's grouped together will be processed correctly.
     
  5. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Well, when working with pretty much any data coming in externally, that more then likely already has a unique id.
     
  6. LazyGameDevZA

    LazyGameDevZA

    Joined:
    Nov 10, 2016
    Posts:
    143
    From the sounds of it you're trying to enforce the relationship to an external source? This feels more like it's specific to a use case and would change depending on what format the external data is in. It's also feeling a little dangerous to try and enforce a primary key which isn't actually the primary key in the current space the data is living in.

    My view would be to build your own systems etc that can take care of this management and not try to shoehorn it into the systems provided by Unity. Once an entity is created it's not always guaranteed that the entity would conform to the archetype. As your game runs the data will transform and either lose some and have more added. This would result in the archetype being updated and having to be managed by us as well. From my understanding an archetype is merely serving as a sort of hotpath that pre-allocates the memory in an efficient way for the entity and won't require rebuilding the archetype with every component that's added.
     
  7. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356

    The context is data that already has a unique id. You need to be able to verify *that* id is not duplicated in ECS. The entity id doesn't do anything for you there.
     
  8. Necromantic

    Necromantic

    Joined:
    Feb 11, 2013
    Posts:
    116
    Is there any problem with just adding a Map/Dictionary or whatever fits best that maps from your UID to an Entity?
    I doubt the whole efficient memory layout thing would work with arbitrary external data, so you'd really need that extra layer of abstraction anyway.
     
    Last edited: Jun 22, 2018
  9. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356

    The common use case here is avoiding duplicates in multiplayer games. Which can easily be generalized and should be as close to ECS as possible not out in some other layer.

    Structs can be compound keys via IEquatable. Have an interface PrimaryKeyComponent. Create an extension method for EntityCommandBuffer that has a CreateEntity/DestroyEntity that takes your primary key component. A HashSet for the components. That's pretty the basics. Api could probably be made a bit more elegant.
     
  10. Necromantic

    Necromantic

    Joined:
    Feb 11, 2013
    Posts:
    116
    That is adding a layer on top of it and not really that much different to what I said, just that it's trying to force it into the ECS core.
    You basically want to replace the Entity as an index, that is core to the whole efficient ECS approach, with arbitrary data.
    You can easily build that functionality around it if you really need it.

    If you just map your UID to an Entity you get all the advantages of the ECS with no real extra work.
     
    Last edited: Jun 23, 2018
  11. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Mapping to an entity isn't that simple. You can only do it after the fact, at which point you could already have duplicates.
    Which is why I approached it more like a validation layer. But that layer needs to be close to ECS and using the same approach system wide. Having multiple approaches to this, and/or having separation between where you validate and tell ECS to create an entity, will have a higher chance of causing bugs.
     
  12. Necromantic

    Necromantic

    Joined:
    Feb 11, 2013
    Posts:
    116
    If you build your own abstraction around it how is the chance any higher to have bugs? If you have your data with UID etc. you can validate at any level even before creation of Entities.
    All those validations etc. built in to the core ECS system would only drain performance and the whole structure seems non-linear and would therefore invalidate everything the core ECS is supposed to do.

    I really don't see what the big problem with mapping the UID to an Entity and validating yourself wherever it's needed is. Aside from having to rely on your own API abstraction.
     
    JPrzemieniecki likes this.
  13. If you create a job with the new valid UIDs as input, you create the entities inside that job (with the EntityCommandBuffer) and attach the UID as a small component as data, why would it create duplicates?
    You can maintain a hashmap with the already created UIDs and run only the entity creation when the UID is not in the hashmap. If you can not know what entities are already created outside of this creation job.

    Although it is a possibility that I'm missing something.
     
  14. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    I said the further away from entity creation that abstraction is, the more likely it is to have bugs. For the same reason database validations are close to the database.

    How exactly would the approach I describe effect performance? It has a negligible hit only when creating/destroying entities via the extension or other method that would be a choice. You only use it when you need the feature. Otherwise it has zero impact.

    But the main thing is something a bit more subtle that I referenced but I think got missed. You cant' map your id to an entity. You don't have an entity at the point of putting it into the command buffer. That is done at the sync point, at which time you could have duplicates in the buffer.
     
  15. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    You could do that but it implies having an uncomfortable amount of separation between the validation and when you tell ECS to create the entity. If you need this to begin with, I think the best approach is a wrapper around command buffers. So you force a single validation point with a singular piece of logic for handling it.
     
  16. Since the entire system is deterministic, the EntityCommandBuffer is also deterministic, so don't see the problem you're describing. There is no way to have duplicates when the Jobs are done. And you are not supposed to run any jobs which uses the entities you're creating in the command buffers before the buffers ran, so IDK, I would be comfortable with this.

    It's the same "problem" like the reactive/event system they're creating. Basically they build a custom entity-based reactive system. If you have seen the livestream about it from Unite Berlin. It has the same separation.
    Since you don't skip frames between these, let alone arbitrary time, I think it works properly.
     
  17. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    3,356
    Not duplicates of ECS entities but duplicates of whatever your data is, as identified by an id external to ECS. Like network messages with commands to create things like npc's, items, etc..

    The command buffer extension doesn't actually do anything new, it was just a way to make it more generic and remove possible failure points between asking for something to be created and it's actual creation.
     
  18. Necromantic

    Necromantic

    Joined:
    Feb 11, 2013
    Posts:
    116
    I don't think we'll reach a consensus here. I still think this would work perfectly fine on an abstraction layer outside of the ECS and putting it into the core of the ECS would - in my opinion - invalidate a lot of the principles the ECS works on.
     
    LazyGameDevZA likes this.