Search Unity

Resolved Crash in Build only: EntityCommandBuffer Playback

Discussion in 'Entity Component System' started by Vaarz, Oct 18, 2020.

  1. Vaarz

    Vaarz

    Joined:
    Mar 19, 2016
    Posts:
    5
    Everything in-editor seems to work fine, but when I make a build I'm getting a crash that I can trace back to EntityCommandBuffer.PlaybackUnmanagedCommandInternal, specifically, for case "AddBuffer".

    I'm using 2020.1 and the latest version of Entities packages, but I was also getting this issue before when using 2019.4 LTS. (I was really hoping upgrading would resolve it :()

    It's really hard to find out exactly what's happening since the ECB playback doesn't occur until a good while after the point where I create the entity, but from what I can tell it involves a bit of my code where:
    1. I create an "Event" entity with a DynamicBuffer of bytes via an ECB. I create the entity, add a couple components, add a buffer, then populate the buffer from a DataStreamReader (Unity.Transport). This is using a ECB and is in a scheduled job.
    2. I'm not sure if its relevant yet, but between creating the "event" above and the ECB System running I have another system that uses WithStructuralChanges().Run(). It can query the same event entities as above.
    3. In build only, I get a crash when the ECB.Playback method is invoked. Specifically:
      • Case ECBCommand.AddBuffer
      • calls mgr->EntityComponentStore->SetBufferRawWithValidation(...)
      • calls var ptr = GetComponentDataWithTypeRW(...)
      • calls ChunkDataUtility.GetComponentDataWithTypeRW(...)
      • calls GetIndexInTypeArray(chunk->Archetype, typeIndex)
      • which calls var types = archetype->Types, and causes the crash with "Access violation reading location"
    I guess my questions are, is this a bug or am I doing something not allowed? I'm not sure if step 2 above is allowed, since I am creating a structural change which I know causes issues for DynamicBuffers, but since everything is just in an ECB I thought it'd be fine. And if its not allowed, why does it work in editor?
    I could try and refactor to avoid the WithStructuralChanges job somehow, but I really don't want to invest that effort if its not actually the issue. I might also just be way off the mark and it could be something completely unrelated o_O

    Any insight would be greatly appreciated!
     
    deus0 likes this.
  2. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,685
    Do you have JobsDebugger enabled in the editor?
    upload_2020-10-18_22-35-41.png

    And Burst -> SafetyChecks
    upload_2020-10-18_22-36-20.png
     
  3. Vaarz

    Vaarz

    Joined:
    Mar 19, 2016
    Posts:
    5
    When I used Burst, yes, I had safety checks on. I didn't use the JobsDebugger though since it was working in the editor just fine.

    Instead, since Burst was making everything hard to debug, I just turned Burst Compilation off. It crashed in build regardless though. I also tested mono vs il2cpp and that didn't make a difference
     
  4. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,685
    It's exactly for showing exceptions in editor, if you don't enable this it will hide part of errors inside jobs from you in editor. Plus enable LeakDetection if disabled.
     
  5. Vaarz

    Vaarz

    Joined:
    Mar 19, 2016
    Posts:
    5
    D'oh! I don't know what I was thinking. Thank you. I had been using the JobDebugger earlier, but must have turned it off at some point. I turned that back on and I did find an unrelated error I recently introduced. Now with that resolved and the JobDebugger on, I still have the Editor running fine and yet the Builds are still crashing the same.

    I've even added a new ECB System to avoid the gap of systems between where I queue the command buffer commands and execute them. So now, I have a scheduled job that creates the entity and adds the buffer, and the next system is the ECB System that plays back the commands (and crashes in build).

    (I also do have leak detection on and I use safety checks when running with Burst Compilation, but I've disabled Burst for now.)
     
  6. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,685
    Huh, funny thing, we have the same issue in build now (which in editor works fine), but for
    memcpy in
    SetComponentDataRawEntityHasComponent
    for
    SetComponent
    playback
     
    deus0 likes this.
  7. cort_of_unity

    cort_of_unity

    Unity Technologies

    Joined:
    Aug 15, 2018
    Posts:
    98
    Thank you for the reports, we'll investigate!
     
    Vaarz and eizenhorn like this.
  8. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,685
    Well today we tracked our issue with DOTS team and figured out our issue, it was our fault, but because of circumstances, we faced with this only in build.
    The reason was our asset reference counter while loading assets through Addressables. We requested our ScriptableObjects by asset reference then subscribing to async request, and then in every callback handler, we increment the counter of loaded SO and compare with total desired (as we have a list of all asset references which will be used for requesting assets itself) and issue was 1 damn row which we not removed after extracting increment and check logic to separate method -
    totalLoaded++
    for one specific SO with game configs, and if this config loaded before the last prefab it will lead us to the case when in last fired loading callback check we have
    totalLoaded + 1
    because of this not removed additional increment and it will add a key to prefabs map with default Entity struct value but not add loaded prefab to conversion list and value for this key in dictionary left default -
    Entity.Null
    . For editor is not the case, because all addressables things loading close to synchronous in fast mode and in the order they defined in preloader script, and this config for which we have this unnecessary increment - last in call order, this is why it’s suppressed in the editor, as all prefabs already loaded successfully before this. As result in build, we have Entity.Null prefab which we instantiating from empty value by object key in the dictionary and then it fails to SetComponent on this Entity.Null instantiated prefab and crash build because of a pointer to the null entity. Clever interweaving of all conditions - addressables loading, safety checks, crash log pointing to SetComponent but not to Instantiate playback (well reasonable as SetComponent cause crash in build and editor case covered by safety checks but in the editor, we have proper prefab entity instead) and one hidden row in our code made that tricky to track in the first place :)
    @Vaarz as I see you have Transport involved which means some async requests to API, you can, probably, have Entity.Null prefab or Entity.Null passed to AddBuffer which happens only in the build because of this async stuff which in editor can work synchronously.
    BTW for checking build and excluding crash and check error follow these steps:
    1. Copy entities package from YourProject/Library/PackagesCache/ to YourProject/Packages folder
    2. Go to
    EntityComponentStore.cs:796
    and you'll see
    AssertEntityHasComponent 
    it lies in
    EntityComponentStoreDebug.cs:284
    and you'll see it's under Conditional attribute which means it will be excluded from build, comment
    [Conditional("ENABLE_UNITY_COLLECTIONS_CHECKS")]
    for row 272 and 284 and save. As package copied in to Packages folder it becomes local and Unity wouldn't override it.
    3. Then build Development build (with disabled Burst in
    ProjectSettings->Burst AOT Settings
    for simplifying debugging) and you'll see (I suppose) something like `The entity does not exist` in build console\logs
    upload_2020-10-21_17-14-53.png

    For deeper debugging you can use VS\Rider and attach to process and debug code in build:
    1. Rename package (com.unity.entities@version) in Packages folder to just com.unity.entities
    2. In Edit->Preferences->in case of rider External Tools enable .csproj generating for packages
    upload_2020-10-21_17-15-14.png
    3. And then build your game with this:
    upload_2020-10-21_17-15-45.png
    4. And then run the standalone debugger in Rider\VS it will run build automatically and you can put breakpoints into AssertEntityHasComponent and see what entity comes here.
    upload_2020-10-21_17-18-15.png

    As GetComponentDataWithTypeRW crashes by cascade as a result of the wrong entity, because this trying to get Type from Archetype from Chunk which is wrong as used wrong entity passed to SetBufferRawWithValidation for getting a chunk
    upload_2020-10-21_17-47-42.png
     
    Last edited: Oct 21, 2020
    Vaarz likes this.
  9. Vaarz

    Vaarz

    Joined:
    Mar 19, 2016
    Posts:
    5
    THANK YOU! I was basically at a dead end until this:
    I did not realize that existed and the extra, more accurate messages helped me solve the issue.

    It was totally my fault. I had systems that register a ComponentType on the Client and Server into a List with an ID. The ID is used to "serialize" the ComponentType across the network so I can add it as a tag onto my "Event" I mentioned earlier.
    If you're familiar with the NetCode package, its very similar to the RPC pattern and is heavily based on that.
    The problem was that I copied a bit too much from there and as a part of registering each Type; at the part where I assigned the type I had accidently defined it out using ENABLE_UNITY_COLLECTION_CHECKS :oops:
    This meant that in builds, even though my serialized "ID" came through correctly, when I tried to look up the ComponentType for that ID, it was the default value since it was never actually assigned.
    That default ComponentType is obviously not a valid component to add to an Entity!

    In hindsight, there were some clues in some of the things I checked that I misinterpreted, and some additional things I could/should have checked, but your tip above helped me get there!
    Also, kind of funny you had such a similar issue! A bit serendipitous for me!
     
    eizenhorn likes this.