Search Unity

  1. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice
  2. Unity is excited to announce that we will be collaborating with TheXPlace for a summer game jam from June 13 - June 19. Learn more.
    Dismiss Notice

Bug Cache error when upgrading packages results in invalid scripts, which then causes asset corruption.

Discussion in 'Unity Accelerator' started by Tim2021, Nov 14, 2023.

  1. Tim2021

    Tim2021

    Joined:
    Jan 19, 2021
    Posts:
    22
    I ran into a very weird error, and I think I have tracked it back to a caching error in Accelerator.

    It has taken me a week of debugging and experiments to get to this point, and I've essentially run out of leads to follow as well as run out of time to work on this so I'm going to dump everything I've discovered here in the hope that it helps someone.

    TL;DR: Updating the package manifest while Unity is closed (when you pull from source control) causes packages to import incorrectly from the Accelerator cache, breaking the project and then causing data corruption in serialised assets.

    Wall of Text:

    After upgrading the com.unity.addressables package from 1.21.17 to 1.21.19 It seemed to work fine on my machine. It was just a simple change to the package manifest.json and I was able to build the game locally along with addressables. However when I merged the change back in to the main branch our CI build started spitting out builds very fast. The builds were tiny compared to what they should have been and on inspection it turned out they had no addressable/asset bundles.

    This then started happening randomly on other team members machines.

    After some digging what I found was that certain scripts within the addressables package had not imported correctly. the source code was there, and it was compiling correctly But for some reason Unity's MonoImporter hadn't recognized them as containing valid classes for serialization.

    If I look at one of the broken scripts (BundledAssetGroupSchema.cs is the one that I first noticed was bas) in the inspector (in Debug mode) they look like this

    upload_2023-11-14_15-52-44.png

    Note the missing "Class Name" property and so on when compared to a working script that looks like this:

    upload_2023-11-14_15-53-28.png

    Because the serialization for these classes was now missing all the serialized assets of that type were now missing (sort of), selecting one of these in the project shows an empty object. Interestingly the script does show up as correct, not missing, but all the other contents of the asset are not there. All my BundledAssetGroupSchema scriptable objects were empty and looked like this (empty):

    upload_2023-11-14_15-58-37.png


    Whereas they should look like this (full of lovely data):
    upload_2023-11-14_16-3-23.png

    More images and discussion in thread...
     

    Attached Files:

  2. Tim2021

    Tim2021

    Joined:
    Jan 19, 2021
    Posts:
    22
    If I look at the AddressabeGroups that reference these schemas they should have two entries in their schema set (one for the ContentUpdate schema, and one for the BundledAsset schema), However we only see the one schema:

    upload_2023-11-14_16-9-42.png

    Interestingly The serialized asset still has the guids for both schemas, it is just that one doesn't even show up. It is also interesting that it is completely missing, it is not that the list is two long and one of the entries is null or missing, In the inspector it looks like the list is only 1 element long even though in the actual .asset on disk the list is two long:

    upload_2023-11-14_16-13-21.png

    These missing schemas were what was causing our builds to fail. There were probably other subtle errors caused by the other missing scripts, but this was one was the first we noticed and where I've been focusing my diagnostics.


    If I locally force reimport the Addressable package everything does go back to working, the Monoscripts import correctly, the Schema objects reappear, the lists of schemas (sometimes) go back to having two entries and builds start working again

    I do however see the following warnings in the log, which all correspond to the scripts that had had trouble importing.

    upload_2023-11-14_16-17-47.png



    The big trouble is that if something causes the addressable groups to reserialize while the project is in this weird inconsistent state then Anything referencing one of the broken assets (like the Addressable Group Schemas) will permanently serialize with either a null or shortened List, meaning it stays broken even after the package is manually reimported, until we go back and manually recover the data from source control.

    Unpicking that mess has consumed a large amount of my time since this first hit us last week.
     
  3. Tim2021

    Tim2021

    Joined:
    Jan 19, 2021
    Posts:
    22
    I've been working on recreating the issue in as minimal project at possible in preparation for submitting a formal bug report, but I think at this point the actual issue is that at some point in the past our cache server ended up with corrupt data in it for some unknown reason, and all of that weird behaviour I described above is actually a symptom of that rather than the root cause.
    I can recreate the symptoms of the issue using the following steps:
    1. Start with the package manifest.json pointing to com.unity.addressables 1.21.19
    2. Force reimport the Addressables package so that everything is nominally correct.
    3. Edit manifest.json to point to addressables version 1.21.17 (This simulates switching branch in git to the main branch in preparation for merging the upgrade to main) It doesn't matter if Unity is open or closed when this happens, it seems to work fine and the package successfully downgrades a version
    4. Close Unity
    5. Edit Manifest.json to point to addessables version 1.21.19 (This simulates the git operation of someone merging the package update from main to their branch)
    6. Open Unity and observe that the addressables package has failed to import as descibed above.
    Interestingly the breakage only happens if Unity is closed when the manifest changes. If you have Unity open and change from 1.21.17 to 1.21.19 then the package imports correctly. I suspect that this is why the problem seemed to cropped up on random people's PCs, it all depended on whether

    I can recreate these symptoms in our full project, our project with ALL of the assets deleted and only the package manifest left, and in a completely empty, brand new project.

    The symptoms seem to go away if I do any of the following:
    1. Disable the cache server in the project settings
    2. Disable "Download" in the Cache server settings
    3. Change the "Namespace prefix" in the Cache server settings.
    This is why I now suspect that this is a caching bug and our cache server has somehow ended up with corrupt data in the original "default" namespace.

    I have not managed to recreate the circumstances that created the corruption in a new namespace, which is why I'm out of leads to follow at this point.

    Disabling downloading from the cache server is not a viable workaround as it completely negates the point of having a cache server.

    Changing the namespace prefix whenever we upgrade a package version is painful, it takes a couple of hours to completely reimport the project and rebuild the cache in the new namespace. It also isn't foolproof, if someone forgets to do it then we get back into the situation where objects are invalid and any reserialisation of data around them will result in semi-permanent lost data getting checked back in to source control.

    I have actually seen this problem once before. It occured when we were last updating Unity version (and a whole bunch of packages) that time it impacted the com.unity.inputsystem package, but I wrote it off at the time as weirdness around the Unity upgrade as there were a lot of upgrade issues we had to work through.

    I'm happy to submit this via the bug reporter, but If I just submit the empty project the bug probably won't be recreatable without also using the same cache server. Currently the cache server data is well over 100GB.


    All of this is in Unity 2022.3.7 (LTS) with Accelerator version v1.0.941+g6b39b61 both running on Windows.
     
  4. unity_Jonny

    unity_Jonny

    Unity Technologies

    Joined:
    Feb 11, 2020
    Posts:
    24
    Firstly, thanks for the amazing write up! That's really helpful.

    There's a lot going on there, but yes, it does seem like somehow in the package upgrade process on launching Unity, the cache receives some invalid data.
    It is strange that it works if the Editor is open when the manifest is edited, but fails if its closed. Perhaps there is some different code path related to package management on first run up of the Editor, and if the import happens in stages, a cache event could happen on partial or invalid imports.
    I wonder if the actual poisoned cache event happens in step 3 you described above - the Editor inadvertently caches invalid data during the downgrade, but the import eventually succeeds locally and does not update the cached data, then on the next launch of the Editor, it downloads the corrupted data?
    I'd like to monitor the download events that occur to find out when the local files get updated

    If you're happy to submit a bug report that would be awesome, you don' tneed to supply a repro project if you have a set of repro steps that reliably show the issue, on a fresh project. Just detail the steps as much as you can.
    If you use the Bug Reporter in the Editor it'll prompt for the details and it'll also grab the current Unity version etc.
    Thanks!
     
  5. Tim2021

    Tim2021

    Joined:
    Jan 19, 2021
    Posts:
    22
    I can only recreate the issue in a clean project if I have that project pointed to our instance of the cache server, and the namespace prefix set to be the old "default" value that our main project was using when all this happened. If I'm running on a clean cache namespace everything works fine.

    Like I said somewhere in the wall of text, I suspect the actual bug happened a while ago, that cache namespace now contains bad data and what we are seeing as I go back and forth between versions is just a symptom of that original corruption that never seems to get flushed out.

    I don't know the recreation steps that corrupted the cache.

    I can probably zip up and include the cache server data folder so that you could set up a local instance with the same data but at this point it is ~120GB will the bug reporter even cope with that?
     
  6. Tim2021

    Tim2021

    Joined:
    Jan 19, 2021
    Posts:
    22
    Bug report submitted. The bug reporter really struggled with the size of the files I needed to attach so that you could replicate the cache server but I think we got there in the end.
     
  7. unity_Jonny

    unity_Jonny

    Unity Technologies

    Joined:
    Feb 11, 2020
    Posts:
    24
    That's great thanks.
     
  8. dKleinTriCAT

    dKleinTriCAT

    Joined:
    Jul 2, 2019
    Posts:
    21
    @unity_Jonny @Tim2021
    Some information from our side, as I think we are suffering from the same issue.

    We believe by now that this issue is not caused by the accelerator, it just gets worsened by it because the accelerator accepts and spreads the corrupted files.

    We disabled the accelerator and deleted the library folder on all of our machines, but after a couple of package updates the problem keeps reappearing. It's always scripts that are not imported properly/corrupted and then cause all kinds of other things to fail. We are using many custom packages to modularize our internal framework and so we actually do several package updates a week. Things don't break with every package update we do, but often enough that we currently do a full reimport for every nightly build that we make. Also every 1-2 weeks our entire dev department has to stop what they're doing and reimport their library for 2 hours because things will be broken that prohibit even entering playmode.

    We have tried to create a small repro project and have managed to create those broken imports on package update even there once, but sadly it seems far more unreliable than in our actual project and we don't quite know exactly what is causing it.
     
  9. jamie_xr

    jamie_xr

    Joined:
    Feb 28, 2020
    Posts:
    67
    I'll add to this. We have indeed seen the same problems and have disabled accelerator in order to circumvent it.

    Package upgrades for us would more often than not require the entire cache server to be purged.
    @Tim2021 This would be similar to you changing the namespace (you essentially have cleared the cache there as you are pointing at a namespace with 0 cache). We however use a namespace prefix other than "default" as we have multiple projects using the cache server (well we did until this issue derailed the whole thing).

    We were also unable to repro it in a small project - even in our project it was quite intermittent. I've wanted to supply a small project for a bug report also for nearly a year now, but we've not been able to, and unable to figure out what caused it.

    It's sad that we had to disable the cache server. Our projects get bigger and bigger and as the team grows there are is such a high velocity of changes that even a few hours you can fall behind and it's going to be at least an hour or more the import the newest stuff.

    @unity_Jonny Can you provide any update?
    @Tim2021 Can you link the bug report, I'd like to monitor progress and upvote it.
     
  10. gsylvain

    gsylvain

    Joined:
    Aug 6, 2014
    Posts:
    104
    I'm investing a very similar issue as well!
    I'm connected to a cache server. Pulling a working state from perforce into a clean workspace, no library folder. The first unity boot is chaotic, AddressableAssetSettings not found, some packages are slightly corrupted (cinemachine, textmeshpro, addressables, probably more). The only way to recover is to manually reimport individual packages. Like OP, I see pretty much this warning for every scripts:

    Code (CSharp):
    1. ConsistencyChecker - guid: ac62285cba7c64612b59f2c0c4124c96, dependenciesHash.value: 0e80f8fbec09af5f0214050fa1a5e0f8, artifactid: f65177e44ff2f3a97f5253752e0fca2b, producedFiles[0].extension: , producedFiles[0].contentHash: 3f2b79d5f32a6dd5ef6dbf5399d67cd5
    2. ConsistencyChecker - guid: ac62285cba7c64612b59f2c0c4124c96, dependenciesHash.value: 0e80f8fbec09af5f0214050fa1a5e0f8, artifactid: 097293a8de91960b90c10de9266a9b4f, producedFiles[0].extension: , producedFiles[0].contentHash: 5e4d7145c5aa2b0879c7722bc06c6312
    3. Importer(MonoImporter) generated inconsistent result for asset(guid:ac62285cba7c64612b59f2c0c4124c96) "Packages/com.unity.addressables/Editor/Build/Layout/BuildLayoutHelpers.cs"
    So I guess it is not really related to switching version as in my case, I'm booting on a single known clean state.

    While digging around, I've found this page: https://docs.unity3d.com/Manual/ImporterConsistency.html
    I deleted my Library and tried this: unity hub -> my project -> 3 dots menu -> add cmd line args -> paste this: -consistencyCheck. Booting the project was a success although it took way longer (duh). Note that I've also tried booting with the arg -consistencyCheckSourceMode cacheserver / local but it didn't worked.

    Next, I went into project settings -> editor -> cache server -> i've set the mode to Enabled instead Using global settings. From there, I did some tests with the "Content validation" dropdown + delete library + fresh boot... nothing worked. The only way that finally resulted in a working state was by disabling "Download"....... which is pretty much deactivating the cache server.
     
    jamie_xr likes this.
  11. dKleinTriCAT

    dKleinTriCAT

    Joined:
    Jul 2, 2019
    Posts:
    21
    I do believe the cache server is just a bystander in this issue. Somehow the packages get corrupted and the cache server doesn't seem to be able to detect that corruption, so it accepts the upload of the corrupted files. From that moment on of course you will always download corrupted files until you wipe the cache server. This makes the problem worse, however I don't believe the cache server is what causes it, as we have also seen that library/package corruption appear with a disabled cache server.

    I have recently (after 6 months) received the first comment to my unity bug report ticket, so I will try again to get a repro project. They did however not provide me with any pointers on the import process or how to more successfully diagnose or reproduce this issue whatsoever, so I'll be flying blind again.

    I'll probably create a bunch of testpackages and see if installing/updating them via archive or local server will eventually break a small project. I've already managed to do this in the past, but as others have stated, it has been very unreliable and I am likely missing something about what makes the library break.
     
    gsylvain likes this.
  12. gsylvain

    gsylvain

    Joined:
    Aug 6, 2014
    Posts:
    104
    Interesting! I'd like to add that I ended up re-enabling the cache server + set a project specific namespace in the cache server settings... something like "MyProjectUnity2022". Seems like it prevented artefacts generated with 2022 from being corrupted by our other unity project, or even our own project still running in 2021 (previous release streams).
     
  13. unity_Jonny

    unity_Jonny

    Unity Technologies

    Joined:
    Feb 11, 2020
    Posts:
    24
    dKleinTriCAT you're right, the cache server is just caching data it believes is valid.
    Without ConsistencyChecking, it will cache whatever it is handed.
    Using the 'cacheServer' mode of ConsistencyChecking forces the Editor to check the meta data hash on the cacheserver before deciding whether to download.
    To use those options you need to use a command line with both:
    -consistencyCheck -consistencyCheckSourceMode cacheserver

    The first arg enables it, the second sets the mode.

    @gsylvain the namespace prefix segregates all data on the cache server internally, so you can treat different projects with different caches - the same asset imported with different namespace prefixes set, will get cached twice, into different cache namespaces. This can be useful, and is recommended, for separating not only different projects, but also different Unity versions.

    One of the biggest sources of asset imprt artifact indeterminism, and often cache corruptions is ScriptedImporters. If you suspect your importer is not deterministic - cannot produce binary identical results reliably - or is otherwise somehow causing a corruption, then you could consider disabling the caching for that importer only, see https://docs.unity3d.com/ScriptReference/AssetImporters.ScriptedImporterAttribute.html and set
    AllowCaching = false
     
  14. unity_Jonny

    unity_Jonny

    Unity Technologies

    Joined:
    Feb 11, 2020
    Posts:
    24
    For the most strict setting of cache artifact content checking, use the Content Validation setting in the Editor Project Settings -> Editor -> Content Validation and set it to Required.
    I think this first appeared in Unity 2022.1, and got backported to Unity 2021.3
     
  15. dKleinTriCAT

    dKleinTriCAT

    Joined:
    Jul 2, 2019
    Posts:
    21
    @unity_Jonny
    Thanks for the info, but we are already using those consistency modes. Sadly they don't seem able to detect this. What we did is disable the upload of artifacts for everyone except the CI and completely purge the CI Library everytime it runs. This is of course not ideal, but it has reduced the frequency in which we get a broken cache state on the server by a lot. If we were able to just ban any scripts from being cached, that might take the cache server entirely out of this problem, but we haven't found a way to do this.

    But as stated previously, the cache server I don't think is the issue. The problem is that the unity MonoImporter is unable to consistently import scripts (and maybe other assets) in packages. There is no special ScriptedImporter at work here, the library just gets more likely to corrupt the more custom packages you have and the more often you update. When enough scripts/assets have been corrupted for you to notice the only way to get your project back into a working condition is to purge the entire library (As reimporting the affected scripts only works until you close and open unity again).
     
    Last edited: Apr 12, 2024