Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question The patch size and asset bundle problem

Discussion in 'Content Pipeline Dev Blitz Day 2023 - Q&A' started by Baste, Jun 8, 2023.

  1. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,195
    This has been a big problem for a very long time, and all the solutions has been inadequate. My impression is also that Unity doesn't really understand the size of the problem.

    The short version is:
    1: Asset Bundles/Addressables are tools made for swapping/adding assets to an existing build without rebuilding
    2: Unity's patch sizes are very large if you don't put things in asset bundles


    So there's a problem (2), and the required solution (1) that's built to solve a completely different problem. As a result, it's a bad fit. The overhead of pretending that all your assets has to be swapped at runtime when they won't be is really bad.

    What we really want is a way to modify the basic packing of assets in Unity in order to have acceptable patch sizes.

    The long version:
    One of the popular console platforms (NDA lol) has a limit on 512 mb for a patch for a shipped game. If the patch is larger than that, you fail certification, and you have to spend time and energy convincing them to let you publish the patch. This means that bugs are around longer, so it makes our players sad, and it makes us sad, so we want to be under that limit!
    There's also the fact that not every part of the world has fast internet, and certain third world countries like the United States Of America have data caps and limits that makes forcing our customers to download a lot more data than they should have to be really bad. It's also really sad to sit down after a hard day of work and then have your console go "lol no wait an hour there's some gigs I need here). So we want small patch sizes for very many very good reasons.

    In a Unity game, if you don't use Asset Bundles, patch sizes are very, very large. It probably varies by game, but what we have experienced is that changes to scenes, no matter how small, means a binary patch size of roughly 20-25% of the total game size. This means that if you have a 2GB game, and you add a single default cube to a single level, you might already have reached the patch size limit!

    Exactly what's going on to cause this is a completely undocumented black box, but the big difference is in the sharedassets files in the Data folder in a built game. It seems like the compression on the files causes all the bytes to get shuffled around such that very small changes causes very large binary diff sizes.

    Using Asset Bundles sidesteps this by forcing data to be copied instead of shared, and by forcing sets of data to be compressed in certain ways. For example, if each scene is in it's own asset bundle, you can be sure that updating (only) the scene does not change any compressed data for a different scene. This is a side-effect of the asset bundles being designed to be added to a game.

    But asset bundles are really annoying to work with, and Addressables doesn't really help at all. The APIs are longer and stranger to use, you can't use direct references to assets outside the bundle, you have bad string-based identifiers everywhere, things are async or very badly made sync hacks built on async backends by reluctant programmers (sync addressables suuuuck), and generally everything takes a lot longer and you get a lot more bugs. You also have to remember to build asset bundles for your build to function, and that is just a horribly annoying and confusing process.

    So we'd really rather not have to deal with Asset Bundles unless we actually want the features that Asset Bundles are made to support - runtime addition and replacement of assets! Ideally, we should be able to put 3 new enemies in a scene without that requiring a patch that's larger than most PS2 games. It also seems like loading stuff from AB is just straight up slower than loading them through a direct reference, but I haven't tested this too much. So, another potential downside.

    In order to not have to deal with Asset Bundles and still have sane patch sizes, we'd need some way to decide how the assets are packed for normal Unity scenes. I don't know exactly what that would look like, since how data is packed for scenes isn't public information afaik. But things like "please don't share assets between these scenes, duplicate them instead" would at times be a wortwhile tradeoff. Other things like managing which things are compressed together would help a ton with the same issue, and probably help with things like optimization in general. The current one size fits all approach is never going to be optimal at all.

    Is there any chance you could consider something like this?
     
    Apex_Dev_03 and flashframe like this.
  2. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    78
    Thanks Baste for all your thoughts.

    One clarifying question - you mention compression of the sharedAssets for player builds? Does that mean you are compressing the entire player content? E.g. with BuildOptions.CompressWithLz4HC? That compression algorithm does break up the data into individual chunks that are compressed separately but it can still result in widespread changes in the overall Archive file as a player is rebuild. Uncompressed player data is larger but I expect it would patch better.

    It is true that AssetBundles are the current way offered to split data up with more control than what a Player build will allow. There is more control but that can easily result in some more duplicated data, unless the layout is specified quite carefully for shared assets. We are quite interested in improving the layout mechanism and making the concept of bundle work better in multiple uses, as we know there are plenty of cases where they can be shipped along with the player and not downloaded or updated independently of the player.

    The way the player works with Shared Assets it not really meant to be a secret black box. The details of what objects end up inside each output file can be viewed if you enable typetrees on your build (Diagnostics->Editor->ForceAlwaysWriteTypeTrees) and then sent the files through Binary2Text (or with UnityDataTools). And maybe I can come up with a brief high level summary of how the sharedAssets get populated if that would be helpful.
     
    flashframe, mandisaw and Yoraiz0r like this.
  3. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    78
    Summary of the Player Content Build calculation:
    1. Gather dependencies from special singleton objects that make it into the build (called "GlobalGameManagers")
    2. Build the first scene (level0). This is special case, processed before Resources, because we want the first Scene to be particularly fast to load. Hence we want all its dependent data to be in a single sharedAsset file.
    The level0 file contains the heirarchy of the scene, e.g. closely matching what is in the Scene file in the editor. The sharedAssets0.assets file contains objects from Assets that are referenced by the scene.
    3. Gather dependencies for Resources folders and build the resources.assets. This file can reference content already assigned to sharedAssets0.assets instead of duplicating it.
    4. Build the rest of the scenes (level1 to N). Each level file can have its own sharedAsset file, but it can also refer to objects already assigned in any of the already generated sharedAsset files.
    For example level2 can potentially use content from sharedAssets2.assets, sharedAssets1.assets, and sharedAssets0.assets.
    5. Finalize the global data that is actually referenced to strip out unnecessary content. The result is written to globalgamemanagers.assets.

    Of course that is a very high level summary that skips through a bunch of details, but i hope it is helpful to demystify how this works. One implication is that the order of scenes can have a big impact in how data is assigned into the sharedAsset files. For example if originally an Asset is only referenced by level5 and later it gets referenced from level1 then on the next build that data moves to level1's sharedAsset file. Possibly that might explain some of the changes in content that you observe when preparing patches.
     
    tinyant, flashframe, mandisaw and 2 others like this.
  4. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,029
    Speaking for this at the time of writing at latest 2022.3.1f1, I still keep getting the following error when trying to build player runtime. It seems like it's easy to reproduce the error when make a code change and recompile the code. At 2021.3.x lts with the same project. I never see this error before. It seems like it's the huge regression at 2022.3.x that until now still not yet fully addressed and fixed. Currently after I change to SpriteAtlasV2 it seems become much better and less chance to get the error but I will still get it sometimes. You can see this thread that quite a lot of people get the same error. https://forum.unity.com/threads/202...-player-to-globalgamemanagers-assets.1343288/

    Asset has disappeared while building player to 'globalgamemanagers.assets' - path '', instancedID '-161430'
    UnityEditor.BuildPipeline:BuildPlayer (UnityEditor.BuildPlayerOptions)
    BuildMenu:BuildPlayer (UnityEditor.AddressableAssets.Build.AddressableAssetBuildResult,GameBuildConfig/RuntimeBuildType,UnityEditor.BuildTargetGroup,UnityEditor.ScriptingImplementation,UnityEditor.BuildTarget,UnityEditor.BuildOptions) (at Assets/_Build/Editor/xxx)
    xxx:xxx () (at Assets/_Build/Editor/xxxx)
     
  5. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    78
    Hi optimise, I don't have a clear explanation for that error, but i did post something in the thread you reference.
     
    optimise likes this.
  6. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,195
    No, we're just on "default compression". I assume that that includes some kind of compression, due to the name!

    That sounds likely! This also implies that if we say add or remove scene N, every sharedAsset from N to the last scene will be completely different. So if we used to just have MainMenu as scene 0, and then introduced a BootScene instead, we'd get a binary diff of pretty much every single file in the entire game, unless the patcher was smart enough to understand that kind of renaming.


    It's probably not the case, but it does sound like this may also cause later levels to potentially load slower? Since level 100 potentially needs to extract some data from sharedAssets 0-100, that's a lot more IO for that scene than if that scene was moved to level 0, and all the assets for that scene were together in a single file. Or is this not a problem?

    Another question; when you're saying that:
    Does that mean that level X for X != 0 can have references to the resources.asset file? I assume so, since otherwise there would be duplication if an asset was both in Resources and directly in a scene, but I wanted to make sure.


    Your explanation does leave room for an improvement, so my suggestion is roughly this:
    Since levels can know that they need assets from sharedAssets files, I assume that there's a list somewhere in scenes that says "these are the sharedAssets that contain assets that you can end up caring about". What would be very helpful for patch sizes would be if we could define custom sharedAssets, and decide that certain things go into those, instead of the sharedAssets of the first level loaded.

    That would allow the convenience of using direct references to prefabs and BuildSettings to define which levels exist (aka. the convenience of not using AssetBundles), while still being able to control how assets are packed for patch sizes.

    This could still be done through the Addressables system, kinda, ish? Or at least similar windows. The idea of tagging assets with "this asset is packed into this thing" already exists, we just want the thing it goes into to be a sharedAsset file or something similar (ie. a part of a build, auto-referenced) instead of an Asset Bundle (built separately, manually loaded)
     
    mandisaw likes this.
  7. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    78
    Hi Baste, a few follow up comments:

    The default compression for Player is no compression at all, e.g. the level0, level1... and other files are visible individually in the build output instead of combined in a single compressed archive file. Meanwhile the default compression for AssetBundles is LZMA.

    You are correct in pointing out that the later the scene index in the list then the more dependencies it might have on other shared Assets and the more files that might be loaded when loading the scene. And I double checked with another developer and got confirmation that the resources.asset file can be referenced by level1 and beyond. He also mentioned that the GlobalGameManagers.assets file can be referenced. The end result is no duplication in player builds. But there is some potential for churn in the content, in the sort of scenarios we are discussing, as the scene list changes.

    AssetBundles are the only way to control the sharing of data and layout of Assets in the output files. In fact, unless you put Scenes into the same AssetBundle then they will have completely independent sharedAssets files. That sometimes can mean a lot of data is duplicated. When multiple scenes are together in the same AssetBundle then sharing calculation much like the one I described for player builds is performed, to build the sharedAssets files inside the AssetBundle.

    The Addressables package is a convenience layer built on top of AssetBundles, so largely the same layout behavior applies there as well. But you won't need to manually load AssetBundles when you use it.

    Also thanks for your thoughts about how this can improve in the future, these ideas are helpful as we plan future improvements in the build area.
     
    mandisaw likes this.
  8. Apex_Dev_03

    Apex_Dev_03

    Joined:
    Aug 18, 2023
    Posts:
    3
    I have the same problem as @Baste and have come to the same conclusions about Asset Bundles and Addressables. The amount of bugs that could be introduced by referencing everything by string and manually managing dependencies / manually loading from disk far outweighs the benefits (because we would be using them to make our builds more deterministic rather than swapping/loading assets at runtime, which we have no use for).

    Sometimes we need to release a hot fix that is a simple code change, and after rebuilding the diff is as much as 3 GB (~1/5 the project build size). It would be really nice to be able to recompile Assembly-CSharp in release mode and skip the asset portion of the build.