Search Unity

Official About Incremental Build and Asset Bundle Hashes

Discussion in 'Asset Bundles' started by AndrewSkow, May 12, 2023.

  1. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    The BuildPipeline.BuildAssetBundles() API is widely used in currently supported versions of Unity to build AssetBundles. This post describes how the AssetBundle Hash and Incremental Build works in the context of that API, and gives some recommendations based on some known limitations.

    In particular we recommend:
    • Doing clean builds when building official releases.
    • When calling UnityWebRequestAssetBundle(), avoid using the AssetBundle hash as the version.
    • (Updated) Starting with 2022.3.8f1 you can use BuildAssetBundleOptions.UseContentHash flag when building bundles, which makes the AssetBundle hash safer for use with UnityWebRequestAssetBundle().
    The rest of this post explains why these recommendations arise, by diving into some details of how the build, hashing and caching work.

    How does Incremental Build work?
    In the BuildPipeline.BuildAssetBundles() implementation, the AssetBundle hash is used to capture the contents and dependencies of the AssetBundle.

    The flow is as follows:

    • Calculate the AssetBundle hash based on the current inputs (see later section for details)

    • If there is a .manifest file from a previous build of the AssetBundle then load the hash and compare

    • If the hashes match then calculate the TypeTreeHash

    • If the AssetBundle Hash and TypeTreeHash both match then do not rebuild the AssetBundle (unless the ForceRebuildAssetBundle flag is specified)

    • If the AssetBundle is built then the new AssetBundle and TypeTree hashes are serialized in the newly generated .manifest file.
    This is an example Manifest, showing the data that supports the incremental build.

    Code (CSharp):
    1.  
    2. ManifestFileVersion: 0
    3. CRC: 2088487739
    4. Hashes:
    5.   AssetFileHash:
    6.     serializedVersion: 2
    7.     Hash: 01c155e080f1c19eab7eacdf3723cae7
    8.   TypeTreeHash:
    9.     serializedVersion: 2
    10.     Hash: 55501093163a37cf23c863ea4050548f
    11. HashAppended: 0
    12. ClassTypes:
    13. - Class: 114
    14.   Script: {fileID: 11500000, guid: 0652a8087db49d248a409593b2b5624b, type: 3}
    15. - Class: 115
    16.   Script: {instanceID: 0}
    17. SerializeReferenceClassIdentifiers: []
    18. Assets:
    19. - Assets/MyScriptableObject.asset
    20. Dependencies: []
    21.  
    How is the AssetBundle Hash calculated?
    The AssetBundle hash is based on the inputs to the AssetBundle build, not the output of the build.

    This value is calculated by hashing a series of inputs that include:
    • TargetPlatform, subtarget

    • Explicitly and implicitly included assets (Specifically the artifactID from the AssetDatabase)

    • Names of the AssetBundles that it depends on

    • Mesh stripping setting

    • Certain BuildAssetBundleOptions

    • For scene bundles: certain global lighting settings (e.g. lightmap mode, fog mode, etc)

    • Shader platform / graphics APIs

    • For bundles with shaders: certain Render Pipeline assets
    Although a pretty exhaustive calculation, this does not capture every possible influence that can impact the build. Based on bug reports we are aware of some limitations. E.g. Performing the following changes will not change the AssetBundle hash:

    • If a asset is moved to a new path (fixed in Unity 2022 and later)

    • If a MonoBehaviour or ScriptableObject keeps the same class name, but moves to a new assembly or namespace

    • If an asset inside a dependent AssetBundle moves to another dependent AssetBundle

    What is the TypeTreeHash?
    The Native AssetBundle build implementation uses a second hash value during Incremental build calculations called the TypeTreeHash. This value is visible in the .Manifest file, so this section briefly explains how that second value works.

    This hash is derived from all the types involved in the AssetBundle. These types are listed in the ClassTypes section of the manifest file, then for each type the hash of the TypeTree is fed into the TypeTreeHash.

    For Script types the ClassName, Namespace and Assembly are also hashed.

    The purpose of this hash is to detect whether any objects used in the AssetBundle have newer serialization formats. For example adding new fields to a MonoScript or updating to a new version of Unity that changes some built-in objects. A change in serialization format means that the AssetBundle should be regenerated to reflect the latest serialized schema for those objects. Unity does provide its best effort to have backward compatibility to older serialized schemas, e.g. when TypeTrees are included in the AssetBundle, but it is normally best to rebuild content if any type changes for performance and compatibility purposes.

    Warning: this TypeTree hash is not part of the AssetBundle hash. So while changes in this hash can force an incremental build, it doesn’t force a change in the overall hash value for the AssetBundle. This is one of the reasons that the AssetBundle hash is not an ideal value for tracking file versions.

    This check can be disabled by specifying the BuildAssetBundleOptions.IgnoreTypeTreeChanges flag.

    Can the known limitations be fixed?
    The cases of incomplete hashing can have serious impact, with the potential of null references, crashes or other unexpected failures on end user devices. That is because an older AssetBundle, with out-of-date content, might have the same hash as a newly built AssetBundle that has correct content. We can call this a “hash conflict”.

    As mentioned above we are aware of some limitations in the hashing algorithm, so a logical step would be to fix these known limitations.

    However, because the input calculation and the visible Hash are the same thing, there are backward compatibility challenges for improving the incremental build calculation.

    For example, if Unity starts to incorporate more information about script types into the hash, then this would change the hash for all existing AssetBundles that have MonoBehaviours and ScriptableObjects. Existing projects that do a minor upgrade of Unity version might suddenly see all their stable AssetBundles requiring a new build, even when the resulting content is actually unchanged. Rebuilding can take a long time, and deploying new AssetBundles can result in large usage of bandwidth or excessive downloads to devices. So we try to keep things quite stable in the area of AssetBundles on our Long Term Support versions of Unity. Because of that concern, fixes to the Incremental build calculation are not normally backported, and require the introduction of new flags.

    For new releases of Unity we are able to improve the AssetBundle pipeline code and reduce these limitations. That is because AssetBundle content will practically always change, at least a little bit, when doing a major upgrade of Unity for an existing project, so it is a good opportunity to introduce code changes that effectively force a clean build.

    Clean Builds
    The risk of incremental builds is that Unity might decide that an AssetBundle from a previous build is valid, based on all the checks that it performs as it calculates the AssetBundle hash and TypeTree hash. Because those checks are not 100% exhaustive then it may leave an AssetBundle alone that would actually have different content if it had been rebuilt.

    The ForceRebuildAssetBundle flag can be used to force each AssetBundle to rebuild, even if the input hash and typeree hash have not changed. Because incremental builds rely on the .manifest files then it is possible to force the rebuild of AssetBundles by erasing the build folder. In fact, erasing the output folder prior to a build can be a good approach, to clear out any obsolete or renamed AssetBundles prior to a fresh build.

    Of course the downside of clean builds is the performance cost of repeating unnecessary build work, potentially adding many hours to the build time. In more advanced situations, where users have a very precise idea what which AssetBundles need to be rebuilt, then it could be feasible to erase individual .manifest files, instead of using ForceRebuildAssetBundle . That would be a way to force certain AssetBundles to rebuild while leaving others that have predictable content to be handled by the regular Incremental Build calculation.

    Incremental builds may make sense for internal builds, e.g. testing builds during production, to help with iteration time. The risk of a hash conflict can exist but with less serious impact. And in fact hash conflicts can be detected with some extra code running as part of the build script.

    Doing a clean build doesn’t prevent the possibility that multiple versions of an AssetBundle can have the same AssetBundle hash, instead it just forces the “correct” current version is generated.

    Overview of UnityWebRequestAssetBundle and the AssetBundle Cache
    The UnityWebRequestAssetBundle API makes it easy to incorporate AssetBundle downloading into a player build, especially because it is available on all supported platforms. On most platforms this includes caching support.

    In order to use the AssetBundle cache a version must be specified, otherwise the same bundle can be downloaded over and over again, every time it is requested.

    It is up to the user to provide any 128-bit (hash) or 32-bit (uint) value they like to distinguish the “version” of the AssetBundle. This could be the hash value calculated by the AssetBundle build, or a regular numeric version number (1,2,3…) or some other value that fits into the customer’s build and release system.

    For example the second argument to this signature is the version hash:

    Code (CSharp):
    1. UnityWebRequest UnityWebRequestAssetBundle.GetAssetBundle(Uri uri, Hash128 hash, uint crc);
    The specified version is recorded in the cache, along with the downloaded AssetBundle.

    If, at a later time, the code attempts to download the same bundle again, and specify the exact same version hash, then the cached version will be reused, rather than a new download. If the version hash does not match then the AssetBundle is downloaded again, and becomes the newly cached version.

    Note - this check is simply checking whether the provided 128-bit value matches the 128-bit value when the AssetBundle was put into the cache, there is no hashing performed at that point.

    So, to successfully use this design, it is important to update the version when a new AssetBundle build is released for download. That way devices will download and use the newer version instead of a previous version that might be cached locally.

    Note: If a downloaded file is using LZMA format (which is the default) then it is recompressed on the device to LZ4 and put into the cache. The AssetBundles can also be cached in uncompressed format by setting Caching.enableCompression to false. These compression transformations can have implications if the full file hash is being used as a unique AssetBundle version identifier.

    Should the AssetBundle Hash be used as a version?
    It can be tempting to use the AssetBundle hash reported in the Manifest files as a version hash for the AssetBundle cache. For example a user may distribute the Manifest AssetBundle after doing a new build, and then run code in the player to enumerate AssetBundles and get their hashes, e.g. with AssetBundleManifest.GetAssetBundleHash. That hash might then be provided when calling GetAssetBundle() as a way to enable caching.

    However, as mentioned previously, there are limitations of the AssetBundle hash. So it is possible that an AssetBundle is rebuilt with new content, but the AssetBundle hash is exactly the same, which is a “hash conflict”. That means that a device may have an older, incompatible version of the AssetBundle cached locally, and it will be stuck using the old one instead of downloading the new one because it thinks it already has that version. This could show up as unexpected behavior, like missing content or crashes on end user devices, that are not reproduced when using a fresh install.

    Recovery from such a situation might require some extra coding. If there are two bundles in circulation with the same version hash then the CRC can be useful. There is support in the AssetBundle.LoadFromFile() API to check the CRC and fail the load if the content doesn’t match the expected CRC. This would detect an incompatible AssetBundle, so that it can be discarded. However, doing an CRC check on each AssetBundle load can really slow things down, so normally we only recommend it for use with UnityWebRequestAssetBundle.GetAssetBundle(). That API only checks the CRC at the time of download, which is efficient but doesn’t help if an AssetBundle with the wrong CRC is already cached! Users could potentially write their own code to check CRCs more efficiently, e.g. only checking CRC of cached AssetBundles when a new release of a game has occurred. The Caching API can be used to enumerate the cache and clear individual items.

    It is also possible to detect hashing conflicts at the time of the build, and avoid releasing a new bundle if its hash has not changed. Resolving this situation might require renaming bundles or making small changes to the content to bypass the problem.

    Alternative Version Approaches
    Rather than facing the challenge of recovering from a “hash conflict”, it seems better to use a more unique value as an AssetBundle version.

    For example the CRC itself can serve as a version identifier (cast into a 128 byte value). Because it is calculated based on the uncompressed content it is resilient to compression changes, and it is based on the actual built content, so does not have the flaws of the Native AssetBundle hash calculation. The .manifest file itself could be hashed, because it contains the CRC along with other distinguishing content. Or the build pipeline of a production may have its own version counting or time stamp available that can serve as a unique version identifier.

    It is also possible to hash the bytes of the AssetBundle file. Hashing file content is a common and robust way to assign a unique version identifier to a file. However when using this approach it is recommended to use the BuildAssetBundleOptions.AssetBundleStripUnityVersion flag when building the AssetBundles. And it is important to be aware of the compression changes that can occur if AssetBundles are built with LZMA, instead of the compression used in the Cache (LZ4 or Uncompressed), because any recompression will change the file’s content.

    No matter what method is used, there needs to be a mechanism to distribute these versions as a new build is uploaded. This could be a simple JSON file that is generated as a post-build step, and the player build will download this file as a first step of checking for updated AssetBundles.

    Update: BuildAssetBundleOptions.UseContentHash

    Starting with 2022.3.8f1 we have introduced a new flag, BuildAssetBundleOptions.UseContentHash. When specified Unity will use the content for the AssetBundle hash, instead of the input hash described previously. The decision for whether to rebuild an AssetBundle is still dependent on the input hash, so that will also be tracked as an additional value in the .manifest file. Using the flag means the AssetBundle hash is safe to use for UnityWebRequestAssetBundle and "hash conflicts" should not occur.

    There can still be bugs in edge cases that impact incremental builds, where an AssetBundle does not rebuild even though some input that influences the build results has changed. So it remains a recommendation to use clean builds for official releases.

    It is strongly recommended to also use the BuildAssetBundleOptions.AssetBundleStripUnityVersion flag when UseContentHash is use, so that the content is not changed after minor Unity upgrades.

    Note also the content-based AssetBundle hash cannot be calculated without actually building the AssetBundle. So it is no longer available when BuildAssetBundleOptions.DryRunBuild is specified.

    Other AssetBundle APIs
    The Scriptable Build Pipeline (used by Addressables) and the Multi-process form of BuildPipeline.BuildAssetBundles (introduced in 2023.1) use content inside the bundle to calculate the AssetBundle hash. That makes them much more resilient to the risk of a “hash conflict”, and safer to use with UnityWebRequestAssetBundle.GetAssetBundle().

    The BuildAssetBundleOptions.AssetBundleStripUnityVersion flag is recommended (or the equivalent flag in Addressables) so that doing a minor upgrade in Unity does not force a change to the AssetBundle hash.

    Doing a clean build for official releases is always recommended, including when using Addressables. While our newer approaches have better input calculations, there still can be some cases where a global setting, build callback or other factor can influence the content of the AssetBundle in a way that is not predicted by the incremental build calculation.

    Conclusion
    Hopefully this deep dive into the details of the AssetBundle incremental build support is helpful for managing AssetBundles using BuildPipeline.BuildAssetBundles(). The implementations available through Addressables and improvements for 2023 have addressed the risk of “hash conflicts”. And BuildAssetBundleOptions.UseContentHash is available in more recent versions of 2022. But older versions of Unity and BuildPipeline.BuildAssetBundles() are still widely used. This older API has been successfully used by many projects to deploy content, but being aware of the risk of hash conflicts can help avoid some potential pitfalls.

    We also hope that, by posting this to the forum, the community will chime in and share some techniques and best practices for dealing with Incremental Builds and the AssetBundle cache.
     
    Last edited: Aug 14, 2023
  2. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    I've also prepared a demonstration script to help make the rather technical text a bit more concrete.

    It has only had some mild testing, so consider that there are lots of disclaimers here that putting it into production is at your own risk. But if you detect any bugs or make improvements I would like to hear about it in this thread, and I hope it can be very helpful.

    Code (CSharp):
    1. using System;
    2. using System.Collections.Generic;
    3. using System.IO;
    4. using System.Security.Cryptography;
    5. using System.Text;
    6. using UnityEditor;
    7. using UnityEngine;
    8.  
    9. // Sample code showing how to use existing Unity and System APIs to monitor the output of an AssetBundle Incremental Build,
    10. // including detection of "hash conflicts".  It was tested with 2021.3, please report any bugs to the Unity forums where
    11. // this was posted.  Feel free to incorporate into your own build scripts if it is useful.
    12.  
    13. // Use: Update the script according to your own build layout.  Then run the menu item "Test Repro/Build Asset Bundles".
    14. // The output is written to the Console window.
    15.  
    16. public class BuildBundles
    17. {
    18.     static AssetBundleBuild[] GetBundleDefinitions()
    19.     {
    20.         // Adjust this method according to your own Assets and desired build layout
    21.  
    22. #if TRUE
    23.         AssetBundleBuild[] bundleDefinitions = new AssetBundleBuild[2];
    24.  
    25.         bundleDefinitions[0].assetBundleName = "bundle_prefab";
    26.         bundleDefinitions[0].assetNames = new string[]
    27.         {
    28.             "Assets/MyPrefab.prefab",
    29.         };
    30.  
    31.         bundleDefinitions[1].assetBundleName = "bundle_sobject";
    32.         bundleDefinitions[1].assetNames = new string[]
    33.         {
    34.             "Assets/MyScriptableObject.Asset"
    35.         };
    36.  
    37.         return bundleDefinitions;
    38. #else
    39.         //This call can be used to build AssetBundles as defined via Inspector / AssetDatabase
    40.         return ContentBuildInterface.GenerateAssetBundleBuilds();
    41. #endif
    42.     }
    43.  
    44.     [MenuItem("Test Repro/Build Asset Bundles")]
    45.     static void BuildAssetBundles()
    46.     {
    47.         string buildPath = Application.streamingAssetsPath;
    48.  
    49.         // Build the AssetBundles
    50.         // To test it out you can experiment with calling it again, e.g. after making changes to the ScriptableObject code, and seeing what rebuilds
    51.         // A new player build should happen each time the bundles change
    52.  
    53.         var bundleDefinitions = GetBundleDefinitions();
    54.         var reporter = new IncrementalBuildReporter(buildPath);
    55.         reporter.ReportToConsole();
    56.  
    57.         Directory.CreateDirectory(buildPath);
    58.  
    59.         // Note: BuildAssetBundleOptions.ForceRebuildAssetBundle is intentionally not specified, so that the effect of Incremental Build can be viewed
    60.         var manifest = BuildPipeline.BuildAssetBundles(buildPath,
    61.             bundleDefinitions,
    62.             BuildAssetBundleOptions.AssetBundleStripUnityVersion,
    63.             BuildTarget.StandaloneWindows64);
    64.  
    65.         if (manifest != null)
    66.         {
    67.             reporter.DetectBuildResults();
    68.             reporter.ReportToConsole();
    69.         }
    70.         else
    71.         {
    72.             Debug.Log("Build failed");
    73.         }
    74.     }
    75. }
    76.  
    77. struct BundleBuildInfo
    78. {
    79.     public Hash128 bundleHash; // As calculated by Unity
    80.     public uint crc;           // As calculated by Unity
    81.     public DateTime timeStamp; // Can be used to detect if the bundle was rebuilt or the previous file was reused
    82.     public string contentHash; // MD5, Expected to always change when the CRC changes and vis versa
    83.  
    84.     public override string ToString()
    85.     {
    86.         return $"Unity hash: {bundleHash} Content MD5: {contentHash} CRC: {crc.ToString("X8")} Write time: {timeStamp}";
    87.     }
    88. }
    89.  
    90. public class IncrementalBuildReporter
    91. {
    92.     Dictionary<string, BundleBuildInfo> m_previousBuildInfo; // map from path to BuildInfo
    93.     string m_buildPath; // Typically a relative path within the Unity project
    94.     string m_manifestAssetBundlePath;
    95.  
    96.     StringBuilder m_report;
    97.  
    98.     public IncrementalBuildReporter(string buildPath)
    99.     {
    100.         m_report = new StringBuilder();
    101.         m_buildPath = buildPath;
    102.         m_previousBuildInfo = new();
    103.  
    104.         var directoryName = Path.GetFileName(buildPath);
    105.  
    106.         // Special AssetBundle that stores the AssetBundleManifest follows this naming convention
    107.         m_manifestAssetBundlePath = buildPath + "/" + directoryName;
    108.  
    109.         if (!File.Exists(m_manifestAssetBundlePath))
    110.         {
    111.             // Expected on the first build
    112.             m_report.AppendLine("No Previous Build Found");
    113.             return;
    114.         }
    115.  
    116.         m_report.AppendLine("Collecting info from previous build");
    117.         CollectBuildInfo(m_previousBuildInfo);
    118.     }
    119.  
    120.     public void ReportToConsole()
    121.     {
    122.         Debug.Log(m_report.ToString());
    123.         m_report.Clear();
    124.     }
    125.  
    126.     private void CollectBuildInfo(Dictionary<string, BundleBuildInfo> bundleInfos)
    127.     {
    128.         var manifestAssetBundle = AssetBundle.LoadFromFile(m_manifestAssetBundlePath);
    129.  
    130.         try
    131.         {
    132.             var assetBundleManifest = manifestAssetBundle.LoadAsset<AssetBundleManifest>("AssetBundleManifest");
    133.  
    134.             var allBundles = assetBundleManifest.GetAllAssetBundles();
    135.             foreach (var bundleRelativePath in allBundles)
    136.             {
    137.                 // bundle is the AssetBundle's path relative to the root build folder
    138.                 var bundlePath = m_buildPath + "/" + bundleRelativePath;
    139.  
    140.                 if (!File.Exists(bundlePath))
    141.                 {
    142.                     // Bundles may have been manually erased or moved after the build
    143.                     m_report.AppendLine("AssetBundle " + bundlePath + " is missing from disk");
    144.                     continue;
    145.                 }
    146.  
    147.                 // Get the CRC from the bundle's .manifest file
    148.                 if (!BuildPipeline.GetCRCForAssetBundle(bundlePath, out uint crc))
    149.                 {
    150.                     m_report.AppendLine("Failed to read CRC from manifest file of " + bundlePath);
    151.                     continue;
    152.                 }
    153.  
    154.                 var fileInfo = new FileInfo(bundlePath);
    155.  
    156.                 var bundleInfo = new BundleBuildInfo()
    157.                 {
    158.                     bundleHash = assetBundleManifest.GetAssetBundleHash(bundleRelativePath), //
    159.                     crc = crc,
    160.                     timeStamp = fileInfo.CreationTime,
    161.                     contentHash = GetMD5HashFromAssetBundle(bundlePath)
    162.                 };
    163.  
    164.                 bundleInfos.Add(bundleRelativePath, bundleInfo);
    165.             }
    166.         }
    167.         finally
    168.         {
    169.             manifestAssetBundle.Unload(true);
    170.         }
    171.     }
    172.  
    173.     public void DetectBuildResults()
    174.     {
    175.         m_report.AppendLine().AppendLine("Collecting results of new Build:");
    176.  
    177.         var newBuildInfo = new Dictionary<string, BundleBuildInfo>();
    178.         CollectBuildInfo(newBuildInfo);
    179.  
    180.         foreach (KeyValuePair<string, BundleBuildInfo> dictionaryEntry in newBuildInfo)
    181.         {
    182.             string bundlePath = dictionaryEntry.Key;
    183.             BundleBuildInfo newBundleInfo = dictionaryEntry.Value;
    184.  
    185.             if (m_previousBuildInfo.TryGetValue(bundlePath, out BundleBuildInfo previousBundleInfo))
    186.             {
    187.                 if (previousBundleInfo.timeStamp == newBundleInfo.timeStamp)
    188.                 {
    189.                     // Bundle was not rebuilt.  Do some sanity checking just in case the timestamp is misleading
    190.                     if (previousBundleInfo.crc != newBundleInfo.crc ||
    191.                         previousBundleInfo.contentHash != newBundleInfo.contentHash)
    192.                     {
    193.                         m_report.AppendLine($"*UNEXPECTED* [Timestamp match with new content]: {bundlePath}\n\tNow:  {newBundleInfo} \n\tWas: {previousBundleInfo}");
    194.                     }
    195.                     else
    196.                     {
    197.                         // Incremental build decided not to build this bundle
    198.                         m_report.AppendLine($"[Not rebuilt]: {bundlePath}\n\t{newBundleInfo}");
    199.                     }
    200.                 }
    201.                 else if (previousBundleInfo.bundleHash == newBundleInfo.bundleHash)
    202.                 {
    203.                     if (previousBundleInfo.crc != newBundleInfo.crc)
    204.                     {
    205.                         // Hash is the same, but according to the CRC, the bundle has new content, so this is a "hash conflict".
    206.                         // This is problematic if the hash is used to distinguish different versions of the AssetBundle (e.g. along with the AssetBundle cache)
    207.                         // If this occurs be wary of releasing this build.
    208.                         m_report.AppendLine($"*WARNING* [New CRC content, but unchanged hash]: {bundlePath}\n\tNow: {newBundleInfo}\n\tWas: {previousBundleInfo}");
    209.                     }
    210.                     else if (previousBundleInfo.contentHash != newBundleInfo.contentHash)
    211.                     {
    212.                         // Normally shouldn't happen, because the CRC check above should also trigger
    213.                         m_report.AppendLine($"*WARNING* [New file content, unchanged hash]: {bundlePath}");
    214.                     }
    215.                     else
    216.                     {
    217.                         // Expected with ForceRebuildAssetBundle or if Unity is being conservative and rebuilding something that might have changed
    218.                         m_report.AppendLine($"[Rebuilt, identical content]: {bundlePath}\n\tNow: {newBundleInfo}\n\tWas: {previousBundleInfo}");
    219.                     }
    220.                 }
    221.                 else
    222.                 {
    223.                     if (previousBundleInfo.contentHash == newBundleInfo.contentHash)
    224.                     {
    225.                         // Expected if the incremental build heuristic has changed, e.g. when upgrading Unity
    226.                         m_report.AppendLine($"[Rebuilt, new hash produced identical content]:{bundlePath}\n\tNow: {newBundleInfo}\n\tWas: {previousBundleInfo}");
    227.                     }
    228.                     else
    229.                     {
    230.                         // The normal case for a AssetBundle that required rebuild
    231.                         m_report.AppendLine($"[Rebuilt, new content]: {bundlePath}\n\tNow: {newBundleInfo}\n\tWas: {previousBundleInfo}");
    232.                     }
    233.                 }
    234.  
    235.                 // Clear it out so that we can detect obsolete AssetBundles
    236.                 m_previousBuildInfo.Remove(bundlePath);
    237.             }
    238.             else
    239.             {
    240.                 m_report.AppendLine($"[Brand new]: {bundlePath}\n\t{newBundleInfo}");
    241.             }
    242.         }
    243.  
    244.         // Anything left in this structure was not matched with an AssetBundle from the new build (so potentially can be erased)
    245.         foreach (KeyValuePair<string, BundleBuildInfo> dictionaryEntry in m_previousBuildInfo)
    246.         {
    247.             m_report.AppendLine($"[Obsolete bundle]: {dictionaryEntry.Key}\n\t{dictionaryEntry.Value}");
    248.         }
    249.     }
    250.  
    251.     private static string GetMD5HashFromAssetBundle(string fileName)
    252.     {
    253.         //Note: The file content will change if the compression is changed (e.g. the LZMA -> LZ4 conversion done for AssetBundle cache)
    254.         //Tip: the AssetBundleStripUnityVersion flag is recommended if you use hashing to track AssetBundle versions.
    255.         FileStream file = new FileStream(fileName, FileMode.Open);
    256.  
    257.         var md5 = MD5.Create();
    258.         byte[] hash = md5.ComputeHash(file);
    259.         file.Close();
    260.  
    261.         // Convert to string
    262.         var sb = new StringBuilder();
    263.         for (int i = 0; i < hash.Length; i++)
    264.             sb.Append(hash[i].ToString("x2"));
    265.         return sb.ToString();
    266.     }
    267. }
    268.  
     
  3. YuriGrachev

    YuriGrachev

    Joined:
    Jul 12, 2016
    Posts:
    6
    Hi @AndrewSkow,

    Thank you for such a deep dive.

    We are the ones using Legacy Asset Bundles. We try to make DLC as stable between the game updates as we can to avoid bandwidth waste. And I believe my talk with Unity support was one of the reasons for this thread to be started. Our studio faced all of these hash conflicts at once not so long ago. But it suddenly became one of our most painful problems. Assets paths at first followed by namespace/assembly issues in a couple of days. All of that accompanied by erroneous bundles' downloads.

    Your suggestion to build asset bundles from scratch each time we prepare a release build is somewhat doubtful. The size of asset bundles for one of our projects is about 3GB at the moment. Each clean build takes about 3-4 hours to complete, while the incremental one takes 25 minutes. I can't say for sure what the reason for such a long build-time is, but I suppose it is the shader compilation, though we put all of the shaders in a single bundle with ShaderVariantCollection along. Such long-lasting builds do not allow us to iterate at the times we need it the most, i.e. when we prepare a release. Nobody would wait half a working day to ensure the specific minor change is good enough for a release. The preparation of a release build usually takes a few days for us with incremental asset bundle build iterations - it will grow to weeks in case we use ForceRebuildAssetBundle.

    As for other problems we often encounter... I can't say for others, but for example, we get ASSET hash calculation inconstistency pretty often, or asset reimport-for-no-reason issue, i don't know for sure. The only thing we have in our researches is a hash we get from AssetDatabase.GetAssetDependencyHash(), and this thing gives us unstable results for the assets that hadn't been changed for years. Like, on two build agents building the same branch the very same asset may get different hashes at a random moment. Then, after a full reimport using the Accelerator we can get back the stable hashes, but they will differ from the ones before. As a result, the asset bundles' hashes will change too, causing huge bandwidth waste for most of our users. We've even tried to call AssetDatabase.ForceReserializeAssets upon updating Unity version, but still get some assets' hashes change from time to time.

    I know that Unity team is committed to solving such issues, but it is really hard to report a bug in case the hashes just change with no particular reason (at a first glance) for a subset of our assets. Without knowledge about the source of this change, we can't switch the hashes back to reproduce the issue. The only thing I know for sure that most of the times this happens with FBX files and more rarely with textures.

    For example, I've caught m_MeshMetrics field change (the only one) once, reported it, but the guy on your side said he failed to reproduce anything alike. So this hadn't gone any further.

    Previously we've been told that the assets' hashes are the ones taken into account while calculating the asset bundle hash. Now you say it is artifactId that is fed into the hasher. Whom should I believe? Or is it the same entity?

    !!! I think, it would be better for all of us if Unity gives us a method for getting a non-questionable unique identifier of the asset bundle contents, like the asset bundle hash in its' best. We can build the systems to calculate such identifiers ourselves, indeed, but they won't be prone to Unity engine's internal changes that may happen in the future.

    As for compatibility with LTS versions of Unity, if you'll add hash calculation terms, but will make it optional, nobody gets an issue. On the other hand everybody with the same issues as we have will be happy or be able to become happy just by adding a flag.
     
    RflectN_TapLab likes this.
  4. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    Thanks YuriGrachev for the detailed summary of the challenges you face.

    Cases where Assets Import differently on different machines are definitely something that we want to track down, and as you have experienced it is quite difficult to track them down to make individual fixes. There is a recently a bug report about some sort of rounding type difference in the floating point numbers of Animation that I have seen in the bug system, but multiple people have tried to reproduce it without success, so its quite hard to address those. But they are definitely serious issues when they hit because they force changes in the build output as well. Any info about repro steps will be helpful.

    Regarding:
    "Previously we've been told that the assets' hashes are the ones taken into account while calculating the asset bundle hash. Now you say it is artifactId that is fed into the hasher."

    Inside the AssetDatabase the ArtifactID is the hash of the content of the Imported Asset. So that is why someone could have described it as the Asset Hash. Overall there are lots of details in the AssetDatabase, so sometimes these attempts at high level descriptions can be somewhat vague, I hope they don't end up more confusing than enlightening.

    Also we totally agree that the use of clean builds is not a satisfactory work flow, and your description of your real life build times and pain is helpful to support our motivation to make it better and better. The AssetBundle building support introduced in 2023.1 has better incremental build support, and uses output content for the AssetBundle hash. But that required quite a big change in how the build works, so it not something that we can backport into stable releases of Unity.

    Regarding "I can't say for sure what the reason for such a long build-time is, but I suppose it is the shader compilation, though we put all of the shaders in a single bundle with ShaderVariantCollection along."
    Shader compilation is often the slowest part of an AssetBundle build, and retaining the Shader cache between builds can help that. Some information about how long each step of the build take is visible by looking at the Build Report (e.g. with the Build Report Inspector). Shader compilation can happen any time a Shader is saved into an AssetBundle, but because you already put them together you would be able to get an idea the Shader influence in your build times.

    It's not a really satisfactory workaround, but it should be technically possible to do a regular incremental build, but "force" certain bundles to rebuild by erasing the manifest files prior to the build. E.g. if certain bundles like the shader one takes most of the time to build then maybe it can be left alone, while other bundles are always rebuilt. Maybe not worth the effort to explore that possibility, but I thought I should mention it just in case.
     
  5. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
  6. YuriGrachev

    YuriGrachev

    Joined:
    Jul 12, 2016
    Posts:
    6
    We've had hash differences on a single machine, but for different branches/folders checked out simultaneously. We've took several assets and tracked down their history in CVS. And we've seen the same history with no exclusive commits for these assets or any other in adjacency. Have no clue how to debug that :-/
    I may add, that there is definitely something we don't know about that triggers asset reimport at some point, resulting in new hashes for a set of assets.

    We do not delete Shader cache, but still have too much time spent on building the asset bundles. Also, I've opened up the log once more and I see there are like several minutes long pit-holes in the times of the log messages. And these pits are pretty random in my opinion. It seems, the log output and the thing that consumes the most time are spinning on different threads, so it is really difficult to say what is the reason for such a time-consuming "something" happening. Most of the times these pit-holes are happening when the logging thread outputs the shader compilation statistics, that is why I'd thought that shader compilation is the main reason. But the numbers the compilation process outputs to the log say that there is no compilation at all (like, 0.00s in total).

    Sadly, the BuildReport along with the BuildReportInspector are only available for the builds of Player, but not of Asset Bundles. There is some sort of report available in SBP, but it is nothing compared to BuildReport. If only we could lay our hands on such a BuildReport but for Asset Bundles, it would be perfect!
     
  7. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    Hi Yuri,

    I don't personally know enough about the shader compilation stage to guess at the cause of those random long time delays - it almost sounds like network or internet access glitches, but for something running on a single machine that wouldn't be a factor.

    But i do have some good news - AssetBundle builds do generate the BuildReport. In past months we fixed and backported a few bugs for AssetBundles so the output should be more accurate (especially when there are Scenes in the bundle). It was a feature that was not well presented in the documentation or API.

    The in-memory BuildReport objcet is not returned by BuildPIpeline.BuildAssetBundles the way that BuildPlayer returns it, but it is written to the same file (Library/LastBuild.buildreport). And the BuildReport Inspector Window, when opened after running the build, will find that file and copy it into the project (which also converts it from Binary to YAML).

    For convenient, Unity 2023 now includes the API Build.Reporting.BuildReport.GetLatestReport, But that is just a convenient function. In older versions the file can be loaded by path to access the BuildReport object.
     
  8. Andrei-Sateanu

    Andrei-Sateanu

    Joined:
    May 5, 2021
    Posts:
    1
    Hey, firstly thanks for the lengthy post

    Secondly, I would like to share my experience as well with building asset bundles :)

    Some info about what we are using:
    • We started using Scriptable Build Pipeline with Caching & Accelerator to speed up building times.
    • We are using asset bundles not addressables
    • We also have CI/CD and assets are being built on different machines/nodes
    • Game is live and assets are constantly added/updated & downloaded so by no means we can do clean rebuilds
    Sadly over time we encountered many issues related to hashes diffs and as we were using this as identifier it assets ended up being re-downloaded as mentioned above.
    • Builds made by different machines randomly had different hashes even though asset was the same
      • We limited for prod to a certain node
    • Builds using caching server on same node were generating different hashes
      • We disabled it
    • Then we started having "stable" hashes
    • Currently back at this subject and started by changing node that builds the assets
      • Tens of assets have changed hash even though it was built from same commit
      • I don't want to remove assets from 1st node (to check if they are rebuilt with the same hash as second)
      • On 1st machine we had consistent hashes when rebuilding same codebase (on same git commit)
    So taking into account what you mentioned i have some questions:
    • AssetBundleStripUnityVersion was recently made available in SBP i think. This is painful enabled by default tbh, i see recommendation is to remove it anyway. I assume once this is enabled all assets will be rebuilt with different hash ?
    • Since with SBP you can decide what bundles build. Is it safe to build only specific bundles and deploy them ? Does this has any risk of hash collisions versus building everything ?
      • Did this locally and for tests i saw that hash changes if you include dependencies vs no dependencies for example
    • Does machine/project path/anything outside of unity project count into hash algorithm ?
      • Several times it feels like it does, based on issues i mentioned above
    • Is the SBP build algorithm different from
      BuildPipeline.BuildAssetBundles
      when it comes to asset bundles and not addressables ?
    • Is SBP intended to be used for asset bundles or only for addressables ?
    It feels like we had same issues @YuriGrachev mentioned above, even with using SBP. And to make it "stable" we had to sacrifice every speed improvement implemented in it and resort to "just trust local cache on machine"

    And the proper solution would actually be to ditch using hashes and trust something else entirely (but this sounds like a real hassle tbh as it would almost mean detecting changes by ourselves instead of letting unity do it)

    I also can't find the build report for asset bundles that you mentioned

    Thanks again for the post :)
     
    Last edited: Jul 11, 2023
    RflectN_TapLab and AndrewSkow like this.
  9. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    Hi Andrei-Sateanu,

    I don't work on the Scriptable Build Pipeline and not an expert on its implementation, but I think can help explain some details that may be helpful.

    Some notes about the Scriptable Build Pipeline hash

    "The hash calculation done by the Scriptable Build Pipeline is not the same as that performed by BuildPipeline.BuildAssetBundles. The SBP algorithm hashes the uncompressed content, without the AssetBundle header, and without the SerializedFile header. So it is similar to the CRC calculation, but tries to avoid the UnityEditor version problem.

    An unfortunate side effect of skipping the SerializedFile header can occur if a change occurs in the headers (e.g. a padding change). If that happens the content hash is not changed even though the rebuilt AssetBundle file is different.

    The SerializedFile header also includes the external references. So moving an external object from one bundle to another might not impact the content hash. To fix this SBP will also add in the dependent AssetBundles into the hash calculation. Unfortunately the dependencies used are recursive, rather than the expected “direct dependency” list - this is being fixed.

    This hashing approach also does not include the bundle directory info in the hash, so two AssetBundles containing different files that had identical contents could produce the same hash. In practice that is very unlikely to be an issue, because Unity controls the internal file names inside an AssetBundle archive based on the AssetBundle name or name of Scene files."


    So to conclude - generally Scriptable Build Pipeline is better than Native Assetbundle for its hash because it is content based. We exposed the feature to fully strip the unity version to help people who are hashing the file themselves, but using that flag should not impact the Scriptable Build Pipeline hash.

    The Scriptable Build Pipeline does not write the Build Report file. When used with Addressables there is more information exposed. including some Build Layout user interface which can be nice for seeing the results of the build. Using Addressables is recommended more than using the Scriptable Build Pipeline directly, but many people are using it directly with success as well.

    Regarding "And the proper solution would actually be to ditch using hashes and trust something else entirely (but this sounds like a real hassle tbh as it would almost mean detecting changes by ourselves instead of letting unity do it)" The script I posted is specific to the BuildPipeline.BuildAssetBundle API but gives some ideas about how to calculate full file AssetBundle hashes directly and file change detection. So it might be useful. I think that the Scriptable Build Pipeline outputs files similar to the .manifest files if you use the compatibility API. I wouldn't suggest trying to calculate what bundles need to be built with your own dependency tracking - its best to build everything and let Unity decide what needs to rebuild. In some cases you can rebuild groups of AssetBundles in isolation from other AssetBundles. But that requires that there is no dependency between the content of each build, or that it is ok in your situation for the shared dependencies be duplicated. In other words the resulting builds might be larger or even broken. So it is a more advanced usage than always sending all the AssetBundles definitions through each incremental build. Overall the behaviour in this area is quite similar between Scriptable Build Pipeline and the signature of BuildPipeline.BuildAssetBundles that takes a list of Assets and AssetBundle assignments.

    Finally, you mention trouble with different build machines producing different output. That is a separate issue from the incremental build issues we are talking about in this thread. The underlying system should produce the same output on any machine given the same project content, same unity version, and same package versions. Sometimes there can be some bugs in certain features or data types that breaks that rule. Or, there can be things happening in scripts (user/package or unity) that have non-deterministic or machine specific impact into the content and thence producing different build output. Tracking that down can be tricky. One possibility is to expand the AssetBundles that are built on two different machines using UnityDataTools and then compare the text dump files to figure out exactly what is different between the bundles (e.g. using a Visual Diff tool). E.g. if the hashes are different then possibly some serialized data is different, or it might be something in the textures or mesh data. Sometimes that can lead to a bug report to Unity, or to finding an issue in a script or package.

    I hope this is helpful
     
    Andrei-Sateanu and Alan-Liu like this.
  10. puzzlebox-patrik

    puzzlebox-patrik

    Joined:
    Oct 31, 2022
    Posts:
    6
    Hi @AndrewSkow,

    Thanks for this great post, lots of useful information.

    I'm curious about the new multi-process build method, as we are considering migrating to it. The documentation seems to imply that it is capable of building incrementally, like the current build method. However, on the documentation page, it also says
    I'm confused by this. How is the build method able to determine that a bundle does not need to be rebuilt if the hash cannot be known before building? I was under the impression that this was done by essentially performing a dry-run before the actual build and comparing hashes to determine which bundles need to be rebuilt.

    Thank you!
     
  11. puzzlebox-patrik

    puzzlebox-patrik

    Joined:
    Oct 31, 2022
    Posts:
    6
    More questions popped up, this time relating to the TypeTreeHash. As I understand it from your post, both the AssetFileHash and the TypeTreeHash are used to determine whether to rebuild an asset bundle. That sparks the following questions:
    • Can there be situations where the TypeTreeHash changes without a change in the AssetFileHash?
    • Does the same apply if the new UseContentHash flag is used?
    The reason I ask this is because, theoretically, the assets are a product of the type tree, which would imply that the asset hash is dependent on the type hash. On the other hand, I know that asset files in the project can remain unchanged even after adding/removing new fields to their classes. So, my expectation is that, under the old implementation, the TypeTreeHash can change without a change in the AssetFileHash, but I'm not quite sure the same goes if the new UseContentHash flag is enabled, since that hash seems to be based on the output of the build, which presumably maps 1-to-1 with the type tree (please correct me if I'm wrong).

    If the TypeTreeHash and AssetFileHash can change independently, I'd also like to know
    • Under what circumstances is it safe to ignore a change in the TypeTreeHash when AssetFileHash remains the same? When is it not safe?
    I would imagine that renaming a class would cause a TypeTreeChange that is safe to ignore (since it does not affect the data layout).

    Ultimately, what I'm interested in, is whether or not we should use just the AssetFileHash (with UseContentHash enabled) to version our bundles, or if we should factor in the TypeTreeHash as well (such as by combining the two into a composite hash value). Using something like Hash(AssetFileHash + TypeTreeHash) seems like a safe bet: then we can guarantee consistency between types and data. On the other hand, this means that even small changes which don't affect the data layout might cause new bundles to be deployed to the client. If I understand things correctly.

    Thanks again!
     
  12. jonathanma_unity

    jonathanma_unity

    Unity Technologies

    Joined:
    Jan 7, 2019
    Posts:
    229
    Without going into too much details the multi process build pipeline is leveraging the asset database to track changes to assets that are included in a build and their dependencies. The pipeline cache intermediate states of the build process. When an asset is modified it will compare the new state with the previous one and if nothing has changed at the build level it will not rebuild the bundle.

    Yes this can happen when you rename a script class name. This doesn't apply with UseContentHash, when a class is renamed both the TypeTreeHash and AssetFileHash will change.

    It's recommended to not ignore the TypeTreeHash even if the AssetFileHash remains the same. Like you said using AssetFileHash + TypeTreeHash is a safe bet.
    Or you can simply use UseContentHash flag and look at AssetFileHash in the manifest which is what I recommend.
     
    AndrewSkow likes this.
  13. puzzlebox-patrik

    puzzlebox-patrik

    Joined:
    Oct 31, 2022
    Posts:
    6
    Thanks a lot! This makes it very clear.
     
    AndrewSkow likes this.
  14. FT_yc

    FT_yc

    Joined:
    Jul 20, 2023
    Posts:
    2
    > If there is a .manifest file from a previous build of the AssetBundle then load the hash and compare

    how does unity find ".manifest" file? if i build two times,and result in different directory, Can i specify the ".manifest" file?
     
  15. AndrewSkow

    AndrewSkow

    Unity Technologies

    Joined:
    Nov 17, 2020
    Posts:
    91
    During an AssetBundle build Unity finds the .manifest simply by looking for a file that has the same file path as the AssetBundle plus ".manifest". E.g. Manifest for "MyBuild/A.bundle" is found at "MyBuild/A.bundle.manifest".

    You cannot specify the manifest but hopefully with these details about how Unity works you have something to work with. For example in one case people were moving the AssetBundles after a build, effectively deleting the ".manifest" files. That had an impact for code stripping when the player is built and would also break ability to have incremental building.

    Note however that .manifest files are only needed at build time, they aren't needed for loading and don't need to be shipped in a player release.
     
  16. FT_yc

    FT_yc

    Joined:
    Jul 20, 2023
    Posts:
    2
    thanks a lot!