Search Unity

[Bug][1.7.5] Hash collision in ProcessBundledAssetSchema

Discussion in 'Addressables' started by simon575, Apr 4, 2020.

  1. simon575

    simon575

    Joined:
    Dec 28, 2019
    Posts:
    3
    This is the bottom part of the method
    Code (CSharp):
    1.  
    2. protected virtual string ProcessBundledAssetSchema(
    3.             BundledAssetGroupSchema schema,
    4.             AddressableAssetGroup assetGroup,
    5.             AddressableAssetsBuildContext aaContext)
    6. {
    7. // ... code omitted ...
    8. for (int i = 0; i < bundleInputDefs.Count; i++)
    9.             {
    10.                 string assetBundleName = bundleInputDefs[i].assetBundleName;
    11.                 if (aaContext.bundleToAssetGroup.ContainsKey(assetBundleName))
    12.                 {
    13.                     int count = 1;
    14.                     var newName = assetBundleName;
    15.                     while (aaContext.bundleToAssetGroup.ContainsKey(newName) && count < 1000)
    16.                         newName = assetBundleName.Replace(".bundle", string.Format("{0}.bundle", count++));
    17.                     assetBundleName = newName;
    18.                 }
    19.  
    20.                 string hashedAssetBundleName = HashingMethods.Calculate(assetBundleName) + ".bundle";
    21.                 m_OutputAssetBundleNames.Add(assetBundleName);
    22.                 m_AllBundleInputDefs.Add(new AssetBundleBuild
    23.                 {
    24.                     addressableNames = bundleInputDefs[i].addressableNames,
    25.                     assetNames = bundleInputDefs[i].assetNames,
    26.                     assetBundleName = hashedAssetBundleName,
    27.                     assetBundleVariant = bundleInputDefs[i].assetBundleVariant
    28.                 });
    29.                 Debug.Log(string.Format("assetBundleName: {0} - hashedAssetBundleName: {1}", assetBundleName, hashedAssetBundleName));
    30.                 aaContext.bundleToAssetGroup.Add(hashedAssetBundleName, assetGroup.Guid);
    31.             }
    I've inserted the Debug.Log
    In my project I encounter the following two lines:
    assetBundleName: f62b2ff391a3b4fa2820a6d79c4a567a_assets_assets/animations/weapons/[기본]양손무기/attack2.anim.bundle - hashedAssetBundleName: 2c98148b3848ee9eeffcbed620639777.bundle
    assetBundleName: f62b2ff391a3b4fa2820a6d79c4a567a_assets_assets/animations/weapons/[기본]한손무기/attack2.anim.bundle - hashedAssetBundleName: 2c98148b3848ee9eeffcbed620639777.bundle

    And then the exception
    ArgumentException: An item with the same key has already been added. Key: 2c98148b3848ee9eeffcbed620639777.bundle
    which prevents me from making a build of the Addressables Groups.

    Note that the two strings [기본]양손무기 and [기본]한손무기 are NOT equal.

    I've constructed this little piece of code to test the hash function:
    Code (CSharp):
    1.         string str1 = "[기본]양손무기";
    2.         string str2 = "[기본]한손무기";
    3.         string hash1 = HashingMethods.Calculate(str1).ToString();
    4.         string hash2 = HashingMethods.Calculate(str2).ToString();
    5.  
    6.         Debug.Log("str1 == str2: " + (str1 == str2));
    7.         Debug.Log("hash1 == hash2: " + (hash1 == hash2));
    I end up with the following output:
    Code (CSharp):
    1. str1 == str2: False
    2. hash1 == hash2: True
    Which supports my theory that HashingMethods.Calculate does not properly work with Unicode strings. As a matter of fact, even if ALL of the Korean characters are different, the hashes will still be the same.

    Is this a problem? In my case yes, because a there are a lot of assets with Korean filenames in the project I'm working on. As soon as two addressable files in the same directory have the same number of characters, it will result in a hash collision.
    Unity doesn't have a problem with these files, everything except Addressables works fine.
     
    Last edited: Apr 4, 2020
  2. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,822
    I'll forward this to the team for them to have a look. Which version of the editor are you running?
     
  3. simon575

    simon575

    Joined:
    Dec 28, 2019
    Posts:
    3
    I'm running 2018.4.20f1
     
  4. simon575

    simon575

    Joined:
    Dec 28, 2019
    Posts:
    3
    The issue is in com.unity.scriptablebuildpipeline@1.5.10\Editor\Utilities\HashingMethods.cs, line 147:
    Code (CSharp):
    1.             else if (currObj is string)
    2.             {
    3.                 var bytes = Encoding.ASCII.GetBytes((string)currObj);
    4.                 stream.Write(bytes, 0, bytes.Length);
    5.             }
    Encoding.ASCII.GetBytes() will convert all non-ASCII character to code 63 (?). So "[기본]양손무기" turns into "[??]????".
     
  5. TreyK-47

    TreyK-47

    Unity Technologies

    Joined:
    Oct 22, 2019
    Posts:
    1,822