Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

(Case 1192489) Building Asset Bundles decompresses textures

Discussion in 'Addressables' started by AlkisFortuneFish, Sep 10, 2019.

  1. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    I've been investigating some slow build issues and one thing I've been coming across for a while is console spam about decompressing unsupported PVRTC textures when targeting iOS.

    Now, for a while I assumed that to be harmless but I just profiled the Unity process to find 50% of its time spent in DecompressPVRTC during asset bundle builds.

    I guess I can only summarize this question as: Why?

    These textures are already in the AssetDatabase with the appropriate target format for the build, why would the build process have to decompress and recompress them?

    Code (csharp):
    1.  
    2. Function Name    Total CPU [unit, %]    Self CPU [unit, %]    Module
    3.  + Unity.exe (PID: 14548)    178079 (100.00%)    0 (0.00%)    Multiple modules
    4. | + ntdll.dll!0x007ff9ef16a271    177280 (99.55%)    0 (0.00%)    ntdll.dll
    5. || + kernel32.dll!0x007ff9ee267974    177280 (99.55%)    0 (0.00%)    kernel32.dll
    6. ||| + Thread::RunThreadWrapper    113653 (63.82%)    0 (0.00%)    Unity.exe
    7. |||| + GfxDeviceWorker::RunGfxDeviceWorker    113217 (63.58%)    0 (0.00%)    Unity.exe
    8. ||||| + GfxDeviceWorker::RunExt    113217 (63.58%)    0 (0.00%)    Unity.exe
    9. |||||| + GfxDeviceWorker::RunCommand    113217 (63.58%)    2 (0.00%)    Unity.exe
    10. ||||||| + GfxDeviceD3D11Base::UploadTexture2D    113109 (63.52%)    1 (0.00%)    Unity.exe
    11. |||||||| + TexturesD3D11Base::UploadTexture2D    113106 (63.51%)    5 (0.00%)    Unity.exe
    12. ||||||||| + TexturesD3D11Base::UploadAll2DData    112619 (63.24%)    0 (0.00%)    Unity.exe
    13. |||||||||| + TexturesD3D11Base::Upload2DData    112615 (63.24%)    5 (0.00%)    Unity.exe
    14. ||||||||||| + ConvertCompressedTextureUpload    92416 (51.90%)    3 (0.00%)    Unity.exe
    15. |||||||||||| + DecompressNativeTextureFormatWithMipLevel    92347 (51.86%)    0 (0.00%)    Unity.exe
    16. ||||||||||||| + DecompressNativeTextureFormat    92347 (51.86%)    3 (0.00%)    Unity.exe
    17. |||||||||||||| - DecompressPVRTC<0,1>    83299 (46.78%)    50252 (28.22%)    Unity.exe
    18. |||||||||||||| - DecompressETC2_RGBA8_RGBA8888    9034 (5.07%)    0 (0.00%)    Unity.exe
    19.  
    I don't really get what business the build process for an unrelated platform has uploading all textures to the local GPU, let-alone using the lossy native format to do so.
     
    Last edited: Sep 10, 2019
  2. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    Hm. I may be wrong but I think I found what is going on. Further time profiling came up with a significant amount of time being spent here:

    Code (csharp):
    1.  
    2. Function Name    Total CPU [unit, %]    Self CPU [unit, %]    Module
    3. |||||||||||||||||||||||||||||||||| + ContentBuildInterface_CUSTOM_WriteSerializedFileAssetBundle_Injected    37947 (63.83%)    0 (0.00%)    Unity.exe
    4. ||||||||||||||||||||||||||||||||||| + BuildPipeline::WriteSerializedFile    37919 (63.78%)    0 (0.00%)    Unity.exe
    5. |||||||||||||||||||||||||||||||||||| + BuildPipeline::BuildReferenceMap::ConvertToInstanceIDToBuildAsset    26731 (44.96%)    14 (0.02%)    Unity.exe
    6. ||||||||||||||||||||||||||||||||||||| + AddBuildAssetInfo    26484 (44.55%)    4 (0.01%)    Unity.exe
    7. |||||||||||||||||||||||||||||||||||||| - CalculateSortIndex    19384 (32.61%)    2 (0.00%)    Unity.exe
    8. ||||||||||||||||||||||||||||||||||||||| + PPtr<Object>::operator Object * __ptr64    19382 (32.60%)    2 (0.00%)    Unity.exe
    9. |||||||||||||||||||||||||||||||||||||||| + PersistentManager::ReadObject    19380 (32.60%)    0 (0.00%)    Unity.exe
    10. ||||||||||||||||||||||||||||||||||||||||| - PersistentManager::ReadObjectThreaded    12791 (21.52%)    0 (0.00%)    Unity.exe
    11. ||||||||||||||||||||||||||||||||||||||||| - PersistentManager::LoadAndIntegrateAllPreallocatedObjects    6585 (11.08%)    1 (0.00%)    Unity.exe
    12. ||||||||||||||||||||||||||||||||||||||||| - PersistentManager::RegisterPartiallyLoadedObjectInternal    2 (0.00%)    1 (0.00%)    Unity.exe
    13. ||||||||||||||||||||||||||||||||||||||||| + PersistentManager::Lock    1 (0.00%)    1 (0.00%)    Unity.exe
    14. ||||||||||||||||||||||||||||||||||||||||| - PersistentManager::Unlock    1 (0.00%)    0 (0.00%)    Unity.exe
    15. |||||||||||||||||||||||||||||||||||||| - GetTypeWithoutLoadingObject    7037 (11.84%)    10 (0.02%)    Unity.exe
    16.  
    Now, I can only guess what CalculateSortIndex does from its name, is it supposed to dereference the object PPtr, which results in the object being loaded into memory? It would explain why Unity would be spending 50% of its time decompressing assets it has no business decompressing.

    Any ideas, @Ryanc_unity?
     
  3. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    This could very well be the cause of this.
     
  4. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    Any updates on this @Ryanc_unity? The affected project is now available to you to test (ask @unity_bill for it), I cannot really reference that in an actual bug report. Looking at AddAssetBundleInfo, it seems to go to the trouble of calling GetTypeWithoutLoadingObject() only to call CalculateSortIndex() immediately afterwards which loads the object anyway, so this does not seem intentional to me at least.
     
  5. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    Ok, I have now traced this. The codepath that triggers the PPtr load of the object in CalculateSortIndex() is if the object is a ScriptableObject/MonoBehaviour. Don't ask how I found out. With the new information, I created a repro project that triggers the offending code path and submitted a bug report (Case 1192489). The annoying thing is that with the size of project the repro is, the build does not take very long. Profiling it, however, clearly shows the textures being decompressed from PVRTC into GPU memory.
     
    Last edited: Oct 18, 2019
  6. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    Wow, I did not see this ping. Sorry about that.

    So PVRTC ya, that format only has hardware support on the iOS devices itself, so a slower software fallback has to be used in the editor to load/save that format thus the msg every time we load one of those textures from disk for a build.

    For a build, we have to have the object data loaded so it can be written to the final output location. Though in this case it sounds like it's doing it excessively in SBP, so this will need to be checked and fixed if this is true. I also just checked the source on latest trunk for CalculateSortIndex and don't see any reason it should be triggering an object load for a texture at this point, so will need to do a bit of debugging there.
     
    protopop likes this.
  7. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    I suspect I'm missing some internal implementation detail here, but shouldn't data that the editor has already prepared for the target platform be loaded as is, rather than fully deserialized? Those assets are fully loaded into GPU memory, that is what causes the PVRTC decompression, which doesn't make much sense to me, especially considering it ends up spending quite literally the vast majority of the time doing just that rather than useful work.

    I dug into it with a native debugger and the disassembly, and it triggers a load if the object is MonoBehaviour/ScriptableObject. In this case, the textures are dependencies of that (specifically, our ScriptableObjects are UMA material overlays) and hence get loaded. With it doing this, I had a build fail out of memory after spending 27 hours mostly in that method.
     
  8. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    Texture data, and other render-able data, goes over to the GPU thread and uploaded to the GPU during any load operation. This includes loading for a build as there isn't a special loading path for this case.

    Ah, that makes a bit more sense then. MonoBehaviour/ScriptableObject don't have full type information in native unless they are loaded as they are just a representation of a scripting object type. So in this case the scripting object loads, and loads it's references recursively.

    If the scripting object in question that is being loaded is being used as a mapping or lookup table (For example, contains an array of textures you might swap out depending on some runtime constraint) I would suggest switching those direct references out for a weak reference type such as the AssetReference type in the Addressables package.
     
  9. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    Ugh. Surely that cannot possibly scale to huge projects? Plus, it is exceedingly ugly by any measure.

    Sure, but why does CalculateSortIndex() need to load the object at all? Obviously, since Unity does not export the relevant symbols, I cannot tell exactly what it is that it is reading from that type, just a raw structure offset, but that does seem to be quite an excessive operation to carry out at that point.

    That would require:
    1) Loss of automatic dependency bundling.
    2) Loss of automatic loading of dependencies.
    3) Async loading where async loading is at the very least inconvenient.

    In short, at that point we would lose all the niceness of having Unity's dependency handling at all. I suspect this is also what is causing our Fast Mode addressables catalogue to take more than a minute and a half to build, allocate astronomical (>16GB) amounts of memory.

    If you are curious about the layout of our project, your team should have access to it, all of the legal stuff has been sorted out as far as I am aware.
     
  10. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    At least for our CI, I'm currently testing building our bundles with -nographics. My theory is that if there is no GPU thread, GPU texture uploads and hence decompression from PVRTC won't occur.
     
  11. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    @AlkisFortuneFish It took me a bit to hunt this down as I couldn't remember the class name. Not 100% sure if this will work for your situation, however there is this UnityEditor type added in 2019.3: `LazyLoadReference<T>`.

    This was added specifically so scripting types in the Editor did not have to immediately load the reference in that field, however it still works just like a normal object reference for all other systems, and falls back at runtime to normal object reference behavior. Basically it was added to solve a very similar problem to what you have, but in the asset import pipeline. So if you are on 2019.3, try changing the Texture2D? references in your scriptable object to LazyLoadReference<Texture2D> instead and see if that improves the decompression at build time as a result.
     
  12. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    I just run a quick test in the repro project and it would very much look like this would work. Unlike AssetReference and friends, this ticks all the boxes, it is detected as a dependency in the AssetDatabase and is synchronously available for both editor and runtime deserialization use.

    I cannot test it on our actual project just this moment, since our 19.3 port branch is out of date and our mainline is currently 19.2, but I am optimistic this is going to improve both build times and, more importantly, our absurdly long times to enter play mode in Fast Mode.
     
  13. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    So, @Ryanc_unity @unity_bill, now we've had our release I've had the time to upgrade the project to 19.3 and addressables 1.6.0, reverting all my addressables customisation minus the `PackTogetherByPath` mode. As @Ryanc_unity suggested, I modified our UMA dependency references to use `LazyLoadReference<T>`. This has resulted in sub-10s Fast Mode enter play mode times, with domain reloads now being the largest cost, rather than addressable catalog generation.

    There is currently an issue where the Groups window slows play mode times by 50s by recalculating its tree view as the catalog is being generated, which I would treat as a bug, but it's a massive QoL improvement already.
     
  14. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    Glad this hit the mark. The second issue you mention of the groups window slowdown I think @unity_bill and co know about it and have fixes either getting ready to go out or in the works. I'll poke them about this thread and have them follow up.
     
  15. unity_bill

    unity_bill

    Joined:
    Apr 11, 2017
    Posts:
    1,053
    yes, we know about it. We have an improvement coming in the next release (this week ish,1.7.something) that will cause the tree to only build the visible nodes. After that (1.8.? 1.7.more?) we've got some plans to expose options that can make it even faster.
     
  16. Camarent

    Camarent

    Joined:
    Feb 19, 2014
    Posts:
    168
    @AlkisFortuneFish Hi! Can you help me with native debugger? What do you use to profile?
    I maybe have the same problem but I want to check it.
     
  17. jehovah0121qq

    jehovah0121qq

    Joined:
    Nov 14, 2013
    Posts:
    68
    One question related to this is that, is it possible to bypass or accelerate the texture compression procedures? Every time I build iOS on a Mac computer, there are loads of

    WARNING: ASTC texture format is not supported, decompressing texture

    showing in the log. Although this log is harmless and the images will be properly compressed, it takes a lot of time. What I want to do is to bypass this or force it to use a faster compression method when I build test app packages. Unity used to invoke PVRTexTool for texture compression, but now (2018.4.x) it doesn't! Does it still call some external tool (exe) which I can hack a little bit?
     
  18. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    The reason you are getting those is probably what Ryan said above. The engine has already compressed the assets to the target texture format, so when it loads them in order to write them in the asset bundles it actually goes through the same code path that loads assets in general an loads them onto VRAM, having to decompress them to do so. Try to batch build with -nographics, see if it helps.
     
  19. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    From the Unity 2020.2 changelog:
    • Build Pipeline: Added: Added ContentBuildInterface.GetPlayerAssetRepresentations API to return the asset representations without triggering a load of the asset itself. Improving performance for certain build cases.
    From the SBP 1.8.4 changelog:
    • Updated CalculateAssetDependencyData to use a new fast path API for working with Asset Representations in 2020.2 and onward.
    Is this what I think it is, @Ryanc_unity?
     
  20. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    @AlkisFortuneFish maybe? The short is that the new api allows us to gather the asset representations a LOT faster without triggering asset loads in most cases. On a 40GB project with 1204 FBX files (most notorious asset for large asset representation counts), gathering the representations on a just opened project took 5378ms with the old approach, and 135ms with this new api.

    This still doesn't resolve having to write out a bunch of (imo mostly useless) entries into asset bundles for those representations. I do have some other ideas for that which are on my list after the current loading performance improvements being worked on.
     
  21. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    That's a yes then! :) The Editor loading texture asset representations built for the player triggered entirely unnecessary texture decompressions and GPU uploads with very hefty performance penalties on builds, especially where the texture formats were proprietary and unsupported by the hardware running the Editor. I'll have to give this a try and see how much it improves performance in our use case.
     
  22. Ryanc_unity

    Ryanc_unity

    Unity Technologies

    Joined:
    Jul 22, 2015
    Posts:
    332
    Ya this should reduce most cases where we are triggering extra asset loads / unloads outside of the Write build task (obviously we still need to load to write, would love to figure out a way to just copy the on disk platform specific data).
     
  23. jamessnow

    jamessnow

    Joined:
    Jul 19, 2022
    Posts:
    3
    @AlkisFortuneFish Do you have any conclusion now? I googled a lot for 'format is not supported, decompressing texture', but found no solution to this

    It has no side effect on visual, but slows down our ci progress a lot.
     
    belokurenkonexters and DR-Ctlc like this.
  24. belokurenkonexters

    belokurenkonexters

    Joined:
    Nov 17, 2021
    Posts:
    2
    ping
     
  25. AlkisFortuneFish

    AlkisFortuneFish

    Joined:
    Apr 26, 2013
    Posts:
    970
    We did a couple of things, firstly we build with -nographics, and secondly we put a lot of our native dependencies behind LazyLoadReference, which solved this issue for us. Our bundle builds still take a pretty long time, but we have quite an (intentionally) extreme number of bundles because of the structure of our game, it takes about an hour to extract dependencies into their own group and then an hour for a fresh bundle build (20-ish minutes cached).