Search Unity

[CFH] Shader variants memory footprint

Discussion in 'Shaders' started by Altair4Ru, Feb 25, 2020.

  1. Altair4Ru

    Altair4Ru

    Joined:
    Aug 21, 2018
    Posts:
    2
    Hi there.

    I've been struggling for a while with our shaders taking up too much memory on iOS (not exclusive).
    At a certain moment I've caught myself thinking that I lack instrumentation. Thus, I want to summarize all my knowance and ask for help to mitigate blind spots.

    We use our custom "uber" shaders that have lots of variants. Collecting those variants is a tough task. Unity offers automatic collection of variants being used while playing the game in Editor. For us using this method means several man-hours (or even days) per build spent. We have shader lods with different sets of variants on board, dynamic gameplay-specific variants and so on. The game should be re-played several times from start to the very end with different quality settings to collect all the possible variants. And some of them could still be skipped...

    As an alternative I've managed to gather all of the possibly used variants from the build using IPreprocessShaders interface. So I build the game without any ShaderVariantCollections (SVC) and collect those variants, then generate an SVC and add it to the build (into an asset bundle specifically), running the build process once moar! Doubling the build time helps to get rid of the possible shader duplicates in asset bundles. In the end I get shader assets with all the variants used throughout the game. This set is indeed much smaller than all of the possible variants, but still is a bit bigger than I would've expect. Unity is a bit greedy here generating some strange variants to cover all of the possible issues, if I get it right.

    Let's assume I've gathered all the required variants and put them into a SVC. If I then load any shader from this SVC, Unity loads its' (Unity's) internal representation of this shader asset into system memory using malloc. This memory then ends up being dirty and adds to the application memory footprint from the iOS' point of view. This Unity's internal representation is the full representation of the shader asset, meaning all of the variants with all their properties and stuff. This memory is then visible in Unity Profiler/Memory/Detailed view/Other/Rendering/ShaderLab, isn't it?
    When I load a gameplay scene, ShaderLab memory consumption jumps up to 160Mb. That is huge for a mobile project. On that certain project there are not too much asset bundles, so duplication is not really a problem, but still I'd expect to win a couple of megs gathering all the shader assets in one bundle. In reality I see the opposite effect. The same scene loaded after the gathering takes up to 256Mb of ShaderLab memory. I assume, that is because all of the provided variants from the build are loaded simultaneously into memory in Unity's representation.

    Documentation says (can't remember exact place) that this Unity's representation is required to compile the specific variants on the fly when they are needed by the pipeline. Once compiled, the variant source data is discarded leaving the compiled program in GPU memory for an app lifetime.
    Knowing that, I've tried warming up the SVC. This took an enormous amount of time, but in the end I've got only 3.5Mb ShaderLab memory and aroung 50 Megs of memory taken by shader assets under Unity Profiler/Memory/Detailed view/Assets/Shader. That's a lot better than 256 megs of ShaderLab memory from my point of view, but is not shippable because of the time taken by WarmUp process. I can't make a player wait for another 15 minutes before the play.

    I think this time consumption should be treated as a bug, because 80% of the CPU time is spent for Shader::SRPBatcherInfoSetup() though we use neither SRPBatcher nor SRP itself.

    And finally, here come the questions:
    1. What can we or Unity do to lower down the ShaderLab memory (except minimize the variants count)?
    2. If I strip certain graphic tiers from the build, would this only affect the build size? In other words, do all of the graphic tiers variants load into memory in runtime or only the current one? What happens if the tier is switched in runtime?
    3. Isn't there a better way to handle shader loading? For example, dump all that Unity's representation memory into the file and mmap it making it clean memory?

    If some of my inferences here are wrong, please, correct me. Let's make this topic a vault of useful information.

    Thank you for your patience during longread.

    P.S. If that matters, we're on Unity 2019.2.8 and 2019.2.19 (two projects on different versions). No specific packages in use. No SRP, built-in RP only.
     
    Last edited: Feb 26, 2020
  2. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,285
    Hey, can you submit a bug report for this - i agree we really need to improve this. Feel free to reference this forum post in the bug report. (and replying here with the case number will help too)

    Thanks!
     
  3. YuriGrachev

    YuriGrachev

    Joined:
    Jul 12, 2016
    Posts:
    6
    Hi @richardkettlewell,

    I've prepared a repro and submitted a bug (Case 1223610).

    Please, take a look.

    P.S. My initial post was mistakenly made from a personal account instead of the corp one.
     
    richardkettlewell likes this.
  4. YuriGrachev

    YuriGrachev

    Joined:
    Jul 12, 2016
    Posts:
    6
    While investigating further I think I've noticed another lower-level issue.

    Each variant is compiled in runtime for the exact platform/GAPI. For Metal Unity uses newLibraryWithSource:eek:ptions:error: from MTLDevice protocol. This creates an object that conforms to MTLLibrary protocol. I don't know the exact type of that object, but I'm sure it is allocated in the dirty memory. Also, it is not trackable by Unity. It does not count it either in GfxMemory or in any other section of Memory profiler.

    In the instruments there's a template named Metal System Trace that has Metal Shader Compiler activity among the tracks. That instrument shows that my test project has a lot of MTLibrary creations along with the comparable number of shader compilations.

    Is there a reason for such aggressive MTLLibrary objects creation? AFAIK, each compiled shader is represented as MTLFunction. MTLLibrary can handle a lot of MTLFunction's at once.

    Also, there's a noticeable pattern in timings that are spent for MTLibrary creation (while compiling a large number of shaders in a row). It seems, they are synced to the framerate and/or vsync (look at the screenshot).
    Screenshot 2020-03-20 at 14.22.02.png
    If Unity would switch to a single library use, we can skip those long-lasting and memory-consuming multiple library creations and win both memory and performance.

    What do you think? @richardkettlewell @martonekler
     
  5. VictorChow_K

    VictorChow_K

    Joined:
    Jan 16, 2019
    Posts:
    9
    The ShaderLab memory optimization issue is listed as fixed in 2020.2.0a9 (or alpha 8) on 29 Apr 2020.
    Will this be backported to a 2019 release?

    Also interested in a response to the excessive MTLLibrary creation post above (though perhaps for a different thread).
     
  6. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,285
    Hey I just checked, and we evaluated it for 2019 but decided not to proceed as the code around the fix had changed significantly and we decided the risk of breaking stuff was too high.
     
  7. VictorChow_K

    VictorChow_K

    Joined:
    Jan 16, 2019
    Posts:
    9
    Thank you for the prompt reply -- sad to read it won't be in 2019. Our ShaderLab data increased from 250 to 400MB from 2018.4.14 to 2019.3.10 with no changes or explanation. After aggressive shader variant stripping, it is back down to ~250MB but it leads me to believe there is a hefty chunk of memory to reclaim in what is otherwise a black box.
     
    Peter77 likes this.
  8. richardkettlewell

    richardkettlewell

    Unity Technologies

    Joined:
    Sep 9, 2015
    Posts:
    2,285
    I've passed your feedback along to the folks involved
     
    VictorChow_K and Peter77 like this.
  9. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    727
    Any updates here? It is taking up to 25% of our memory usage (170 MB for shaders) right now
     
  10. joshuacwilde

    joshuacwilde

    Joined:
    Feb 4, 2018
    Posts:
    727
    I'm finding more references to shaders and appears to be closer to 240 MB. We are on 2019.3.14f1.
     
  11. sunmachine

    sunmachine

    Joined:
    Jun 4, 2014
    Posts:
    9
    Even with total shader striping (I must emphasize this - absolutely preventing shader compilation via preprocessor at build-time), we're seeing 170 MB at runtime. This is down from ~260 MB without stripping. Shaderlab is constituting ~38% of my peak mobile memory budget. `2019.2.21f` and `2019.2.14f`.

    Our project never went over 50MB before. 2019.3 is a possible upgrade path but I'm advising against it unless there is use in doing it.
     
  12. chrismarch

    chrismarch

    Joined:
    Jul 24, 2013
    Posts:
    472
    If you're having issues with shaderlab memory usage, it may be helpful to list if you are using Addressables and/or URP, and the platform profiled, as well as all the version numbers.

    We use Addressables 1.16, URP 7.5 and Unity. 2019.4.10f1. On iOS, shaderlab was using around 150MB across various profiler categories before stripping, and maybe 10 or 20 mb after, although these numbers are rough. Our custom stripping basically did what the built in stripping advertises to do, with unused keywords, but the built in stripping does not seem work with this combination of tech.
     
    pokruchin likes this.
  13. matheus_inmotionvr

    matheus_inmotionvr

    Joined:
    Oct 3, 2018
    Posts:
    63
    Any updates on this issue? We're reaching 350MB of memory consumption and 2 URP shaders are to blame for most of it. Unity 2019.4.33

     
  14. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    This means you either use a lot of variants or don't strip the ones you don't need.
     
  15. MikeDemone

    MikeDemone

    Joined:
    Nov 30, 2020
    Posts:
    6
    It seems like in more recent versions Unity is loading all available variants into memory rather than just the ones you are actively using. I have a sample project with an uber shader that takes up 29.4KB in 2019.4.5f1 but 57.1MB in Unity 2020.3.13f1, 2020.3.27f1, and 2021.2.10f1.

    My uber shader does have a lot of variants which are kept in the build with a ShaderVariantCollection, but only 1 is in use so I'd expect the memory footprint to be around 30KB like in 2019.4.5f1. Is there any way we can prevent all the shader variants from being loaded into memory whenever the shader is used?

    Edit for some more context - I have 2 scenes in my test project, one which references a ShaderVariantCollection, and one which has a cube with a material that references my uber shader on it. I include the ShaderVariantCollection so that when I build the project, my shader will have all the variants that it has in my main project (~8000). I then load into the scene with the cube in it, which should only use 1 of those variants, but I have 50+ MB of shaders loaded. I am assuming this is because all of the variants in the build are loaded into memory rather than just the ones I am using.
     
    Last edited: Feb 8, 2022
  16. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    @MikeDemone Before 2019.4.30f1 got a fix, the memory was not reported individually per shader, but rather for all shaders together, under "ShaderLab". I don't think anything changed in the way variants are handled.
    Can you please double-check this on 2019.4?
     
  17. MikeDemone

    MikeDemone

    Joined:
    Nov 30, 2020
    Posts:
    6
    Oh yeah, you're totally right! All that memory was hiding in shader lab. I was only looking at the memory using the memory profiler (rather than the built in profiler) which doesn't show that info. Thank you for the prompt reply!
     
    aleksandrk likes this.
  18. vuthang

    vuthang

    Joined:
    Mar 7, 2017
    Posts:
    50
    Hi, I got this crash on google play console: Shader::SRPBatcherInfoSetup(). Please help me fix this! Thanks!
     

    Attached Files: