Search Unity

  1. Good news ✨ We have more Unite Now videos available for you to watch on-demand! Come check them out and ask our experts any questions!
    Dismiss Notice

Unity Profiler doesnt show where the GC allocations are coming from

Discussion in 'Editor & General Support' started by mbitzos, Feb 17, 2019.

  1. mbitzos

    mbitzos

    Joined:
    Jun 8, 2018
    Posts:
    16
    I have been experiencing loss of frames randomly in my project, so I have looked toward the unity profiler to see what could be the issue. I noticed there was quite a lot of GC allocations happening each frame (~16kb) with random spikes every few ms (~100kb). When I look at the memory overview, it says that GC allocations per frame are as low as 300 and as high as 2000 during spikes. Now when I look at the CPU usage overview, I am not getting nearly as much memory being allocated (17b /frame). Where is the rest of this being allocated? Is this amount of garbage acceptable? How can I find where this allocations are happening?

    I have tried deep profiling and everything to see where this could be happening.

    I also understand the general practices when it comes to GC optimizations and while i'm pretty sure I have been following these practices I can't be sure entirely as my codebase as gotten quite large. Having the profiler not pointing me in any direction is making me feel really lost in trying to figure out what the issue is.

    The only thing I am able to be certain of is that upon start, I am using a library that I am aware of loads several very large dictionaries into memory (100mb+) and the random FPS drops are directly related to this (have tracked down to simply removing one line to prevent FPS drop). What I dont understand is that why would this affect performance for the entire life span of playing and not just the start since it doesnt run on an update loop/coroutine. Could this simply be because the performance is taking a hit because of the large amount of data loaded into memory at the start?

    I apologize if I seem ignorant (especially when it comes to the GC stuff), I am just trying to wrap my head around this and figure out a solution so I can actually move on from this issue.

    upload_2019-2-17_2-26-27.png
    upload_2019-2-17_2-28-40.png
     
    alexeyzakharov likes this.
  2. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    You are profiling a playmode player running in the editor. My guess (after checking this behavior out myself) is that these allocations happen in the Editor Loop and get filtered before showing up in the CPU Profiler. You could try enabling "Profile Editor" to see them.
    To get reliable data on how your game will perform, you should prefer profiling a built player on your target device though.

    I am making a note to see if we can't improve upon the clarity of this particular behavior in the Memory Profiler Module, so thank you for the feedback :)
    if there still is a discrepancy when profiling a built player, that'd be a bug though.
     
    mbitzos and alexeyzakharov like this.
  3. alexeyzakharov

    alexeyzakharov

    Unity Technologies

    Joined:
    Jul 2, 2014
    Posts:
    397
    Another option you have to see managed allocation in game code is to enable allocation callstacks - https://docs.unity3d.com/Manual/ProfilerCPU.html
    You will see red samples in timeline view. Callstack is displayed when you click on a sample.
    For main thread allocations you can also use hierarchy view - callstack is displayed in detailed view when you select a specific sample.

    Also take a look at this nice writeup on how to avoid gc allocations - https://unity3d.com/learn/tutorials...ion/optimizing-garbage-collection-unity-games
     
    Last edited: Feb 18, 2019
    MarkTension and mbitzos like this.
  4. mbitzos

    mbitzos

    Joined:
    Jun 8, 2018
    Posts:
    16
    Thank you for the fast reply! That makes a lot of sense and clears up a lot!
     
    MartinTilo likes this.
  5. mbitzos

    mbitzos

    Joined:
    Jun 8, 2018
    Posts:
    16
    Thanks for the info on the profiler it helps! I've been living by that document to make sure im not doing any unnecessary GC allocs, its very useful!
     
    MartinTilo likes this.
  6. hungrybelome

    hungrybelome

    Joined:
    Dec 31, 2014
    Posts:
    317
    Hi, can you explain what setting Allocation Callstacks to "Managed Allocations" actually does in the profiler? I'm not seeing any more detailed data about allocations.

    I was hoping that it would make the opaque GC.Alloc callstack listings in the profiler more transparent. Right now I am trying to track the cause of some very large GC.Alloc listings, but all I can do is remove LOCs and test again since I can't pinpoint the exact cause in the profiler, only the parent method.

    *EDIT*
    Nevermind, found out from here that it enables the callstack in the timeline view!
     
    Last edited: Jun 30, 2019
  7. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    These call stacks are also visible in Hierarchy view, albeit admittedly a bit more hidden. The Nice thing here is that they can be selected and copied out
    2019-07-29_15-55-28.png
    1. Find the drop-down that reads "no Details" by default and switch to "Show Related Objects"
    2. Select one of the "N/A" Objects (yes, awkward name*)
    3. Find the call stack in the pane underneath that opens once you select an object
    (* The name usually would be something like "MainCamera" for e.g. a render sample and clicking one of these while profiling in the Editor highlights the corresponding object in the Hierarchy view, if it is still around.)
     
  8. T5Shared

    T5Shared

    Joined:
    Oct 19, 2018
    Posts:
    128
    Looks like those call stacks are only available when profiling the game running in the Unity Editor. When I profile a built player, I don't get any stacks. With people often saying 'don't profile the Editor, profile a player', it would be nice to get this kind of data when you do that.
     
  9. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    @T5Shared
    Agreed, which is why it was added for players build with 2019.3.0a12 or newer.
     
  10. T5Shared

    T5Shared

    Joined:
    Oct 19, 2018
    Posts:
    128
    Excellent! But you forgot to mention it will also be back-ported to 2018.4 ;)
     
    resequenced likes this.
  11. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    @T5Shared I'd personally love to be able to backport all those super useful Profiler features onto any LTS still supported or even earlier but, sadly, a lot of the time they are tied to too many changes deep in the code base and building upon quite some iterations of changes. Especially where it concerns the interplay between unity's own native code, Mono and IL2CPP code and all the permutations of platform dependent implementations. Also keeping in mind that 18.4 still has support for Mono 3.5 and 4.x Scripting back-end with multiple 3.5 Mono repositories/implementations per platform...
    Backporting this particular feature (which implicates changes to all these parts of the engine) would not really be practicable.
     
  12. T5Shared

    T5Shared

    Joined:
    Oct 19, 2018
    Posts:
    128
    Yeah, I thought you would come up with lots of valid technical reasons. Disgusting ;) I can *probably* get our project working in 2019 so I can get those features for testing. Might even be easier than you backporting to 2018...
     
  13. T5Shared

    T5Shared

    Joined:
    Oct 19, 2018
    Posts:
    128
    Hi @MartinTilo, is there a specific reason why the 'GC Alloc' column in the 'Show Related Objects' panel is showing the amounf of bytes in a formatted style (e.g. 20 B, 16 KB, 20.8 MB), while the 'Show Calls' view is showing raw byte counts? And could the 'Show Related Objects' panel show 'GC Alloc' counts in bytes?

    The reason I am asking is that sorting the panel on 'GC Alloc' seems to be alphabetical. Which works for byte counts, but not for the formatted stuff. I found it more difficult than necessary to find a cleverly hidden '15.6 MB' amongst 3000 entries:

    upload_2019-11-12_21-1-55.png


    Displaying large amounts of bytes as KB and MB can be nice. But it would be nice if they sort properly.

    Edit: I tried both Unity 2018.4.9f1 and 2019.2.12f1

    Thanks,
    Matt
     
    Last edited: Nov 12, 2019
    MartinTilo and alexeyzakharov like this.
  14. alexeyzakharov

    alexeyzakharov

    Unity Technologies

    Joined:
    Jul 2, 2014
    Posts:
    397
    Do you feel like the exact number is perceived better for you? We can have an option for the display style.
    (We added the KB/MB in order to simplify reading the high numbers).

    That is our oversight, thanks for spotting
     
    T5Shared and MartinTilo like this.
  15. T5Shared

    T5Shared

    Joined:
    Oct 19, 2018
    Posts:
    128
    Hi @alexeyzakharov,

    I would prefer the exact number. Interestingly enough not because it is exact, but because it gives a much better sense of 'size'. A big number stands out much more than 'KB' versus 'GB'. For example, a big allocation stands out much more like this:


    upload_2019-11-13_17-11-36.png

    than like this:

    upload_2019-11-13_17-10-32.png

    (it's the 15.6 MB one.) It also means that sorting of the GC column in the Related Object will be fixed automatically.

    Thanks,
    Matt
     
    MartinTilo and alexeyzakharov like this.
  16. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    1,471
    Since I can only deep profile in editor (in 2018.4) I am doing but get weird looking gc allocs. What for instance is this (function name replaced)?
    Code (CSharp):
    1. Callstack:
    2. 0x00007FF968195064 (mono-2.0-bdwgc) [boehm-gc.c:805] mono_gc_alloc_obj
    3. 0x00007FF968138935 (mono-2.0-bdwgc) [object.c:5515] mono_object_new_alloc_specific_checked
    4. 0x00007FF968138E6A (mono-2.0-bdwgc) [object.c:5456] mono_object_new_specific_checked
    5. 0x00007FF96813E263 (mono-2.0-bdwgc) [object.c:5463] ves_icall_object_new_specific
    6. 0x00000000223217C8 ((<unknown>))
    7. 0x000000003CEBF903 (Mono JIT Code) [Sweep.cs:1070] FUNCTION_CALL()
    the function makes no allocs after the first time it is called
     
  17. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    Hi @andyz,
    That'd be the JIT (just in time) compilation compiling the method the first time it's called. That allocation can mostly be ignored.
     
  18. andyz

    andyz

    Joined:
    Jan 5, 2010
    Posts:
    1,471
    Ok thanks but it shows up in profile frame long after first time of calling. Also maybe an option to hide JIT compilations?
     
  19. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    which one is it? if the same method allocates because of JIT multiple times, it might do something non trivial and it might be interesting to see the method.

    agreed and added to our backlog. BTW, if the the target platform you're aiming for supports IL2CPP, you might want to profile on there with an IL2CPP build. That way you don't get any JIT allocations and performance in general will be way closer to the final build.
     
    andyz likes this.
  20. daxiongmao

    daxiongmao

    Joined:
    Feb 2, 2016
    Posts:
    291
    Looking at the call stacks didn't really help for the gc alloc i was trying to track down. So i will post what it was here.

    Lambda allocations.

    I was aware lambdas allocate memory but what i was not really aware of or thought about was when that happens. I assumed it happened at or right before the function was executed. But digging in more it seems, they are allocated at the method scope in most cases. So you will get allocations even if the lambda is never executed. And it also seems in my case not just method scope, but "for" loop scope as well. But not "if" level scope.

    I had a nested loop that had the lambda in it, that was never executed because of an if statement, but was seeing gc allocs for the number of nested loops iterations.

    The profiling tools etc, weren't really able to catch this correctly. Because wrapping the actual execution code did not seem to catch the correct location of the alloc. A much wider one would, but it was really hard to pinpoint. And i had to resort to commenting stuff out which eventually led me to question my knowledge on lambdas and dig into them more.

    The final solution was to rework stuff a bit and create and internal method outside the loops and use this in the nested loop. Then the lambda would be created only when actually executing it.
     
  21. Digika

    Digika

    Joined:
    Jan 7, 2018
    Posts:
    81
    Is there any hacky way to enable it in development build in versions below that?
    may be via boot.config or assembly patch or something? I need to profile it outside of editor
     
  22. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    No. It's an involved change in Mono/IL2CPP and the main codebase.

    Re:
    Ah yes. Those can be lovely confusing. Thanks for raising this feedback, we'll check if we can't get Mono/IL2CPP to drop some further hints into the call stacks to help identify the allocations for the Anonymous Functions, Lambdas and Closures.
     
  23. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    Sorry, I'll have to caveat my earlier "No." a bit. If you are running a Mono build on 2018.3-2019.2 *, you can however pass the "-deepprofiling" command line option to the built player, to enable deep profiling for the entire session. Maybe that can help you?

    (* that option is still there for newer builds but with the ability to switch Allocation Call Stacks on on the fly (and not just in Mono but in IL2CPP too), as well as the build option to build with deep profiling support, to then be able to toggle that on and off ad-hoc (and on IL2CPP too), this has become somewhat more of a hassle to do if via command line instead of UI or API. And it is also not your only option anymore).
     
  24. Digika

    Digika

    Joined:
    Jan 7, 2018
    Posts:
    81
    MartinTilo
    Thanks for the responses I now have more questions, if you dont mind.

    By `mono` in this context you mean mono itself? Because I can get debug compiled version of mono if that works?

    This is not directed at me but I will budge in: is there a reason I'm not getting allocation report for LINQs OrderByDescending in place? It should return with allocation but Profiler reports 0. Tested on 2020.x

    EDIT: this does not work, I still see all zeroes in GC.Alloc
    I will check, thanks!
    Any workaround for 2017.x? As I said I can get debug version of mono. May be replace some assemblies from 2018 playback engines and use `-deepprofiling`? I dont care about accuracy of some usage stats I just need to see what allocates and where inside Mono domain.

    Thanks, but ability to built with that is a bit limited rn.

    Additional question, not related to the context above.
    I'm profiling 2020 build outside of editor build with profiling support. I see PlayerLoop stats and total allocations at frame X is ~400B and hierarchy shows the base location where it occured. However, in memory stats the same frame has `GC Alloc` stats of 4.0KB. This makes no sense, that's almost 10x higher then what Playerloop reports. And there is no info on any other allocations anywhere. Is this some limitation of Profiler or I'm missing something?
    P.S. I just realized it is the same question OP asked but in my case I'm profiling standalone player, so no Editor caveats
     
    Last edited: Sep 18, 2020
    alexeyzakharov likes this.
  25. alexeyzakharov

    alexeyzakharov

    Unity Technologies

    Joined:
    Jul 2, 2014
    Posts:
    397
    The Memory stats chart and view displays allocations across all threads, while hierarchy view displays for now only Main Thread data. More likely there is a C# thread which allocates. You can spot those allocations on the timeline view - they are marked as red samples.
     
  26. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    934
    I.e. no. It's not just mono. You could have all the changes in mono but you'd need source code access to try and backport the main code base changes too.


    Do you have a code snippet for this? Because it allocated and it showed in the Profiler when I tried this just now.


    What GC.Alloc stat are you looking at?

    Do you mean limited as in: only available from 2019.3 and up?
     
  27. Digika

    Digika

    Joined:
    Jan 7, 2018
    Posts:
    81
    I mean calls to GC.Alloc. The stats always at 0 for that in 2017-2018 even with -deepproiling passed (where I can get detailed call stack data).

    Yes, that was my initial thought but the timeline only shows these allocations already reported by Playerloop onMain thread. I manually went through all dropdowns for threads and none of them had any data for GC.Alloc reported.

    Code (CSharp):
    1.  
    2. _sortedList.Clear();
    3. foreach (var kv in _averages)
    4. {
    5.     long tickSum = 0;
    6.     var len = kv.Value.Value.Count;
    7.     for (int i = 0; i < len; i++)
    8.         tickSum = tickSum + kv.Value.Value[i].ElapsedTicks;
    9.     var ma = kv.Value.Key;
    10.     ma.Sample(tickSum);
    11.     var av = ma.GetAverage();
    12.     if (av > cutoffTicks)
    13.         _sortedList.Add(new KeyValuePair<string, long>(kv.Key.GUID, av));
    14. }
    15. if (_sortedList.Count > 0)
    16. {
    17.     int c = 0;
    18.     // TODO: .OrderByDescending allocates (?)
    19.     foreach (var kvav in _sortedList.OrderByDescending(x => x.Value))
    20.     {
    21.         //do stuff
    22.     }
    23. }
    24.  
    Here, sorting in place via .OrderByDescending() query should allocated and Benchmark.Net does report on that for .NETFX/Core for example, but I see no GC.Alloc data in late 2019/2020 versions of Unity/profiler

    I mean as in dont have a way to download build with extra debug options enabled, only have basic development build.
    So you are saying there is absolutely no way (even if hacky) to get reports on GC.Alloc calls at site in the Playerloop call hierharchy in anything below 2019.2.3 in standalone player profiling mode?
     
    Last edited: Sep 23, 2020
unityunity