Search Unity

  1. Get the latest news, tutorials and offers directly to your inbox with our newsletters. Sign up now.
    Dismiss Notice

Feedback Unity 2021.1 Performance Overview

Discussion in '2021.1 Beta' started by Peter77, Oct 23, 2020.

  1. rz_0lento

    rz_0lento

    Joined:
    Oct 8, 2013
    Posts:
    2,017
    I think someone has to report the incremental GC dropping performance radically though. It's not what users would expect to happen.
     
  2. jdtec

    jdtec

    Joined:
    Oct 25, 2017
    Posts:
    281
    First thing I've done is disable it. Makes me wonder if Unity should have it disabled by default?

    Maybe it should have a disclaimer that it costs performance and really you should be eliminating garbage collection where possible?
     
    Ruslank100 and landonth like this.
  3. AcidArrow

    AcidArrow

    Joined:
    May 20, 2010
    Posts:
    7,577
    Wait, it's on by default? That sounds bad.
     
    joshcamas likes this.
  4. jdtec

    jdtec

    Joined:
    Oct 25, 2017
    Posts:
    281
    Sorry I don't know for sure, shouldn't have phrased it like that. I don't remember turning it on but haven't tried making a new project yet to confirm so maybe I did turn it on.
     
  5. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    If you create a new project (3D Template) in Unity 2021.1.0a9, then the
    Incremental GC
    is turned on by default. It's probably in older Unity versions turned on by default too.

    I actually think it's not bad that it's turned on by default. Because then you have a chance to squeeze out a little more performance if you turn it off later in the development. That way it runs "smooth" by default and if you commit time to optimizing for GC, then you get some performance back in return.
     
    phobos2077 and dzamani like this.
  6. AcidArrow

    AcidArrow

    Joined:
    May 20, 2010
    Posts:
    7,577
    I just checked, it's on by default.

    If it has a performance overhead, I think it's a bad thing to have on by default.

    (but then again Unity needs to find what their target audience is, because parts of the engine seem to target large studios and other parts newbie users that don't know what they're doing, so defaults being good or bad depend on what the target audience is)
     
    landonth, phobos2077 and jdtec like this.
  7. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    It's definitely not what I expected. The documentation doesn't mention it's a trade-off between "fewer GC spikes for steady lower performance" too. I actually assumed Incremental GC would help performance to be honest.

    However, I'm not sure if the drop in performance is even by design. It would be very helpful if a GC expert from Unity (@jonas-echterhoff ?) could tell if this is by design. If it's not, I can file a bug-report if its helps.

    What baffles me the most is that my profiling project is quite GCAlloc optimized, there is 0 per frame gcalloc most of the time. However, turning off Incremental GC is still giving back up to 0.5ms per frame. So even though Incremental GC should have nothing to do, it still is doing something.

    Again, could be totally by design.
     
    hippocoder and Prodigga like this.
  8. AcidArrow

    AcidArrow

    Joined:
    May 20, 2010
    Posts:
    7,577
    I'll be really glad to be proven wrong, but I'm pretty sure the overhead is by design, which is why it's baffling that it's on by default.
     
  9. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,647
    Hi,

    there is a _certain_ expected overhead from incremental GC, caused by the need to add "write barriers" to tell the GC whenever a reference in managed memory has changed. We have measured this, and found that it does not cause a measurable (beyond natural variations in frame rates) impact in performance in complete Unity projects we tried - but it is certainly possible to create artificial worst case scenarios where this would cause severe drops in performance. Since there is no such thing as a "typical" unity project, we cannot rule out that there are project negatively affected by this - and that is the main reason we still allow the option to turn it off. We do expect that for the majority of projects, incremental GC is beneficial for performance.

    The other performance difference is that GC collection is now distributed over multiple frames generally looking at vsync to see how much time is available at the end of a frame, and using that "free" time for GC - instead of doing a full blocking collection - which won't take time on most frames, but will then cause stutter on the frames it does.

    It would be interesting to take a look at your test project to see what is causing the difference in this case.
     
  10. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,647
    Oh - I just realized you attached the project to a FogBugz case. Will take a look.
     
  11. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    I always ran/run performance tests with VSync turned off. Is this perhaps a reason why I see the Incremental GC cost?

    Yes, thank you! The project attached to the bug-report was submitted with Unity 2017 a few years ago. If it helps, I can also submit it again with 2021.1 and the options I used for the latest test.

    Otherwise here are the Player Settings that I used for the latest tests. Create a Win64 build with and without Incremental GC to see the difference. Also make sure to run it on similar hardware that I used to submit the bug-report (Case 1108597). You most likely can't reproduce it on higher-end hardware.
    upload_2020-12-17_10-30-18.png
     
    hippocoder and Deozaan like this.
  12. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,647
    If Vsync is disabled, targetFrameRate is not used, and you are running on a platform, which does not have any platform specific frame timing, then the incremental GC will always assume it can use up to 1 millisecond per frame for it's work (as it has nothing to use for a better estimate). So, yes, it is possible that what you are seeing is the incremental GC doing its work. A profiler sample would tell. But it should not need this time continuously, only when collection needs to happen. But I guess you are only sampling a few frames? Maybe in the non-incremental case, you have one big collection happening (before you start taking time samples), and in the incremental case, that time is spread over the first X frames, which you are seeing? What if you wait some time? Will you still see a difference? Again, comparing profiler output graphs could probably tell.

    I have not been able to load your project into a newer version of Unity easily (did not spend much time on it yet, either). If you can submit an updated version, that would indeed be helpful.
     
    phobos2077 and Neonlyte like this.
  13. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    It's more than a few. The 20 samples you see in the image below represent 20 seconds, where each sample is the average frame-time over the last second.

    If the game runs at 200 frames per seconds (200fps = (1/5ms)*1000), that would be 4000=200*20 samples in total.

    scene_6_8.png

    I repeated these tests three times and used the minimum average frame-time in the graph, this should remove a lot of noise.

    Since the test goes over 20 seconds and the average frame-time gets reset each second, I don't think a single GC spike at the beginning would explain the entire test to be slower.

    I can capture that.

    Alright, will do that!
     
    Last edited: Dec 17, 2020
  14. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    Here you go:
    (Case 1300104) 2021.1: "Incremental GC" cost

    Please make sure to watch the attached video(s) in the bug-report, where I explain how to build a player and how to run the test.
     
  15. jamespaterson

    jamespaterson

    Joined:
    Jun 19, 2018
    Posts:
    362
    for what its worth, this thread and associated work having tracked down the performance issue to incremental GC is making me think I might make the step to migrate from 2018.4 to a later version at some point. Thanks again Peter!
     
    phobos2077, Prodigga and Jes28 like this.
  16. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    This shouldn't stop you from upgrading. After all, you can turn it off when you notice it affects your game negatively and then it's the same behavior as in 2018.4.
     
    jamespaterson likes this.
  17. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,647
    Thanks. I took a look at that project - but on my machine, I could not see any difference between builds with and without incremental GC. Now, my frame times are generally higher than yours, and the project seems to be GPU bound on my computer (a 2017 MacBook Pro) - so it would be possible that performance differences are masked by the time spent waiting for the GPU.

    But: I also looked at samples from both builds using the profiler. I looked for two things where I'd suspect incremental GC to cause a difference.
    1. incremental collection taking time. But I did not see any time spent in the GC (it does not seem to kick in in either incremental or non-incremental mode, as the project seems to barely allocate any memory at runtime).
    2. differences in time spent in scripts. In incremental GC mode, scripts are compiled with different settings (the write barriers mentioned above), which can cause some overhead. But I could not see any difference in time spent in script code in either of the two builds.

    So - I'm not doubting your findings, but I cannot reproduce them. Can you get Unity Profiler graphs showing the differences in frame times? That would be very useful here.
     
  18. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    GPU bound? :eek: The test runs (should run) at 320x240 resolution to avoid being GPU bound, I'm more than surprised. Did you run it on Windows or OSX?

    I've uploaded Unity profiling data recorded with the Profiler.enableBinaryLog functionality, please see:
    (Case 1300171) 2021.1: Follow-up on (Case 1300104) "Incremental GC" cost

    Here are some screenshots so everybody can follow along...

    without.png

    with.png

    diff.png

    Please note that the Unity profiling data recording itself costs performance too, so these numbers don't necessarily match the graphs posted earlier.

    PS: The profiler data captures seem to contain "CPU Usage" only, all memory alloc etc seem to be missing? I've asked whether this is by design here: https://forum.unity.com/threads/profiling-data-is-missing-a-lot-of-information.1024171/
     
    Last edited: Dec 17, 2020
  19. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,647
    On a MacBook running OSX. I don't think the actual rendering was taking much time, but the blitting took some ms (possibly upscaling to the retina display?). Not much, but enough to mask the small differences in the rest of the frame (which is also very fast).

    Thanks. This is useful.

    So, what I see here: You have uploaded 4 different data sets, each with and without incremental GC. In each case, the non-incremental version is faster, but never by as much as the number sets you originally posted.

    Differences in frame times I see for the 4 data sets (unless I'm interpreting them wrong?): 0.04ms, 0.09ms, 0.19ms, 0.2ms. Drilling down, the difference seems to be completely in managed code. That is in line with the overhead from incremental gc requiring managed code to be generated with write barriers. And in this scope, I think the difference is not unexpected - there is _some_ cost related to incremental GC (or any more modern GC for that matter) for the write barriers.

    But what is strange is that the numbers you are seeing without profiling show higher differences. I have no explanation for this atm.

    I don't know this (I'm not involved directly with the profiler development). But just FWIW - I don't think that allocations play a role here. It does not seem to be the collecting part of the GC which is taking time here (there is barely any GC happening here), but the overhead from tracking references for later collection.

    Btw, thanks for sharing all this data. This is very useful (even though I may not have any answers on how to improve this, other than turning off incremental GC for your project, or optimizing the managed code which is slower to change fewer references in heap objects).
     
  20. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
  21. 00christian00

    00christian00

    Joined:
    Jul 22, 2012
    Posts:
    945
  22. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    5,609
    Not that I know of. My understanding is this:
    • It's not exclusive to UI, but the UI Components were a trivial case to report for me. Not every UI Component causes gcalloc btw.
    • It's not specific to the MonoBehaviour/Component base classes, but to the specific implementation of that MonoBehaviour/Component. Not every MonoBehaviour/Component causes gcalloc.
     
    JamesArndt and 00christian00 like this.
  23. nasos_333

    nasos_333

    Joined:
    Feb 13, 2013
    Posts:
    9,499
    Nice, very useful to know this :) thanks
     
    Peter77 likes this.
  24. Quatum1000

    Quatum1000

    Joined:
    Oct 5, 2014
    Posts:
    855
    Hi,

    I followed this threads about comparing versions for a longer time.

    (I know Peters tests prevent GPU loads)
    For my project I wrote a win based tool to make changes to all unity GPU core scripts at once. First, to make sure that there are no incompatibilities when updating GPU core scripts to a newer version. And for sure to have all changes in any Unity GPU core Script outside of unity itself, and pass changes into any version at any time with one click.

    Untitled-1.jpg

    This thread reminds me to compare the GPU versions 2018.2.xx with 2020.x. based on all the shader & core stuff.

    Further. I can definitely remember when comparing the first time Net4.x beta vs Net.3x all the scripting stuff was about 30%-70% faster on Net4.x in every situation... but after about 2 - 3 builds later the speed advantage was completely gone. That was a bit annoying, because the NET team shouted "it's so much faster" permanently.
     
  25. mcdenyer

    mcdenyer

    Joined:
    Mar 4, 2014
    Posts:
    9
    After discovering the issue with Incremental GC. Is the mobile performance of 2019/2020 still considered poor compared to 2018? I want to upgrade my project to 2020 to utilize visual shader tools but only if I can get as good of performance on old devices as I do on 2018.
     
unityunity