Search Unity

  1. Unity 2018.3 is now released.
    Dismiss Notice
  2. The Unity Pro & Visual Studio Professional Bundle gives you the tools you need to develop faster & collaborate more efficiently. Learn more.
    Dismiss Notice
  3. We've updated our Terms of Service. Please read our blog post from Unity CTO and Co-Founder Joachim Ante here
    Dismiss Notice
  4. Want to provide direct feedback to the Unity team? Join the Unity Advisory Panel.
    Dismiss Notice
  5. Improve your Unity skills with a certified instructor in a private, interactive classroom. Watch the overview now.
    Dismiss Notice

Incremental GC feedback thread

Discussion in '2019.1 Alpha' started by jonas-echterhoff, Nov 26, 2018.

  1. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Unity 19.1a10 has experimental support for incremental garbage collection. You can find more information about the feature in this blog post.

    I'm opening this forum thread as a place to discuss the feature and to collect any feedback. We are very interested in hearing from anyone trying this on projects (especially projects which are suffering from GC spikes), and to hear how incremental GC affects these projects - but any other type of feedback is very welcome as well of course.
     
    Last edited: Nov 26, 2018
  2. Dmitriy-Yukhanov

    Dmitriy-Yukhanov

    Joined:
    Jul 27, 2012
    Posts:
    1,129
    Looks really interesting, @jonas-echterhoff !

    Thanks for letting us try it at such early stage.

    Here is a first simple experiment on Android:

    upload_2018-11-27_2-15-5.png

    upload_2018-11-27_2-16-4.png

    upload_2018-11-27_2-20-46.png

    Last screenshot reveals nature of the Incremental GC: Profiler shows GarbageCollector.CollectIncremental taking all the WaitForTargetFPS frame (wait for vsync) and GC.Collect runs portion of job within CollectIncremental time.

    And if I got it correctly, this picture is totally correct - Incremental GC makes some job to define borders, then just runs chunk of synchronous GC.Collect() at the specified frame and then makes some more additional work to prepare for the next frame.

    And this is much, much better than a single 9ms spike with Incremental GC turned off for same scene:

    upload_2018-11-27_2-28-30.png

    I'm really happy to see this is coming and will be available at the 19.1.

    Though I'm afraid this will relax requirements for the developers on heap allocations avoidance and it may increase ignorance to the GC allocations problem, leading to more issues with GC in the future on the late project stages =D
     
  3. yoyobbi

    yoyobbi

    Joined:
    Nov 26, 2013
    Posts:
    15
    We see significant performance problems in any managed code that allocates memory, independent of garbage collection spikes - code that allocates just runs more slowly. My theory is that the Boehm GC approach means fresh allocations constantly spill into fresh cache lines, so code that allocates will almost always be hit with a performance-crippling cache miss.

    I had hoped that the rumoured "new garbage collector" would be a generational garbage collector with good cache utilization for short-lived allocations. Is there an initiative at Unity to support generational GC, or is incremental Boehm the best we can hope for? Reducing spikes is great, but if allocation continues to hurt performance then we will continue to avoid allocations as much as humanly possible.
     
    Sluggy likes this.
  4. liiir1985

    liiir1985

    Joined:
    Jul 30, 2014
    Posts:
    25
    This would be the benefits of percise GC, Bohem is a conservative collector, which means it cannot tell the difference between real pointer and a integer value. So compacting memory is not possible with boehm, as well as generational marking. Percise GC(both sgen, coreclr's gc, jvm's gc) will compact memory, which means to move live objects together in order to eliminate memory fragments and to improve cache localty.
    But currently it's most unlikely unity will adopt any percise GC, because non of those work with il2cpp. It's difficult to get the stackmap out of c++ compiler which is crucial for percise GC.
    Using percise GC at this point would mean to abandon il2cpp and switch to JIT generate code gen system, like mono aot or coreRT. CoreRT is currently not production ready and don‘t support iOS
     
    Last edited: Nov 27, 2018
    yoyobbi likes this.
  5. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Thanks for the testing! From your screenshots, it looks like you don't actually have vsync enabled, though, making the player run at >100fps? If you enable vsync, the GC should have a better clue at how much time it should use. If you don't, try changing the value of GarbageCollector.incrementalTimeSliceNanoseconds.

    Yes, this is a concern I share - people might make up for the better time distribution by writing less optimal code, and then not benefit in the end. Though you could argue that there is still benefit, if you can get to a similar result with less hard optimization work.
     
    Dmitriy-Yukhanov likes this.
  6. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Right now, no. But as I wrote in the linked blog post, incremental Boehm seemed like the smallest (and thus, safest) step to take towards a better GC, and should help solve the biggest problem people seem to have (spikes). Once this is shipping and stable, we are at a better point to switch to other GC solutions, as the write barrier part needed by pretty much any modern GC is solved then. We will continue to listen to feedback and consider future steps based on that.

    That said, no possible solution is a silver bullet. Unity's requirements don't necessarily match that of other software, so what works well somewhere else might not work well for Unity. Eg, users have repeatedly asked about switching to Sgen, which I have been testing with, and did not get overall better performance results in Unity content.
     
  7. Dmitriy-Yukhanov

    Dmitriy-Yukhanov

    Joined:
    Jul 27, 2012
    Posts:
    1,129
    Thanks for your reply, Jonas!

    It actually was built with Every V Blank setting:

    upload_2018-11-27_11-40-15.png

    Though I agree CPU graph looks unusual for the Player with VSync enabled.
     
  8. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    1,841
    What's the logic for sweeping with this? Does it have anything resembling generations or other knobs we can tweak?
     
  9. dadude123

    dadude123

    Joined:
    Feb 26, 2014
    Posts:
    718
    The only knob to tweak is the maximum time spent on scanning per frame.
    No big logic changes and no generational GC yet.

    As jonas-echterhoff explained in post#6 you can view it as a sort of preparation stage for coming changes that also already fixes the biggest issue we have with the GC (which is frame time spikes).
     
    r618 likes this.
  10. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    I think the profiler graph may be wrong here. Looking at the reported total frame time of ~42ms, that does not match the graphed frame rate between 100-200 fps. I think there were some bugs in profiler graph rendering in 19.1, I'll check with our profiler developers.
     
    Dmitriy-Yukhanov likes this.
  11. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Just to make sure I'm not overpromising: There are no specific "coming changes" planned after incremental GC. GC spikes are clearly the biggest user issue with GC today, so we are setting out to fix those. Once that has landed and is out of experimental, we will listen to feedback and evaluate what are the most pressing issues to work on, and plan further steps based on that.
     
    Peter77 likes this.
  12. Dmitriy-Yukhanov

    Dmitriy-Yukhanov

    Joined:
    Jul 27, 2012
    Posts:
    1,129
    Cromfeli likes this.
  13. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Cromfeli and Dmitriy-Yukhanov like this.
  14. yoyobbi

    yoyobbi

    Joined:
    Nov 26, 2013
    Posts:
    15
    Thanks for clarifying. Spike reduction is definitely a great step forward, so thank you for that.

    We will continue to avoid allocating memory in order to maintain decent cache performance. I guess the good news is that all the tricks we've learned and pooling mechanisms we've built aren't about to become obsolete after all. :)

    In future with ECS + jobs + Burst compilation - all premised on native arrays of value types - we should be writing more cache-friendly code with less allocation.
     
  15. nxrighthere

    nxrighthere

    Joined:
    Mar 2, 2014
    Posts:
    404
    In a managed environment they will never become obsolete even with generational GC. Even if Unity someday will get a modern GC, you still have to pool almost everything.
     
  16. KillHour

    KillHour

    Joined:
    Oct 25, 2015
    Posts:
    14
    Any ideas why enabling Incremental GC doesn't seem to be doing anything? Even in a brand new project on 2019.1.0a11, with the only changes being setting Scripting Runtime to 4.x and enabling Incremental GC in Player Settings, with a simple test script, I'm still seeing GC being run as a single frame, and without the GarbageCollector.Incremental call in my profiler.
     
  17. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Which platform are you testing this on?
     
  18. KillHour

    KillHour

    Joined:
    Oct 25, 2015
    Posts:
    14
    Windows.
     
  19. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    Testing in editor or player? Incremental GC is only supported on players atm. Also, how long is your GC spike? If it is very short, there might not be a point in spreading it over multiple frames.
     
  20. KillHour

    KillHour

    Joined:
    Oct 25, 2015
    Posts:
    14
    That explains it. I was testing in the editor.
     
  21. Peter77

    Peter77

    Joined:
    Jun 12, 2013
    Posts:
    3,023
    There is this great Unite talk, that explains why profiling in the editor might not be the best option.
     
  22. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    24,193
    +Don't even trust the hardware. Throttling after intensive development and testing is quite common on anything mobile, wired or not, and even sometimes on desktops.

    A little anecdote, and I'll use FPS here to be relatable as the hard data is not available any longer. I pushed the Vita so hard that initially it was >60fps but in repeated play, went <50fps, which really wouldn't do.

    So I actually ended up giving it more to do on the CPU, and the GPU took a little rest, bringing the thermal down and running at a steady 60 throughout.

    It's odd. I did more work. Subtracted no work. Ran faster because GPU wasn't trying to take off. The heck I know. Obviously it's not going to be something you'll be able to use but still a funny tale I thought I'd share.
     
  23. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    24,193
    @jonas-echterhoff - Hi just wanted to run by you if it's possible or even desirable to query if the Incremental GC is currently busy, as I have wiggle room when I can instantiate things. Here's my use case:

    1. Player moves to new area, and the scene is loaded async (streaming so there are no pauses for loading). The scene contains only geometry and textures, so in this case, it'll be usually loading in just meshes + associated things like colliders, navmesh stuff etc - and if I'm not mistaken, Unity handles this with ring buffers well these days...

    2. Once this is done I'd like wait for the incremental GC to finish being busy, then move onto instantiating the objects that the area requires, nice and slowly, perhaps time-sliced so that the objects can have time to set themselves up without doing so while the incremental GC is running. I would include these in the scene rather than instantiating them myself, but it seems like that would just be slower as it would bloat the scene with a lot of repetition.
    I don't really need to have all these objects ready for at least a few seconds as there will be at least a few seconds of player travel time guaranteed, so I can delay object construction if the GC is busy sweating.

    Is this a good or bad scheme? Your thoughts are welcome, thank you.
     
  24. snacktime

    snacktime

    Joined:
    Apr 15, 2013
    Posts:
    1,841
    It's generally considered a horrible practice outside of games, because second guessing what a VM will do is nearly impossible in the generic case.

    That said most apps have a single steady state and don't have the hard sync points that game loops do. So while I still cringe at the idea of making runtime logic decisions based on GC stats, I think it has a lot of merit here.

    So I think I would propose having access to concrete stats that are actually a thing and from those define what busy is for your context. For example if you had basic stats on what is allocated, what the promotion/tenuring thresholds were and how many objects were set to be promoted/tenured in the next pass.

    On top of that maybe a callback after each GC pass just signaling that the pass is over, to better coordinate with your code logic.

    I still cringe at the idea, I think it's a minefield no matter how good you are. The making runtime decisions based on the stats that is. The stats themselves would be super useful regardless.
     
    SugoiDev and hippocoder like this.
  25. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    24,193
    Absolutely! it's totally a hack, which is kind of OK in game dev, depending. Most people will go "why aren't you using proper frame timings to predict this?" and that's fair for optional jobs like effects and so on.

    This is quite applicable to my use case though, for tasks that are mandatory to do but you have some wiggle room when, so it's just about having a bit more knowledge of what Unity's doing to defer some potentially expensive operations. I mean an incremental GC quite happily throws off your average frame timings, so you probably won't be able to tell when it's running.

    Generally my choices are about consistent frame times not peaks and pits. Maybe there's better ideas, and I'd love to hear them. I'm just after a delicious smooth experience as levels are streamed in, and some object setup, and some GC is going to be part of that.
     
    Prodigga and SugoiDev like this.
  26. jonas-echterhoff

    jonas-echterhoff

    Unity Technologies

    Joined:
    Aug 18, 2005
    Posts:
    1,513
    I have some doubt on whether such a setup would end up giving you any real life benefits. Assuming you have some frame time goal (either vsync or target FPS), the incremental GC is designed to make an effort to fit into that, so if you do more work, the incremental GC will do less work, and just take a bit longer, which should not hurt, so I guess what you suggest would ideally just cause the incremental GC to finish a bit earlier (because you delay your other work, giving more time to the GC), but I'm not sure if that would make a difference to the player in the end.

    If you wanted to try it however: There is no direct way to query if the GC "is currently busy". I think you can kind of get that information indirectly, however, using a recorder, like
    UnityEngine.Profiling.Recorder.Get("GarbageCollector.CollectIncremental")
    - which should let you query how much time the GC spent the last frame (should be 0 if not busy). I think that this should not require a development build, and is supported in release builds, but I'm not 100% sure about that.
     
    LeonhardP and hippocoder like this.
  27. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    24,193
    Awesome, thank you. You're probably right though, will give it a whirl anyway.