Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question Garbage collector poor idle performance with large objects in memory

Discussion in 'Scripting' started by ualogic, Aug 23, 2023.

  1. ualogic

    ualogic

    Joined:
    Oct 15, 2018
    Posts:
    16
    Hi, I am seeing a weird behavior and don't understand if that is normal. I have a cubic voxel grid, which is an object with 2 bool arrays, 1 float array and 1 vector3 array and some other properties to contain all the data. If I set the size of the grid to be say 256 which results in 16m voxels, GC suddenly starts to spend 50ms each frame basically doing nothing. Allocations stay at <1kb beyond the first creation of the arrays, but collection is 50-80ms each frame. If the grid is 64x64x64 I can't notice any performance hit, but at 128x128x128 it's already 20ms. This does not happen when I just create arrays directly in the monobehaviour script from which I create a grid object. That doesn't seem right to me. Any suggestions on what might be happening there and how to manage/test/fix this?

    P.S. I am using Ubuntu and Unity 2021.3.22

    EDIT:
    -It happens only on GameObject array creation, not on anything else. Gameobjects themselves in the array are not created. Any other array have no big impact on GC performance.

    -Bad GC performance happens because there is manual GC.Collect in the project, but it is probably needed and can't be avoided.

    -The question is why gameobjects degrade GC performance and not arrays of vectors, floats, etc?
     
    Last edited: Aug 23, 2023
  2. PraetorBlue

    PraetorBlue

    Joined:
    Dec 13, 2012
    Posts:
    7,718
    Objects sitting at rest in memory should not have any garbage collection impact at all. The garbage collector comes into play when objects are created and then no longer actively referenced by any objects in your program.

    Can you show your code?
     
    Yoreki likes this.
  3. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    3,899
    So many questions here.
    Just one tip: don‘t even attempt to approach voxel worlds of this size without using native collections, mathematics, jobs and burst packages.
     
  4. MelvMay

    MelvMay

    Unity Technologies

    Joined:
    May 24, 2013
    Posts:
    10,468
    This. :)
     
  5. ualogic

    ualogic

    Joined:
    Oct 15, 2018
    Posts:
    16
    Oh, I feel a bit stupid now. It's not the grid, but an array of empty gameobjects that I use for testing.

    The code is literally:
    Code (CSharp):
    1. int len = 256*256*256;
    2. voxelsTest = new GameObject[len];
    The question still stays, but should I edit the context?
     
  6. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    3,495
    Is that array public / serialized in the inspector? If yes you get the full impact of Unity's fake null objects ^^
     
    Yoreki, CodeSmile and PraetorBlue like this.
  7. PraetorBlue

    PraetorBlue

    Joined:
    Dec 13, 2012
    Posts:
    7,718
    Not to mention the impact of the inspector trying to render 16 million rows of an array o_O

    The idea of using a whole GameObject for every voxel in a voxel game is also going to be disastrous performance-wise.
     
    Yoreki, CodeSmile and Bunny83 like this.
  8. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,750
    Actually, in programming languages with garbage collection the number of objects sitting in memory does impact performance. The garbage collector eventually needs to check all managed objects to know if they need to be collected or not. The more objects there are, the longer this takes.
     
    CodeSmile likes this.
  9. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    3,899
    Not disastrous. Not even close. More like „end of the universe“ kind of catastrophic. :D

    I‘m surprised this doesn‘t even crash outright. In a serialized scene, you cannot even have one million game objects since it would crash the editor and bloat the scene to several gigabytes on disk.

    With 16 million gameobjects you can assume they will amount to 1-2 gigabytes in memory, plus the array containing 4-byte references (int pointers) times 16 million makes for another 4 gigabytes of memory.
     
  10. ualogic

    ualogic

    Joined:
    Oct 15, 2018
    Posts:
    16
    Performance does not matter, as long as I understand what is happening and why, but I still don't completely.
    I tested with a new project and the behavior is different, so I searched a bit and unearthed a manual GC.Collect() in the project. Doesn't spike without it anymore, but still does not explain why this only occurs with gameobjects? Also that manual GC.Collect was there for a reason, so I might not be able to just disable it.

    In the new project with local gameobjects array there is no visible difference to performance.
    With public array, even though it's hidden in inspector, Editor starts to take 0.5s per frame when the object that has script attached to it is selected. Player has no visible performance hit.

    Tested with the code:

    Code (CSharp):
    1.     [SerializeField] int lenToCreate = 256;
    2.     [HideInInspector] public GameObject[] voxelTestPublic;
    3.     GameObject[] voxelTestLocal;
    4.    
    5.     void Update()
    6.     {
    7.         if(Input.GetKeyDown(KeyCode.P)){
    8.             int len = lenToCreate*lenToCreate*lenToCreate;
    9.             voxelTestPublic = new GameObject[len];
    10.             Debug.Log("Created " + voxelTestPublic.Length + " objects");
    11.         }
    12.  
    13.         if(Input.GetKeyDown(KeyCode.L)){
    14.             int len = lenToCreate*lenToCreate*lenToCreate;
    15.             voxelTestLocal = new GameObject[len];
    16.             Debug.Log("Created " + voxelTestLocal.Length + " objects");
    17.         }
    18.     }
     
  11. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    256*256*256 = 16,777,216 * 8B/object reference = 134,217,728 bytes, which is larger than the 85kB threshold for the array to go on the large object heap (LoH). Objects that end up on the LoH are automatically Gen 2 objects, and could definitely impact your GC performance if they are involved in garbage collection. Try to keep all objects below 85kB in total size. This namely affects really big arrays that you allocate. An object that is made up of other objects are all separate objects.

    https://learn.microsoft.com/en-us/dotnet/standard/garbage-collection/large-object-heap

    If you're going to work with voxels you will have to figure out how to compress all kinds of data, starting with taking those 8 byte object references and converting them to something like a ushort, that keys into a palette of voxel properties or something like that, and then deciding on reasonable limits for everything and how you structure the memory model for the whole voxel environment.

    Minecraft uses a system of chunks to make up the entire world, and you'll almost certainly have to do something like that unless the whole game is confined to a single chunk that you can fit into memory comfortably. Otherwise, you will need to stream those chunks in and out.
     
    Last edited: Aug 23, 2023
    Bunny83 likes this.
  12. ualogic

    ualogic

    Joined:
    Oct 15, 2018
    Posts:
    16
    The GC performance does not degrade with creation of large Vector3, float, or bool arrays.

    Also it's not a game but a 3d reconstruction of the environment that does not even need to be rendered. Gameobjects are needed for collision detection.

    Optimizations only make sense if what I do fails, but to know it fails I need to know what's happening, which at the moment I don't fully understand.
     
  13. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    You can profile allocations that contribute to GC pressure in the profiler, and see when collections occur in the memory section. I'm not sure if there is anything in Unity to profile the actual garbage collection, but knowing what and when your software allocates in detail could provide you with clues. My strategy is to minimize GC allocations to begin with, so that's how I would start to tackle it.

    I've only used PerfView once, so I'm still a bit fuzzy. I'm not sure if you'd have to isolate an example of your issue to a non-Unity, strictly C# project, or if you can just use PerfView on your Unity app. In any case, maybe it will help you with profiling garbage collection.

    https://github.com/Maoni0/mem-doc/blob/master/doc/.NETMemoryPerformanceAnalysis.md

    https://github.com/microsoft/perfview
     
  14. ualogic

    ualogic

    Joined:
    Oct 15, 2018
    Posts:
    16
    The problem is I don't allocate anything anymore. Allocations stay at default <300B.

    Yea, will have to check out profilers and info on GC, thx.
     
  15. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    3,495
    Wait a second, you do a manual GC.Collect? When and how often do you do that? Every frame? That would explain everything.... When you do a GC.Collect, ALL threads will be suspended and the GC thread will spin up and do it's GC cycle. This will always be a costly operation, that's why the GC usually decides on its own when it's necessary. Calling Collect every frame is of course a performance killer.

    Where do those ~300B per frame come from? Nothing should allocate memory "by default". Yes, certain things can't be avoided, but you usually can avoid all per frame allocations.
     
    Yoreki likes this.
  16. Yoreki

    Yoreki

    Joined:
    Apr 10, 2019
    Posts:
    2,588
    Others touched on this. But do you realise voxel worlds are not made of individual objects? A minecraft chunk is not made of tens of thousands of cube objects. In fact, not a single one. The objects in voxel worlds are chunks, the meshes of which look like they are made of individual objects or shapes (such as cubes), but they really are not. It's a trick. The chunk mesh itself obviously can have collision, but you dont need individual objects for that.

    The voxel data itself is little more than an array with one or more entries per coordinate. That's it. That's the world. Everything else is its representation. Basically, you draw triangles where the world data is met with "air". That's the only parts you can see anyways. That's why, if you can stick your camera into a wall in voxel worlds, you can see through the ground, unless there is air pockets like caves. Keep in mind that voxel based worlds are a highly advanced topic.
    What you are attempting would be silly, even if you did not need it to run in realtime.

    Bunny was faster, but i too wanted to mention that manual garbage collection is usually a bad idea. Especially if you do it frequently and not just occasionally. Just let the collector do its job, unless you know exactly what you are doing. Why do you think that you "cant avoid" the manual garbage collection in the first place?
     
    MartinTilo and Bunny83 like this.
  17. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    2,140
    No. The Boehm GC currently used in Mono & IL2CPP Scripting Backends uses neither LoH nor generations.
    Also, ideally don't build assumptions about GC implementation details into your code. Ergo, chunking might not be bad but won't fix anything here.


    As for why a large reference type array of nullpointers would slow down the GC: it'll look over all of these pointers to see if they keep anything in memory going:
    - what are you holding on to? Oh, null, next!
    - also null? Next!
    - ...

    It the GC might also interpret pointer sized types (e.g. int on 32 bit platforms, long on 64, voi* or IntPtr on either etc) as potential pointers and do the same.

    The tip to use NativeArrays for your grid helps not just with that, but also with keeping managed memory fragmentation and GC pressure down, additionally to other potential speedups like keeping the code Burst compatible if you need that later.
     
    CodeRonnie, Yoreki and Bunny83 like this.
  18. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    That's good to know and thank you for pointing it out. Since you mention it, I am sure that I have already heard that a few times before, and will try be more considerate of the current state of Unity's specific garbage collection implementation in the future.

    The reason that I have built those types of assumptions into my own code, personally, has been because that code does not assume that it will even be run in Unity. It is class library code that can be run in any C# application, and should conform as well as possible to all environments, including the many, many environments that do have issues with the LoH, just as .NET's System classes are also programmed.

    Also, as you agree, it's not a bad rule of thumb and certainly can't hurt. I'm no expert on how bits navigate through the CPU, but I would take a wild guess that a giant array that could easily be cut down to 1/4 the size of the more naive implementation may be more friendly to CPU caching and traversal.

    There could also be other external factors that, as you point out, can't always be predicted with 100% accuracy. As a non-sequitor example, I'm learning that Linux kernels have built in TCP Segment Offload operations for TCP packets that are not as readily available and supported for UDP packets. Knowing every kind of external factor that may affect the performance of your code like that isn't always possible, and we should certainly all avoid making assumptions, while also making a best effort at performance.

    Yes, that information is presented in the guide that I linked to from Maoni Stephens, .NET GC Architect, in the section labeled "Understanding GC pauses, ie, when GCs are triggered and how long a GC lasts - How long an individual GC lasts" in the table of contents. It was my hope that the guide would offer the relevant answers from an expert for those willing to read it, rather than attempt to explain everything myself.

    https://github.com/Maoni0/mem-doc/b...ceAnalysis.md#How-long-an-individual-GC-lasts

    "The .NET GC is a tracing GC, which means GC needs to go trace through various kinds of roots (e.g., stack locals, GC handle table) to find out which objects should be live. So the amount of GC work is proportional to how much memory is live, i.e., the survivors."

    or

    "...this is what I always tell folks when talking about how often GCs are triggered and how long individual GCs last is: What survives usually determines how much work GC needs to do; what doesn't survive usually determines how often a GC is triggered."

    Also, my recommendation to turn the array of object references into ushort voxel keys would probably help with heap fragmentation, CPU caching, and traversing all of that data during GC sweeps.
     
    Last edited: Aug 24, 2023
    MartinTilo likes this.
  19. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    @ualogic I believe Martin Tilo's answer may provide you that specific solution you were looking for. The difference in GC sweep times with a large, live array of gameobjects versus an equal size array of value types like Vector3 and float is probably because the garbage collector has to go through each object reference checking if it is null, and then go inspect the object and all of its references. It has to access all kinds of random places in memory as it jumps to where that gameobject is and checks all of its members to mark them as well as just the reference in the array. If it's an array of value types the CPU has better access to that contiguous memory, but that's not really the issue just a general benefit.

    With an array of value types there's really nothing to mark as live other than the array itself. With the array of game objects it has to sweep through every element and every game object and all of its members checking for live references to mark.
     
    Last edited: Aug 24, 2023
    MartinTilo likes this.