Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Voting for the Unity Awards are OPEN! We’re looking to celebrate creators across games, industry, film, and many more categories. Cast your vote now for all categories
    Dismiss Notice
  3. Dismiss Notice

Understanding boxing allocations & performance

Discussion in 'Scripting' started by MaskedMouse, Jul 19, 2018.

  1. MaskedMouse

    MaskedMouse

    Joined:
    Jul 8, 2014
    Posts:
    1,057
    I am trying to understand boxing allocations in terms of Memory and Performance and how to avoid it.

    I've got an abstract class that I extend which contains a generic type. See code at the bottom.
    Now I know that boxing is bad. I've got Resharper installed with the "Heap Allocations Viewer" extension.
    It tells me I am boxing T to object in the equals null comparison.
    I want to keep the code performant, even if it would mean that I would have to do micro optimizations.
    This piece of script is going to be used somewhat through out the project. Since it is a collection of shared "things". This shared collection could be a list of Units (GameObject), Players, NPCs etc.

    2018-07-19 23_27_32-Window.png

    Project Setup - Running Unity 2018.2 with .net 4.x, Mono backend (x64 build)
    I have tested the Contains method with the profiler by running it in a for loop 20 million times.
    There are no GC allocations. Of course with that amount of iteration it takes a little bit to execute.
    But I did not see any increase in memory usage either.

    1. Is the memory usage by the method just that little or is it actually 0 bytes because the GC allocation is 0?
    2. Or is GC allocation an other type of allocation than a boxing allocation?

    if I were to create a new non abstract class without generic setup, then boxing wouldn't exist. It does not need to box it as an System.object type. But in that case it would mean I have to copy pasta the same code again, again and again. Which means if there is one change, I need to change it in every extended class. That is troublesome. Hence I want a base class that I can extend.

    3.
    If it is bad for performance / memory, is there a better way of doing this where boxing would not occur?

    Boxing & unboxing references
    https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/types/boxing-and-unboxing
    https://forum.unity.com/threads/question-about-struct-boxing-performance-related.255647/

    Code (CSharp):
    1. public abstract class RuntimeList<T> : ScriptableObject
    2. {
    3.     public List<T> List = new List<T>();
    4.  
    5.     public virtual void Add(ref T item)
    6.     {
    7.         if (!Contains(ref item))
    8.         {
    9.             List.Add(item);
    10.         }
    11.     }
    12.  
    13.     public virtual void Remove(ref T item)
    14.     {
    15.         if (Contains(ref item))
    16.         {
    17.             RemoveItem(ref item);
    18.         }
    19.     }
    20.  
    21.     protected virtual bool Contains(ref T item)
    22.     {
    23.         for (var i = List.Count - 1; i >= 0; i--)
    24.         {
    25.             var listedItem = List[i];
    26.  
    27.             if (listedItem == null || !listedItem.Equals(item))
    28.                 continue;
    29.  
    30.             return true;
    31.         }
    32.  
    33.         return false;
    34.     }
    35.  
    36.     protected virtual void RemoveItem(ref T item)
    37.     {
    38.         for (var i = List.Count - 1; i >= 0; i--)
    39.         {
    40.             var listedItem = List[i];
    41.             if (listedItem == null) || !listedItem.Equals(item))
    42.                 continue;
    43.  
    44.             List.RemoveAt(i);
    45.             return;
    46.         }
    47.     }
    48. }
     
    HidetoKotomi likes this.
  2. villevli

    villevli

    Joined:
    Jan 19, 2016
    Posts:
    86
    Boxing can happen if a value type (like an int or any struct type) is provided as the type parameter T. However in your case T will commonly be a reference type (like UnityEngine.GameObject, UnityEngine.MonoBehaviour or any class type) and in this case the generic code will not cause boxing.

    To make sure that boxing will not occur, you can constrain the RuntimeList<T> to only work with reference types by replacing your first line with:

    Code (CSharp):
    1. public abstract class RuntimeList<T> : ScriptableObject where T : class
    This will likely remove the warning.

    If you want RuntimeList<T> list to work with reference types and value types without boxing, you have to remove the null check and use
    Code (CSharp):
    1. System.Collections.Generic.EqualityComparer<T>.Default.Equals(value1, value2)
    and make sure that any value type you use as T implements the System.IEquatable<T> interface.
     
    Last edited: Jul 20, 2018
    MaskedMouse likes this.
  3. MaskedMouse

    MaskedMouse

    Joined:
    Jul 8, 2014
    Posts:
    1,057
    So boxing only occurs with value types and not with reference types in this case.
    So what if I want it to be able to be either case? It could be RuntimeList<int> but also RuntimeList<GameObject>?
    Value types cannot be null but Reference types can. Should I create 2 seperate RuntimeLists? one for struct types and one for UnityEngine.Object types?
     
  4. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,378
    So lets first clarify what exactly boxing is and what it's a problem in regards to Unity. I'm going to start with a refresher on stack vs heap... I'll put that in spoiler tags if you want to read it.

    Unity uses Mono/.Net runtime for its scripting. Mono/.Net has automatic memory management. How this works is that there are 2 main portions of memory.

    The Stack
    The Heap

    The Stack is the memory for currently running code. It's called the stack because every time a new "frame" of code (usually a function/method) is called, the first thing the method does is allocate its stack frame, and that frame is placed onto the top of the stack (like a stack of plates)... if it calls a method/function, that method is then stacked on top of that. When the method/function returns, it is popped off the stack.

    The memory allocation for that stack frame is determined by the compiler and consists of enough memory to store all the variables defined in the method/function. If that variable is a simple struct/primitive type enough memory to store that entire value is allocated. So like this function here:

    Code (csharp):
    1.  
    2. public float Add(float a, float b)
    3. {
    4.     return a + b;
    5. }
    6.  
    This function will allocate 8 bytes of memory on the stack (and maybe a little extra filler if the compiler determines it necessary for header information or something). The first 4 bytes stores 'a' and the next 4 bytes store 'b'.

    Where as say this:
    Code (csharp):
    1.  
    2. public Vector3 Add(Vector3 a, Vector3 b)
    3. {
    4.     return a + b;
    5. }
    6.  
    This will take up 24 bytes since each Vector3 takes up 12 bytes (3 floats * 4 bytes per float)

    BUT for reference types... it doesn't need to store that object as a value, instead it stores it as a reference to the object. This reference pointer is usually 4 or 8 bytes (or whatever the platform determines its pointer word size to be). So for instance this method:
    Code (csharp):
    1.  
    2. public void PrintName(GameObject obj)
    3. {
    4.     string nm = obj.name;
    5.     Debug.Log(nm);
    6. }
    7.  
    This would take up 8 bytes; 4 bytes for the obj var, and 4 bytes for the string var (note technically the compiler might optimize out the string var... but that's not part of this conversation). The GameObject might actually take up 60+ bytes of memory, or more. Same goes for the string, the string could be 30 characters long taking up 240 bytes of memory, but the var is only 4 bytes long.

    ...

    Now you might have 2 questions here.

    1) Where does the GameObject and string exist if it's not on the stack?
    They're in the heap.

    2) So structs and primitives always live on the stack? I've heard that said before?
    No, they aren't always on the stack. You may have heard this before, but it is completely false.

    So to matter 1. A reference type (objects that are of a class type) are intended to be referenced rather than treated as values. A reference type has state and if you change the member of it, other variables that reference the same object should update.

    Code (csharp):
    1.  
    2. Vector3 a = new Vector3();
    3. Vector3 b = a;
    4. a.x = 5f;
    5. Debug.Log(a.x == b.x); //false
    6.  
    A value type is a value. If you change 1 value, the other value does not change.

    Code (csharp):
    1.  
    2. GameObject a = new GameObject("SomeName");
    3. GameObject b = a;
    4. a.name = "Bob";
    5. Debug.Log(a.name == b.name); //true
    6.  
    Where as a ref type is an object. If you modify the object, all other references to that same object should change as well.

    So how this is accomplished is that when you create an object a piece of memory large enough to house that object is allocated. And any variable that references that object stores the address of that object (we're starting simple here, we'll get to dynamic refs in a moment). When you say "a = b" with ref types, you're just copying the address of the object, and not the object itself.

    As for where this object is allocated... well if we were in an unmanaged language it'd just be any swath of memory you chose. When you were done with the object you'd have to make sure to deallocate said memory in some manner (depending the language the methods of doing this vary). This management of memory can get very annoying because it can get all fragmented, and you have to remember to clean yourself up. This is one of those places things like "memory leaks" come from... when you allocate memory for us but then never deallocate it. So instead of recycling memory every time you need new memory or just keep allocating new memory creating a run away effect where the system thinks you need memory you actually don't.

    How managed languages deal with this is that it just allocates a large portion of memory from the system. This region of memory is called the 'Heap'. When you need to create a referenced object, it is placed into this heap. The variable is then given a smart address to that object in the heap... this can be implemented many ways, but the general idea is this. The Heap usually has some TOC (table of contents) that stores the addresses of the objects, and the variables stores the index in the TOC where the address sits. Then if the memory manager needs to move the object around in memory to defragment it or make space for other objects (like giant arrays or something) it just updates this TOC and all references to the object the next time they look in the TOC they have the new address. Furthermore this TOC tracks how many references there are to a specific object, every time you take a reference to it (set some variable to the object), an counter is incremented, every time you dereference, the counter is decremented.

    If the heap of memory gets too full the program has 2 options.

    1) ask the system for more memory (bad)
    2) throw out objects that are no longer referenced and reuse that memory (recycling is good!)

    This pass is called 'Garbage Collection'. What it does is search through the system for objects that don't have references anymore (that counter ='s 0), and make that chunk of the heap available for reuse.

    Problem is that while this occurs... nothing can touch the heap! The heap is in a state of flux where accessing it might point to an invalid address and your program could potentially crash.

    To make things worse. This algorithm I'm describing... well there's naive versions (like the one I described), and there are robust versions.

    The version of Mono that Unity uses... yeah... it's one of the naive versions!

    This means it's slow as molasses flowing up hill.

    So where does boxing come in, and this fact that structs/prims can exist not in the stack but in the heap.

    Well there's the first obvious one.

    A class is really just a grouping of fields/variables and functions/methods. Those fields/variables are its 'state', and is was needs memory on the heap. The functions/methods exist over in the program memory (they're static, they don't need to be stored in dynamic memory like the stack or heap).

    So this is why I say it is wrong to assume structs/value types exist only on the stack. The object is made up of variables which may be value types, and they exist on the heap!

    But then there are these special other cases that are called 'boxing'.

    So lets say you have a variable typed as 'object', or as an interface like 'IEquatable'.

    You have a function like so basically:

    Code (csharp):
    1.  
    2. public void PrintObj(object obj)
    3. {
    4.     Debug.Log(obj.ToString());
    5. }
    6.  
    OK, so when we go and allocate a stack frame, how much memory should be allocated for that variable 'obj'???

    If obj is a Vector3, it's going to take up 12 bytes. But if it's a long it's only 8. If it's a string it only needs 4 bytes for the ref pointer. They point is that it could be any size... but the stack frame needs to allocate EXACTLY how much memory on the stack it needs for the function/method.

    This same issue arises for classes. If you have the class:

    Code (csharp):
    1.  
    2. public class Foo
    3. {
    4.      public int Value;
    5.      public string Name;
    6.      public object Something;
    7. }
    8.  
    Well we need 4 bytes for the int, 4 bytes to reference the string, and well... what for the 'Something'?

    If we stick any ref type into our object variable we know exactly what amount of memory we need... the standard ref word size (usually 4 bytes). The clincher is these darn structs/value types.

    So in comes boxing.

    Boxing is the idea of throwing a struct or value type onto the heap and treat it as a ref type. Then any reference to it is the usual 4 bytes needed to reference it. Now we know how much to allocate for our object variable, and at runtime when it comes time to box the value the runtime just allocates an entry on the heap for the boxed value the size the value is.

    This also goes for if you have a variable typed as an interface, since structs can implement interfaces and be assigned to variables typed as an interface. Boxing is also used here too.

    ...

    ...

    So now how this pertains to your code.

    When you say something like:

    Code (csharp):
    1. if (listedItem == null || !listedItem.Equals(item))
    Well, the concept of something being 'null' is a reference type thing. Values can't be null. You don't have a null int (unless you explicitly define the Nullable<int> type). ints always have value.

    So by checking equal to null, you're effectively casting it to 'object' and checking if it's null.

    If listedItem was a ref type no boxing would occur, it's already allocated on the heap.

    Problem is that when the compiler finds you defined a generic class it has to go with the broadest case, so it's going to assume a cast to object in this example.

    There are ways around this though... for example if you knew that T would ALWAYS be a ref type, you could but a generic constraint on your class like so:

    Code (csharp):
    1. public abstract class RuntimeList<T> : ScriptableObject where T : class
    Basically saying 'RuntimeList' should only ever be a list of ref types. Then the idea of checking if null makes sense.

    Vice versa you could constrain to 'struct' to say it'll always be a value type:
    Code (csharp):
    1. public abstract class RuntimeList<T> : ScriptableObject where T : struct
    And completely forego the == null, because there's no chance T would ever be null since value types can't be null.

    And of course the last option is to check if T is a value type or not before even bother checking if null. Only running that code if you must.
     
  5. MaskedMouse

    MaskedMouse

    Joined:
    Jul 8, 2014
    Posts:
    1,057
  6. Owen-Reynolds

    Owen-Reynolds

    Joined:
    Feb 15, 2012
    Posts:
    1,914
    Boxing started life as a Java term. I feel like C# tends to use another word for it, just to be different, but can't recall it. If you're really interested, you might find more by looking at Java references (just about everything in C#'s Reference/Value system was borrowed from Java, so it's equally applicable.)

    But for your particular situation, there's a 99% chance your general purpose utility will need a revamp. The only things which don't are ones you've used in a previous project. Go for a possible 1% speed increase after you're sure it's in the final form -- otherwise put in on an index card at the bottom of the pile.
     
  7. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,378
  8. Owen-Reynolds

    Owen-Reynolds

    Joined:
    Feb 15, 2012
    Posts:
    1,914
    But the OP listed the explanation on the microsoft site as part of "I still don't understand after reading this." Then two people here agreed and hand-wrote explanations from scratch. Then the OP re-agreed, replying one of those was more helpful than the official microsoft explanation.

    For such an old, common concept like boxing, there must be a bunch of good, edited explanations, with examples and analysis. And there are, mostly on Java sites. It's just more of a central thing in that language.
     
  9. lordofduct

    lordofduct

    Joined:
    Oct 3, 2011
    Posts:
    8,378
    My point wasn't that the example from MSDN was good.

    My point was that microsoft/C# calls it 'boxing'. Note what part of your post I quoted.

    Yes, OP may want to check out Java articles on boxing to get a more broad understanding of the concept.
     
  10. MaskedMouse

    MaskedMouse

    Joined:
    Jul 8, 2014
    Posts:
    1,057
    Alright, I thank you all for explaining, I learned quite a lot.

    Since I want the RuntimeList<T> to be implemented with either struct or class, I first tried to implement a new class
    RuntimeClassList<T>
    and
    RuntimeStructList<T>
    that extend RuntimeList<T>
    where T : class
    /
    where T : struct

    The RuntimeStructList would remain boxed because of
    Equals(object x)
    .
    I tested the code with the profiler and for an integer type it would indeed cause boxing, GC allocations were shown in the profiler. I tested a seperate line of code using the solution villevi suggested, using the
    EqualityComparer<T>
    , profiling that did not have any GC allocations.

    So I reverted back (using git) to just the
    RuntimeList<T>
    and then implemented
    EqualityComparer<T>.Default.Equals(listedItem, item)
    to compare the items of the list.

    What I then tried next, I created a new struct type with an int ID as field variable. I then created a runtime list of that new struct type, tested the contains method that uses the
    EqualityComparer<T>
    for GC allocations. Since it did not implement the interface
    IEquatable<T>
    , it generated GC allocations as expected. I then implemented the
    IEquatable<T>
    interface for the struct and tested it again, no GC allocation and performing well. Cool!