Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

Question Why do strings create garbage?

Discussion in 'Scripting' started by GuirieSanchez, Jul 13, 2023.

  1. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    Hello everyone,

    I would like to discuss a question that has been on my mind regarding string handling and garbage collection in Unity. To provide some context, let me quote an excerpt from Unity's documentation:

    Now, here's my question: If strings in C# seem to hold the value of a string, why is it not possible to make them behave or act as any other value type? By allowing this, we could potentially have more control over garbage collection issues.

    I apologize if this question seems dumb or noobie, but I would appreciate any insights or explanations from the community. Thank you in advance for your help!
     
  2. Lurking-Ninja

    Lurking-Ninja

    Joined:
    Jan 20, 2015
    Posts:
    9,900
    In exchange multiple string contains the same value doesn't take that much memory only the reference size, not the string size.
    Also there are FixedString types in Unity which more or less behave like value types.

    Also if you want mutable string handling you will have to use the StringBuilder class in C#.
    More info here: https://learn.microsoft.com/en-us/dotnet/api/system.string?view=netstandard-2.1#Immutability

    C# was made so you don't have to care about garbage. The fact that we use it for "high performance" game development is not what Microsoft made it for. So their choice of not caring about the garbage immutable strings cause was made from that point of view.
     
  3. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    I tested the StringBuilder class a few days ago in a code snippet like the following:

    Code (CSharp):
    1. Profiler.BeginSample("Sample UpdatePanelText String");
    2.  
    3.         panelText.text = $"<size=65%><color=#FFFFFF>{ex.name}<br>({mod.name})</color></size><br><color=#FFE081>Sets</color>";
    4.         panel3DText.text = $"<size=65%><color=#FFFFFF>{ex.name}<br>({mod.name})</color></size><br><color=#FFE081>Sets</color>";
    5.  
    6.         Profiler.EndSample();
    7.  
    8.  
    9.         Profiler.BeginSample("Sample UpdatePanelText StringBuilder");
    10.  
    11.         stringBuilder.Clear();
    12.         stringBuilder.Append("<size=65%><color=#FFFFFF>");
    13.         stringBuilder.Append(ex.name);
    14.         stringBuilder.Append("<br>(");
    15.         stringBuilder.Append(mod.name);
    16.         stringBuilder.Append(")</color></size><br><color=#FFE081>Sets</color>");
    17.  
    18.         panelText.text = stringBuilder.ToString();
    19.         panel3DText.text = stringBuilder.ToString();
    20.  
    21.         Profiler.EndSample();

    The String Builder version took doubled as much time to compute and generated almost twice as much garbage as the regular/concat string format, so I figured either I'm using it wrong or it doesn't work for the specific use case I had in mind.

    Also, I would like to ask how your suggested "FixedString types" could play a role in, for instance, the use case I showed.
     
  4. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    36,563
    I've also never seen issues with strings being a gc problem... I theorize it might be due to C# string interning?

    https://stackoverflow.com/questions...k-what-are-the-benefits-and-when-to-use-inter

    Either way, make sure it's worth making your code all ugly with StringBuilder noise... it probably isn't gonna make a big difference and I think the API is awful.

    EDIT:
    Oh no Lurk, what happened to your avatar image!?
     
  5. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    I agree
     
    Kurt-Dekker likes this.
  6. zulo3d

    zulo3d

    Joined:
    Feb 18, 2023
    Posts:
    510
    It's not possible to make strings truly behave like a value because strings need to exist in memory. Their value like behavior in C# is just there for the programmers convenience, whereas a true value often only needs to exist within a CPU register.

    This method requires no memory allocations:

    Code (CSharp):
    1. void blah()
    2. {
    3.     float x=10;
    4.     float y=20;
    5.     x=y+30;
    6. }
     
  7. tleylan

    tleylan

    Joined:
    Jun 17, 2020
    Posts:
    521
    The key words are "seems to". They are reference types pointing to data allocated on the heap, not the stack. You don't have to define a string, it can be returned by a method, another class, a service, etc. If you want pointers you can use C or C++ and have maximum control. Most use cases don't require that level of control.

    As a thought experiment... you're given control over the string allocation and destruction that you don't presently have... what changes are you going to make to your code?
     
    halley, Bunny83 and Kurt-Dekker like this.
  8. Ryiah

    Ryiah

    Joined:
    Oct 11, 2012
    Posts:
    20,082
    From what I understand of String Builder (which admittedly is very little since I've never bothered using it) you're just not performing enough operations and/or working with large enough strings.

    Same, and I think part of the problem is that while the advice that working with strings directly isn't ideal is constantly brought up what's not brought up is how much improved garbage collection has become.

    Also while they haven't landed in Unity yet there are optimizations in C# 10 for interpolated strings.

    https://learn.microsoft.com/en-us/d...als/csharp-10.0/constant_interpolated_strings
    https://learn.microsoft.com/en-us/d...als/csharp-10.0/improved-interpolated-strings
     
    Last edited: Jul 13, 2023
    ilmario and GuirieSanchez like this.
  9. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    3,899
    @GuirieSanchez You are calling ToString() on stringbuilder twice, that is probably causing it to do double the work. Also try creating a new StringBuilder instead of clearing. And all the Append calls can be replaced by a string literal - stringbuilder shines when it runs in a loop that puts a string together piece by piece which your example isn‘t doing.
     
  10. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    I just had a simplistic thought: being able to manipulate strings without having to worry about garbage allocations. I know it's not possible, I was just curious about why.

    Unity just provides some tips if your project heavily relies on strings:
    I find working with strings a little frustrating, knowing that almost every operation can potentially generate garbage. It often feels like the only solution is to constantly be vigilant about minimizing string manipulations.

    Consider how inconvenient it would be if value types were also allocated on the heap, requiring us to be cautious about manipulating and creating variables at all times.
     
    Last edited: Jul 13, 2023
  11. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    36,563
    I just really don't understand this mindset... I mean I appreciate that you feel this way but there is simply no concrete evidence you should even spend any brain cycles on these concerns at all, in at least 99.99% of games.

    I'll give you a really egregious example: my KurtMaster2D game contains MS-DOS and native mode C games running under Unity. I didn't want 57,000 different entrypoints so I made one and Unity communicates to the native code back and forth... entirely with strings passed in and out one of four functions:

    Code (csharp):
    1.     [DllImport ("__Internal")]
    2.     private static extern System.IntPtr dispatcher1_entrypoint1( int opcode1, int arg1);
    3.  
    4.     [DllImport ("__Internal")]
    5.     private static extern System.IntPtr dispatcher1_entrypoint2( int opcode2, string arg2);
    6.  
    7.     [DllImport ("__Internal")]
    8.     private static extern System.IntPtr dispatcher1_getkpworkbuf();
    9.  
    10.     [DllImport ("__Internal")]
    11.     private static extern void dispatcher1_rendercolor( System.IntPtr pixels, int span, int flags);
    12.  
    Yes, every frame there are about 8 to 10 transactions performed: update input bits, update mouse touch, select game, gain focus, pump one frame, play sound, emulate particular system instruction, render one frame, etc., and every one of those things sends back an itemized string that I parse in C#. The string might look like:

    ok;playsound=buttonpress1;change_button_label:1,okay;user_intent_location:A,300,180;


    Every frame I chop up hundreds of those strings and make business decisions in Unity as to what to do next, then I have it all blast over the graphics, essentially one big string, which I cram into a Texture2D on Unity and present it.

    EVERY FRAME. 30fps to 60fps... every frame makes that much string noise.

    I don't have any GC issues. Runs buttery smooth.

    Apple iTunes: https://itunes.apple.com/us/app/kurtmaster2d/id1015692678
    Google Play (including TV): https://play.google.com/store/apps/details?id=com.plbm.plbm1

    External Public TestFlight Link: https://testflight.apple.com/join/q5W6yzCD

    It even runs buttery smooth transcoded to Javascript in the WebGL builder!!!

    Itch.io: https://kurtdekker.itch.io/kurtmaster2d

    All with strings!
     
    GuirieSanchez and Ryiah like this.
  12. Adrian

    Adrian

    Joined:
    Apr 5, 2008
    Posts:
    1,051
    The important point is that the size of value types is fixed and known at compile time. If you allocate an array of structs or if the runtime allocates a stack frame, it can calculate exactly how much memory is needed ahead of time.

    Strings, however, are variable size. If you have a string variable or an array of strings, .Net cannot know ahead of time how much memory will be required. Therefore, each string needs to be allocated individually on the heap and GC-managed.

    @Lurking-Ninja mentioned FixedString, using a fixed amount of memory for a string. In which case it can be used as a value type, with the downside that you might waste memory for short strings or have a string that is too long to fit in the memory you've allocated.

    There's also string interning, where some strings are allocated in a special table and not garbage collected, complicating things further.
     
  13. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    @Kurt-Dekker I get you. At first, I got a bit concerned when I saw around +14 KB of garbage being generated in just a single "DateTime.TryParse" operation and some +2KB of garbage generated in very frequent string concat operations that I do in almost every frame. Although to be honest, my concerns are mainly driven by curiosity and love for optimization (and also because I'm a bit of a performance freak) :). It is more about my personal enjoyment rather than thinking that any modern device would actually suffer from these concerns.

    I see, that makes sense.

    So, to sum up, no matter how big, say, an
    int 
    you're operating with is, it will fit within the int32, which .NET knows in advance and allocates a fixed amount of memory to store it on the stack (unless it's too large, for which you'd use
    long
    or any other suitable alternative).

    On the other hand, since a string length range can vary from a single word or letter to a full text, the only reasonable choice is to determine the memory required to store it after creating it and then allocate space on the heap accordingly.

    Just a quick follow-up question: when we modify an existing string, this sort of dynamic allocation of memory always goes on the heap. Would it be inviable to allocate it on the stack?
     
    Last edited: Jul 13, 2023
  14. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    36,563
    Nothing on the stack survives the return from the function containing it. The return statement literally frees / destroys it simply by stepping the Stack Pointer over all the data. As per ABI convention (eg, how the CPU works!), everything below / beyond the stack pointer is unallocated free-to-be-used RAM, so nothing could ever survive in it.

    If you do return and references are outstanding to a stack-based variable, AFAIK the data is copied off the stack beforehand and boxed up and put on the heap instead, and that happens at return time.
     
    GuirieSanchez likes this.
  15. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Others may have made similar points while I was typing this up, but I'm just going to add my bit to the conversation.

    Why do strings create garbage?

    Manipulating strings allocates new memory for later garbage collection because strings are immutable reference types. The string of characters (which are value types) in memory that the string object points to cannot be manipulated without allocating new memory for a new string of characters. You can think of a string as essentially a char[]. The new string of characters probably has a different length than the previous string. You wouldn't want your strings to take up more space in memory than the actual characters that make up the text. If you kept the same length and just changed which characters are in the string, theoretically you wouldn't need to allocate new memory, just change the value of the characters. However, that only applies to that one, less common case, and they wanted string objects to feel like value types when they were designed. So, one standard behavior for every case is to make them immutable reference types.

    On a quick side note, delegate objects, and hence events, are also immutable reference types. Every time you += or -= an event a brand new object is created behind the scenes just like with string manipulation. The MulticastDelegate class handles equality comparison even though they are technically different objects. It's a way of making things convenient for the user, but yes it comes with the new memory allocation caveat.

    I'm not sure if this was directly relevant in their design decisions, but immutable reference types may be relevant in multithreaded situations. Remember that strings are not unique to Unity. Swapping a reference is an atomic operation. There could still be race conditions where an unexpected thread has the final assignment of the value of a particular string, but you're not going to end up with a garbled string object where two threads were fighting over which particular characters need to be in it. If you had a StringBuilder that two threads were trying to manipulate (which you shouldn't ever do), then you could end up with a very mixed up string of characters, or just throwing exceptions.

    The amount of memory they need to occupy is not known at compile time, unless the strings are interned string literals, in which case they don't contribute to garbage memory allocation anyway. One thing to keep in mind here is that a lot of your value types are also stored on the heap. Only certain things like local variables that get declared within the scope of a method live on the stack. The stack is only 1MB or 4MB in size, depending on the CPU. Most of your other value types at run time are probably members of a reference type that is on the heap at run time. They are already trying to make strings resemble a value type. If you wanted to allocate a string of characters locally on the stack you can actually do that like so:
    Code (CSharp):
    1. using System;
    2.  
    3. public class Foo
    4. {
    5.     public void Bar()
    6.     {
    7.         // baz is a value type and it lives on the stack. I hope I don't want to change the length.
    8.         int length = 12;
    9.         Span<char> baz = stackalloc char[length];
    10.         // But now I have to do this:
    11.         baz[0] = 'H';
    12.         baz[1] = 'e';
    13.         baz[2] = 'l';
    14.         baz[3] = 'l';
    15.         baz[4] = 'o';
    16.         baz[5] = ' ';
    17.         baz[6] = 'w';
    18.         baz[7] = 'o';
    19.         baz[8] = 'r';
    20.         baz[9] = 'l';
    21.         baz[10] = 'd';
    22.         baz[11] = '!';
    23.         // And many APIs don't accept Span<char>. Now I have to do this to use it with the method I want.
    24.         string message = new string(baz);
    25.     }
    26. }
    27.  
    I have a few thoughts about your benchmark. Firstly, the profiler, or any other way of measuring how long a bit of code takes is going to have a maximum resolution of 100ns. So, if that one bit of code is taking less than 100ns, you're not going to get accurate results. It's better to test in Benchmark.NET, or if you need to compare assigning to a text field of a Unity component, as you are doing here, then you should make sure each test is as minimal and comparable as possible. Only do one thing at a time if you can, and run each test in a big loop that will guarantee the total time is well over 100ns. Then divide the total by the number of iterations to get the average time, and do that over and over and over again to get a more accurate result.

    Also, I know you were trying to do a comparison to the two string interpolations in the other test, which is why you call .ToString() twice, but calling .ToString() will definitely allocate a brand new string object every time it is called. I'm not sure if there should be a big difference though between the two test at first glance. String interpolation and StringBuilder in that scenario aren't doing much that is different. Breaking down the specific memory allocations is another issue that I feel like I would have to look at myself to see exactly where it is coming from. That being said, be aware that StringBuilder is maintaining it's own array or multiple arrays of characters behind the scenes. That's in addition to the new strings that are created when you call .ToString(). Also, depending on which version of Unity and .NET version you are using, StringBuilder could also be allocating a brand new string when you append value types. However, you are appending string variables example, I just wanted to point out that older versions of StringBuilder allocate just from appending value types. The .NET Standard 2.1 version should not, but I haven't benchmarked the speed of the new append methods for comparison.

    As far as the StringBuilder API is concerned. I don't intend to argue in favor of the API or anything, but I do want to point out that it is designed to enable a fluent interface. So your code could be changed to this:
    Code (CSharp):
    1. stringBuilder.Clear().Append("<size=65%><color=#FFFFFF>").Append(ex.name).Append("<br>(").Append(mod.name).Append(")</color></size><br><color=#FFE081>Sets</color>");
    Allocating a new StringBuilder would also contribute to garbage memory allocation. It's essentially like a List<char> and you wouldn't want to be creating new instances all the time because it would somewhat defeat the purpose. The separate Append calls in the example cannot be reduced because the name values, while they are strings, are variable and not known what they will be at compile time.

    Oh, but it is possible, you just have to use alternate means. Here is an open source pluging for working with strings without allocating memory that should be compatible even with assigning text to text components without ever allocating any new memory along the way: https://github.com/Cysharp/ZString

    I know exactly how you feel. ;) However, this is where I'm actually going to agree with others that if you haven't profiled and found that garbage collection is actually having a definite and noticeable impact on performance, then you probably don't need to worry about it. Different applications use different garbage collectors, and determining if the time you save in garbage collection is actually greater than the time you working around allocating memory is VERY complex. The people who work on garbage collection have spent a lot of time to make sure it's good at what it does. You need to create really good test cases that run for a long time, not just little unit tests because you are comparing an asynchronous scenario to a synchronous blocking scenario. It's apples and oranges. You have to compare the overall performance of the entire test scenario, profiling the actual garbage collection with PerfView, and see if one overall scenario is faster than the other. It's not simple at all to know if you're even saving anything, or actually making things worse.
     
  16. Kurt-Dekker

    Kurt-Dekker

    Joined:
    Mar 16, 2013
    Posts:
    36,563
    This is the key: first do no harm

    Almost all these shenanigans make code harder to read, IMNSHO

    F'r'instance, in 1979 on the TRS-80 Model 1 computer with Level II BASIC, here was string building:

    Code (csharp):
    1. s = "Kurt, please meet "
    2. s = s + "other dude.";
    That's why just doing this feels right to me, 44 years later in C#:

    Code (csharp):
    1. string s = "Kurt, please meet ";
    2. s += "other dude.";
    I'm also not above the occasional chains of
    strcat(3)
    usage in my native C programs!

    Code (csharp):
    1. output[0] = 0;
    2. strcat( output, "Kurt, please meet ");
    3. strcat( output, "other dude.");
     
  17. MaskedMouse

    MaskedMouse

    Joined:
    Jul 8, 2014
    Posts:
    1,057
    I like optimization as well.
    But I'd like to think of it differently. If you can remove many bolts from a car, it becomes lighter. Being lighter means less gas consumption. If the total weight loss means you have to spend 5% less gas on the car without consequences then yeah sure. If it means that the integrity of the car goes to S*** then don't. It's not worth it.

    The string optimization depends on how often you run the code, how large the strings are and how many appends you do. If you're running it every frame with large amounts of strings then sure the string builder reduces some garbage.
    If it is only every now and then. Then don't bother and use string interpolation. The garbage collector will take care of it.

    If you really want to spend time optimizing then focus on the "hot paths".
    Heavy calculations that run every frame spending multiple ms.
    Or you know, look at the profiler and see what is taking most of the time in a build. (Editor profiling is not accurate)

    Here's a little experiment setup to test the garbage generation.
    I've profiled it in a development build (IL2CPP).
    I don't know what the ex / mod were so I replaced them with GameObjects.

    Used Script:
    Code (CSharp):
    1. using System.Collections;
    2. using System.Text;
    3. using TMPro;
    4. using UnityEngine;
    5. using UnityEngine.Profiling;
    6.  
    7. public class GarbageTest : MonoBehaviour
    8. {
    9.     [SerializeField]
    10.     private TextMeshProUGUI InterpolatedPanel;
    11.  
    12.     [SerializeField]
    13.     private TextMeshProUGUI StringBuilderPanel;
    14.  
    15.     public GameObject Ex;
    16.     public GameObject Mod;
    17.     private readonly StringBuilder stringBuilder = new();
    18.  
    19.     private IEnumerator Start()
    20.     {
    21.         // Yield .5 second before starting the test
    22.         yield return new WaitForSeconds(.5f);
    23.      
    24.         // Perform the test 10 times with 2 frames in between
    25.         for (var i = 0; i < 10; i++)
    26.         {
    27.             Test();
    28.          
    29.             yield return null;
    30.             yield return null;
    31.         }
    32.      
    33.         OtherTest();
    34.         yield return null;
    35.      
    36.         Application.Quit();
    37.     }
    38.  
    39.     private void OtherTest()
    40.     {
    41.         stringBuilder.Clear();
    42.         Profiler.BeginSample("Sample StringBuilder append");
    43.         stringBuilder.Append("<size=65%><color=#FFFFFF>").Append(Ex.name).Append("<br>(").Append(Mod.name).Append(")</color></size><br><color=#FFE081>Sets</color>");
    44.         Profiler.EndSample();
    45.     }
    46.  
    47.     private void Test()
    48.     {
    49.         Profiler.BeginSample("Sample Interpolation");
    50.         var interpolateString = $"<size=65%><color=#FFFFFF>{Ex.name}<br>({Mod.name})</color></size><br><color=#FFE081>Sets</color>";
    51.         Profiler.EndSample();
    52.      
    53.         Profiler.BeginSample("Set PanelText Interpolated string");
    54.         InterpolatedPanel.text = interpolateString;
    55.         Profiler.EndSample();
    56.      
    57.         Profiler.BeginSample("Sample StringBuilder");
    58.         stringBuilder.Clear();
    59.         stringBuilder.AppendFormat("<size=65%><color=#FFFFFF>{0}<br>({1})</color></size><br><color=#FFE081>Sets</color>", Ex.name, Mod.name);
    60.         Profiler.EndSample();
    61.      
    62.      
    63.      
    64.         Profiler.BeginSample("Sample StringBuilder.ToString()");
    65.         var stringBuilderString = stringBuilder.ToString();
    66.         Profiler.EndSample();
    67.    
    68.         Profiler.BeginSample("Set PanelText StringBuilder");
    69.         StringBuilderPanel.text = stringBuilderString;
    70.         Profiler.EndSample();
    71.     }
    72. }

    First time
    AppendFormat
    allocated 0.5 KB.
    Second time it allocated only 334 B.
    Third time and the times after that, it allocated only 82 B
    upload_2023-7-14_0-40-33.png

    That together with the 210 B from the
    .ToString()
    results into 292 B as apposed to string interpolation which is 364 B.

    Using the
    StringBuilder
    does help reduce the garbage generated by string concatenation.
    But only after a few times of running.
     
    Last edited: Jul 19, 2023
  18. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    839
    I only glimpsed over the other comments so far so maybe someone has already answered it but the general answer has to do with how stacks and heaps work. If you fully understand what those two things are and how they work then you pretty much have everything you need to know to understand why strings are the way they are. Strings are objects that essentially wrap a variable sized amount of memory. Because of this they needed to be designed in the most general way possible where many different, potentially unforeseen, use cases could occur making them heap-based objects was the only sensible way to go.

    As a person that learned to program in C using books written in the 80s (even though it was the early 2000s at the time lol) and still programs with a 'C accent' to this day, I can say that I will gladly give up that small bit of control and potential performance for the sake of not having to deal with character arrays and null-terminating bytes ;) It's true they can be a bit of a headache in very large-scale industrial-sized applications where you're dealing with hundreds-of-thousands if not millions of requests that all require tons of strings but thankfully this just doesn't come up much in video games. Usually you only *need* to update a couple of strings once a frame or so in most cases and the rest of the time internal data can actually just be indexed ints or hashed values.

    As for your test with StringBuilder, you are doing a LOT of string allocating in that test which is essentially defeating the purpose of using StringBuilder in the first place.
     
  19. Unifikation

    Unifikation

    Joined:
    Jan 4, 2023
    Posts:
    1,026
    Wonderful stuff, Ronnie!

    Thank you very much!!!
     
    CodeRonnie likes this.
  20. Neto_Kokku

    Neto_Kokku

    Joined:
    Feb 15, 2018
    Posts:
    1,750
    You only really need to worry about string garbage if you're doing thousands of string operations every frame, or operating on a large amount of strings at once.

    Also, literal strings are different: they don't become garbage since they are part of the assembly. So when you write this:

    Code (CSharp):
    1. public string Speak()
    2. {
    3.   return "hello world!";
    4. }
    This will not allocate a new string on every call. Instead, it always returns a reference to the same string object. This is the main upside from C#'s immutable strings: since that string can't be modified, it can be reused and passed around by reference safely.

    This, however, will create allocate a new dynamic string every time:
    Code (CSharp):
    1. public string SpeakTheNumber(int value)
    2. {
    3.   return value.ToString();
    4. }
    Converting something to a string always requires the creation of a new string. Getting the name or tag of any Unity object also allocates a fresh new string every time you access those properties.

    About StringBuilder and how it works: it uses a resizable List of characters and each .append() call just adds the characters from the input strings to the list. Calling .ToString() builds a new string that contains all the characters from the list. The list will create garbage whenever it needs to be resized, but if you reuse the same string builder it will recycle the largest reached capacity.
     
  21. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Well about that string immutability... it's kind of a social construct.
    I mean, ok, a language design ideal, but it's not like you can't mutate the content of a C# string, or even it's length.

    It's good to keep in mind the ideas of why it is designed as immutable, but if you do, and have a concrete situation where these considerations don't apply, and when in a pinch and where this is necessary for performance critical improvements, I've gotten good results by breaking with that concept. Especially because StringBuilder doesn't actually help to avoid that many allocations when you do something other than concatenation of strings and chars as all number conversion and all formatting still needlessly allocates garbage.

    Code (CSharp):
    1.  
    2. string GetNonInternedNonLiteralString(char functionParamToForceNonLiteral ='a')
    3. {
    4. var s = $"My mutable string{functionParamToForceNonLiteral}   ";
    5. Debug.Assert(!s.IsInterned, "If you modify interned strings, you're gonna have a bad time");
    6. unsafe
    7. {
    8.     fixed (char* c = s)
    9.     {
    10.         // Feel free to modify those chars here
    11.         c[2] ='\t';
    12.         // Trim those trailing white spaces by setting the length
    13.         // The int value of a strings length is directly Infront of it's char array.
    14.         var newLength = s.IndexOf(' ')+1;
    15.         *(((int*)c) -1) = newLength;
    16.         // Set a trailing null terminator for APIs that ignore the length
    17.         c[newLength] ='\0';
    18.     }
    19. }
    20. Debug.Log($"string length is {s.Length}"); // it is 11
    21. return s;
    22. }
    23.  
    So if you know that no other thread is doing anything with that string, that it isn't a literal and you checked it is not interned, you can mutate it around as you like. You can even set your uGUI/TextMeshPro labels to use it. You might just need to force them to realize that it is not the same string that they already build a mesh for (afaik they do some reference equals checks to avoid rebuilding the same text and, for some reason [\sarkasm], don't expect string contents to change).

    But say you have a timer or an FPS counter: just allocate a
    new string('0', 10)
    and set those digits yourself. (I know there's char array APIs on most places these days, buuut sometimes there isn't).
    Or maybe the OS gave you a file path and you know ToLower wouldn't do anything weird with that charset like making it longer and you want to do it yourself char per char, or you just want to shave of the file ending, why waste a perfectly good, freshly minted and guaranteed not to be interned or literal string?

    And if you really need this, you can write an API that treats strings as mutable and does some nice number formatting so you can use this optimization all over the place. It's pretty reasonable everywhere were you don't really care about weird Locale or Unicode shenanigans.
     
    Last edited: Jul 15, 2023
    Sluggy, GuirieSanchez and Bunny83 like this.
  22. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    While you technically can modify the internal characters of a string, it requires using unsafe code, and you cannot add length. If you shorten the length, you're really just moving the null terminating character and expecting the APIs that operate on that string to respect the null terminating character principle, which is not guaranteed, and you're still reserving that excess character space in memory. Although, holding onto extra space after shortening isn't any different from StringBuilder really, except that if you get the length value of your string it will return something that doesn't match where you've manually moved the null terminating character. It is noteworthy to point out though that you can technically do it.
     
  23. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Correct, though if you know how much you'll need at max, you can pre-allocated that. You could also pool these in different sizes...

    No no, you missed the part where I negatively pointer offset to one int size in-front of the char* buffer and manipulate the length property directly.
    (Though the code formatting broke, making it harder to parse, which I fixed now)

    I add the null terminator as a safety hatch for the APIs that ignore the string.Length property, not the other way around ;).

    You might want to wrap the string in a struct that remembers what the actual length was so that later calls to your hypothetical unsafe stringbuilder API can reuse the extra length. If you don't, then yeah, you're forfeiting that space until the string is GCed
     
    Last edited: Jul 14, 2023
    Bunny83 and CodeRonnie like this.
  24. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    I've edited my code example a bit for more clarity.
     
  25. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Interesting! I admit that unsafe code is not my specialty. That's a fancy trick!
     
  26. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Come to the magic land of C# unsafe, it's got lots of candy ;)

    And really, here the only thing where this is literally unsafe is, as I said, if the string is used by another thread (and maybe if something retained a reference to it and doesn't expect it to change when it will continue it's execution at some point), if it is interned or if it's a literal.
    Well, and maybe if .Net ever decides to change the layout for strings/arrays but I kinda doubt that will happen. And even if, you'll notice it quickly because your strings would be all kinds of broken and then you can adjust the code to whatever the new layout is.
     
    Sluggy likes this.
  27. JoshuaMcKenzie

    JoshuaMcKenzie

    Joined:
    Jun 20, 2015
    Posts:
    897
    While its not the primary point of the thread, I feel its important to point out that you can avoid all this string GC by using char arrays and TextMeshPro component.

    With Unity's Text components, it doesn't matter how you setup the string beforehand it'll still cause GC at the end when you set the myTextComponent.text field, simply because of a string's inherent immutability. Some string manipulation methods generate less garbage than others, but all of them will generate garbage regardless just because they are messing with strings in the first place. StringBuilder doesn't generate no garbage, it can just generate a little less if used correctly.

    but with TextMeshPro components you can instead pass it a recycled Character array and the component will update the glyphs without generating any garbage. I do this for timers which tend to always change their Text every frame and they typically have a fixed character length anyway.
     
    GuirieSanchez and CodeRonnie like this.
  28. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    That's 100% the best time to construct your own strings, when the API accepts a raw text format, like char[], Span<char>, ReadOnlySpan<char>, or UTF-8 byte[] or Span<byte>. Then you can truly have 0 GC alloc.
     
  29. _geo__

    _geo__

    Joined:
    Feb 26, 2014
    Posts:
    1,111
    Just wanted to leave a thanks for this info. You made me feel like this (in order): :eek::oops:o_O:):D

    I had never thought about them that way. It's kinda obvious if you read the docs but usually one does not use "+=" on them often enough to show up as significant while profiling. I had memorized them as GC free but now, making 100.000 of them per frame, convinced me that indeed they do create some (albeit miniscule) garbage.
     
    GuirieSanchez and CodeRonnie like this.
  30. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    @CodeRonnie Amazing post, I would like to thank you for the detailed explanation and for sharing your expertise with us. Also, thank you all for the contributions, now I have a much clearer perspective about why they chose to use strings the way they are. I 100% agree that my test wasn't optimal and didn't give the StringBuilder the credit it deserves. It just told me though that, for that very concrete operation (small and isolated operations), it is not worth it.

    Regarding the use of unsafe code, I've heard numerous strong opinions against it, particularly when working on multiplatform projects, as some platforms do not support it. So, I haven't really explored it thus far or considered it as a real option.

    I've come across it before, but I must admit that I didn't give it much attention. I had the assumption (which may be incorrect) that if it's not widely popular or commonly used, and if Unity hasn't officially incorporated it or considered it for string improvements, there might be certain drawbacks associated with it, possibly making it less appealing or aesthetically unpleasing. If anyone has used it though, I would very much appreciate it if you could share your experience.
    Would you mind elaborating a little bit on the specifics of how to do this exactly? I'm quite interested.


    If all of your event subscriptions are done either in the Awake or OnEnable methods, wouldn't the garbage generated be deallocated before the game/app starts? I have to admit that I use (and maybe even overuse) the observer pattern, so this information is quite valuable to me. Also, I'm curious if there should be any performance concerns related to unsubscribing in the OnDisable method, particularly when we're deactivating and activating a potentially large number of objects.
     
  31. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    3,495
    Though which unsafe code, as shown by Skiriki, you can actually mutate an existing string just fine. However this trick may not work with every API due to potential "caching". If the API caches the last string that was set and compares the newly assigned string with the old one to decide whether or not it should update the string, it would of course fail since you would have mutated and passed the same string object again. Since most APIs would expect immutability of strings that's a reasonable assumption. Of course there are workarounds by simply using two alternating string instances which you both mutate. So you can switch between the two each time. Additional work but would eliminate all allocations.

    Vexe once made a "gstring" library which did cache strings and mutate them with unsafe code. For most usecases it's not worth going down that route. However especially old mobile devices were heavily impacted by the GC so the solution is to get your allocations to literally zero which is quite difficult. When we talk about allocations of course we mean frequent allcations which produce garbage.
     
    Last edited: Jul 15, 2023
    Sluggy, Skiriki, CodeRonnie and 2 others like this.
  32. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    I would be happy to provide some example code for how to update the text value of TextMesh Pro components without GC allocation. However, I think it will have to wait until Monday or some time next week when I can give it my full attention.

    If the only time you're subscribing to observe an event is when an object is initialized, and unsubscribing when it is disabled or destroyed, which I would expect to be the typical pattern, then it probably will not have a significant impact on the performance due to garbage collection. I only pointed it out as more of an academic note, not something you should be overly concerned about. If you have hundreds or thousands of objects constantly spawning in and out, and each time they do they subscribe and unsubscribe from dozens of events, maybe that could have an impact, but it should be something that you would be able to identify in the profiler. As far as how you would lower GC pressure if that were a legitimate issue, you might have to change the way you handle those communications. Perhaps all of the observers could be changed to implement some interface, and then they could be added to a list of observers that the event provider iterates through and calls an appropriate method instead. However, then you might have some differences in the speed of those actual "events" firing, which could have different performance characteristics between platforms and between IL2CPP AOT vs Mono JIT compiled players. I'm not sure how much of a concern those performance differences are relevant either vs the original GC issue. It would all be fairly categorized as micro-optimization unless you are able to profile and demonstrate different performance characteristics that are more or less ideal for the way your software is designed. If you don't see an issue in the profiler, you probably shouldn't worry about it. That being said, I am the type to worry about micro-optimizations for no rational reason, and I have been working on a way to replace typical .NET events that doesn't allocate, but it's only like halfway done, and I have to balance that kind of pet project versus what I am expected to be working on for my day job.
     
    GuirieSanchez likes this.
  33. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Exactly, and with those text fields they do cache it so you have to do something like assigning string.Empty in-between to force a rebuild. Its not nice to have to use that intermediary step, but iirc there is no GC allocated, and if it is, it's for the glyph mesh and not the string and that would then happen regardless of input
    That's the school of optimization I went through, before Incremental GC was a thing. With incremental GC it's less pressing, though the write barriers that are necessary to get it to work can have up to a ~0.5-1ms impact on your typical mobile frame rate that is kind of hidden/spread out across your C# code. If you're not main-thread bound it's a non issue though and a reasonably low amount of GC is fine, until you hit fragmentation issues.

    Are you sure that's the case? Last time I checked the allocation was for the adhoc implicit conversion of Functions to delegates, not for assignment. So you can cache Functions that you need to frequently subscribe and unsubscribe as delegates (e.g. Action<T>) and assign those instead at no GC cost.

    One exception is for events with loooots of subscribers, at which point subscribing does allocate again.

    As you mention later, you can create your own generic event property that handles all the delegates as a list, which can make it nicer to profile and debug as well (because it now looks and behaves like a flat iterative call structure instead of a super deep recursive looking callstack). Particularly the delegate caching can be annoying when navigating through the code base via "go to definition of".

    In the pre Iterative GC days and for lower end phones, I did profile this and it did make sense performance wise. I haven't had to do this in a while now though so, yeah, always profile. Profile before, profile after and also use Profile Analyzer to compare impact of the change over multiple frames and function calls, particularly for that write barrier effect (turn incremental GC on for one build, then off for the next and in Profile Analyser exclude GC.Collect times for both profiler captures if you're curious what the perf impact is.)

    As @CodeRonnie said don't bother to try to optimize those, its not worth the pain. Also the assignment itself doesn't creat "garbage" that needs to be cleared up immediately. Any GC.Alloc denotes that something was needed on the managed heap at that moment. It doesn't say that it was unused immediately after. And technically it's only "garbage" to be cleaned up, if you stop using it, i.e. stop holding a reference to it.

    If you keep your reference to it, like it sounds like you do, it will now be part of the managed heap. That has an impact on its own, as the heap might have to grow (which takes time, even more than a single GC.Alloc allocation does) and now has to check more items for cleanup each subsequent time that the heap is so full that for a new allocation to fit, it needs to collect and free unused memory, or expand the heap. With Incremental GC, that isn't an "or" question btw. Because collection is necessarily delayed till later, the heap HAS to expand immediately for the new allocation. But, once everything is collected it gives you more breathing room for some time of now further expansion needed. While I haven't profiled this yet, that should reasonably lead to a worsening of managed heap fragmentation with Incremental GC turned on though.

    If you don't notice the managed heap ballooning out of shape over a long period though (or don't see it balloon too much on tightly memory constraint platforms like movie or WebGL), just ignore the issue of fragmentation entirely. It's way too complex to analyze and even harder to find ways to try to prevent it from happening to make sense bothering about it unless you absolutely have to.
     
    Last edited: Jul 16, 2023
    GuirieSanchez likes this.
  34. SisusCo

    SisusCo

    Joined:
    Jan 29, 2019
    Posts:
    1,104
    One could in theory also achieve allocation free string concatenations using caching:
    Code (CSharp):
    1. using System.Collections.Generic;
    2. using UnityEngine;
    3.  
    4. namespace AllocationFreeStringOperations
    5. {
    6.     public static class AppendExtension
    7.     {
    8.         private const int MAX_CACHED_ITEM_COUNT = 100_000;
    9.         private const int MAX_CACHED_STRING_LENGTH = 64;
    10.  
    11.         public static string Append<T>(this string @this, T value)
    12.         {
    13.             if(!Cached<T>.results.TryGetValue((@this, value), out string result))
    14.             {
    15.                 result = string.Concat(@this, value);
    16.                 if(result.Length <= MAX_CACHED_STRING_LENGTH)
    17.                 {
    18.                     Cached<T>.results.Add((@this, value), result);
    19.                     Debug.Assert(Cached<T>.results.Count <= MAX_CACHED_ITEM_COUNT);
    20.                 }
    21.             }
    22.  
    23.             return result;
    24.         }
    25.  
    26.         private static class Cached<T>
    27.         {
    28.             public static readonly Dictionary<(string, T), string> results = new(128);
    29.         }
    30.     }
    31. }
    Usage:
    Code (CSharp):
    1. string a = "123";
    2. string b = "456";
    3. int c = 789;
    4. string d = a.Append(b).Append(c);
    5. Debug.Log(d); // 123456789
     
    Last edited: Jul 15, 2023
    GuirieSanchez and Ryiah like this.
  35. Adrian

    Adrian

    Joined:
    Apr 5, 2008
    Posts:
    1,051
    Creating a new delegate from a method does allocate. But adding or removing delegates from a multicast delegate also creates a new delegate (though they are not reference types). In Mono, this always allocates a new array to hold the new invocation list. CoreCLR seems smarter and handles it more like a list (though I'm unsure how it manages to keep invocation lists unique per delegate – maybe there's some more handling in the VM). This is done to make event subscription thread-safe and lock-free, it uses Interlocked.CompareExchange in a loop, repeatedly re-combining the delegate lists until it succeeds.

    For short invocation lists, this isn't a problem, also because no and single listeners are handled without arrays. But events with a lot of listeners reallocate their invocation list for each listener change, which can add up to a lot of garbage.

    Aren't you basically reimplementing string interning at that point?

    The Input System does use string interning through their InternedString struct. But a comment at the top of the file floats the idea to create a custom table and then just use indexes instead.
     
    GuirieSanchez, Skiriki and CodeRonnie like this.
  36. SisusCo

    SisusCo

    Joined:
    Jan 29, 2019
    Posts:
    1,104
    Manual string interning can be useful for improving performance of equality checking between two strings, and can reduce overall memory usage via making multiple strings with identical values point to the same memory location. But it doesn't help with reducing garbage collector pressure, since you have to allocate a string object before you can pass it to string.intern.
     
  37. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    That's the first I've heard of that and I doubt that platform non-support for unsafe is a thing, at least within the platforms supported by Unity. I'd double and triple check that rumor.
     
    Unifikation likes this.
  38. Stardog

    Stardog

    Joined:
    Jun 28, 2010
    Posts:
    1,886
    Legacy Text doesn't seem to create garbage if it's set to an already created string. Check this FPS counter:
    https://gist.github.com/st4rdog/80057b406bfd00f44c8ec8796a071a13
     
  39. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Looks like CoreCLR will provide a new way to do the kind of string manipulation I mentioned, but with Span<T>/Memory<T>, though I have no idea how safe that's gonna be with its moving GC. (fixed or pinning feels safer)

    Then again, CoreCLR arrival to Unity would mean a general reassessment of the usefulness of such optimizations would be needed first (*more profiling is needed).
     
  40. Sluggy

    Sluggy

    Joined:
    Nov 27, 2012
    Posts:
    839
    Quoted for truth and emphasis.
     
    Skiriki likes this.
  41. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Yes. I've just finished running some tests. Sometimes things change, like StringBuilder.Append() used to allocate because it would call .ToString(), but now it does not because it calls .TryFormat(). So, it's definitely important to test and know whether my information is dated or not. So I tested the following code in Unity 2021.3.26 on Windows with IL2CPP and .Net Standard 2.1. Basically, it creates a list of 1000 objects that have a method. Those 1000 objects are randomly added and removed, one at a time, one add or remove per frame. The profiler demonstrates that there is GC allocation every frame.

    Code (CSharp):
    1. using System;
    2. using System.Collections.Generic;
    3. using UnityEngine;
    4.  
    5. public class TestScript : MonoBehaviour
    6. {
    7.     private const int Count = 1000;
    8.     private ExampleObserver[] Observers;
    9.     private int Index;
    10.     private int[] Indices;
    11.     private int[] IndicesA;
    12.     private int[] IndicesB;
    13.     private bool Toggle;
    14.     private bool Adding;
    15.     private System.Random Random = new System.Random();
    16.  
    17.     public event Action BigEvent;
    18.  
    19.     private List<Action> Actions = new List<Action>();
    20.  
    21.     private void Start()
    22.     {
    23.         Observers = new ExampleObserver[Count];
    24.         IndicesA = new int[Count];
    25.         IndicesB = new int[Count];
    26.         for(int i = 0; i < Count; i++)
    27.         {
    28.             Observers[i] = new ExampleObserver();
    29.             IndicesA[i] = i;
    30.             IndicesB[i] = i;
    31.         }
    32.         for(int i = 0; i < Count; i++)
    33.         {
    34.             int random = Random.Next(0, Count);
    35.             int swap = IndicesA[i];
    36.             IndicesA[i] = IndicesA[random];
    37.             IndicesA[random] = swap;
    38.  
    39.             random = Random.Next(0, Count);
    40.             swap = IndicesB[i];
    41.             IndicesB[i] = IndicesB[random];
    42.             IndicesB[random] = swap;
    43.         }
    44.         Indices = IndicesA;
    45.     }
    46.  
    47.     private void Update()
    48.     {
    49.         int index = Indices[Index];
    50.  
    51.         if(Adding)
    52.             BigEvent += Observers[index].ExampleMethod;
    53.         else
    54.             BigEvent -= Observers[index].ExampleMethod;
    55.  
    56.         //if(Adding)
    57.         //    BigEvent += Observers[index].ExampleAction;
    58.         //else
    59.         //    BigEvent -= Observers[index].ExampleAction;
    60.  
    61.         //if(Adding)
    62.         //    Actions.Add(Observers[index].ExampleAction);
    63.         //else
    64.         //    Actions.Remove(Observers[index].ExampleAction);
    65.  
    66.         ++Index;
    67.         if(Index == Count)
    68.         {
    69.             Index = 0;
    70.             if(Adding)
    71.             {
    72.                 Indices = Toggle ? IndicesA : IndicesB;
    73.             }
    74.             else
    75.             {
    76.                 Indices = Toggle ? IndicesB : IndicesA;
    77.                 Toggle = !Toggle;
    78.             }
    79.             Adding = !Adding;
    80.         }
    81.     }
    82. }
    Code (CSharp):
    1. using System;
    2.  
    3. public sealed class ExampleObserver
    4. {
    5.     public Action ExampleAction;
    6.  
    7.     public ExampleObserver()
    8.     {
    9.         ExampleAction = ExampleMethod;
    10.     }
    11.  
    12.     public void ExampleMethod()
    13.     {
    14.  
    15.     }
    16. }
    The first test was run by simply assigning the method to the event by name. The second test caches the method as a delegate object on the observer object, so the local MultiCastDelegate object is only created once in the constructor. The third test simply adds and removes those cached delegates to and from a List<Action> as a control example to prove that nothing else is happening and there is 0 GC alloc in that case.

    When a new delegate is combined with the delegate underlying the event, it reallocates, sometimes up to 8kB for 1000 observers. The amount that is allocated per frame climbs slowly from a very small number up to the maximum, then climbs back down as observers are removed. I highlighted the peaks. You can see the garbage collector doing its work in the Memory section, those ramps that rise, and then drop after a GC. So, when there are lots of observers it reallocates the full 8kB every frame. The amount allocated is based on how many observers there already are.

    I noticed a few things glancing back at my profiler images. You can see the rise and fall of how much memory is allocated per frame in the memory section as that maroon line that has a sort of rainbow curve to it. The frame I've highlighted is it the peak of that curve, showing that the number of existing observers is the main factor in how much memory is reallocated on add and remove. Also, for some reason you can see that the CPU performance in both tests is much worse on the other side of the rainbow, meaning it's more costly to remove observer delegates for whatever reason.
     

    Attached Files:

    Last edited: Jul 18, 2023
  42. Bunny83

    Bunny83

    Joined:
    Oct 18, 2010
    Posts:
    3,495
    Yes, as you said delegates like strings are reference types but are actually immutable in order to be thread safe. C# delegates and events have been designed for the usual UI binding in the Visual Studio form designer. Events and delegates aren't meant to be subscribed / unsubscribed often.

    Also note that an "event" is just a delegate with some additional compiler restrictions (who can call / invoke it and how you add / remove listeners). Though under the hood it's literally just a delegate and when no listener is subscribed, it's literally just null. Here's literally the "CombineImpl" method in the MS reference source which is used when two delegates are combined.

    That's actually why UnityEvents are a bit better as they actually use a List internally and are mutable. They do create and cache a separate invokation list as well, but it is only updated when the event is invoked and something sas added / removed. A more garbage friendly solution would be to manually hold a List of delegates and when you want to invoke the event, just iterate through the list and invoke the delegates one-by-one. MulticastDelegates do essentially the same thing behind the scenes.

    Of course the actual jitted code for the invokation is most likely much more straight forward but in the end is also just a list of pointers that are called one by one. Though the creation / recreation of the list when adding / removing listeners is essentially the same.
     
    SisusCo and CodeRonnie like this.
  43. Skiriki

    Skiriki

    Joined:
    Aug 30, 2013
    Posts:
    66
    Interesting. I at least remember that actually being different but maybe the difference was between subscribing / unsubscribing 1(!) Event vs multiple and that's where the caching helped? Or it was just Mono 3.5 things.

    I guess the other thing that an immutable/welded-together-invocation-list-of-delegates helps with aside from threading is unsubscribing from within the subscribed method, which requires extra bookkeeping when you do your own even list handling (or going in reverse order and hoping that no callback would ever unsubscribe a different callback that subscribed earlier... Ok, I know now that was a stupid assumption but I did feel smart about it back then until I had to debug it:confused:.)
     
    CodeRonnie likes this.
  44. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Totally. I can confirm that writing your own logic for what happens when observers are added and removed during an event is annoying. Immutable events don't have that problem.
     
  45. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Also, the tests confirm what was said about not really worrying about it. These thousand objects are still gliding along at 10,000 FPS, and the garbage collector is chewing through whatever they spit out.
     
    SisusCo likes this.
  46. SisusCo

    SisusCo

    Joined:
    Jan 29, 2019
    Posts:
    1,104
    It's also worth noting that it's possible to implement custom
    add
    and
    remove
    accessors for an event, which makes it possible to avoid garbage being generated when the subscriber list is modified.
    Code (CSharp):
    1. private readonly List<Action> myEventListeners = new(1);
    2.  
    3. public event Action MyEvent
    4. {
    5.     add => myEventListeners.Add(value);
    6.     remove => myEventListeners.Remove(value);
    7. }
     
  47. Baste

    Baste

    Joined:
    Jan 24, 2013
    Posts:
    6,186
    There's a lot of good info in this thread, but I thought I'd shoot in an explanation on some of the things missed.

    The StringBuilder is not built to fix the slowness of an interpolated string (eg. $"foo {bar}"), it's built to fix the slowness of a manually concatenated string - doing += on the string a bunch of times. This has to do with C# history, the StringBuilder is a lot older than a $"".

    So the slow version the StringBuilder is meant to fix is one of these equivalent things, which both allocate at least 4 new strings in addition to the ones you see in the code.
    Code (csharp):
    1. var str = "<size=65%><color=#FFFFFF>";
    2. str += ex.name;
    3. str += "<br>(";
    4. str += mod.name;
    5. str += ")</color></size><br><color=#FFE081>Sets</color>";
    6.  
    7. // or
    8.  
    9. var str = "<size=65%><color=#FFFFFF>"
    10.           + ex.name
    11.           + "<br>("
    12.           + mod.name
    13.           + ")</color></size><br><color=#FFE081>Sets</color>";
    I'm pretty sure interpolated strings use a StringBuilder or something like it behind the scenes anyway, so your example is essentially comparing two StringBuilders that combines 3 elements with two StringBuilders that combines 5.


    These days, use a StringBuilder only when the number of things you need to combine varies.
     
  48. SisusCo

    SisusCo

    Joined:
    Jan 29, 2019
    Posts:
    1,104
    Yep, DefaultInterpolatedStringHandler to be exact, which uses ArrayPool<char> when constructing the string to avoid unnecessary allocations (in C# 10 and later).
     
    Last edited: Jul 19, 2023
    Neto_Kokku, CodeRonnie and Bunny83 like this.
  49. GuirieSanchez

    GuirieSanchez

    Joined:
    Oct 12, 2021
    Posts:
    387
    Bump :D for anyone who may know the following:
     
  50. CodeRonnie

    CodeRonnie

    Joined:
    Oct 2, 2015
    Posts:
    280
    Sorry, I promise I am working on a comprehensive answer. It's just been a busy week. I'm trying to launch to the asset store. I have two examples (and learned that the StringBuilder example may actually still generate garbage in Unity but I need to do further testing to make sure), but I wanted to try to get four examples, showing all of the ways I know how to do it, and run performance benchmarks for a full comparison. The only working example I have uses my own library, and I wanted to also show free methods like with ZString and experiment with TryFormat. I haven't forgotten!