Search Unity

Const vs static IL2CPP optimization

Discussion in 'Scripting' started by Chrisad, Aug 14, 2019.

  1. Chrisad

    Chrisad

    Joined:
    Mar 12, 2013
    Posts:
    55
    Hi everyone.

    In my work, I have to deal with a LOT of strings for localization and other stuff. In the code, the keys strings are constant values and don’t change over the time. Most of them are private, and I don't use other assemblies. Theoretically, I should use the keyword const to store them, but one programmer told me to use static readonly over const because const are copied in the memory each time there are used at compile time.

    I was wondering if it is true in C#, so I searched on the internet.

    The official documentation of C# says that consts are subtitutes by their literal values directly into the intermediate language code

    https://docs.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/constants

    it substitutes the literal value directly into the intermediate language (IL) code that it produces. Because there is no variable address associated with a constant at run time, const fields cannot be passed by reference and cannot appear as an l-value in an expression.


    But I found that literal strings keep a table of each reference, so multiple references in the code of a literal string doesn't duplicate any values.
    (I found multiple people saying that, here one example)

    https://stackoverflow.com/a/23529832.

    From what I understood, static members are stored one time in a special space of the heap, and cannot be removed by the garbage collector.

    Since Unity feeds IL2CPP with the IL code, and translate that to C++, I don't have information about what happens to the static and const values.

    My final questions are:

    Does the Intermediate Language copy or not the value of a const string?
    If yes, does the size of a project increase if a lot of big strings are referenced multiple of times in the IL code ?
    How does IL2CPP handle const and static strings? (specially on Android and iOS platform)
     
  2. kaarloew

    kaarloew

    Joined:
    Nov 1, 2018
    Posts:
    360
    If you use following code in https://dotnetfiddle.net/
    Code (CSharp):
    1. using System;
    2.                    
    3. public class Program
    4. {
    5.     public static readonly string staticString = "STAT";
    6.     public const string constString = "CONS";
    7.    
    8.     public static void Main()
    9.     {
    10.         Console.WriteLine(staticString);
    11.         Console.WriteLine(constString);
    12.     }
    13. }
    and show the IL, you will see
    Important part is
    IL_0001: ldsfld string Program::staticString
    IL_0006: call void [mscorlib]System.Console::WriteLine(string)
    IL_000b: nop
    IL_000c: ldstr "CONS"
    IL_0011: call void [mscorlib]System.Console::WriteLine(string)


    so compiler will optimize const variables directly to code, while static readonly will require call
     
  3. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,938
    String literals are stored in the global-metadata.dat file. This file is mapped into read-only memory when the IL2CPP runtime is initialized. The data for each string literal is used to initialize a new String object, which is created in the same way as a String object created at run time.

    Whether or not the string literal is used in a const string or a static field, there is not too much difference in behavior.

    Note that you can debug the generated C++ code and check on all of this behavior. See this talk we did a few years ago about just that - it is still relevant:
     
    Chrisad and DonLoquacious like this.
  4. Chrisad

    Chrisad

    Joined:
    Mar 12, 2013
    Posts:
    55
    If you call several time console.WriteLine with the const, the IL code will display several "CONS" in the IL code. At first, I was wondering where the string interning was executed. But by looking at what was doing ldstr instruction, what we see as "CONS" is a token to a metadata and not the actual string. In the doc https://docs.microsoft.com/en-us/do...tion.emit.opcodes.ldstr?view=netframework-4.8 we can read The Common Language Infrastructure (CLI) guarantees that the result of two ldstr instructions referring to two metadata tokens that have the same sequence of characters return precisely the same string object (a process known as "string interning").
    That was the thing that confused me.


    I watched the video, thank you for it.
    If I understood everything correctly, static string are just methods that return a string interned in IL. So we have the overhead of a method when a static is accessed. Does IL2CPP use the exact same behaviour for const, or does const are directly the reference to the string object ? (That would mean that const are faster than string)

    Last question, does IL2CPP do an optimization for other literals ? Since they cannot have reference, I know that some C++ compiler store some literal into the data segment, but is it still the case for IL2CPP ?
     
  5. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,938
    The behavior is the same for const, because the raw data for the content of the string is stored in the metadata, but there is still a String object that is created, even for const strings.

    There is no specific optimization. For example, integer literals in C# will become integer literals in C++. I would expect the C++ compiler to do the "best" thing for integer literals (which is probably putting them in the data segment). However, IL2CPP does not do anything special in this respect.
     
    Chrisad likes this.
  6. Chrisad

    Chrisad

    Joined:
    Mar 12, 2013
    Posts:
    55
    So const and static just access to a simple C++ string object. In the video, literal strings for "Nightmares" and "Blacksmith" are accessed by calling il2cpp_codegen_type_info_from_index() (20:29). This method is the direct access to a literal string value and since const and literal strings in C# are the same thing in IL code, I presume that the C++ code would be the same for const. The only difference I see is that a static member in C# is a method returning a literal string in IL, so in C++, static would be a method calling another method, that point toward a C++ string object.
    And both static and const retrieve the a single object, with no duplication of data since the string value is from the metadata in C++.

    I probably mistaken the IL2CPP compiler with the C++ compiler per platform on this point. I was talking about that https://stackoverflow.com/a/349030.

    Thank you for your information, it's really help me to understand how the thing works behind the scene.
     
    JoshPeterson likes this.
  7. Chrisad

    Chrisad

    Joined:
    Mar 12, 2013
    Posts:
    55
    Sorry to ressurect the thread but recently, I found this info in the documentation of Unity regarding const vs static readonly :

    Note: Because every reference to a const variable is replaced with its value, it is inadvisable to declare long strings or other large data types const. This unnecessarily bloats the size of the final binary due to all the duplicated data in the final instruction code.


    Wherever const isn’t appropriate, make a static readonly variable instead. In some projects, even Unity’s built-in trivial properties have been replaced with static readonly variables, resulting in small improvements in performance.


    I found this information in this page:
    https://docs.unity3d.com/Manual/BestPracticeUnderstandingPerformanceInUnity8.html

    According to our discussion, this part of the documentation is wrong. Since strings are stored in the metadata in C#, const or static strings don't create several instance of the same string. Using const cannot bloats the size of the final binary more than a static readonly string. A different behaviour would imply that Unity breaks the string interning process which is not the case.

    I think the documentation should be change @JoshPeterson
     
  8. JoshPeterson

    JoshPeterson

    Unity Technologies

    Joined:
    Jul 21, 2014
    Posts:
    6,938
    We will investigate this - thanks!
     
    Chrisad likes this.