Search Unity

Memory usage of 0 length array

Discussion in 'Scripting' started by Shushustorm, Dec 6, 2018.

  1. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Hey everyone!

    I was wondering what the overhead of an array with a length of 0 looks like.
    I have found some answers online stating it uses about 12 bytes in C#.
    But is this dependant on the variable type?

    For example, I am working a lot with array sizes and if its length is 0, there won't be anything happening. I do use this excessively and with 12 bytes per array, this shouldn't be a problem. But if the memory footprint gets inherited, it most certainly will be.

    For example:

    Code (CSharp):
    1. class Stuff : Things {
    2.     int i1 = 0;
    3.     int i2 = 0;
    4.     int i3 = 0;
    5.     int i4 = 0;
    6. }
    7. class Things {
    8.     int i1 = 0;
    9.     int i2 = 0;
    10.     int i3 = 0;
    11.     int i4 = 0;
    12. }
    13. Stuff[] stuff1;
    14. Stuff[] stuff2;
    15. [...]
    What do you think the memory footprint of stuff1 would look like?
    12 bytes?
    12 * 4 bytes?
    12 * 5 bytes?
    12 * 8 bytes?
    12 * 9 bytes?
    Or something completely different?

    Best wishes,
    Shu
     
  2. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    I expect, if you do not define the size of an array, it will be ignored by compiler.
    But I assume you define somewhere later at runtime?

    Stuff class has 8 ints. I would think 8 x 4 * number of array elements. However, if there is indeed additional overhead per element, like 12, there maybe 8x4 8*12 x number of elements. But is just my pure guess.

    May I ask, why your concern about footprint of empty arrays?
    Because if that become an issue, what when you start getting data into arrays? :)
     
    Last edited: Dec 6, 2018
    Shushustorm likes this.
  3. Depends on if you build for 32 bit or 64 bit platforms (references)... MS intentionally hides the managed sizes, you should not care about those. Integers are always 4 bytes.
     
    Shushustorm likes this.
  4. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Thanks for your replies, Antypodish and LurkingNinjaDev!

    @Antypodish

    I do make those variables public, if that makes a difference. (Which might cause the compiler to not ignore the array?)
    Also, I only define some of the arrays at runtime (partially on Awake) and some of them in the Inspector and ideally, the ones that I don't set to a length >0 should take up just a very small amount of memory.

    Basically, what I am doing is writing scripts that are very flexible and use a lot of empty arrays that could hold data as soon as I require instances of the class to use that certain data. Most of the arrays will remain empty for most of the instances over the full lifecycle of the GameObject. I am doing things like that, because it would be easier to only maintain one script that checks the arrays's length instead of writing a lot of different scripts that include a lot of similar code. Because if I require changes, I may have to change all of them and that would be quite a headache. Also, there are a lot of instances of those scripts, which adds things up by quite a lot, if memory usage is higher than 12bytes per array. This will especially be a problem, since I don't have 4 ints like in the example here, but rather 20 different variables, some of which also inherit from other stuff.

    @LurkingNinjaDev

    I am both targeting 32 bit as well as 64 bit systems. On the 64 bit systems, this probably won't be a RAM issue, though.

    EDIT:
    Just to give this some more context:
    If memory footprint is inherited, this would be about:
    25 * 8 bytes (maybe on average, some variables are strings, some Vector2, some Vector3) + 12 bytes array itself overhead
    That would be 212 bytes per array, maybe 100 per script, which would be about 20kB per instance already.
    Maybe 10 such scripts on 5000 objects per scene, if I don't load assets dynamically.
    10*5000*20kB = 1.000.000kB
    That's almost a gigabyte? That's nowhere near possible. I have about 10MB for that. Maybe I should load assets dynamically after all. But that's another headache. Or split scenes up smaller, which will lead to loading screens.
     
    Last edited: Dec 6, 2018
  5. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    Jon Skeet wrote an arcticle about it long time ago, no guarantee that it's still up to date.

    Taking the information from that post, not only does the 32bit or 64bit architecture that you're building for contribute to the size, but the type might as well (for instance, check empty object[] vs empty int[]).

    However, as mentioned, you rarely want to care about that implementation detail of the CLR, unless you're doing extreme optimizations, unsafe code, marshalling, research, or some kind of coding competitions or anything alike.

    The thing is, whenever you need an empty array, you can actually use the same instance everywhere, i.e. T[] = new T[0] would technically only be necessary once. That's due to the fact that it has not really any properties or state that you can benefit from, nothing to compare except its reference, which in turn does not make a lot of sense as there are better alternatives.

    In other words, whenever you require an empty array, create one and share it.

    (Unity's deserialization system could actually make use of that as well, but it doesn't and deserializes empty arrays by creating new T[0]s all the time - at least in the editor).
     
    Last edited: Dec 6, 2018
    Antypodish and Shushustorm like this.
  6. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    @Suddoha

    Thanks for your reply! Very interesting! I thought about using the same instance for all empty arrays that use the same type as well. I'm unsure, though, if that works the way I intend to. For example, I still need to be able to set the value in the Inspector. Yet, it might work! Most certainly better than alternatives like splitting up scripts, scenes or loading assets dynamically.
     
  7. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    I updated my post with some additional info that applies to Unity.

    From a technical perspective, you can always replace the empty ones (that were created by Unity during deserialization) with an empty T[] of your own, so that the generated ones could then be GC'ed.

    But that's just additional effort that might not be worth it, unless you have potentially thousands, tens of thousands or even more empty arrays at once.
     
    Shushustorm likes this.
  8. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    As @Suddoha said, I would really look into way, of sharing, or utilizing unused arrays.
    Should be something like polling bullets.
    You shoot thousands of bullets per min, but really you got at most 100 at any time in scene.
    Once bullet is hit, put into poll stack, and wait for reuse.
    This way saving lot of memory.

    Or you may need consider rework your architecture.
     
    Shushustorm likes this.
  9. That's a lot. I don't know if your system powerful enough to pull this off if you don't have enough memory to store a bunch of empty arrays.
    I guess you're targeting mobile? I think you can benefit from breaking up scenes and loading them additive. Question is, do you have the bandwidth and the processor power to do that?

    BTW, you can try to hack and can keep your "empty arrays" null. In this case you're only paying the 4 or 8 bytes per array (reference). But then again, you can't edit them in the inspector properly or you need to touch them on deserialization as you would if you use a single empty array as reference.
     
    Shushustorm likes this.
  10. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Thanks for your replies! Quite interesting!

    @Suddoha

    If that would mean allocating about 1GB and garbage collect it at the start of each scene, that would probably increase load time drastically? Or do you mean the ones that I do specify? I wouldn't worry about those. Those would be about 1 or 2 per script:
    1.5*10*5000*212bytes=15.900.000bytes
    About 15MB, that's more likely to work and I can still optimize that amount if needed later on.

    @Antypodish

    Yes, exactly, pooling is something that I need to use frequently as well. To manage memory like that seems like it could work!

    @LurkingNinjaDev

    Well, most of the GameObjects are deactivated. I am thinking of about 100 of them running simultaneously.
    Yes, I am targeting mobile. Unfortunately, though, loading scenes is generally quite slow if data is not persistent between them and the objects that use those scripts are scene specifics.
     
  11. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    If you really targeting mobile, I suggest drop whole flexibility matter.
    Often you can make thing faster, instead of trying to automate.
    Do I got right impression, you are over engineering?
    Is easy to fall into such trap.

    Maybe scrap this part and start over?
    Sometimes is good way drop something to redo, while you may realize, there is much better way.
     
    Shushustorm likes this.
  12. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    @Antypodish

    Well, I generally do try to not sacrifice performance for flexibility, but for some stuff, it would mean managing hundreds of scripts instead of maybe 5. So it would make a huge difference for the workflow. I doubt I could actually manage all of those scripts then. I'm not sure if I'm over engineering, but I don't see an efficient way to work with hundreds of scripts when, for example, I need to change certain aspects and that change affects dozens of other scripts as well, because of major similarities. Keeping an overview seems difficult.
    Maybe it would make sense to have a class for those scripts they can refer to in order to keep the most important methods in one place, but that may also overcomplicate things.
    Maybe I will try profiling when assigning the same instance of a variable to all of the empty arrays. If that works, it may be the reasonable way to do that.
     
  13. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    You should not need so many different scripts, for things like that.
    By what sounds, structure is pretty the same across all objects.
    Is just all about data.

    I don't know, if that helps, but you could look into JsonUtility.
    @Suddoha probably could suggest Scriptable Objects. Which also may be some solution.
    But we don't know exact problem.
    Just thoughts.
     
    Shushustorm likes this.
  14. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    If you run into the situation of allocating 1GB of "useless arrays", I'd definitely look into alternatives. Perhaps there's a missing concept of abstraction or a certain design pattern, that would help to reduce the number of array fields in your components.

    But ye, technically you could replace it with that shared instance or with 'null', as @LurkingNinjaDev suggested. The benefit of sharing the instance over setting it to 'null' can be that you have less null-checks before iterating or passing it to other functions (as 'null' is often considered an invalid argument). However, that pretty much depends on your coding practices, error handling, overall usage and encapsulation.

    Can you add an example that demonstrates a use case for all these arrays, which are likely to be empty?
     
    Shushustorm likes this.
  15. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Thanks for your replies!

    @Antypodish

    Json creates files that can easily altered by the users, though, right?

    @Suddoha

    An example would be extending the ParticleSystem's capabilities and animate gravity over time, giving the option for different input patterns in order to animate. This would only be a fraction of what one of those scripts would be able to do. Maybe I do need to split them up, though.

    I did do some testing and I'd say the memory footprint of 0 length arrays may even be within measurement inaccury. Considering memory usage always varies by a few MB, the results for different approaches were very similar. Even similar to using GameObjects without any of those 20.000 scripts that were placed on the objects in the scene.

    However, there is a completely different bottleneck: startup time. It takes about 2 seconds to start a scene if no scripts or empty classes are attached to the objects. But with 20.000 scripts, it will take about 70 seconds to start a scene. The initializing process is a huge problem. 70 seconds on one of the machines I'm working on is way too high. I would aim for maybe 3 or 4 seconds, on the low end device. So there shouldn't be more than 1 second overhead on my development device.
    I guess I will need to cut down the scenes' sizes drastically and increase scene count.

    Also, here is what I measured (I also tested prefabs, but they even seem to perform way worse, since there is not a lot of impact from each object other than the script):

    Empty scene:
    startup time: 2 seconds
    RAM usage: 327.0MB

    1000 GameObjects:
    startup time: 2 seconds, using prefabs: 2 seconds
    RAM usage: 353.8MB, using prefabs: 357.0MB

    1000 GameObjects, 20 empty classes per GameObject
    startup time: 2 seconds, using prefabs: 2 seconds
    RAM usage: 366.8MB, using prefabs: 367.8MB

    1000 GameObjects, 20 Array0 per GameObject
    startup time: 70 seconds, using prefabs: 116 seconds
    RAM usage: 356.2MB, using prefabs: 369.7MB

    1000 GameObjects, 20 Array0_New per GameObject
    startup time: 67 seconds, using prefabs: 117 seconds
    RAM usage: 367.2MB, using prefabs: 370.3MB

    1000 GameObjects, 20 Array0_Null per GameObject
    startup time: 68 seconds, using prefabs: 117 seconds
    RAM usage: 365.5MB, using prefabs: 371.4MB

    1000 GameObjects, 20 Array0_Reference per GameObject
    startup time: 68 seconds, using prefabs: 117 seconds
    RAM usage: 363.3MB, using prefabs: 368.6MB

    Also, I attached the scripts I used if someone wants to test this as well.
     

    Attached Files:

  16. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    It probably is negligible, as you'd have to create many more empty arrays. I thought you had really hit the 1GB.

    Even if an empty one took 100bytes, you'd still need 10000 empty ones to get to 1MB (which is fairly acceptable if other optimization would take too much effort), in other words the 20000 in your scene would then "only" occupy 2MB (let's not talk about fragmentation and all of that stuff).
    And 10 millions to reach that 1GB that you mentioned previously. If that was the case, I'd say it sounds like an attempt to make optional configuration available for some sort of detailed simulations.

    No idea how large your scenes have to be, however, if you can reduce scene size, try that first.
    Additionally, you could also build up the content over time, or think of some smart spatial subdivisions. More complex would be some streaming solutions. A way different but efficient approach could be a shader - but that depends what you're trying to visualize, whether or not that can be done in a shader.

    You can also try to add one more abstraction layer: Before declaring lots of arrays directly in a component, have a type that wraps around the optional content. Talk to that via the abstraction and let the implementation decide what's gonna happen. That is, if there's a varierty of data to consider / to operate on / to apply to objects in the scene, attach a component which handles it. If there's nothing, well, you'll just have nothing. And if there's only one thing you want to have, take an implementation that only cares about that one thing.
     
    Shushustorm likes this.
  17. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    Yep. Is just a text file. So modification it, need to be ensured, that follows the format. Otherwise, may not load correctly.
    Is not hard, but something to keep in mind.
     
    Shushustorm likes this.
  18. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Thanks for your replies!

    @Suddoha

    The weird thing is, when originally testing, I did see memory usage rise to over 1GB, even continually increasing. But I was not able to reproduce this. Maybe that was just a bug.

    Also, after some further investigation, the only problem with a huge number of empty arrays* is that the Editor enters play mode very slowly. Scene startup is only a problem within the Editor. When doing a build, the scene almost starts instantly. So in builds, a huge amount of empty arrays is no problem. Also, I noticed that building takes quite some time.
    I'm unsure whether or not this is intended behaviour, but I assume, when building, there is some optimization going on, which is crucial in fast loading time, but takes a long time to process when building.

    *in my tests, 1000 GameObjects, each 20 Components, each 40 Arrays, of which each was of a type which included 10 ints, 20 Color32 and 20 Vector3, so this would be 800.000 empty arrays. If this means 12 bytes per array, about 9.15MB would be allocated, which seems like it is the case when looking at the results of my tests. Because otherwise, it would include 10 ints, 20 Color32 and 20 Vector3 with all of those 800.000 * 12 bytes, which I would estimate to about 800000 * (12 + (10*4 + 20*4*4 + 20*4*3)) bytes, about 467MB, which would be way too much.

    What exactly do you mean by "optional configuration"? Is there a way to integrate optional public variables? I know there is method overloading, but that doesn't seem to help in this context, I guess.

    For some things like specific vertex animations I do use shaders, but for most things it's a much more straightforward solution to use C#.
    Unfortunately, adding a more abstract layer will not be useful in this case, since each of the arrays I am using is already the most abstract of what is happening with the data contained, unless abstracting it to a level that holds all the arrays, but then, I would need to include all of them when trying to access one, which in turn doesn't provide any better performance.

    @Antypodish

    Unfortunately, if that's the case, I don't want the game to contain easily changeable files, though. It's not meant to be an explicitly moddable game.
     
  19. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,780
    You can hard code Json string into cs files, if you like.
    But in that case, Scriptable Object may be better solution.
    @Suddoha will be better person to comment on second option.
     
    Shushustorm likes this.
  20. Suddoha

    Suddoha

    Joined:
    Nov 9, 2013
    Posts:
    2,824
    The information about the final build are already good news.
    However, I can imagine entering the playmode is annoying as hell. Personally, build time would be my least concern.

    It's not that much for many of today's devices, and if this was a normal overhead I wouldn't even bother at all, especially not during development.

    Note however, that the minimum size still depends on the bitness of your target platform and the type of which you've got the arrays. So if these 40 arrays are of type T[], where T is a reference type that wraps around your ints, colors and vectors, you'd have (according to the article linked above) ~ 800k * 16 (32 bit) or 800k * 32 (64 bit).

    As stated above, it reads as if you had a wrapping reference type for the ints, colors and vectors, so the sizes of the value types should only be significant once you fill the arrays.

    I was just trying to imagine what your project looks like / is about on a more detailed level, since you have 800 arrays (20 components * 40 arrays) of a certain type (that contains alot of additional info on top of that) on each GO. I'm just wondering how that's going to be configured...

    So, if this was about to be configured in a more automatic / generic manner, or at least through a more convenient user interface, I'd definitely only add what's supposed to be there "on demand", similar to what @Antypodish suggested (doesn't have to be JSON, you could as well save it as binary).


    It's a bit difficult to imagine. I don't quite understand why one optional data configuration has to be coupled to all the others.

    Save it in a different format, so that it's no longer obvious what one would be editing. In order to de-serialize binariy formats, you already need to know more about the data types that were used to serialize it - which means one has to dig into your code base.
    Starting from there you could try to implement additional hurdles, but honestly, that's usually not worth it at all. Let them do what they wanna do with their game.
     
    Shushustorm likes this.
  21. Shushustorm

    Shushustorm

    Joined:
    Jan 6, 2014
    Posts:
    1,084
    Thanks for your replies!

    @Antypodish

    That sounds like a valid solution! Thanks for pointing that out! I will keep that in mind in case the current workflow will be infeasible.

    @Suddoha

    Of course, build time is less than a problem than having to wait before entering playmode. But even then, if a build mid production takes 3-5 hours (I'd estimate that given the current situation), it may very well become a problem. Especially since I have to build quite a lot to make sure it runs well on consoles. Then again, building only one scene at a time may work.
    But yes, the biggest problem would be waiting a minute each time before entering playmode. I submitted this as a bug and I'm very interested if this is to be expected or if this could be fixed from Unity's side.

    Thanks for pointing out the different memory usages. On 64bit, however, memory should be the least problem. I should even have some room for additional upscaling of assets, even more than I originally planned.

    Exactly, the arrays' content only becomes allocated once the length != 0.

    The configuration isn't that much of a problem. Many objects will use similar or even exactly the same values, some of them I will probably start from another script that generates values and for some that use exactly the same values, I am thinking about writing a script that controls all of them so that there won't be too much unnecessary starting and stopping of coroutines.

    Also, I'm unsure if binary is the right thing here, since I need those values to be tweakable at runtime and reading from and writing to binary always has some overhead, especially on slow disks.

    Let's say I have

    ParticleAnimationGravityValues values1;
    ParticleAnimationColorValues values2;

    , I could use

    public class ParticleAnimationValues {
    ParticleAnimationGravityValues values1;
    ParticleAnimationColorValues values2;
    }

    , but as soon as I need one of those ParticleAnimationValues, there will, additionally to ParticleAnimationValue's memory, also be the overhead of all the contained variables' memory footprint. I'm unsure if that's going to be useful in terms of memory.

    Binary is generally something I would consider, but I'm unsure about its performance at that scale. But I will keep that in mind as well.