Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Dismiss Notice

How To Optimize a Unity Game

Discussion in 'General Discussion' started by GameDevSA, Oct 24, 2021.

  1. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    This is great for single thread but when it comes to a lot of threads all competing with each other, from all kinds of places in the source, it probably won't work as well without traffic management + the overhead that inevitably must bring.
     
  2. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    Yes this is absolutely only for single thread as in reality, you have to lock the component arrays against changes during system updates. It's not an alternate approach for ecs, just for single threaded GameObject/MonoBehaviour land.
     
  3. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    It's a good bit of code to demonstrate an idea, to be honest I didn't think anyone would try :)
     
  4. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    You know, I'm pretty sure you can operate on the stored values directly without creating a temporary copy with get/set if you enable unsafe mode.

    Code (csharp):
    1.  
    2. static unsafe void update(SinWaveMotionComponent *comp){
    3. }
    4.  
    You'll need to use an array for storage and not a list this way, though.
     
  5. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    Blame Arowx. :)
    Yes you could but that wasn't a concern for the purpose of this demonstration.
    Are there any consequences to using unsafe in Unity? Does it play any role in terms of publishing your game?
     
    hippocoder likes this.
  6. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Thank you for this great example of my concept.

    unsafe should not make any difference as Unity is written in C/C++ which by default is 'unsafe'.

    Think of it like taking the stabilisers off your first bike it's fine as long as you don't fall off.

    I think you could improve this section by using the Burst and Mathermatics features e.g. float3 and [BurstCompile] tags

    Code (CSharp):
    1.  
    2. for (int i = 0; i < components.Count; i++) {
    3.             var c = components[i];
    4.  
    5.             c.angle += 360 * c.rotationsPerSecond * dt;
    6.             c.angle %= 360;
    7.             float sin = Mathf.Sin(c.angle * Mathf.Deg2Rad);
    8.  
    9.             Vector3 position = c.transform.position; // you access the transform twice you should get a reference to it once and use it twice.
    10.             position.y = sin * c.radius;
    11.             c.transform.position = position; // ditto
    12.  
    13.             components[i] = c;
    14.         }

    Also your using an Array of Structs which is not as performant as a Struct of Arrays.

    Set it up as a Job and you would have basic Multi-threading.

    But as your just using a sine function within a range you should really generate a sine array with sufficient resolution and then you could use the nearest value and you could calculate this based on the display size in pixels and some upper threshold e.g. 3600 entries 360.0 max resolution.

    PS 360 radians = 2 * PI so you can just swap 2 * PI for 360 and remove the Mathf.Deg2Rad and have a radians sine array lookup table.
     
    Last edited: Nov 2, 2021
    JooleanLogic likes this.
  7. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    The effects of "unsafe" code are undocumented. The manual simply doesn't talk about it, as far as I'm aware.
     
    JooleanLogic likes this.
  8. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    Good grief, my apologies to all for not optimising my conceptual code which isn't being used by anyone anywhere. :rolleyes::)
    What does that mean in the context of this design and how would you achieve it? I'm not familiar with SoA other than briefly looking it up now.
    The simplicity of this design is quite cool but it only operates on individual components. I.e. It can't align multiple components in the same query like ecs can.
    E.g. CompA.Components and CompB.Components do not align.
    So you can't split components you want to operate on into more atomic types for SoA (if my understanding is correct). Not in any generic way that I can see anyway.
    Thanks. I'd steered clear of this due to a vague memory that it had publishing consequences on mobile for some reason. Good to know the gloves can come off.
     
  9. neginfinity

    neginfinity

    Joined:
    Jan 27, 2013
    Posts:
    13,321
    In concept of this design it means you'll skip creating the temporary copy. The idea was to make the things cacheable, as far as I can tell, and since it is a struct, the value is going to be on the stack. So the quesiton is whether pulling the value repeatedly on the stack is going to have any performance impact.

    If you use pointers, limited C++ style, you'll have less boilerplate. No accessors or anyting. Basically, raw memory access allows you to CAST raw bytes into struct data and operate on that, without creating a temporary on the stack. It is very old school, though.

    If I were pursuing this sort of direction, in the end I'd likely kill off the generic base and classes completely and create one master function that operates on raw memory block, and calls corresponding functions by itself. That's nearly C style, without ++.

    You'd also likely be able to store multiple different types in the same block, although the question is how'd you parallelize those as different types would have different footprint.

    But that, too, would be very far removed from normal C#.

    There are rumors that using unsafe causes problems when publishing on iOS. However, I do not have any hits in documentation regarding unsafe code, aside from assembly definition settings article. So it is undocumented.
     
  10. GimmyDev

    GimmyDev

    Joined:
    Oct 9, 2021
    Posts:
    157
    Be careful of premature optimization, LUT (ie arrowx sine array) aren't necessarily faster, and instruction cache also have aliasing performance issues, and a whole bunch of unintended effects from scheduler and so on, there is rules of thumb, but ultimately the hardware design might trump it all. I think generalizing data cache performance is the only thing mostly portable today though.
     
    neginfinity likes this.
  11. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    All these optimisations are fine but that just was not the point of this code.
    I used sin/cos tables 20+ years ago to get 3d explosion effect working on the 486sx. Nothings changed lol.
     
  12. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    Yes that's exactly what I remembered reading somewhere though researching it again I don't think there's any problem with it. It's still an option in the ios player settings so I'm sure it's fine.
     
  13. GimmyDev

    GimmyDev

    Joined:
    Oct 9, 2021
    Posts:
    157
    Memory speed has become the bottleneck, that's why chip maker cram extra hardware on the chip (like scheduler) and craft complex memory hierarchy and layout, to deal with the slowness of accessing memory, it's faster to do iterate code on the alu register than calling a LUT (depending on complexity of algorithm), Sine is probably better left to ALU than dealing with the memory mapping logistic to access a sine LUT. On mobile you have microjoule issue too, where memory access dominate the heat profile, so closer memory is best, and there is no closer memory than the register, if you ca define your algorithm to work on the closest memory (ie less electron travel, therefore less heat), you get more performance without hitting the threshold of throttling.

    That's the theory though, I haven't personally tested that use case, and probably isn't competent to test it thoroughly, i'm merely aware of it due to nerd around me constantly bitching about it... and watching video to understand what the heck the are talking about. SIMD oh! Everyone is racing the electron now.
     
    angrypenguin likes this.
  14. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Code (CSharp):
    1. public struct SinWaveMotionComponentData {
    2.     public NativeArray<float> angle;
    3.     public NativeArray<float> radius;
    4.     public NativeArray<float> rotationsPerSecond;
    5.     public NativeArray<float3> transformPositions;
    6. }
    Structs of Arrays you would just have one struct with arrays of the data (Native Arrays for Jobs/Burst). Then use the Copy Burst Copy pattern to move the data from the transform to the Burst Job and back again.
     
  15. JooleanLogic

    JooleanLogic

    Joined:
    Mar 1, 2018
    Posts:
    447
    Ok that's as I thought. However you're back to showing how you want the final outcome to look with no way of how to get to that or the effect that's going to have on writing code which you want to be easy.

    I think the only way you can get the advantages of ecs but also ease of coding would be if Unity wrote a custom language specifically for it which could hide all the complexity behind a compiler. Not a bad idea but I can't see it happening anytime soon.

    My only point was to rebut the argument that you can just struct your components into static arrays and it's hunky dory. Even in such a simple case, you immediately run into issues of indirection and non-alignment amongst others. If you were to set about to solve these issues, you'd eventually end where Unity's ecs already is which is their archetype/chunk based solution.
    I agree it's difficult (I'm looking at alternatives myself) but that's just the consequence of working in a data paradigm vs an object one.
    Careful with the N word around here. Lol.
     
  16. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Update on the benchmark I was working on...

    In summary Update() is faster than Batch SOA Until 4096 and the overhead for Jobs made it slower than Batch and Update().

    Note this is for a very simple move n objects system.

    Mind you I did enable the BurstCompiler on all Update/Process methods so maybe that has a few tricks up it's sleeve when doing multiple Updates?
     
  17. unity-freestyle

    unity-freestyle

    Joined:
    Aug 26, 2015
    Posts:
    45