Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. We have updated the language to the Editor Terms based on feedback from our employees and community. Learn more.
    Dismiss Notice
  3. Join us on November 16th, 2023, between 1 pm and 9 pm CET for Ask the Experts Online on Discord and on Unity Discussions.
    Dismiss Notice

UI Optimisation Challenge?

Discussion in 'UGUI & TextMesh Pro' started by Arowx, Nov 29, 2014.

  1. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK I know I'm challenge mad, but what if Unity put out an open challenge to developers to improve and enhance the new UI system?

    They could provide a trail pro code to enable people to use the profiler and improve/enhance the UI system.

    With tempting rewards or prizes for the best optimisations/improvements, free Pro versions of 5 maybe?

    And it could give the UI team a bit of a rest, at least until the deadline hits and they have to rate the entries!
     
    Gekigengar and MrEsquire like this.
  2. Mikeysee

    Mikeysee

    Joined:
    Oct 14, 2013
    Posts:
    155
    +1 I think this is a great idea
     
  3. MrEsquire

    MrEsquire

    Joined:
    Nov 5, 2013
    Posts:
    2,712
    + Good idea
     
  4. Gekigengar

    Gekigengar

    Joined:
    Jan 20, 2013
    Posts:
    706
    + Good idea.
     
  5. shkar-noori

    shkar-noori

    Joined:
    Jun 10, 2013
    Posts:
    833
    + Good idea. already made a lot of optimizations and new controls [ComboBox, TabPanel..],
     
  6. Demozo

    Demozo

    Joined:
    Aug 19, 2014
    Posts:
    20
    This sounds like a cool idea, might give people some motivation to develop new controls which might then get adapted into the engine?
     
    shkar-noori likes this.
  7. Nubz

    Nubz

    Joined:
    Sep 22, 2012
    Posts:
    553
    Haha
    People need to give up the idea of Unity giving away 5 pro.
    Seems to be a lot of it lately
     
  8. Zeblote

    Zeblote

    Joined:
    Feb 8, 2013
    Posts:
    1,102
    Makes you wonder why
     
  9. MrEsquire

    MrEsquire

    Joined:
    Nov 5, 2013
    Posts:
    2,712
    I thought the whole point of open source is exactly for things like this? Allows other people to add and improve the code..Therefore I doubt they give away any licenses and how much work would be needed to make do for a 1500 dollar pro license..i idea is great but no incentive
     
  10. Gekigengar

    Gekigengar

    Joined:
    Jan 20, 2013
    Posts:
    706
    With great reward comes great motivations.
     
    MrEsquire likes this.
  11. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Actually I think there could be a problem with optimising the UI with the code base provided.

    If you run a performance test using text with shadows and dig down in the profiler I found that 15.5% of the time was being taken up with calls to the a List Getting an item in this case the UIVertex triggering a string.memcpy(???) this class does not appear to be in the open classes???

    But we could still do the following optimisations.

    Code (CSharp):
    1. protected void ApplyShadow(List<UIVertex> verts, Color32 color, int start, int end, float x, float y)
    2.         {
    3.             UIVertex vt;
    4.  
    5.             var neededCpacity = verts.Count * 2;
    6.             if (verts.Capacity < neededCpacity)
    7.                 verts.Capacity = neededCpacity;
    8.  
    9.             for (int i = start; i < end; ++i)
    10.             {
    11.                 vt = verts[i];
    12.                 verts.Add(vt);
    13.  
    14.                 Vector3 v = vt.position; // Should be outside of loop
    15.                 v.x += x;
    16.                 v.y += y;
    17.                 vt.position = v;
    18.                 var newColor = color; // ditto
    19.                 if (m_UseGraphicAlpha) // Should be outside of loop prevents branching and
    20.                     newColor.a = (byte)((newColor.a * verts[i].color.a) / 255);
    21.                 vt.color = newColor;
    22.                 verts[i] = vt;
    23.             }
    24.         }
    So that would give us this ...

    Code (CSharp):
    1.  
    2. protected void ApplyShadow(List<UIVertex> verts, Color32 color, int start, int end, float x, float y)
    3. {
    4.     UIVertex vt;
    5.  
    6.     var neededCpacity = verts.Count * 2;
    7.     if (verts.Capacity < neededCpacity)
    8.         verts.Capacity = neededCpacity;
    9.  
    10.     Vector3 v;
    11.  
    12.     if (m_UseGraphicAlpha) // Should be outside of loop prevents branching and
    13.     {
    14.         var newColor;
    15.  
    16.         for (int i = start; i < end; ++i)
    17.         {
    18.             vt = verts[i];
    19.             verts.Add(vt);
    20.  
    21.             v = vt.position;
    22.             v.x += x;
    23.             v.y += y;
    24.             vt.position = v;
    25.      
    26.             newColor = color; // ditto
    27.             newColor.a = (byte)((newColor.a * vt.color.a) / 255);
    28.  
    29.             vt.color = newColor;
    30.             verts[i] = vt;
    31.         }
    32.     }
    33.     else {
    34.         for (int i = start; i < end; ++i)
    35.         {
    36.             vt = verts[i];
    37.             verts.Add(vt);
    38.  
    39.             v = vt.position;
    40.             v.x += x;
    41.             v.y += y;
    42.             vt.position = v;
    43.      
    44.             vt.color = color;
    45.             verts[i] = vt;
    46.         }
    47.     }
    48. }
    49.  
    It should be a bit faster, don't have things setup to test it though.

    And we could reverse the loop as apparently counting down in C# is slightly faster than up according to dotnetperls.

    The other big performance hit appearing in my benchmark is Text.OnFillVBO().

    Digging down same issue with UIVertex and string memcpy??? then Vector3.op_Multiply() [can be replaced by unrolling the multiplication to each float element].

    Text.GenerationSettings() -> get_pixelsPerUnit() is a bit of a hog as for 53 calls we end up with 212 calls when it could be cached.

    A bit more digging and the canvasUpdateRegisty.
    InternalRegisterCanvasElementForGraphicRebuild() call does a linear search of all elements in the list, this could be improved with a Dictionary or Array based index id system for ICanvasElements. (Note this is only 1.5% of performance issue).
     
    Last edited: Dec 1, 2014
    Gekigengar and MrEsquire like this.
  12. Dantus

    Dantus

    Joined:
    Oct 21, 2009
    Posts:
    5,667
    I used something similar in my Cloud System. A few hundred or thousand particles that need to be recalculated in each frame and the loop contained an "if" which could be removed. Getting rid of it showed almost no performance benefit, even though the code was executed a few hundred or thousand times per frame. It didn't even pay off on mobile devices, which was pretty surprising.

    Personally, I would never make that kind of optimization. Slightly faster means in that context, you need to carefully measure it and that needs to be made with enormous numbers, otherwise you won't be able to measure any difference.
    Often when you have a loop, you are also accessing array elements. They completely ignored any kind of array access in their performance considerations, which certainly takes longer than the comparison or the increment/decrement and as such is far more relevant when it comes to improving the performance.

    As the UI team will create a new text rendering implementation, it is not unlikely that they will go through it later on or that it even needs to be replaced.

    Edit: Just realized that I only have negative comments. Be assured it is not my intention to stop your efforts at all!
     
  13. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    I've found what is triggering the String.memcpy it appears to be the Vector4 struct used as the tangent in UIVertex, although I'm not sure what element of Vector4 is causing it?

    Correction, nope it appears to kick in when a structs memory footprint increases, as soon as you have a couple of Vectors??
     
    Last edited: Dec 1, 2014
  14. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    OK odd one but just tried changing a UIVertex struct I was testing to a class and the String.memcpy memory profile hits vanish and in deep profile mode the tests show the following:

    UIVertex as Struct - 698ms (called 100 times).
    UIVertex as Class - 171ms (ditto).

    But in non deep profile mode it reverses! LOL

    UIVertex as Struct - 0.02 seconds (using realTimeSinceStartup as timer)
    UIVertex as Class - 0.04 seconds.
     
  15. Zeblote

    Zeblote

    Joined:
    Feb 8, 2013
    Posts:
    1,102
    Could the memcpy be caused by said deep profile mode?
     
    Arowx likes this.
  16. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    @Dantus Well in theory it depends what Mono/.Net does to the loop, it could turn a loop with an inner condition that is not dependent on a variable changed within the loop into a branch and two loops. But as far as the CPU is concerned it has to load up the next set of commands and a branch within a loop is a potential performance hiccup.

    80/20 rule 80% of your code will not need optimisation as it's not called often enough, but the 20% that is tucked away in inner loops can provide significant improvements in performance.

    Of course the UI team have access to the Open and Closed Source so they should be able to make significant improvements in performance.
     
  17. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Good point, the String.memcpy() entries do have very high Self ms entries, but why does it only appear on structs and not on classes?
     
  18. Dantus

    Dantus

    Joined:
    Oct 21, 2009
    Posts:
    5,667
    In theory it doesn't matter what Mono/.Net does, because making optimizations to please a specific compiler is usually not a good idea, especially not if the whole architecture will be migrated to il2cpp.
    Of course, if something is highly performance sensitive, it needs to be optimized even like that. But in my opinion this is not one of those cases.
     
  19. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Well my optimisation targets were derived from a stress testing program aimed to highlight areas ripe for performance improvements in high usage scenarios.

    The optimisations I made are generic, all CPU's can suffer when performing branch prediction, if they get it wrong it can stall their command pipeline slowing down performance.

    IL2CP just takes the intermediate language converts it to C++ and then compiles it. Ideally the C compiler will make optimisations of it's own, depending on the build parameters you give it. But that is a Unity 5 technology and I'm talking about 4.x builds.
     
    Last edited: Dec 2, 2014
  20. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    Just to clarify:

    100 developers all respond to this challenge and each spends 20 hours designing new controls. One control is chosen and added to the new Unity.

    Thereby money lost by developers assuming $20 per hour = 99*20*20=$39,600. (Money which could otherwise have been spent in the asset store???)
    Money saved by Unity = 1*20*20 = $400.

    Total loss by Unity of $400-$39,600 = $39,200

    Better idea: Unity hires 5 top developers and designers for 1 month and creates lots of nice optimized controls. Everyone's happy.
     
  21. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    Your forgetting that even if you control is not chosen you could still release it on the asset store, then if it is useful and worth it people will pay for it.

    You're making the assumption that all work has equal value based on time. And a challenge by definition is not something you make money doing it is something that lets you grow and learn e.g. X-Prize, Climbing Everest, Slack Walking Angel Falls.

    Also a lot of companies have FedEx days or one day a week where developers / engineers / designers can work on their own pet projects. Often companies that do this find nuggets that would otherwise not have been considered time well spent e.g. Unitys own Ninja camps.

    Why not take that creative challenge concept and allow people within the community to get creative.
     
  22. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    I'm not sure it's entirely legal though. Since Unity could offset all it's work as "challenges" thereby getting round minimum wage laws.

    In fact if Unity called all it's staff "volunteers" then they wouldn't have to pay them anything at all! In fact, doesn't it all ready do this for some workers? What's the "summer of code" all about, if not cheap labour. But that's another story.

    It's kind of a grey area when a commercial company becomes half open source.

    Am I right comrades?
     
  23. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    What if said developers would rather work on new higher priority features of the engine.

    Just look at the asset store, in a way it has become a bit of a crutch for Unity, they might not have the best features in the game engine but it's OK as there are optional systems on the asset store.

    Where would Unity be if they had adopted a more open challenge based approach to improving their feature set???

    What would the community be like?
     
  24. Dantus

    Dantus

    Joined:
    Oct 21, 2009
    Posts:
    5,667
    As far as I have seen, you measured the performance in the editor which doesn't give the most reliable data. It is necessary to create builds to test the performance. Optimally it needs to be done for different platforms. Only that will allow you to find actual bottlenecks and will allow to the test possible solutions for them.

    Ideally the C# compiler (or the jit/aot compiler) should take care of that already. If you want to improve the performance and you are doing code duplication by hand to achieve it, those changes should definitely not become unnecessary in Unity 5. This is clearly not a good way to develop reliable and maintainable code.
     
  25. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
  26. yoonitee

    yoonitee

    Joined:
    Jun 27, 2013
    Posts:
    2,363
    It might be like Blender, which is a hodgepodge of barely workable elements.

    In principle I don't see anything wrong with open sourcing parts of Unity but then, we should all be able to modify Unity and make our own "Pro" versions which we could customize and sell on as our own game engines. Much like how the Android OS is opensource but all vendors can make and sell their own customized versions.

    In particular they should take out the clause that says you can't make a game engine using Unity.
     
  27. Dantus

    Dantus

    Joined:
    Oct 21, 2009
    Posts:
    5,667
    Again, the text rendering will be updated anyway. They most likely didn't have a close look at the performance of the text rendering because of that.
    I am still sure you can barely measure a performance difference between the branched and not branched variants. I would be surprised if this is the only part in the Unity UI code where such an optimization would be possible. For the sake of maintainability I believe it is often better not to use them.
     
  28. Arowx

    Arowx

    Joined:
    Nov 12, 2009
    Posts:
    8,194
    I stand corrected, it would appear that ARM RISC chips have a built in feature that causes small logic branches to be folded into the command pipeline and reduce jumps in code.

    http://en.wikipedia.org/wiki/ARM_architecture (conditional execution).