Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice

Profiler Deep Profile in build shows bogus data?

Discussion in '2019.3 Beta' started by Lynxed, Dec 15, 2019.

  1. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121
    The version is 2019.3.f3
    I'm deep profiling performance of computationally intensive generation, which suppose to be done on the background. Some of it is done in .net Task, some is done with unity jobs system. When the generation starts, time values of methods, that have absolutely nothing to do with the generation itself scale up.

    For example:

    Here we have a method, called 27 times, that has nothing to do with the generation, that updates some positions. It costs 0.87ms.
    upload_2019-12-15_3-36-47.png

    And here we have absolutely the same method with absolutely the same 27 objects costing 40.53ms:

    upload_2019-12-15_3-39-1.png

    Is there a bug, or am i doing something wrong? Please help! I'm sure i successfully profiled my game with Tasks prior to version 2019.3.f1.

    upload_2019-12-15_3-51-59.png

    This is how timeline looks.
     
    Last edited: Dec 15, 2019
    alexeyzakharov likes this.
  2. Ng0ns

    Ng0ns

    Joined:
    Jun 21, 2016
    Posts:
    197
    Having issues with deep profiling in 2019.3.f3 as well (worked fine in the past). Fps drops to single digits when enabled. Tested on two phones, and it also happens while running game in editor.

    On a empty scene EditorLoop takes up 82ms.
     
    DeadCastles and alexeyzakharov like this.
  3. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,618
    Perhaps related to this known issue:
    https://issuetracker.unity3d.com/is...-deep-profiling-is-on-with-no-complex-scripts

    However, one way to speed it up a little is to close all profiler tabs but the one you actually need. For example, if you only want to profile CPU, just all the other ones like Memory, UI, etc.
     
  4. Ng0ns

    Ng0ns

    Joined:
    Jun 21, 2016
    Posts:
    197
    Closing down profiler modules seems to help a little while running in editor, but not much. Profiling the phone is not possible, it almost stalls.
     
  5. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Are you profiling the Player? Is it possible to file a bug with a project? I think slow down happens due to large amount of data generated by deep profiling and the pooling buffer in the player in not large enough to accommodate it. So we spend more time than usually on memory allocations.
    There is https://docs.unity3d.com/ScriptReference/Profiling.Profiler-maxUsedMemory.html api/cmdline parameter which might help in this situation if you set it to e.g. 256MB. By default the pooling buffer we use in Players is 4MB.
     
  6. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Is this happening when Deep Profiling is enabled in BuildWindow?
     
  7. Ng0ns

    Ng0ns

    Joined:
    Jun 21, 2016
    Posts:
    197
    Yes. "Development build" and "Deep profiling support" enabled.
     
    alexeyzakharov likes this.
  8. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121
    Yes, this is IL2CPP build with Deep Profile turned on.

    Sadly, i would not be able to file the project. Yes, the amount of stuff happening there is very big. Thank you, i will try the walkaround and let you know, if it fixes stuff.
     
    alexeyzakharov likes this.
  9. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Thanks!
    No worries, if I understand the issue correctly increasing the default buffer size for profiler should fix the slow down. I'll make a fix which increases default max allowed memory for profiler to 128MB in case of deep profiling.
     
    Peter77 likes this.
  10. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    @Lynxed one more question - if you disable deep profiling in the player using "Deep Profile" button, do you see better performance?
    (I can assume so from the shape of the screenshot you posted, but want to confirm my theory :))
     
  11. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121
    Hi! Tried this:
    upload_2019-12-20_20-5-3.png

    But in Deep Profile i still get the same result as before:

    upload_2019-12-20_20-16-55.png

    To answer your question: no. I measured the same method, which is not involved in generation process with non-Deep Profile (with Profiler.BeginSample-Profiler.EndSample) and here are the results:

    Before generation starts:
    upload_2019-12-20_20-19-9.png

    after generation started:
    upload_2019-12-20_20-22-30.png

    Although i have only 3 frames, where time is different, i think that is still a strange result.

    This method consistently does a number of of Vec3 and Quaterion operations per call, so nothing changes there during generation process.
     

    Attached Files:

    alexeyzakharov likes this.
  12. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121

    Attached Files:

    alexeyzakharov likes this.
  13. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Thank you for trying this out! The buffer limit has nothing to do with the slow downs then.

    This unfortunately points to the fundamental issue with deep profiling. I suspect there are a lot of calls to those methods.
    So my explanation would be that the slow down by design due to the nature of instrumentation profilers like Unity Profiler:
    1) Large amount of samples.
    2) Fixed cost per sample.
    The overhead from deep profiler is also significantly larger compared to the normal profiler beign/end api calls. It is usually 0.1-0.8us, so 10K calls can easily contribute to the 1-8ms perf frame. (Normal profiler begin/end api cost is ~0.01-0.02us).
    We have an idea how to decrease a fixed cost for deep profiling to be on pair with simple api, and have a dynamic instrumentation to skip profiling of high frequency loops. But ultimately instrumentation profilers fundamentally do not fit some use cases.

    I can suggest to use a sampling profiler to get a performance picture without an invasive instrumentation.
    For this particular case I've added an integration with Intel® VTune™ Amplifier of Unity Editor and Player x64 for Windows for 2019.2. There is a small guide available as a doc we are told we can share :). We use it mostly internally, but if you find it useful I can add improve it and add to the manual.
     
    Last edited: Dec 23, 2019
    Peter77 likes this.
  14. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Did it work better(faster) in the Editor in the previous versions of Unity when profiling Playmode?
    Perhaps we got a performance regression in deep profiling our tests didn't surface. I'm asking because we didn't change actually Deep Profiling mechanism itself in 2019.3, and only added support to players.
    If so, do you mind filing a bug?
     
  15. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121
    Thank you for all the help! This doc is very interesting.

    It's not about speed for me, it's about inconsistent evaluation of time for samples and inadequate sorting in Hierarchy.
    Sadly, i don't really have time to dig deeper right now, but what i noticed is Timeline chart makes more sense then Hierarchy mode. Can it be about Hierarchy mode making weird sorting and combining of samples?

    I would not be able to recompile the project in 2019.2 now, but if i'll have the time (or when the perf will become an unacceptable issue and profiling will be impossible), i'll try to file a bug report.
     
  16. Lynxed

    Lynxed

    Joined:
    Dec 9, 2012
    Posts:
    121
    A question: What are the best practices for profiling coroutunes and async methods? I use UniTask library for my stuff, where i can await UniTask.Yield(), which has similar effect to "yield return null" in an IEnumerator. Those things break Profiler.BeginSample(). Should we never have such things between Profiler.BeginSample() - Profiler.EndSample() or is there more convenient way? Oftentimes you would have yield return null in a loop, which you want to profile.
     
  17. Ng0ns

    Ng0ns

    Joined:
    Jun 21, 2016
    Posts:
    197
    I know there's a large overhead while running the deep profiler in editor mode, but I don't remember it being this bad. At the moment my project runs around 10 fps when launched for ~5 seconds before it becomes somewhat responsive again (most likely due to my game populating object pooler).

    I just don't remember it being this bad, like ever, but I guess it could be me.
     
  18. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    2,456
    How many of these scripting threads are you spinning up in those frames with the slow down. Maybe you're just hitting thread-contention and the main thread is forced off the core while one or more of these threads takes center stage? Your Job worker threads seem relatively idle and their count is tuned to the amount of available cores so if you have more scripting threads than job threads, maybe it's just too many.

    You seem to be occupying these for a similar amount of frames as the spoke on the main thread lasts too.