Search Unity

  1. Unity Asset Manager is now available in public beta. Try it out now and join the conversation here in the forums.
    Dismiss Notice

Profiling.Recorder?

Discussion in 'Experimental Scripting Previews' started by Vanamerax, Sep 24, 2016.

  1. Vanamerax

    Vanamerax

    Joined:
    Jan 12, 2012
    Posts:
    938
    Hi there!

    I just noticed that a new thing appeared on the roadmap, namely a new Profiling API:



    Can someone from UT explain this a bit more? Is there going to be an experimental build for this anytime soon? Very interested in this new feature!
     
  2. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Hi!

    We are exposing builtin profiler counters you can see in the Profiler Window to runtime - UnityEngine.Profiling namespace.

    Provided API is very simple in the first iteration and includes:

    Code (CSharp):
    1. class Sampler
    2. {
    3.     public string name;
    4.     public Recorder GetRecorder();
    5. }
    6.  
    7. // Accumulator, every time sample activated it adds its value to the recorder.
    8. public class Recorder
    9. {
    10.     public bool enabled;
    11.     public long elapsedNanoseconds; // Accumulated time for the previous frame
    12. }
    13.  
    14. class Profiler
    15. {
    16.     ...
    17.     // Get builtin Sampler
    18.     static public Sampler GetSampler(string name);
    19.     // Return all available sampler names
    20.     static public int GetSamplerNames(List<string> names);
    21. };
    Recorder allows to get accumulated time of Begin/End pairs of the specific profiler label for the previous frame.
    This information is equivalent to the one you can see in Hierarchy or Timeline view of Profiler Window.

    The usage example is:
    Code (CSharp):
    1. using UnityEngine;
    2. using UnityEngine.Profiling;
    3.  
    4. public class ExampleClass : MonoBehaviour
    5. {
    6.     Recorder behaviourUpdateRecorder;
    7.     ExampleClass()
    8.     {
    9.         var sampler = Profiler.GetSampler("BehaviourUpdate");
    10.         behaviourUpdateRecorder = sampler.GetRecorder();
    11.         behaviourUpdateRecorder.enabled = true;
    12.     }
    13.     void Update()
    14.     {
    15.         if (behaviourUpdateRecorder.isValid)
    16.             Debug.Log("BehaviourUpdate time: " + behaviourUpdateRecorder.elapsedNanoseconds);
    17.     }
    18. }

    For now only development player will be able to collect data, but further the plan is to enable high-level metrics like BehaviourUpdate, Physics/AI/.. update, loading later in non-development players.
    The feature is on the way to trunk and hopefully will be available in the alpha builds - we are targeting 5.6.
     
  3. Vanamerax

    Vanamerax

    Joined:
    Jan 12, 2012
    Posts:
    938
    Thanks a lot for your explanation! This looks like a very easy to use and useful API. I have one critical question for this though: Will this API be thread safe? I myself would really like to profile methods of background threads, however the profiler currently does not support this. Are there any plans for this or to change that?
     
  4. Abenthum

    Abenthum

    Joined:
    Mar 2, 2016
    Posts:
    18
    +1 for multithreaded profiling: currently we built our own system to profile background logic threads.

    Another use case where API improvement could help us: We currently capture snapshots in our benchmarking system, but to display the snapshot in the benchmarking result it requires a roundtrip through unity editor to load and process the snapshot. What we actually want is to capture a full snapshot (including user-defined sections) in a player and convert it into a human-readable format right away.
     
  5. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Thanks for the questions!
    Yes elapsedNanoseconds property is thread safe.
    But it returns the accumulated values (across all threads) of all Begin/End pairs for the specific profiler builtin label in the previous frame.
    E.g. in situation like:
    Code (CSharp):
    1. Main thread:     1           2 B-E 3
    2. Worker Thread 1: 1 B-E B---E 2     3
    3. Worker Thread 2: 1   B-E     2     3
    4. Worker Thread N: 2      B----2-E   3
    In frame 2 elapsedNanoseconds will be equal to the sum of 3 BE delta times and delta time between B and frame boundary. In frame 3 it is BE + frame 2 start to E time.

    User-defined sections (dynamic samplers generated by Profiler.BeginSample/EndSample) are not supported in the first iteration. Also this functionality will be deprecated and replaced by Begin()/End() methods in Sampler class (with Sampler<T1,T2,...>.Begin(T1, T2, ...) for custom data later on). We are currently working on it.

    It would be nice to learn about your use-case more - How do you use the data? Do you hot-tune level based on immediate profiler data? Do you really need full snapshot to be in the player?
     
    Last edited: Sep 27, 2016
  6. Vanamerax

    Vanamerax

    Joined:
    Jan 12, 2012
    Posts:
    938
    Thanks for your explanation Alex, really appreciate it! Another question related to multithreaded profiling (while we're at it):

    Last time I checked, Profiler.Begin/EndSample couldn't be called from background threads. Are there any plans to change that? Ideally, I would like to see the measurements pop up in the profiler timeline view like they already do for Unity's native Job scheduler. Like, additionally display the managed threads.
     
  7. alexzzzz

    alexzzzz

    Joined:
    Nov 20, 2010
    Posts:
    1,447
    As far as I understand, the built-in profiler is the only tool to profile code in Unity. It's great most of the time, but particularly for code profiling it's not. The UI/UX is far behind such tools as ANTS Performance Profiler or JetBrains dotTrace.

    Will "deep profile" be supported? Are there plans to add any extra information to the samplers, like corresponding source file names and line numbers? Perhaps, the general question is - are the planned profiler enhancements will make it possible to make a better viewer for code profiling results?
     
    Last edited: Sep 28, 2016
    Vanamerax likes this.
  8. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    We are working on a Profiler improvements which will allow capturing data from managed threads as well. However because of UI/UX challenges, you might need to express interest in the particular thread by calling API like Profiler.SetThreadName(string groupName, string threadName) in order to have view uncluttered (there might be dozens of threads).

    Yes, deep profiler will be still supported. It might also get an upgrade in form of lightweight mode where you can set specific methods you want to instrument - so far we have a prototype for Window/OSX+Mono configuration.
    Short-term plan is to extend generic samplers with an arbitrary (but strongly typed) data.
    Talking about hierarchy/timeline view improvements - yes, this feature is on our radar as "Improve Hierarchy and Timelime view interaction - double click or right click on an overview entry opens the script at the position".
    These improvements might make profiling experience better.
     
  9. Vanamerax

    Vanamerax

    Joined:
    Jan 12, 2012
    Posts:
    938
    Thank you! This sounds wonderful!

    As far as I know, the .net Thread class does have a Name property, isn't that an option to populate the displayed name field in the profiler view? Or do you mean to only select a few of the managed threads to display in this way? In the latter case, I would suggest to use a different naming scheme like Profiler.RegisterThread to make this intention more clear.

    Lastly, do you have any estimate of when we might expect an experimental build for this? I'd be glad to provide feedback once you guys got a proof of concept!:)
     
    alexeyzakharov likes this.
  10. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Yes, the purpose of this API is to let you explicitly express whether or not you want to have this thread being displayed in the UI. Profiler.RegisterThread/UnregisterThread makes it clearer.

    Recorder functionality will be available with an alpha build of 5.6. But other improvements are still in development and, unfortunately, we can't say exact day yet, as we have planned those for 5.7. We'll definitely keep you posted - once we have a stable experimental build :) we can definitely share it.

    Thanks for your interest!
     
  11. 00christian00

    00christian00

    Joined:
    Jul 22, 2012
    Posts:
    1,035
    Thanks! These will be a lifesaver.
    Can you also add a clear on play button just like the console? It's quite useless 99% of the time to have data from the previous run.
     
    alexeyzakharov likes this.
  12. liortal

    liortal

    Joined:
    Oct 17, 2012
    Posts:
    3,562
    Is this something that is already available in the beta right now ?
     
  13. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Added ClearOnPlay button to our todo list.

    Recorder API will be available in 5.6 alpha 1.
     
    Vanamerax and 00christian00 like this.
  14. ytrew

    ytrew

    Joined:
    Feb 22, 2013
    Posts:
    20
    Need a little guidance here. I am working with the new API here to collect and post process these metrics, but I am seeing labels that are not included in profiler window and data that does not match up quite as well as I would have hoped. As an example:

    I collect the metrics and then look for hotspots (samplers with the most time spent.), idle reports as 8.6%, while the profiler Window tells me that the most time spent is in “BehaviourUpdate”.

    I have a few questions regarding the metrics:
    • When running a workload (game, etc.), is it possible that a recorder becomes valid or invalid? If so is there a recommended approach to determine when this occurs?
    • I see “Idle” as a sampler metric, but do not see idle in the profiler window. Are there other samplers that are not available in the profiler that are in the Profiler?
    • Does the profiler capture editor metrics? I see samplers such as “GuiView RepaintAll” even with nothing in the scene.
    • On the heels of the previous question, is it possible to samplers into something more meaningful in order to profile only certain segments (i.e. CPU, GPU, etc.)?

    Thank you so much for the assistance, I look forward to seeing the API grow and mature
     
  15. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Hi,
    Thanks for the feedback - this is very valuable for us!

    Recorder is valid, when there is correspondent label created at the time of Recorder.Get call. There are 2 use cases for that to consider:
    1. Static labels which are available immediately after engine init. We are working on making high level subset of the labels available in non-dev release builds (such as BehaviourUpdate, Camera.Render, etc.). For that we need a fail-safe if API is used with dev-player label. If static label is not available at startup - it won't be available any further.
    2. Dynamic labels. Currently these are the ones created internally when calling C# methods (like MonoBehaviour.Update for the specific game class) or ones created with Profiler.BeginSample(string) or CustomSampler.Create calls. You can get recorder for such labels only after they were actually created/used. (Also Profiler.BeginSample/EndSample isn't useful for Recorders as it has no information about the label in the EndSample call).
    We are adding API to get label type so you can see which labels are "static" and which are "dynamic" in the future.

    "Idle" is an internal label which is used to measure Job Queue Worker Thread activity. Recorders collect information from all threads. Thus you can't see "Idle" labels in Hierarchy View yet, but you can see them in the Timeline View.
    We'll use a better name for it.

    Yes, recorders are intended to be independent from Profiler. In the Editor you can also capture Editor-only metrics, which again will be annotated as such later. "GuiView RepaintAll" is the Editor label.

    There are no GPU metrics exposed yet, because they introduce a significant overhead to the render pipeline. However, we might add some very high-level metrics in the future. For now all recorders measure CPU time.
    Making the labels more meaningful is definitely on our list - cleaning up names and providing tooltip with relevant information is very important.
     
  16. ytrew

    ytrew

    Joined:
    Feb 22, 2013
    Posts:
    20
    Thanks for the reply. I have a few more items :)

    Tooltip information would be great and subsets would really help when the desire is to profile only specific areas. Also, it would be nice to get some sort of functionality to determine when new dynamic labels become available, I’m not sure if polling for recorder validity makes the most sense. I guess when subsets are added it would be nice to simply query a specific subset such as “BehaviorUpdate” and retrieve all MonoBehaviour labels. Also, I have to keep around two lists (Labels and Recorders) to make the correlation between label and data and output anything meaningful. Would be better if the recorder kept a string of the label name. As it currently stands, my approach is to:
    1. Create a list of all available labels (and routinely check for new additions)
    2. Retrieve each labels recorder
    3. Query for recorder validity
    4. Get data for each valid recorder
    5. Output data using the list of labels and recorders
    If you have any tips are advice, it is much apricated.
     
  17. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Yes, polling doesn't do anything right now as if it was invalid during Get call, it stays invalid.
    We can potentially do precreation of labels when you query recorder.
    Another consideration is that we didn't plan recorders as a replacement for profiler. As they are independent, we don't want to make slower fast path when profiler is disabled. E.g. to get MonoBehaviour labels we have to get script name and actually create a Sampler class on a native side regardless of if we plan to use Recorder or not. That creates an overhead and memory overuse. 95% of dynamic labels are generated only when Profiler is enabled.
    I completely agree that ideally you should get Recorder and then if the data is available, just query it. But it is tricky to do it without performance impact right now.

    We tried this approach before and decided to remove name property from Recorder. The rationale behind that was that Recorder itself as well as all labels were stored on a native side. And every time we pass a string from native to managed side we do a managed allocation. And that's very noticeable when you have e.g. onscreen HUD which displays some metrics and store only Recorders querying name every frame for rendering. So in this case API forces to use Tuple or two lists to avoid GC allocations :)

    We are working on a sample Profiler HUD project that uses Recorders to display main performance in the game. We'll share it once ready.
    Generally main use case for this API is to get overview what's going on during the frame. How much time subsystems use and what are the bottlenecks when game is played. Especially release games.
     
  18. ytrew

    ytrew

    Joined:
    Feb 22, 2013
    Posts:
    20
    Thanks for the feedback, the performance considerations make sense :) Looking forward to see the Profiler Project and new changes as the become available. Is it possible top get a timeline on when these may be dropping (subsets for labels, high-level GPU stats, HUD project, etc.)?
     
  19. benblo

    benblo

    Joined:
    Aug 14, 2007
    Posts:
    476
    Just discovered the Recorder/Sampler and CustomSampler classes, nice!

    @alexeyzakharov One thing I very frequently find myself wanting is RAII samplers, for this kind of cases:
    Code (CSharp):
    1.     void bla()
    2.     {
    3.         Profiler.BeginSample("bla");
    4.  
    5.         // do something...
    6.  
    7.         if (a)
    8.         {
    9.             Profiler.EndSample();
    10.             return;
    11.         }
    12.  
    13.         // do something...
    14.  
    15.         if (b)
    16.         {
    17.             Profiler.EndSample();
    18.             return;
    19.         }
    20.  
    21.         // do something...
    22.  
    23.         Profiler.EndSample();
    24.     }
    Keeping track of all the EndSamples before each return is tedious.

    One (bad) way is to wrap the method body in another method, but that's awfully intrusive for profiling code.

    Ideally I'd like to simply do this:
    Code (CSharp):
    1.     void bla()
    2.     {
    3.         var sample = new ProfilerSample("bla");
    4.  
    5.         // do something...
    6.  
    7.         if (a)
    8.         {
    9.             return;
    10.         }
    11.  
    12.         // do something...
    13.  
    14.         if (b)
    15.         {
    16.             return;
    17.         }
    18.  
    19.         // do something...
    20.     }
    ... and sample would automatically EndSample when falling out of scope... but C# has no proper RAII (doesn't call Dispose automatically).

    The next best way (still intrusive) is to wrap the code in a using block:
    Code (CSharp):
    1.     struct ProfilerSample : IDisposable
    2.     {
    3.         bool disposed;
    4.         public ProfilerSample( string name )
    5.         {
    6.             disposed = false;
    7.             Profiler.BeginSample(name);
    8.         }
    9.         public void Dispose()
    10.         {
    11.             if (!disposed)
    12.             {
    13.                 Profiler.EndSample();
    14.                 disposed = true;
    15.             }
    16.         }
    17.         public void End()
    18.         {
    19.             Dispose();
    20.         }
    21.     }
    22.  
    23.     void bla()
    24.     {
    25.         using (new ProfilerSample("bla"))
    26.         {
    27.             // do something...
    28.  
    29.             if (a)
    30.             {
    31.                 return;
    32.             }
    33.  
    34.             // do something...
    35.  
    36.             if (b)
    37.             {
    38.                 return;
    39.             }
    40.  
    41.             // do something...
    42.         }
    43.     }
    I feel like something like this should be built-in... and if some UT wizard can find a magic trick to do it without a using block, that would be even better :) !
     
    alexeyzakharov likes this.
  20. superpig

    superpig

    Drink more water! Unity Technologies

    Joined:
    Jan 16, 2011
    Posts:
    4,659
    IIRC, the reason we don't already have this is that it makes it impossible to strip out the profiler instrumentation in release builds.
     
  21. liortal

    liortal

    Joined:
    Oct 17, 2012
    Posts:
    3,562
    In which Unity version is this new API included?
     
  22. superpig

    superpig

    Drink more water! Unity Technologies

    Joined:
    Jan 16, 2011
    Posts:
    4,659
    2017.1.
     
  23. nzhangaudio

    nzhangaudio

    Joined:
    May 26, 2015
    Posts:
    20
    It's been around two years. What's the plan regarding having this in non-development build? I'd like to do some scripting that's reactive to the performance.
     
    alexeyzakharov likes this.
  24. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    2019.1 there will be at least 2 markers available in release players - "GC.Collect" and "Gfx.WaitForPresent".
     
  25. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    Has anyone got code examples for using Recorder with CustomSampler in-editor?

    e.g. this page specifically states they should work together:
    https://docs.unity3d.com/ScriptReference/Profiling.CustomSampler.html

    ...but CustomSamplers that all work 100% as expected when viewed in the Profile window do ... nothing ... when I try to read their values in script using the Recorder class, e.g. like this:

    Code (CSharp):
    1.  
    2. CustomSampler s = CustomSampler.Create("FOO");
    3. ...
    4. s.Begin();
    5. ...
    6. s.End(); // shows up as expected in ProfileWindow
    7. ...
    8. Recorder.Get( "FOO" ); // this is ALWAYS dead, has no data.
    9.  
     
  26. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Yes, we use Recorder for performance testing of development builds internally - that is the primary function so far :)
    Did you try example provided in https://docs.unity3d.com/ScriptReference/Profiling.Sampler.GetRecorder.html ?
    This returns a Recorder object which should be enabled.
     
  27. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    Sorry, when I said "in-editor" I meant: fully in-editor, not runtime -- i.e. I am profiling editor scripts.

    Profiling editor scripts works fine with the Profiler window running and manually triggered, but does not work for me otherwise.
     
  28. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Ok, sorry, I think the scripting API example missing one important point - the recorder returns time for the previous frame (or immediately when disabled). This is because there are internal engine markers which might contribute to the accumulated value after any scripting method is executed in the current frame.

    If you modify your example to
    Code (CSharp):
    1.  
    2. CustomSampler s = CustomSampler.Create("FOO");
    3. Recorder r = s.GetRecorder();
    4. r.enabled = true;
    5. ...
    6. s.Begin();
    7. ...
    8. s.End(); // shows up as expected in ProfileWindow
    9. ...
    10.  
    11. r.enabled = false;
    12. // use r.elapsedNanoseconds;
    13. r.enabled = true;
    14.  
    or just use
    r.elapsedNanoseconds;
    next frame everything should work.
     
    Ardenian and a436t4ataf like this.
  29. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    Aha! That would certainly explain it. I'm in San Francisco until next week, but I will try this when I get home.

    I've added a suggestion to the docs page that it be updated to include your change - now that you've described it, it makes sense, but it's not obvious unless someone tells you.
     
    alexeyzakharov likes this.
  30. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    thank for the suggestion! I can see this being confusing - the main reason why this is the value for the previous frame is that some in-engine activities can be done after all scripts are executed, so there is no way to get the final result for the current frame.
    We are changing the Recorder api to accommodate other stats such as draw stats and add an explicit property "immediateValue" if you want to read the current value.

    Great, let me know if you have any troubles with using it.
     
  31. Ardenian

    Ardenian

    Joined:
    Dec 7, 2016
    Posts:
    313
    Hello, just a quick heads up that I have also run into the issue that confused @a436t4ataf, which I was not able to resolve until I read the posts in this thread.

    I would have greatly appreciated if the example that you posted here, @alexeyzakharov, was available in the documentation, as well as including the explanation that you gave in your previous post, therefore I hope that the documentation is going to be updated in the future.
     
    alexeyzakharov and a436t4ataf like this.
  32. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    @alexeyzakharov your correction worked fine. I am now able to test in-editor performance timings inside Unity unit-tests (currently using 2019.2 + UTF v 1.8). Because the API doesn't make much sense at the moment (very easy to misunderstand), I wrapped Sampler+Recorder into a new struct which works more logically. (I'll be refactoring it soon, but the essence is there: a create method, a begin, an end, and a "time elapsed in milliseconds").

    For instance:

    Code (CSharp):
    1. struct SampleRecorderPair
    2.     {
    3.         public CustomSampler sampler;
    4.         public Recorder recorder;
    5.         public long lastNanos;
    6.  
    7.         public static SampleRecorderPair Create( string s )
    8.         {
    9.             SampleRecorderPair srp = new SampleRecorderPair();
    10.            
    11.             srp.sampler = CustomSampler.Create( s );
    12.             srp.recorder = srp.sampler.GetRecorder();
    13.             srp.lastNanos = 0;
    14.            
    15.             return srp;
    16.         }
    17.  
    18.         public void Begin()
    19.         {
    20.             recorder.enabled = true;
    21.             sampler.Begin();
    22.         }
    23.  
    24.         public long End()
    25.         {
    26.             sampler.End();
    27.             recorder.enabled = false;
    28.  
    29.             lastNanos = recorder.elapsedNanoseconds;
    30.             return lastNanos;
    31.         }
    32.  
    33.         public long milliseconds
    34.         {
    35.             get { return lastNanos / 1000000; }
    36.         }
    37.     }
    38.  
    39. [UnityTest]
    40.         public IEnumerator Test_SlowOperation()
    41.         {
    42.             ... // setup my test here
    43.            
    44.             SampleRecorderPair srpStamp = SampleRecorderPair.Create( "Stamp" );
    45.             srpStamp.Begin();
    46.             slowComponent.LongRunningMethod(); // the thing we're testing
    47.             srpStamp.End();
    48.             Debug.Log( "stamp time millis = "+srpStamp.milliseconds );
    49.             Assert.Greater( 10, srpStamp.milliseconds, "This operation shouldn't take more than 10 milliseconds at worst" );          
    50.         }
     
  33. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    .. However, it doesn't work for the Unity Inspector :(.

    My guess is that this is only a weakness of Unity's current Unit Testing Framework - it will not render Editor controls (Inspectors, Scene view, etc), effectively running headless at all times - so that it cannot be used to test custom Inspectors. (this is great as an option - headless tests run faster! - but it's unfortunate that there's no option to turn it off, and run tests accurately).

    i.e. not an issue with the Samplers/Recorders -- I'm guessing that if UTF had an option to run identically to the Editor, then the Samplers would faithfully record the data without making any changes?

    For now, I'm planning to discard the UTF for these kind of Editor tests (and hope it gets more features / becomes open-source next year) and use our own proprietary unit-testing that Just Works (but cannot run headless! So runs a lot slower than UTF if you don't need to accurately measure Editor performance).
     
  34. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    I've finished integrating, and it's working great. I now have this setup:

    1. Important classes use CustomSampler.Create("special-name") and .Begin() / .End() to internally profile
    2. When working in Editor, I can use the Profiler window to analyse those
    3. When working in Editor and the Profiler window hangs (happens most of the time in Unity 2019 :(), I can run a script that does the profiling instead *without having to write new code*
    4. Profiling-Script uses the identical CustomSampler.Create("special-name"), which gives you a reference to the shared sampler, and lets you peek at the values (without touching the profiled class at all, which is perfect)
    5. I can write Unit Tests with UTF v1.8 that use the same process to not only test for correctness, but test for performance - again, without making any changes to the classes being tested! I do: "Assert.Greater( 1000000, recorder.elapsedTimeNanoseconds, "This segment shouldn't take more than 0.1ms to execute")"

    To be clear, the API has a poorly named method in CustomSampler (although the docs correctly explain what it does, in contrast to the name). CustomSampler.Create does NOT "create" at the API/conceptual level, it instead "attaches" to a shared sampler (which is created if it doesn't already exist? I guess that's why it got given the wrong name in the API) ... basically: Create implements Get/Fetch/Attach. This is why the above setup works: it doesn't matter which class calls Create first, or how many times it gets called, Unity correctly shares the data between all methods/classes that ever called Create with the same string argument.

    ...and if that doesn't seem to work, you've made a mistake in your code, like I did the first time around :). Because the Unity parts of this definitely work, everything is fine.
     
    alexeyzakharov likes this.
  35. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Thank you very much for the feedback! Yes, it is not really user friendly and I'm going to fix it.
    I've added it for a different use case - to extract internal engine timings for performance tests, but apparently it can be also used for just timing a code with pretty good accuracy and low overhead. (Originally I thought that if you want to measure time Stopwatch is what you can use.) Having a wrapper like you created is totally useful.

    AFAIK the headless mode is controlled by -batchmode - if you don't use it you should run tests fine. Also if you want to test UI performance you also have to ensure that the window is visible (
    wnd.Show()
    ), positioned correctly with a fixed size (
    wnd.position = new Rect(0, 0, ...)
    ) preferably not docked (wnd.RemoveFromDockArea()). We use this approach for the Profiler Window performance tests :)

    Yes, agree. The term we use in native code is GetOrCreate. In C# the issue is a bit complicated due to domain reload. "Create" part happens in the player of in a fresh scripting domain in the Editor. Because profiler markers are never deleted after domain is reloaded we actually use "Get" to retrive already created marker.
    Note that CustomSampler is going to be replaced by ProfilerMarker in the future providing simpler api and ability to do markup in burst code.

    So, I'm working on a replacement for the Recorder api you are using which would also provide runtime access to other stats (like draw stats, memory stats, etc.).
    Would it be easier to understand API if we had something like?
    Code (CSharp):
    1. unsafe struct ProfilerStat
    2. {
    3.   public ProfilerStat(string counterOrMarkerName);
    4.   public ProfilerDataType ValueType;
    5.   public void* Value { get; } // Value at the end of the last frame/reset
    6.   public void* ImmidiateValue { get; } // Current value
    7.   public T GetValue<T>() { return *(T*)Value; }
    8.   public int FlushCount;
    9.   static ProfilerStat StartNew(string counterOrMarkerName);
    10.   public void Start();
    11.   public void Stop();
    12. }
     
    a436t4ataf likes this.
  36. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    Stopwatch is OK, but ... I found it too inaccurate for 95% of cases. The times when you need to profile from a script, you usually are doing small tight loops (e.g. with unit testing), or you're doing complex cases where you need to understand where the time was spent in a fine granularity (e.g. integration test, e.g. code-optimization), or the Profiler window is crashing, and you need high accuracy.

    Also: Stopwatch gets included in production builds, if I remember correctly? That means lots of extra work in doing #define's to add/remove all your stopwatch code - whereas the Sampler/Recorder system I can leave in the codebase (cleaner, easier to maintain).

    If I understand correctly, usage would be something like:

    Code (CSharp):
    1.  
    2. public void SlowMethod()
    3. {
    4.  var totalTime = ProfilerStat.StartNew( "SlowMethod_Total");
    5.  ... // do stuff
    6.  totalTime.Stop();
    7.  }
    8.  
    9. public void ScriptedPerfTest()
    10. {
    11.  SlowMethod();
    12.  SlowMethod();
    13.  SlowMethod();
    14.  ...
    15.  Debug.Log( "Total time taken for 3 runs = " + (new ProfilerStat("SlowMethod_Total")).Value );
    16. }
    17.  
    ...where "SlowMethod_Total" now appears in the Profiler window automatically, and we can optionally call ScriptedPerfTest from an Editor test (or from any editor code) to separately measure + report results.

    If so, looks good. Two suggestions:

    1. It would be nice to have a static method for 3rd party codepaths to use to peek at the results. e.g. my final line of code in ScriptedPerfTest - it only wants to interact with the profiler in order to read the results (it's not going to start or stop anything; it's a read-only use-case).

    e.g. static ProfilerTest Get(string counterOrMarkerName); // possibly a singleton, integrated with the constructoror ... e.g. static T GetValue<T>( string counterOrMarkerName ); // this is more precise: the use-case only needs the Value, it doesn't need the full ProfilerTest struct!

    2. The naming of .Value and .ImmediateValue is non-obvious; I would prefer if the names were merged with the single-line comments you've inserted above, so that anyone reading code that uses these methods immediately understands which one is which, and why the code is using it.

    e.g. ValueForLastPeriod + ValueInstant
    or e.g. ValueCompleted + ValueSnapshot
    or e.g. ValueFullFrame + ValuePartialFrame

    (if I understand correctly: ValueImmediate MIGHT have the same value as Value, but will often have a different value, depending on what moment you invoke it. In most cases, you expect people to use Value, because that represents a stable, "completed frame" number, but there may be special cases where people need to use ValueImmediate, to look at a long-running profiled spot that hasn't completed yet)

    NB: I am only guessing at the meaning of ValueImmediate here, I'm imagining something like: you have a frame that takes more than a few seconds (!), and this enables the Profiler window to give a live display of where that time is going even while the main engine hasn't finished drawing the frame yet. That would be cool! I have had a lot of situations during development where things do actually take many seconds to do a single frame, and if I could profile them, it would have made life easier.

    (e.g. worst case was an algorithm that took 20-120 seconds *per frame*. Profiling it and optimizing it down to milliseconds per frame was a very slow process, it took 10-15 minutes to get a single profile)
     
    alexeyzakharov likes this.
  37. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    This is bad :oops: and I want to fix this!

    Yes, but you can wrap it into a struct/class with Conditional attribute. But I guess the main reason to not use it is that it costs depends on scripting runtime implementation.

    I would do it more like
    Code (CSharp):
    1. static readonly kSlowMethodMarker = new ProfilerMarker("SlowMethod_Total");
    2.  
    3. public void SlowMethod()
    4. {
    5. kSlowMethodMarker.Begin();
    6. ... // do stuff
    7. kSlowMethodMarker.End();
    8. }
    9.  
    10. public void ScriptedPerfTest()
    11. {
    12. var totalTime = ProfilerStat.StartNew(kSlowMethodMarker);
    13. SlowMethod();
    14. SlowMethod();
    15. SlowMethod();
    16. totalTime.Stop();
    17. ...
    18. Debug.Log( "Total time taken for 3 runs = " + totalTime.Value);
    19. }
    20.  
    The reason why is that the time counting has a minimal impact on the code itself (5-50ns) -
    ProfilerStat.StartNew(kSlowMethodMarker);
    is much slower to to getting the marker by string each time.

    The first suggestion is great! I'll consider generic
    ProfilerStat<T> where T : unmanaged
    . Though we need non generic version to be able to make something like in-game perf stats window where you can add different metrics and collect/visualize those.

    For the second one - what about ValueLastFrame + ValueValueImmediate? Immediate value becomes ValueLastFrame at the end of CPU frame. In the beginning of a frame some stats are reset to 0. E.g. timings of Camera.Render, GC.Alloc count, etc. And due to some of those can be incremented outside of your scripting code (after LateUpdate at very end of the frame) there is no way to know/use those and not impact the values itself.

    Yes, immediate value could give a hint if something takes too long and should be e.g. aborted or moved to the next frame.
     
  38. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    Sorry, to be clear: by "crashing" I mean "takes so long to render a single frame that I have to give up and force-quit Unity". It rarely crashes (I crashed it last week. Looking at windows Task Manager, the Profiler ran out of RAM (it was at approximately 40 GB of RAM usage when it ran out ... for recording approximately 5 frames of data, but one of those frames had a 20 second render time without the profiler :)).

    In general, the profiler window is currently very stable, but some parts are very inefficient, and some of its rendering is hanging the editor for 30 seconds up to multiple minutes. Compared to the 5.x/2017 profiler, it is 1000x better, but it feels like some small parts of the render loop are still using Unity 5.x style O(n^10) rendering ;) or similar.

    Oops, yes, that's what I would do too :). I wrote my example too quickly.

    Sounds good. Would be even better if it could capture the fact that ValueLastFrame is representing a full frame, e.g. - "ValueOfLastFrame".

    Or would it be accurate to have:

    "ValueLastFrameTotal / ValueIntermediate"
    "ValueLastFrameTotal / ValueThisFramePartial"
    "ValueLastFrameTotal / ValueThisFrameIntermediate"

    Or even omit the "Value", take that as implied from the return type:

    "LastFrameTotal / ThisFrameIntermediate"

    Reasoning:
    • I prefer Intermediate to Immediate, because it makes clear this is a not-quite-ready value, its a partial result
    • (Immediate often implies "good, better" in performance situations :), where Intermediate implies "immediate but incomplete")
    • Total vs Partial appears to summarise the fundamental difference between the values (if I understand correctly)
    • (it is then obvious to the reader that "this-frame-partial" is going to become "last-frame-total" at the end of the frame)
     
    alexeyzakharov and MartinTilo like this.
  39. Lockthav

    Lockthav

    Joined:
    Dec 6, 2018
    Posts:
    6
    Hello Alexey!

    Trying to make a tool (editor-only) to track my Particle System's performance.
    Doing something like:

    Code (CSharp):
    1. Recorder.Get("PreLateUpdate.ParticleSystemBeginUpdateAll")
    2. Recorder.Get("PostLateUpdate.ParticleSystemBeginUpdateAll")
    3. // And so on...
    But I feel like tracking all those hard-coded samplers is not the best approach.

    There's a way to do something like
    Code (CSharp):
    1. Recorder.Get("ParticleSystem")
    and retrieve all samplers related to Particle Systems?

    Thank you!
     
    alexeyzakharov likes this.
  40. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    No, but you can use Sampler.GetNames in a meantime to get all available names and create recorders for those which contain "ParticleSystem" substring.
     
    Lockthav and a436t4ataf like this.
  41. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    @alexeyzakharov incidentallly ... today I ran into the "unity profiler window is useless because performance is multiple orders of magnitude slower than expected" bugs again. I have now scheduled a week to write an editor plugin that replaces the Unity official Profiler, because it's so awful I can't bear it any longer (and I'm glad the Profiler API's exist so that I can do that!).



    Summary:
    1. Start a scene with 5 small meshes in it, and nothing happening -- editor runs at 100+ FPS (I wasn't counting to be specific)
    2. Enable deep profiling, wait a few seconds for the decoration to happen -- editor @ 100+ FPS
    3. Click button in custom Inspector to trigger blocking action that takes 3.4 seconds WITHOUT profiler -- editor hangs ---- this is the code I need to profile/trace. It's about 5,000 lines of code, but the execution time is short.
    4. ... blocking action takes approximately 5-6 seconds instead of 3-4 (because of profiling overhead, I assume)
    5. *** Unity Profiler Window takes 10 Gigabytes of RAM per second until the machine runs out of physical RAM **
    6. Unity editor is back to normal, responding normally. Profiler Window is unresponsinve
    7. Wait ...
    8. ... wait ...
    9. .... ... wait
    10. ... After a minute or so, the Profiler Window starts responding to mouse-clicks again (and I can finally hit the Stop button)
    11. Unity editor is still running at 100+ FPS
    12. Profiler is running at ** less than one frame per 10 seconds **
    13. Now try to view any frame in Profiler
    14. ...
    15. ... Profiler now kills the Unity Editor. Cannot get any response to anything ...
    16. ...
    17. ... 10 minutnes later, after crying, still nothing happening, merely the insane Profiler window having a private party in my CPU cores ...
    18. Force-close Unity Editor just to get control of my computer back
    19. OR sometimes: wait another 5 minutes ... 10 minutes ... or more ... for the Profiler to "allow" the Unity Editor to continue running again.
    This, sadly, is the same bad behaviour of the Unity Profiler window we've all been experiencing since Unity 5. Sometimes you get lucky and the profiler runs "very slowly but only a 5-30 second wait to do work", and then sometimes - like today, on a trivial processing load - it runs "TENS OF MINUTES wait for it to do work, every time you click a mouse button". When you're trying to trace a bug, it makes you want to scream :).

    To be clear: I'm not expecting Unity to fix this. And it's against policy - I'm on the stable releases (2019.1, 2019.2) not the unstable ones (2019.3, 2020.1) - so I know there'll be no fixes. But the problem has been so very bad for so very many years that I've given up, and I'm sure I'm not the only one.
     
    alexeyzakharov likes this.
  42. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Sorry for that - we are well aware of the problem and working on making UI part scalable to support amount of data we can transfer.

    Yes, the timeline rendering part requires significant changes. We are aiming for the future releases, but also exploring a simple memory reduction fixes that can help in a meantime.
     
    a436t4ataf likes this.
  43. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    @alexeyzakharov - using this in unit-tests in 2019.4 isn't working any more :(.

    If I use your technique to run tests one at a time (double click each one in TestRunner) == it works as expected.

    However, if I allow Unity TestRunner to run all tests at once (double-click the class name, or double-click the word "Tests"), and the tests run fast, then all results from Recorder / Sampler are now wrong.

    Background / concrete example:

    I have reduced the test case down to: I am testing a method that should be a no-op (it's coded to "return;") - the test is "measure the execution time of these samplers for 1 run of this method.

    I'm asserting that the Recorder.elapsedNanoseconds < 500000 (i.e. 0.5ms)

    ...when double-clicking each test one by one: that assertion dosnt trigger, everything is fine.

    But when running all tests at once I get a series of massively too-large numbers, e.g. for one run of 7 test cases in the same class:

    6088000 ns
    5145000 ns
    (nothing - assertion passed)
    5248000 ns
    5290000 ns
    (nothing - assertion passed)
    5374000 ns

    Is this a bug? It seems to be something wrong with Unity's TestRunner - it (correctly) runs individual tests, but something goes wrong when it runs batches of tests.
     
  44. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    2022 update: I can't get this code to run any more. Whatever was changed in Unity 2020 LTS appears to have permanently broken it - everything always returns 0 (even when the Profiler window is showing actual times).

    And of course there's no way to debug it :(.

    I tried the "new" API that supposedly replaces this in 2021 - it's so intensely confusing I gave up. The docs were incomprehensible, and the class names made no sense at all.
     
    alexeyzakharov likes this.
  45. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    thanks for the reply @a436t4ataf ! sorry to hear your code was broken by 2020 LTS and the new api doesn't suit your needs.

    may I clarify what is the api you are referring to and what was confusing so that we can fix it :)
     
  46. a436t4ataf

    a436t4ataf

    Joined:
    May 19, 2013
    Posts:
    1,933
    I'm trying again now, recording my observations as I go.

    1. Start with 'unity profiling' on Google, arrive at: https://docs.unity3d.com/Manual/Profiler.html
    2. That page first of all tells us about the 'low level native plugin API' -- yeah, no. I'm sure that's wrong. (maybe?)
    3. The only other thing listed that sounds like 'allow you to profile code' is "common profiling markers"
    4. ... which goes to a page which immediately starts by saying its incomplete ("following tables explain what some of the common markers do")
    5. Read the whole thing. But it seems it's all Unity-internal stuff - great for people who want to profile Unity, useless for developers writing new code.
    6. ... basically that page is a nightmare. It's just endless rabbit-holes of different things, with (I realised afterwards) no hope of finding anything useful for profiling user-code. But the only way to find out is to do a LOT of reading, and following links, and reading those, and trying to understand each of them, only to decide "no, this isn't relevant either - it's yet another 'unity employees only' feature; great if I ever want to profile the UnityEngine, but not relevant today"
    7. The other useful-sounding links on the main Profiler page (linked above) redirect you to : https://docs.unity3d.com/Packages/com.unity.profiling.core@1.0/manual/index.html - a page which makes it sound like you HAVE to install a separate package to do any profiling (do you? Is that the new requirement - profiling is now not included in Unity any more? This is not mentioned in the main Unity docs as far as I can tell)
    8. ... it doesn't have any documentation on the main page.
    9. The 'Upgrade guide' is particularly sad: https://docs.unity3d.com/Packages/com.unity.profiling.core@1.0/manual/upgrade-guide.html
    10. ... after you finish crying, in desperation: try reading the API. Maybe we can reverse-engineer the docs by reading API method call names etc.
    11. ..ah! Right! Now we're getting somewhere - there's a bunch of docs here.
    12. ProfilerMarker. WTF?
    13. ... the linked API page for that says "See Recorder"
    14. ......and Recorder's page says only to use CustomSampler's
    15. ........which are the old API and (as noted above) don't seem to work any more.
    16. Go back to ProfilerMarker and try to understand it.
    17. Give up. It's circular references to things like Recorder that themselves don't seem compatible with it and it's impossible to figure out what is actually required, what is optional, what is going on.
    18. Try to reverse-engineer a working setup using ProfilerMarker. Nothing works, can't figure out why, have no idea what's missing, what's correct.
    19. Finally: give up.

    TL;DR: it seems there are now 3 (I think?) simultaneous profiling API's in Unity, and they all are slightly-related in that they share a little bit of code. But nowhere does it say this. And nowhere does it say which code they share, and which is necessary. Nowhere does it tell you which to use when, or why. Or how.

    TL;TL;DR: There's no 'getting started guide'. That's all that's really needed: "If you want to profile your own code, do this". That should be linked from the main https://docs.unity3d.com/Manual/Profiler.html page, and it should be clearly signposted (Because it's a common use case that is separate from everything else on that page).
     
    alexeyzakharov likes this.