Profiling.Recorder?

Vanamerax · Sep 24, 2016

Hi there!

I just noticed that a new thing appeared on the roadmap, namely a new Profiling API:

Can someone from UT explain this a bit more? Is there going to be an experimental build for this anytime soon? Very interested in this new feature!

alexeyzakharov · Sep 26, 2016

Hi!

We are exposing builtin profiler counters you can see in the Profiler Window to runtime - UnityEngine.Profiling namespace.

Provided API is very simple in the first iteration and includes:

Code (CSharp):

class Sampler

{

public string name;

public Recorder GetRecorder();

}

// Accumulator, every time sample activated it adds its value to the recorder.

public class Recorder

{

public bool enabled;

public long elapsedNanoseconds; // Accumulated time for the previous frame

}

class Profiler

{

...

// Get builtin Sampler

static public Sampler GetSampler(string name);

// Return all available sampler names

static public int GetSamplerNames(List<string> names);

};

Recorder allows to get accumulated time of Begin/End pairs of the specific profiler label for the previous frame.
This information is equivalent to the one you can see in Hierarchy or Timeline view of Profiler Window.

The usage example is:

Code (CSharp):

using UnityEngine;

using UnityEngine.Profiling;

public class ExampleClass : MonoBehaviour

{

Recorder behaviourUpdateRecorder;

ExampleClass()

{

var sampler = Profiler.GetSampler("BehaviourUpdate");

behaviourUpdateRecorder = sampler.GetRecorder();

behaviourUpdateRecorder.enabled = true;

}

void Update()

{

if (behaviourUpdateRecorder.isValid)

Debug.Log("BehaviourUpdate time: " + behaviourUpdateRecorder.elapsedNanoseconds);

}

}

For now only development player will be able to collect data, but further the plan is to enable high-level metrics like BehaviourUpdate, Physics/AI/.. update, loading later in non-development players.
The feature is on the way to trunk and hopefully will be available in the alpha builds - we are targeting 5.6.

Vanamerax · Sep 26, 2016

alexeyzakharov said: ↑

For now only development player will be able to collect data, but further the plan is to enable high-level metrics like BehaviourUpdate, Physics/AI/.. update, loading later in non-development players.
The feature is on the way to trunk and hopefully will be available in the alpha builds - we are targeting 5.6.
Click to expand...

Thanks a lot for your explanation! This looks like a very easy to use and useful API. I have one critical question for this though: Will this API be thread safe? I myself would really like to profile methods of background threads, however the profiler currently does not support this. Are there any plans for this or to change that?

Abenthum · Sep 27, 2016

+1 for multithreaded profiling: currently we built our own system to profile background logic threads.

Another use case where API improvement could help us: We currently capture snapshots in our benchmarking system, but to display the snapshot in the benchmarking result it requires a roundtrip through unity editor to load and process the snapshot. What we actually want is to capture a full snapshot (including user-defined sections) in a player and convert it into a human-readable format right away.

alexeyzakharov · Sep 27, 2016

Thanks for the questions!

wasstraat65 said: ↑

Will this API be thread safe?
Click to expand...

Yes elapsedNanoseconds property is thread safe.
But it returns the accumulated values (across all threads) of all Begin/End pairs for the specific profiler builtin label in the previous frame.
E.g. in situation like:

Code (CSharp):

Main thread: 1 2 B-E 3

Worker Thread 1: 1 B-E B---E 2 3

Worker Thread 2: 1 B-E 2 3

Worker Thread N: 2 B----2-E 3

In frame 2 elapsedNanoseconds will be equal to the sum of 3 BE delta times and delta time between B and frame boundary. In frame 3 it is BE + frame 2 start to E time.

Abenthum said: ↑

What we actually want is to capture a full snapshot (including user-defined sections) in a player and convert it into a human-readable format right away
Click to expand...

User-defined sections (dynamic samplers generated by Profiler.BeginSample/EndSample) are not supported in the first iteration. Also this functionality will be deprecated and replaced by Begin()/End() methods in Sampler class (with Sampler<T1,T2,...>.Begin(T1, T2, ...) for custom data later on). We are currently working on it.

It would be nice to learn about your use-case more - How do you use the data? Do you hot-tune level based on immediate profiler data? Do you really need full snapshot to be in the player?

Vanamerax · Sep 27, 2016

Thanks for your explanation Alex, really appreciate it! Another question related to multithreaded profiling (while we're at it):

Last time I checked, Profiler.Begin/EndSample couldn't be called from background threads. Are there any plans to change that? Ideally, I would like to see the measurements pop up in the profiler timeline view like they already do for Unity's native Job scheduler. Like, additionally display the managed threads.

alexzzzz · Sep 28, 2016

As far as I understand, the built-in profiler is the only tool to profile code in Unity. It's great most of the time, but particularly for code profiling it's not. The UI/UX is far behind such tools as ANTS Performance Profiler or JetBrains dotTrace.

Will "deep profile" be supported? Are there plans to add any extra information to the samplers, like corresponding source file names and line numbers? Perhaps, the general question is - are the planned profiler enhancements will make it possible to make a better viewer for code profiling results?

alexeyzakharov · Sep 29, 2016

wasstraat65 said: ↑

Last time I checked, Profiler.Begin/EndSample couldn't be called from background threads. Are there any plans to change that? Ideally, I would like to see the measurements pop up in the profiler timeline view like they already do for Unity's native Job scheduler. Like, additionally display the managed threads.
Click to expand...

We are working on a Profiler improvements which will allow capturing data from managed threads as well. However because of UI/UX challenges, you might need to express interest in the particular thread by calling API like Profiler.SetThreadName(string groupName, string threadName) in order to have view uncluttered (there might be dozens of threads).

alexzzzz said: ↑

Will "deep profile" be supported?
Click to expand...

Yes, deep profiler will be still supported. It might also get an upgrade in form of lightweight mode where you can set specific methods you want to instrument - so far we have a prototype for Window/OSX+Mono configuration.

alexzzzz said: ↑

Are there plans to add any extra information to the samplers, like corresponding source file names and line numbers? Perhaps, the general question is - are the planned profiler enhancements will make it possible to make a better viewer for code profiling results?
Click to expand...

Short-term plan is to extend generic samplers with an arbitrary (but strongly typed) data.
Talking about hierarchy/timeline view improvements - yes, this feature is on our radar as "Improve Hierarchy and Timelime view interaction - double click or right click on an overview entry opens the script at the position".
These improvements might make profiling experience better.

Vanamerax · Sep 29, 2016

alexeyzakharov said: ↑

We are working on a Profiler improvements which will allow capturing data from managed threads as well. However because of UI/UX challenges, you might need to express interest in the particular thread by calling API like Profiler.SetThreadName(string groupName, string threadName) in order to have view uncluttered (there might be dozens of threads).
Click to expand...

Thank you! This sounds wonderful!

As far as I know, the .net Thread class does have a Name property, isn't that an option to populate the displayed name field in the profiler view? Or do you mean to only select a few of the managed threads to display in this way? In the latter case, I would suggest to use a different naming scheme like Profiler.RegisterThread to make this intention more clear.

Lastly, do you have any estimate of when we might expect an experimental build for this? I'd be glad to provide feedback once you guys got a proof of concept!

alexeyzakharov · Sep 30, 2016

wasstraat65 said: ↑

Or do you mean to only select a few of the managed threads to display in this way? In the latter case, I would suggest to use a different naming scheme like Profiler.RegisterThread to make this intention more clear.
Click to expand...

Yes, the purpose of this API is to let you explicitly express whether or not you want to have this thread being displayed in the UI. Profiler.RegisterThread/UnregisterThread makes it clearer.

wasstraat65 said: ↑

Lastly, do you have any estimate of when we might expect an experimental build for this? I'd be glad to provide feedback once you guys got a proof of concept!
Click to expand...

Recorder functionality will be available with an alpha build of 5.6. But other improvements are still in development and, unfortunately, we can't say exact day yet, as we have planned those for 5.7. We'll definitely keep you posted - once we have a stable experimental build we can definitely share it.

Thanks for your interest!

00christian00 · Oct 1, 2016

alexeyzakharov said: ↑

Short-term plan is to extend generic samplers with an arbitrary (but strongly typed) data.
Talking about hierarchy/timeline view improvements - yes, this feature is on our radar as "Improve Hierarchy and Timelime view interaction - double click or right click on an overview entry opens the script at the position".
These improvements might make profiling experience better.
Click to expand...

Thanks! These will be a lifesaver.
Can you also add a clear on play button just like the console? It's quite useless 99% of the time to have data from the previous run.

liortal · Oct 4, 2016

Is this something that is already available in the beta right now ?

alexeyzakharov · Oct 11, 2016

00christian00 said: ↑

Can you also add a clear on play button just like the console? It's quite useless 99% of the time to have data from the previous run.
Click to expand...

Added ClearOnPlay button to our todo list.

liortal said: ↑

Is this something that is already available in the beta right now ?
Click to expand...

Recorder API will be available in 5.6 alpha 1.

ytrew · Jun 23, 2017

Need a little guidance here. I am working with the new API here to collect and post process these metrics, but I am seeing labels that are not included in profiler window and data that does not match up quite as well as I would have hoped. As an example:

I collect the metrics and then look for hotspots (samplers with the most time spent.), idle reports as 8.6%, while the profiler Window tells me that the most time spent is in “BehaviourUpdate”.

I have a few questions regarding the metrics:

When running a workload (game, etc.), is it possible that a recorder becomes valid or invalid? If so is there a recommended approach to determine when this occurs?

I see “Idle” as a sampler metric, but do not see idle in the profiler window. Are there other samplers that are not available in the profiler that are in the Profiler?

Does the profiler capture editor metrics? I see samplers such as “GuiView RepaintAll” even with nothing in the scene.

On the heels of the previous question, is it possible to samplers into something more meaningful in order to profile only certain segments (i.e. CPU, GPU, etc.)?

Thank you so much for the assistance, I look forward to seeing the API grow and mature

alexeyzakharov · Jun 26, 2017

Hi,
Thanks for the feedback - this is very valuable for us!

ytrew said: ↑

When running a workload (game, etc.), is it possible that a recorder becomes valid or invalid? If so is there a recommended approach to determine when this occurs?
Click to expand...

Recorder is valid, when there is correspondent label created at the time of Recorder.Get call. There are 2 use cases for that to consider:
1. Static labels which are available immediately after engine init. We are working on making high level subset of the labels available in non-dev release builds (such as BehaviourUpdate, Camera.Render, etc.). For that we need a fail-safe if API is used with dev-player label. If static label is not available at startup - it won't be available any further.
2. Dynamic labels. Currently these are the ones created internally when calling C# methods (like MonoBehaviour.Update for the specific game class) or ones created with Profiler.BeginSample(string) or CustomSampler.Create calls. You can get recorder for such labels only after they were actually created/used. (Also Profiler.BeginSample/EndSample isn't useful for Recorders as it has no information about the label in the EndSample call).
We are adding API to get label type so you can see which labels are "static" and which are "dynamic" in the future.

ytrew said: ↑

I see “Idle” as a sampler metric, but do not see idle in the profiler window. Are there other samplers that are not available in the profiler that are in the Profiler?
Click to expand...

"Idle" is an internal label which is used to measure Job Queue Worker Thread activity. Recorders collect information from all threads. Thus you can't see "Idle" labels in Hierarchy View yet, but you can see them in the Timeline View.
We'll use a better name for it.

ytrew said: ↑

Does the profiler capture editor metrics? I see samplers such as “GuiView RepaintAll” even with nothing in the scene.
Click to expand...

Yes, recorders are intended to be independent from Profiler. In the Editor you can also capture Editor-only metrics, which again will be annotated as such later. "GuiView RepaintAll" is the Editor label.

ytrew said: ↑

On the heels of the previous question, is it possible to samplers into something more meaningful in order to profile only certain segments (i.e. CPU, GPU, etc.)?
Click to expand...

There are no GPU metrics exposed yet, because they introduce a significant overhead to the render pipeline. However, we might add some very high-level metrics in the future. For now all recorders measure CPU time.
Making the labels more meaningful is definitely on our list - cleaning up names and providing tooltip with relevant information is very important.

ytrew · Jun 26, 2017

Thanks for the reply. I have a few more items

Tooltip information would be great and subsets would really help when the desire is to profile only specific areas. Also, it would be nice to get some sort of functionality to determine when new dynamic labels become available, I’m not sure if polling for recorder validity makes the most sense. I guess when subsets are added it would be nice to simply query a specific subset such as “BehaviorUpdate” and retrieve all MonoBehaviour labels. Also, I have to keep around two lists (Labels and Recorders) to make the correlation between label and data and output anything meaningful. Would be better if the recorder kept a string of the label name. As it currently stands, my approach is to:

Create a list of all available labels (and routinely check for new additions)

Retrieve each labels recorder

Query for recorder validity

Get data for each valid recorder

Output data using the list of labels and recorders

If you have any tips are advice, it is much apricated.

alexeyzakharov · Jun 27, 2017

ytrew said: ↑

Also, it would be nice to get some sort of functionality to determine when new dynamic labels become available, I’m not sure if polling for recorder validity makes the most sense.
Click to expand...

Yes, polling doesn't do anything right now as if it was invalid during Get call, it stays invalid.
We can potentially do precreation of labels when you query recorder.
Another consideration is that we didn't plan recorders as a replacement for profiler. As they are independent, we don't want to make slower fast path when profiler is disabled. E.g. to get MonoBehaviour labels we have to get script name and actually create a Sampler class on a native side regardless of if we plan to use Recorder or not. That creates an overhead and memory overuse. 95% of dynamic labels are generated only when Profiler is enabled.
I completely agree that ideally you should get Recorder and then if the data is available, just query it. But it is tricky to do it without performance impact right now.

ytrew said: ↑

Also, I have to keep around two lists (Labels and Recorders) to make the correlation between label and data and output anything meaningful. Would be better if the recorder kept a string of the label name
Click to expand...

We tried this approach before and decided to remove name property from Recorder. The rationale behind that was that Recorder itself as well as all labels were stored on a native side. And every time we pass a string from native to managed side we do a managed allocation. And that's very noticeable when you have e.g. onscreen HUD which displays some metrics and store only Recorders querying name every frame for rendering. So in this case API forces to use Tuple or two lists to avoid GC allocations

We are working on a sample Profiler HUD project that uses Recorders to display main performance in the game. We'll share it once ready.
Generally main use case for this API is to get overview what's going on during the frame. How much time subsystems use and what are the bottlenecks when game is played. Especially release games.

ytrew · Jun 27, 2017

Thanks for the feedback, the performance considerations make sense Looking forward to see the Profiler Project and new changes as the become available. Is it possible top get a timeline on when these may be dropping (subsets for labels, high-level GPU stats, HUD project, etc.)?

benblo · Sep 22, 2017

Just discovered the Recorder/Sampler and CustomSampler classes, nice!

@alexeyzakharov One thing I very frequently find myself wanting is RAII samplers, for this kind of cases:

Code (CSharp):

void bla()

{

Profiler.BeginSample("bla");

// do something...

if (a)

{

Profiler.EndSample();

return;

}

// do something...

if (b)

{

Profiler.EndSample();

return;

}

// do something...

Profiler.EndSample();

}

Keeping track of all the EndSamples before each return is tedious.

One (bad) way is to wrap the method body in another method, but that's awfully intrusive for profiling code.

Ideally I'd like to simply do this:

Code (CSharp):

void bla()

{

var sample = new ProfilerSample("bla");

// do something...

if (a)

{

return;

}

// do something...

if (b)

{

return;

}

// do something...

}

... and sample would automatically EndSample when falling out of scope... but C# has no proper RAII (doesn't call Dispose automatically).

The next best way (still intrusive) is to wrap the code in a using block:

Code (CSharp):

struct ProfilerSample : IDisposable

{

bool disposed;

public ProfilerSample( string name )

{

disposed = false;

Profiler.BeginSample(name);

}

public void Dispose()

{

if (!disposed)

{

Profiler.EndSample();

disposed = true;

}

}

public void End()

{

Dispose();

}

}

void bla()

{

using (new ProfilerSample("bla"))

{

// do something...

if (a)

{

return;

}

// do something...

if (b)

{

return;

}

// do something...

}

}

I feel like something like this should be built-in... and if some UT wizard can find a magic trick to do it without a using block, that would be even better !

superpig · Sep 23, 2017

IIRC, the reason we don't already have this is that it makes it impossible to strip out the profiler instrumentation in release builds.

liortal · Sep 23, 2017

In which Unity version is this new API included?

superpig · Sep 25, 2017

2017.1.

nzhangaudio · Sep 21, 2018

superpig said: ↑

2017.1.
Click to expand...

alexeyzakharov said: ↑

For now only development player will be able to collect data, but further the plan is to enable high-level metrics like BehaviourUpdate, Physics/AI/.. update, loading later in non-development players.
The feature is on the way to trunk and hopefully will be available in the alpha builds - we are targeting 5.6.
Click to expand...

It's been around two years. What's the plan regarding having this in non-development build? I'd like to do some scripting that's reactive to the performance.

alexeyzakharov · Oct 3, 2018

nzhangaudio said: ↑

It's been around two years. What's the plan regarding having this in non-development build? I'd like to do some scripting that's reactive to the performance.
Click to expand...

2019.1 there will be at least 2 markers available in release players - "GC.Collect" and "Gfx.WaitForPresent".

a436t4ataf · Sep 29, 2019

Has anyone got code examples for using Recorder with CustomSampler in-editor?

e.g. this page specifically states they should work together:
https://docs.unity3d.com/ScriptReference/Profiling.CustomSampler.html

...but CustomSamplers that all work 100% as expected when viewed in the Profile window do ... nothing ... when I try to read their values in script using the Recorder class, e.g. like this:

Code (CSharp):

CustomSampler s = CustomSampler.Create("FOO");

...

s.Begin();

...

s.End(); // shows up as expected in ProfileWindow

...

Recorder.Get( "FOO" ); // this is ALWAYS dead, has no data.

alexeyzakharov · Oct 2, 2019

a436t4ataf said: ↑

Has anyone got code examples for using Recorder with CustomSampler in-editor?
Click to expand...

Yes, we use Recorder for performance testing of development builds internally - that is the primary function so far
Did you try example provided in https://docs.unity3d.com/ScriptReference/Profiling.Sampler.GetRecorder.html ?

a436t4ataf said: ↑

Recorder.Get( "FOO" ); // this is ALWAYS dead, has no data.
Click to expand...

This returns a Recorder object which should be enabled.

a436t4ataf · Oct 2, 2019

Sorry, when I said "in-editor" I meant: fully in-editor, not runtime -- i.e. I am profiling editor scripts.

Profiling editor scripts works fine with the Profiler window running and manually triggered, but does not work for me otherwise.

alexeyzakharov · Oct 18, 2019

a436t4ataf said: ↑

Profiling editor scripts works fine with the Profiler window running and manually triggered, but does not work for me otherwise.
Click to expand...

Ok, sorry, I think the scripting API example missing one important point - the recorder returns time for the previous frame (or immediately when disabled). This is because there are internal engine markers which might contribute to the accumulated value after any scripting method is executed in the current frame.

If you modify your example to

Code (CSharp):

CustomSampler s = CustomSampler.Create("FOO");

Recorder r = s.GetRecorder();

r.enabled = true;

...

s.Begin();

...

s.End(); // shows up as expected in ProfileWindow

...

r.enabled = false;

// use r.elapsedNanoseconds;

r.enabled = true;

or just use
r.elapsedNanoseconds;
next frame everything should work.

a436t4ataf · Oct 25, 2019

Aha! That would certainly explain it. I'm in San Francisco until next week, but I will try this when I get home.

I've added a suggestion to the docs page that it be updated to include your change - now that you've described it, it makes sense, but it's not obvious unless someone tells you.

alexeyzakharov · Oct 25, 2019

a436t4ataf said: ↑

I've added a suggestion to the docs page that it be updated to include your change - now that you've described it, it makes sense, but it's not obvious unless someone tells you.
Click to expand...

thank for the suggestion! I can see this being confusing - the main reason why this is the value for the previous frame is that some in-engine activities can be done after all scripts are executed, so there is no way to get the final result for the current frame.
We are changing the Recorder api to accommodate other stats such as draw stats and add an explicit property "immediateValue" if you want to read the current value.

a436t4ataf said: ↑

Aha! That would certainly explain it. I'm in San Francisco until next week, but I will try this when I get home.
Click to expand...

Great, let me know if you have any troubles with using it.

Ardenian · Dec 2, 2019

a436t4ataf said: ↑

(...)

I've added a suggestion to the docs page that it be updated to include your change - now that you've described it, it makes sense, but it's not obvious unless someone tells you.
Click to expand...

alexeyzakharov said: ↑

thank for the suggestion! I can see this being confusing - the main reason why this is the value for the previous frame is that some in-engine activities can be done after all scripts are executed, so there is no way to get the final result for the current frame.
We are changing the Recorder api to accommodate other stats such as draw stats and add an explicit property "immediateValue" if you want to read the current value. (...)
Click to expand...

Hello, just a quick heads up that I have also run into the issue that confused @a436t4ataf, which I was not able to resolve until I read the posts in this thread.

I would have greatly appreciated if the example that you posted here, @alexeyzakharov, was available in the documentation, as well as including the explanation that you gave in your previous post, therefore I hope that the documentation is going to be updated in the future.

a436t4ataf · Dec 7, 2019

@alexeyzakharov your correction worked fine. I am now able to test in-editor performance timings inside Unity unit-tests (currently using 2019.2 + UTF v 1.8). Because the API doesn't make much sense at the moment (very easy to misunderstand), I wrapped Sampler+Recorder into a new struct which works more logically. (I'll be refactoring it soon, but the essence is there: a create method, a begin, an end, and a "time elapsed in milliseconds").

For instance:

Code (CSharp):

struct SampleRecorderPair

{

public CustomSampler sampler;

public Recorder recorder;

public long lastNanos;

public static SampleRecorderPair Create( string s )

{

SampleRecorderPair srp = new SampleRecorderPair();

srp.sampler = CustomSampler.Create( s );

srp.recorder = srp.sampler.GetRecorder();

srp.lastNanos = 0;

return srp;

}

public void Begin()

{

recorder.enabled = true;

sampler.Begin();

}

public long End()

{

sampler.End();

recorder.enabled = false;

lastNanos = recorder.elapsedNanoseconds;

return lastNanos;

}

public long milliseconds

{

get { return lastNanos / 1000000; }

}

}

[UnityTest]

public IEnumerator Test_SlowOperation()

{

... // setup my test here

SampleRecorderPair srpStamp = SampleRecorderPair.Create( "Stamp" );

srpStamp.Begin();

slowComponent.LongRunningMethod(); // the thing we're testing

srpStamp.End();

Debug.Log( "stamp time millis = "+srpStamp.milliseconds );

Assert.Greater( 10, srpStamp.milliseconds, "This operation shouldn't take more than 10 milliseconds at worst" );

}

a436t4ataf · Dec 7, 2019

.. However, it doesn't work for the Unity Inspector .

My guess is that this is only a weakness of Unity's current Unit Testing Framework - it will not render Editor controls (Inspectors, Scene view, etc), effectively running headless at all times - so that it cannot be used to test custom Inspectors. (this is great as an option - headless tests run faster! - but it's unfortunate that there's no option to turn it off, and run tests accurately).

i.e. not an issue with the Samplers/Recorders -- I'm guessing that if UTF had an option to run identically to the Editor, then the Samplers would faithfully record the data without making any changes?

For now, I'm planning to discard the UTF for these kind of Editor tests (and hope it gets more features / becomes open-source next year) and use our own proprietary unit-testing that Just Works (but cannot run headless! So runs a lot slower than UTF if you don't need to accurately measure Editor performance).

a436t4ataf · Dec 8, 2019

I've finished integrating, and it's working great. I now have this setup:

Important classes use CustomSampler.Create("special-name") and .Begin() / .End() to internally profile

When working in Editor, I can use the Profiler window to analyse those

When working in Editor and the Profiler window hangs (happens most of the time in Unity 2019 ), I can run a script that does the profiling instead *without having to write new code*

Profiling-Script uses the identical CustomSampler.Create("special-name"), which gives you a reference to the shared sampler, and lets you peek at the values (without touching the profiled class at all, which is perfect)

I can write Unit Tests with UTF v1.8 that use the same process to not only test for correctness, but test for performance - again, without making any changes to the classes being tested! I do: "Assert.Greater( 1000000, recorder.elapsedTimeNanoseconds, "This segment shouldn't take more than 0.1ms to execute")"

To be clear, the API has a poorly named method in CustomSampler (although the docs correctly explain what it does, in contrast to the name). CustomSampler.Create does NOT "create" at the API/conceptual level, it instead "attaches" to a shared sampler (which is created if it doesn't already exist? I guess that's why it got given the wrong name in the API) ... basically: Create implements Get/Fetch/Attach. This is why the above setup works: it doesn't matter which class calls Create first, or how many times it gets called, Unity correctly shares the data between all methods/classes that ever called Create with the same string argument.

...and if that doesn't seem to work, you've made a mistake in your code, like I did the first time around . Because the Unity parts of this definitely work, everything is fine.

alexeyzakharov · Dec 19, 2019

a436t4ataf said: ↑

Because the API doesn't make much sense at the moment (very easy to misunderstand), I wrapped Sampler+Recorder into a new struct which works more logically. (I'll be refactoring it soon, but the essence is there: a create method, a begin, an end, and a "time elapsed in milliseconds").
Click to expand...

Thank you very much for the feedback! Yes, it is not really user friendly and I'm going to fix it.
I've added it for a different use case - to extract internal engine timings for performance tests, but apparently it can be also used for just timing a code with pretty good accuracy and low overhead. (Originally I thought that if you want to measure time Stopwatch is what you can use.) Having a wrapper like you created is totally useful.

a436t4ataf said: ↑

My guess is that this is only a weakness of Unity's current Unit Testing Framework - it will not render Editor controls (Inspectors, Scene view, etc), effectively running headless at all times - so that it cannot be used to test custom Inspectors. (this is great as an option - headless tests run faster! - but it's unfortunate that there's no option to turn it off, and run tests accurately).
Click to expand...

AFAIK the headless mode is controlled by -batchmode - if you don't use it you should run tests fine. Also if you want to test UI performance you also have to ensure that the window is visible (
wnd.Show()
), positioned correctly with a fixed size (
wnd.position = new Rect(0, 0, ...)
) preferably not docked (wnd.RemoveFromDockArea()). We use this approach for the Profiler Window performance tests

a436t4ataf said: ↑

To be clear, the API has a poorly named method in CustomSampler (although the docs correctly explain what it does, in contrast to the name). CustomSampler.Create does NOT "create" at the API/conceptual level, it instead "attaches" to a shared sampler (which is created if it doesn't already exist? I guess that's why it got given the wrong name in the API) ... basically: Create implements Get/Fetch/Attach.
Click to expand...

Yes, agree. The term we use in native code is GetOrCreate. In C# the issue is a bit complicated due to domain reload. "Create" part happens in the player of in a fresh scripting domain in the Editor. Because profiler markers are never deleted after domain is reloaded we actually use "Get" to retrive already created marker.
Note that CustomSampler is going to be replaced by ProfilerMarker in the future providing simpler api and ability to do markup in burst code.

So, I'm working on a replacement for the Recorder api you are using which would also provide runtime access to other stats (like draw stats, memory stats, etc.).
Would it be easier to understand API if we had something like?

Code (CSharp):

unsafe struct ProfilerStat

{

public ProfilerStat(string counterOrMarkerName);

public ProfilerDataType ValueType;

public void* Value { get; } // Value at the end of the last frame/reset

public void* ImmidiateValue { get; } // Current value

public T GetValue<T>() { return *(T*)Value; }

public int FlushCount;

static ProfilerStat StartNew(string counterOrMarkerName);

public void Start();

public void Stop();

}

a436t4ataf · Dec 19, 2019

alexeyzakharov said: ↑

(Originally I thought that if you want to measure time Stopwatch is what you can use.)
Click to expand...

Stopwatch is OK, but ... I found it too inaccurate for 95% of cases. The times when you need to profile from a script, you usually are doing small tight loops (e.g. with unit testing), or you're doing complex cases where you need to understand where the time was spent in a fine granularity (e.g. integration test, e.g. code-optimization), or the Profiler window is crashing, and you need high accuracy.

Also: Stopwatch gets included in production builds, if I remember correctly? That means lots of extra work in doing #define's to add/remove all your stopwatch code - whereas the Sampler/Recorder system I can leave in the codebase (cleaner, easier to maintain).

Would it be easier to understand API if we had something like?

Code (CSharp):

unsafe struct ProfilerStat

{

public ProfilerStat(string counterOrMarkerName);

public ProfilerDataType ValueType;

public void* Value { get; } // Value at the end of the last frame/reset

public void* ImmidiateValue { get; } // Current value

public T GetValue<T>() { return *(T*)Value; }

public int FlushCount;

static ProfilerStat StartNew(string counterOrMarkerName);

public void Start();

public void Stop();

}

Click to expand...

If I understand correctly, usage would be something like:

Code (CSharp):

public void SlowMethod()

{

var totalTime = ProfilerStat.StartNew( "SlowMethod_Total");

... // do stuff

totalTime.Stop();

}

public void ScriptedPerfTest()

{

SlowMethod();

SlowMethod();

SlowMethod();

...

Debug.Log( "Total time taken for 3 runs = " + (new ProfilerStat("SlowMethod_Total")).Value );

}

...where "SlowMethod_Total" now appears in the Profiler window automatically, and we can optionally call ScriptedPerfTest from an Editor test (or from any editor code) to separately measure + report results.

If so, looks good. Two suggestions:

1. It would be nice to have a static method for 3rd party codepaths to use to peek at the results. e.g. my final line of code in ScriptedPerfTest - it only wants to interact with the profiler in order to read the results (it's not going to start or stop anything; it's a read-only use-case).

e.g. static ProfilerTest Get(string counterOrMarkerName); // possibly a singleton, integrated with the constructoror ... e.g. static T GetValue<T>( string counterOrMarkerName ); // this is more precise: the use-case only needs the Value, it doesn't need the full ProfilerTest struct!

2. The naming of .Value and .ImmediateValue is non-obvious; I would prefer if the names were merged with the single-line comments you've inserted above, so that anyone reading code that uses these methods immediately understands which one is which, and why the code is using it.

e.g. ValueForLastPeriod + ValueInstant
or e.g. ValueCompleted + ValueSnapshot
or e.g. ValueFullFrame + ValuePartialFrame

(if I understand correctly: ValueImmediate MIGHT have the same value as Value, but will often have a different value, depending on what moment you invoke it. In most cases, you expect people to use Value, because that represents a stable, "completed frame" number, but there may be special cases where people need to use ValueImmediate, to look at a long-running profiled spot that hasn't completed yet)

NB: I am only guessing at the meaning of ValueImmediate here, I'm imagining something like: you have a frame that takes more than a few seconds (!), and this enables the Profiler window to give a live display of where that time is going even while the main engine hasn't finished drawing the frame yet. That would be cool! I have had a lot of situations during development where things do actually take many seconds to do a single frame, and if I could profile them, it would have made life easier.

(e.g. worst case was an algorithm that took 20-120 seconds *per frame*. Profiling it and optimizing it down to milliseconds per frame was a very slow process, it took 10-15 minutes to get a single profile)

alexeyzakharov · Dec 19, 2019

a436t4ataf said: ↑

the Profiler window is crashing
Click to expand...

This is bad and I want to fix this!

a436t4ataf said: ↑

Also: Stopwatch gets included in production builds, if I remember correctly?
Click to expand...

Yes, but you can wrap it into a struct/class with Conditional attribute. But I guess the main reason to not use it is that it costs depends on scripting runtime implementation.

a436t4ataf said: ↑

If I understand correctly, usage would be something like:
Click to expand...

I would do it more like

Code (CSharp):

static readonly kSlowMethodMarker = new ProfilerMarker("SlowMethod_Total");

public void SlowMethod()

{

kSlowMethodMarker.Begin();

... // do stuff

kSlowMethodMarker.End();

}

public void ScriptedPerfTest()

{

var totalTime = ProfilerStat.StartNew(kSlowMethodMarker);

SlowMethod();

SlowMethod();

SlowMethod();

totalTime.Stop();

...

Debug.Log( "Total time taken for 3 runs = " + totalTime.Value);

}

The reason why is that the time counting has a minimal impact on the code itself (5-50ns) -
ProfilerStat.StartNew(kSlowMethodMarker);
is much slower to to getting the marker by string each time.

The first suggestion is great! I'll consider generic
ProfilerStat<T> where T : unmanaged
. Though we need non generic version to be able to make something like in-game perf stats window where you can add different metrics and collect/visualize those.

For the second one - what about ValueLastFrame + ValueValueImmediate? Immediate value becomes ValueLastFrame at the end of CPU frame. In the beginning of a frame some stats are reset to 0. E.g. timings of Camera.Render, GC.Alloc count, etc. And due to some of those can be incremented outside of your scripting code (after LateUpdate at very end of the frame) there is no way to know/use those and not impact the values itself.

a436t4ataf said: ↑

I have had a lot of situations during development where things do actually take many seconds to do a single frame, and if I could profile them, it would have made life easier.
Click to expand...

Yes, immediate value could give a hint if something takes too long and should be e.g. aborted or moved to the next frame.

a436t4ataf · Dec 19, 2019

alexeyzakharov said: ↑

This is bad and I want to fix this!
Click to expand...

Sorry, to be clear: by "crashing" I mean "takes so long to render a single frame that I have to give up and force-quit Unity". It rarely crashes (I crashed it last week. Looking at windows Task Manager, the Profiler ran out of RAM (it was at approximately 40 GB of RAM usage when it ran out ... for recording approximately 5 frames of data, but one of those frames had a 20 second render time without the profiler ).

In general, the profiler window is currently very stable, but some parts are very inefficient, and some of its rendering is hanging the editor for 30 seconds up to multiple minutes. Compared to the 5.x/2017 profiler, it is 1000x better, but it feels like some small parts of the render loop are still using Unity 5.x style O(n^10) rendering or similar.

I would do it more like
Click to expand...

Oops, yes, that's what I would do too . I wrote my example too quickly.

For the second one - what about ValueLastFrame + ValueValueImmediate? Immediate value becomes ValueLastFrame at the end of CPU frame.
Click to expand...

Sounds good. Would be even better if it could capture the fact that ValueLastFrame is representing a full frame, e.g. - "ValueOfLastFrame".

Or would it be accurate to have:

"ValueLastFrameTotal / ValueIntermediate"
"ValueLastFrameTotal / ValueThisFramePartial"
"ValueLastFrameTotal / ValueThisFrameIntermediate"

Or even omit the "Value", take that as implied from the return type:

"LastFrameTotal / ThisFrameIntermediate"

Reasoning:

I prefer Intermediate to Immediate, because it makes clear this is a not-quite-ready value, its a partial result

(Immediate often implies "good, better" in performance situations , where Intermediate implies "immediate but incomplete")

Total vs Partial appears to summarise the fundamental difference between the values (if I understand correctly)

(it is then obvious to the reader that "this-frame-partial" is going to become "last-frame-total" at the end of the frame)

Lockthav · Jan 16, 2020

Hello Alexey!

Trying to make a tool (editor-only) to track my Particle System's performance.
Doing something like:

Code (CSharp):

Recorder.Get("PreLateUpdate.ParticleSystemBeginUpdateAll")

Recorder.Get("PostLateUpdate.ParticleSystemBeginUpdateAll")

// And so on...

But I feel like tracking all those hard-coded samplers is not the best approach.

There's a way to do something like

Code (CSharp):

Recorder.Get("ParticleSystem")

and retrieve all samplers related to Particle Systems?

Thank you!

alexeyzakharov · Jan 17, 2020

Lockthav said: ↑

There's a way to do something like
Click to expand...

No, but you can use Sampler.GetNames in a meantime to get all available names and create recorders for those which contain "ParticleSystem" substring.

a436t4ataf · Jan 25, 2020

@alexeyzakharov incidentallly ... today I ran into the "unity profiler window is useless because performance is multiple orders of magnitude slower than expected" bugs again. I have now scheduled a week to write an editor plugin that replaces the Unity official Profiler, because it's so awful I can't bear it any longer (and I'm glad the Profiler API's exist so that I can do that!).

Summary:

Start a scene with 5 small meshes in it, and nothing happening -- editor runs at 100+ FPS (I wasn't counting to be specific)

Enable deep profiling, wait a few seconds for the decoration to happen -- editor @ 100+ FPS

Click button in custom Inspector to trigger blocking action that takes 3.4 seconds WITHOUT profiler -- editor hangs ---- this is the code I need to profile/trace. It's about 5,000 lines of code, but the execution time is short.

... blocking action takes approximately 5-6 seconds instead of 3-4 (because of profiling overhead, I assume)

*** Unity Profiler Window takes 10 Gigabytes of RAM per second until the machine runs out of physical RAM **

Unity editor is back to normal, responding normally. Profiler Window is unresponsinve

Wait ...

... wait ...

.... ... wait

... After a minute or so, the Profiler Window starts responding to mouse-clicks again (and I can finally hit the Stop button)

Unity editor is still running at 100+ FPS

Profiler is running at ** less than one frame per 10 seconds **

Now try to view any frame in Profiler

...

... Profiler now kills the Unity Editor. Cannot get any response to anything ...

...

... 10 minutnes later, after crying, still nothing happening, merely the insane Profiler window having a private party in my CPU cores ...

Force-close Unity Editor just to get control of my computer back

OR sometimes: wait another 5 minutes ... 10 minutes ... or more ... for the Profiler to "allow" the Unity Editor to continue running again.

This, sadly, is the same bad behaviour of the Unity Profiler window we've all been experiencing since Unity 5. Sometimes you get lucky and the profiler runs "very slowly but only a 5-30 second wait to do work", and then sometimes - like today, on a trivial processing load - it runs "TENS OF MINUTES wait for it to do work, every time you click a mouse button". When you're trying to trace a bug, it makes you want to scream .

To be clear: I'm not expecting Unity to fix this. And it's against policy - I'm on the stable releases (2019.1, 2019.2) not the unstable ones (2019.3, 2020.1) - so I know there'll be no fixes. But the problem has been so very bad for so very many years that I've given up, and I'm sure I'm not the only one.

alexeyzakharov · Feb 12, 2020

a436t4ataf said: ↑

into the "unity profiler window is useless because performance is multiple orders of magnitude slower than expected" bugs again
Click to expand...

Sorry for that - we are well aware of the problem and working on making UI part scalable to support amount of data we can transfer.

a436t4ataf said: ↑

To be clear: I'm not expecting Unity to fix this. And it's against policy - I'm on the stable releases (2019.1, 2019.2) not the unstable ones (2019.3, 2020.1) - so I know there'll be no fixes.
Click to expand...

Yes, the timeline rendering part requires significant changes. We are aiming for the future releases, but also exploring a simple memory reduction fixes that can help in a meantime.

a436t4ataf · Jul 31, 2020

alexeyzakharov said: ↑

Ok, sorry, I think the scripting API example missing one important point - the recorder returns time for the previous frame (or immediately when disabled).
Click to expand...

@alexeyzakharov - using this in unit-tests in 2019.4 isn't working any more .

If I use your technique to run tests one at a time (double click each one in TestRunner) == it works as expected.

However, if I allow Unity TestRunner to run all tests at once (double-click the class name, or double-click the word "Tests"), and the tests run fast, then all results from Recorder / Sampler are now wrong.

Background / concrete example:

I have reduced the test case down to: I am testing a method that should be a no-op (it's coded to "return;") - the test is "measure the execution time of these samplers for 1 run of this method.

I'm asserting that the Recorder.elapsedNanoseconds < 500000 (i.e. 0.5ms)

...when double-clicking each test one by one: that assertion dosnt trigger, everything is fine.

But when running all tests at once I get a series of massively too-large numbers, e.g. for one run of 7 test cases in the same class:

6088000 ns
5145000 ns
(nothing - assertion passed)
5248000 ns
5290000 ns
(nothing - assertion passed)
5374000 ns

Is this a bug? It seems to be something wrong with Unity's TestRunner - it (correctly) runs individual tests, but something goes wrong when it runs batches of tests.

a436t4ataf · Jun 6, 2022

2022 update: I can't get this code to run any more. Whatever was changed in Unity 2020 LTS appears to have permanently broken it - everything always returns 0 (even when the Profiler window is showing actual times).

And of course there's no way to debug it .

I tried the "new" API that supposedly replaces this in 2021 - it's so intensely confusing I gave up. The docs were incomprehensible, and the class names made no sense at all.

alexeyzakharov · Jun 7, 2022

thanks for the reply @a436t4ataf ! sorry to hear your code was broken by 2020 LTS and the new api doesn't suit your needs.

a436t4ataf said: ↑

I tried the "new" API that supposedly replaces this in 2021 - it's so intensely confusing I gave up. The docs were incomprehensible, and the class names made no sense at all.
Click to expand...

may I clarify what is the api you are referring to and what was confusing so that we can fix it

a436t4ataf · Jun 10, 2022

I'm trying again now, recording my observations as I go.

1. Start with 'unity profiling' on Google, arrive at: https://docs.unity3d.com/Manual/Profiler.html
2. That page first of all tells us about the 'low level native plugin API' -- yeah, no. I'm sure that's wrong. (maybe?)
3. The only other thing listed that sounds like 'allow you to profile code' is "common profiling markers"
4. ... which goes to a page which immediately starts by saying its incomplete ("following tables explain what some of the common markers do")
5. Read the whole thing. But it seems it's all Unity-internal stuff - great for people who want to profile Unity, useless for developers writing new code.
6. ... basically that page is a nightmare. It's just endless rabbit-holes of different things, with (I realised afterwards) no hope of finding anything useful for profiling user-code. But the only way to find out is to do a LOT of reading, and following links, and reading those, and trying to understand each of them, only to decide "no, this isn't relevant either - it's yet another 'unity employees only' feature; great if I ever want to profile the UnityEngine, but not relevant today"
7. The other useful-sounding links on the main Profiler page (linked above) redirect you to : https://docs.unity3d.com/Packages/com.unity.profiling.core@1.0/manual/index.html - a page which makes it sound like you HAVE to install a separate package to do any profiling (do you? Is that the new requirement - profiling is now not included in Unity any more? This is not mentioned in the main Unity docs as far as I can tell)
8. ... it doesn't have any documentation on the main page.
9. The 'Upgrade guide' is particularly sad: https://docs.unity3d.com/Packages/com.unity.profiling.core@1.0/manual/upgrade-guide.html
10. ... after you finish crying, in desperation: try reading the API. Maybe we can reverse-engineer the docs by reading API method call names etc.
11. ..ah! Right! Now we're getting somewhere - there's a bunch of docs here.
12. ProfilerMarker. WTF?
13. ... the linked API page for that says "See Recorder"
14. ......and Recorder's page says only to use CustomSampler's
15. ........which are the old API and (as noted above) don't seem to work any more.
16. Go back to ProfilerMarker and try to understand it.
17. Give up. It's circular references to things like Recorder that themselves don't seem compatible with it and it's impossible to figure out what is actually required, what is optional, what is going on.
18. Try to reverse-engineer a working setup using ProfilerMarker. Nothing works, can't figure out why, have no idea what's missing, what's correct.
19. Finally: give up.

TL;DR: it seems there are now 3 (I think?) simultaneous profiling API's in Unity, and they all are slightly-related in that they share a little bit of code. But nowhere does it say this. And nowhere does it say which code they share, and which is necessary. Nowhere does it tell you which to use when, or why. Or how.

TL;TL;DR: There's no 'getting started guide'. That's all that's really needed: "If you want to profile your own code, do this". That should be linked from the main https://docs.unity3d.com/Manual/Profiler.html page, and it should be clearly signposted (Because it's a common use case that is separate from everything else on that page).

Search Unity

Unity ID

Useful Searches

Profiling.Recorder?

Drink more water! Unity Technologies

Drink more water! Unity Technologies