Search Unity

  1. If you have experience with import & exporting custom (.unitypackage) packages, please help complete a survey (open until May 15, 2024).
    Dismiss Notice
  2. Unity 6 Preview is now available. To find out what's new, have a look at our Unity 6 Preview blog post.
    Dismiss Notice

Resolved (IN-39547) [1.0.0-pre.65] Significantly slower than older version

Discussion in 'Physics for ECS' started by optimise, Apr 27, 2023.

  1. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    The performance is significantly slower than the older version. At 0.5.1-preview.2, it's faster than 1.0.0-pre.65 quite a lot.
     
  2. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Thanks for bringing this to our attention. We will have a look at this.

    Did you notice any particular areas in the profiler which show where the change in time consumption originates from?

    Note that depending on the amount of content in your scene, there can be a significant overhead that is caused by an integrity check that is only done in the Unity Editor for validation.
    See here for more information.
     
  3. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    From my observation, the stall mostly comes from BuildPhysicsWorld that looks like it caused by forcing to complete job immediately and run multiple times in one frame. Another huge stall is caused by EntitiesGraphicsSystem that looks like it's slowdown by dots physics and most of the time is doing nothing.
     
    daniel-holz likes this.
  4. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Thanks for the added information. Could you share a profiler screenshot that shows this?
    I have not observed more than one BuildPhysicsWorld run per frame except if you use Netcode for multiplayer which does predictive simulation (multiple steps per frame). Are you using Netcode here?

    This is what I am seeing (one time BuildPhysicsWorld system update). Example profiler run from the scene "Assets\Tests\Pyramids\Pyramids.unity" in the official Unity PhysicsSamples project.
    profiler.png

    And here is the hierarchy view for some other frame in the same run. As you can see the BuildPhysicsWorld time consumption is low and it is called only once.
    profiler_hierarchy.png
     
    Last edited: May 4, 2023
  5. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    My PhysicsInitialize / BuildPhysicsWorld step has become comically slow. I was looking for a reason, but it might be the new Preview version.

    Used to get 300+fps, now it's more between 3 and 30 fps. Almost all of the frame time is lost in that step. Unfortunately, I cannot drill down any lower, it seems. It's just "lost" somewhere on that system. (will try a deeper profiling session after work today)
     
    Last edited: May 5, 2023
  6. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    That's very odd.
    Can you try the scene I mentioned above, to make sure we get a baseline to work with? I tested all this with the latest version.
     
    Thygrrr likes this.
  7. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    upload_2023-5-6_12-8-29.png

    ECS Samples repo has a bunch of issues when imported from scratch into Unity 2022.2.17f1. The prefab mentioned here actually exists, too, so this message is doubly bewildering.

    I'll try to get it to work and do some profiling.
     
    daniel-holz likes this.
  8. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    Ok, the framerates are bad on ECS Samples changeset 02bbced477b0458a9e31279fdc942f5676ff4909, in Unity 2022.2.17f1, with the following peculiar details:

    There are no errors, but there's a weird warning I have never seen (I think we've all seen the "1 entitirs in the scene... one", I mean the "2 settings provider one)
    upload_2023-5-6_12-18-52.png


    Systems Window doesn't show the timings correctly, they always disappear after a few frames, then reappear.


    On a 60 Hz display at 2048x2048 (all other things as default when checking out the project, this is just my default editor layout), I get around 70-85 fps. It certainly feels slower than that, though. (I'd say about half)
    upload_2023-5-6_12-17-4.png

    Using Unity's Default layout on a 120 Hz display gives SLOWER framerates even though the times are... lower?
    upload_2023-5-6_12-22-14.png

    Profiling gives no super baffling results, other than a significant portion of time being lost on the editorloop:

    upload_2023-5-6_12-27-35.png
    It very much does not feel like 5 ms per frame, but if I look at the Editorloop, that takes up around 11.5 ms.
     
    Last edited: May 6, 2023
  9. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    The plot thickens - I have now DISABLED the Collider Integrity checks, and frame times more than doubled, with time lost exclusively in the playerloop. This now feels very sluggish.

    upload_2023-5-6_12-31-46.png

    The sluggishness went away after switching tasks, I suspected it was a Burst JIT issue or something, turning off Burst Compilation gives me pretty identical performance.

    However, it didn't go away on its own the first time, so it felt more like related to the Unity EditorApplication (even though the massive frame time "spike" does not occur in the EditorLoop) because it sped up only after I switched tasks away from Unity.

    Maybe some flip/flop bi-stable glitch in Burst that sometimes just hangs compilation, resulting in (even more) abysmal performance.

    This super slow performance is pretty much in line with what I see in my larger project since about 2022.2.15f or .16f - I'll try .18f in a little bit.

    On the 120 Hz display, when Burst seems to be doing its job, performance feels okay looking at the profiler (2.5ms, give or take)

    However, the Editorloop still eats 12+ milliseconds out of every frame, even if the editor displays nothing - empty inspector, sceneview hidden just default layout the way Unity intended. (ok, that's a lie, I run at 150% UI scale)
     
    Last edited: May 6, 2023
  10. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    Overall performance is much better in 2022.2.18f1. o_O

    I can no longer reproduce the drastic performance loss in my larger project.

    (there's still a loss, but it's not in PhysicsSystemGroup, but 6:1 proportionally in RenderPipelineManager.DoRenderLoop_Internal(), and this is probably a per-camera optimization I must make thanks to some URP changes)

    If I create a lot (around 20k) physics bodies, capping my game's budget, I see an interesting pattern of small performance spikes where "BeforePhysicsSystemGroup" takes 6x as long as usual, but this could to be a sync point in my own code that I need to work around)

    Overall, 1 ms average time for 20k bodies is very good (I'll look at the sync point separately, it's probably my pathfinding system)

    upload_2023-5-6_13-10-36.png


    (EDIT: Just compared to 2022.2.16f1 with all other things exactly equal, and .18's performance in physics is more than double [<1 ms per tick vs >2ms per tick], though the 10x to 20x loss is gone and I would ascribe that to a Burst issue, lacking another explanation).

    Edit: And one more, there was an issue in pre-2022.2.18f1 where Unity on editor closing would hang on CleanupMono about 2 out of 3 times, I haven't observed this even once since the new URP that came this week.
     
    Last edited: May 6, 2023
    daniel-holz likes this.
  11. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    Here's the screenshot. I didn't use use dots netcode. I just spawn like 20000 balls and when they hit the ground and after awhile it will suddenly become extremely lag. You have a look at IN-39547 repo project.

    upload_2023-5-8_15-51-24.png
     
    daniel-holz likes this.
  12. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    @Thygrrr : Great stuff! Thanks for the thorough investigation.

    One thing I am wondering is how the 120 Hz display plays into all of that. Could V-Sync and/or the default ECS timestep of 60Hz play a role here?

    Btw, the case above where you noticed time lost in the editor loop, wouldn't that just be the loop enforcing real-time framerate with a 60Hz simulation time step (which is the default in ECS; see SystemAPI.Time.DeltaTime)? If there was no "pause" somewhere, the simulation would run faster than real-time.
     
    Last edited: May 8, 2023
  13. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    @optimise:

    After having had a closer look at your screenshot, I can see that the whole physics system group is run 5 times here per single fixed update.

    The reason for this is likely that the FixedRateCatchUpManager is trying to catch up with the real-time simulation rate, but needs to step the physics an excessive number of times since a single step time consumption is likely too high in your scene.

    The Entities framework employs simulation sub-stepping for cases in which the system needs to “catch up” due to excessive time consumption in the game loop.

    This is done in the FixedRateCatchUpManager, which implements system update semantics similar to UnityEngine.MonoBehaviour.FixedUpdate().
    The method can be configured in a way that a single "catch up" step does not exceed the World.MaximumDeltaTime value, which prevents simulation issues such as tunneling and the like.
    However, due to this clamping of the variable time step, sometimes the system needs to step multiple times to catch up which then aggravates the issue in a negative feedback loop, as is likely the case here.

    So, in summary, have a look at the stepping parameters used by the FixedRateCatchUpManager, and either configure it in an attempt to reduce this effect, or solve the issue at the source by optimizing your scene further.
     
    Last edited: May 8, 2023
  14. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    I see but the same amount of balls perform much better at 0.5.1-preview.2 compares to 1.0.0-pre.65 which I believe it's performance regression. I would like to know why 0.5.1-preview.2 is so much faster? One performance regression issue I know at 1.0.0-pre.65 is that when there's lots of physics entity stack together it will regress performance significantly.
     
  15. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    I totally understand that. We have your test project in hand and have filed an internal bug report for investigation and are going to look into this. Stay tuned!
     
    Thygrrr and optimise like this.
  16. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    Hi. Any new update? Btw I think official really needs to completely eliminate stalling at build physics world that this causing insane slowdown at dots netcode prediction that build physics world will running multiple times in one frame. So when total up it becomes really large ms to execute. Not sure how official going to solve it but looks like need to further improve dots physics pipeline.
     
  17. IsaacsUnity

    IsaacsUnity

    Unity Technologies

    Joined:
    Mar 1, 2022
    Posts:
    94
    Thank you for the feedback! We continue to investigate this issue as part of our overall backlog. We'll share more when we have additional details.
     
  18. CMarastoni

    CMarastoni

    Unity Technologies

    Joined:
    Mar 18, 2020
    Posts:
    912
    There are two major stall in the physics 1.0 that cause the build physics world to wait for other jobs and
    Daniel is investigating on it. There are indeed some little more stalls now by "default" on the main thread in the physical loop.
    In particular, the internal BuildPhysicsWorldDependencyResolver system is usually the one that add some of these stall, since it introduce a sync point on the main thread: it will wait for all jobs that touches PhysicsWorldSingleton, BuildPhysicsWorldData and / or PhysicSimulation singleton entities.

    That being said, the game just entered into the classical death-loop: there was a long frame (maybe because of physics), that cause physics run multiple times, that will produce again an even longer frame, etc etc up to a point where it become like that.
    This is never ending, unless the catch-up mechanism decide to say something like: "you know what, let's simulate bigger steps (imprecise) to exit this loop". But this is not the way it is implemented.
    I would suggest to implement your own FixedRateManager and do some logic there to determine how to clamp and handle cases where the elapsed time in between frame may cause physics to run to slow and enter into this situation.

    Another solution is to not-rebuild the world at all, apart the first time. And that can be implemented by a custom system that selectively:
    - Disable the BuildPhysicsWorld
    - Ask instead to update the AABB tree (imprecise collision detection though)

    That requires some little addition at the moment to the BuildPhysicsWorldData to make this cleaner but it is a viable solution, that led to very good performance improvement.

    Because for netcode we had similar problems, if we have high latency the client need to simulate a lot of physics step in one frame to catch up, we experimented on a bunch of thee.
    By applying both a sort of "adaptive physics-step rate" (that lead to misprediction, and non-determistic results, as I described) as well as by avoiding to rebuild the physics world, you can limit the problem.
     
    Thygrrr and optimise like this.
  19. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    I guess currently it's not possible to override current physics system without modify the dots physics package right? I would like official to make overriding dots physics much easier without required to modify dots physics package and provide utility methods and others to further make it much easier.

    But still if official can improve all these options for unity developer to toggle on/off the required parts then it's even nice. To be honest I not sure how to implement these solutions. Looks super advanced for me.
     
    Thygrrr likes this.
  20. Thygrrr

    Thygrrr

    Joined:
    Sep 23, 2013
    Posts:
    705
    For me, the issue is back with a vengeance in 1.0.10.

    I created a thread because I don't even know how to dig deeper into profiling, and it impacts the LTS version rather than pre65. Just
    https://forum.unity.com/threads/slow-physics-how-do-i-examine-this-performance-readout.1444288/

    It seems to compound drastically above 10k entities.

    All my other physics jobs are pretty simple, so I wonder what the stalling is all about. They should be long, long finished when BuildPhysicsWorld even starts. (maybe they don't, the scheduler is still pretty opaque to me - I read a "UpdateBefore" and "UpdateAfter" as "execution of the other system is guaranteed to finish before the other updates")

    Specifically, any jobs in BeforePhysicsSystemGroup should be completely done executing before BuildPhysicsWorld updates, right?

    Unfortunately, it seems beyond my capabilities to set breakpoints or reliably profile or inspect burst code to validate that assumption, or even to find out which SPECIFIC JOB it is waiting for here in "240 instances on the main thread" (maybe a good idea to add that to the profiler marker or make a profiler counter with a parameter for it?):
     
    Last edited: Jun 8, 2023
  21. n3b

    n3b

    Joined:
    Nov 16, 2014
    Posts:
    56
    The main bottleneck here is the constant back-and-forth copying of data between ECS and Physics. As I understand this was initially done to enable the use of physics data during the simulation process. However, when dealing with large quantities of data, as seen in this case, the time required for copying outweighs any benefits gained rather than using physics data before and after the simulation.

    To address this issue, one possible solution would be to store the rigid body index as component data and directly utilize physics collections instead.

    Another slowdown occurs during the BVH BuildFirstNLevels, where the sorting of all AABBs is performed in a single thread. This task could be parallelized across multiple threads, especially when working with clusters of bodies (which would also open up the opportunity for an open world with a single physical world). By making this change, I've observed improved BVH build times, somewhat around x4-x10, depending on bodies distribution over clusters. Additionally, the points collection can be cached, and only AABB expansion and repositioning is needed.
     
    Last edited: Jun 8, 2023
    daniel-holz and optimise like this.
  22. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    @n3b: Thanks for the pointers. Note that we are aware of some data duplication between ECS and core Physics and we are discussing this internally. We'll keep looking into this.

    @optimise : quick update on the data and project you provided us with.
    Looking at your data specifically for 1.0.0-pre65, I can confirm that in your scene physics is stepped twice per frame.
    That's why you see a "sync point" blocking execution within the BuildPhysicsWordSystem.
    This is the second BuildPhysicsWorldSystem, which naturally must wait until the first step is completed (marked below with an X).
    So it only seems as if this system is wasting time while the jobs are actively working on finishing the previous step as you can see in the screenshot. The sync is obviously necessary since we can start a second physics step before the first one has completed.

    upload_2023-6-8_21-21-54.png

    I indicated with arrows to the right that the same sequence of jobs will be scheduled again here as part of the second step in this frame.
     
    optimise and n3b like this.
  23. n3b

    n3b

    Joined:
    Nov 16, 2014
    Posts:
    56
    @daniel-holz Sorry, I should have made it clear that what I mentioned earlier was for folks who want to boost performance in specific tricky situations. There's no one-size-fits-all solution, and the current system already does a great job in most cases.
     
    daniel-holz likes this.
  24. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    Ya. I would like official to completely eliminate "sync point" so no matter it's stepped twice or multiple times per frame, it will significantly improve the performance specially when using dots netcode that will definitely call multiple times per frame.

    Then next step is implement this simulate bigger steps suggested by CMarastoni to quickly exit the stalling loop so it won't stuck at super low fps for long time. Consider to expose elapsed time in between frame for tuning at Physics Step authoring. I would like this improvement implemented into official package.

    For this I also would like this improvement implemented into official package. Not sure how to implement the solution but I think something like enable/disable BuildPhysicsWorldEnableableTagComponent to enable/disable BuildPhysicsWorld. For Ask instead to update the AABB tree, I also not sure how should implement it but I guess it's still enable/disable UpdateAABBTreeEnableableTagComponent. Maybe even better make physics system configurable that unity developer can just toggle on/off for the required systems but this will need to implement system baker like current authoring baker to bake component into entity.

    Btw I'm curious why manually update the AABB tree will have imprecise collision detection issue?

    I guess maybe at oop land it's extremely hard to get one-size-fits-all solution but at dots land I believe it's possible figure out generic solution to solve most of the problems with great performance. If it's not possible I think can implement toggle on/off at authoring and physics systems for unity developer to mix and match and even further make it able to override default physics system and add new custom physics system without package modification.
     
  25. n3b

    n3b

    Joined:
    Nov 16, 2014
    Posts:
    56
    I'm afraid there's absolutely no way to implement a one-size-fits-all solution, regardless of the programming paradigm you choose. There are numerous cases where the solutions I proposed may perform worse than the default DOTS algorithms.

    Burst is the general solution that Unity provides to boost performance across various scenarios. However, it's up to the developers to adapt and optimize specific algorithms according to their needs. It would have taken Unity years to figure out how users would utilize their framework effectively and make improvements accordingly.

    I'm glad that we have access to the source code to customize it for our own requirements. Although, I would really appreciate it if Unity teams used fewer "private/sealed/internal" modifiers and more "partial" to facilitate customization.
     
    daniel-holz likes this.
  26. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Your suggestions are all very valid. I think these are good general improvements and I took a note for us internally. Thanks again.
     
    optimise likes this.
  27. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    The performance would not automatically improve since the sync point is happening in the middle of 2 consecutive physics steps. The jobs launched by the physics systems for the first step need to be completed before the now waiting BuildPhysicsWorldSystem can launch step 2.

    If we were to remove the sync that now causes the wait in the second execution of BuildPhysicsWorldSystem, and if we could possible schedule all the jobs for step 2 ahead of time without even having any information on what the result of step 1 is (we can not easily do that), the BuildPhysicsWorldSystem would not block on the main thread but the work scheduled by the systems from steps 1 and 2 needs to be completed anyways. So you wouldn't get any difference in the end in regards of job execution time for both sets of jobs (for steps 1 and 2 combined) whether there is a sync in the middle or not, since the sync is waiting on running jobs. So in terms of job processing, there is no gap. The processing is never halted.

    If however you want to do some extra work on the main thread after completion of the physics systems from step 1 (at the moment where now you have the BuildPhysicsWorldSystem waiting), that is perfectly fine. You would just need to add your systems to the correct group to benefit from that gap so that your systems are launched after the systems from the first physics step have finished (not necessarily the jobs though!) and before the BuildPhysicsWorldSystem starts execution of the second physics step. Just before of data dependencies here. You will not be able to read any data that the systems and jobs in the first physics step are still manipulating, like the velocities and transforms of rigid bodies, until these jobs have finished (which is exactly what the BuildPhysicsWorldSystem from the second step is waiting for also btw).

    That group that you could add your systems to could be
    the
    AfterPhysicsSystemGroup
    .

    I wonder if you do indeed want 2 consecutive physics steps happening within one FixedStepSimulationSystemGroup update. This could be the main issue here, which blocks everything up.
    The reason why this is happening is due to the activities of the
    FixedRateCatchUpManager
    which decides that you need an additional physics step. For more details on how to control the behavior of this catch up manager (which is used by default in Entities), please refer to this other post on this topic.
     
  28. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    @n3b : Would you mind sharing the parallelized implementation of BuildFirstNLevels btw?
     
  29. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    @optimise : Quick update here. I am getting almost twice the FPS with 20000 spheres spawned in 1.0.0-pre65 compared to 0.5.1 with your stress test. In 0.5.1 it's almost coming to a halt while in 1.0.0 it's going at ~45 fps even after all spheres have dropped to the floor. Before, while the spheres are in the air, it runs at 60 FPS.

    I will look into this more now.

    Edit: The slow down in the old version might be due to the rendering (more than just the game tab open). I will confirm.
     
    Last edited: Jun 10, 2023
    optimise likes this.
  30. n3b

    n3b

    Joined:
    Nov 16, 2014
    Posts:
    56
    Instead of sorting both sides of the large range, the worker delegates one side to others using a ring queue with a spinlock. This approach allows me to process all large ranges, not just a specific number (N), meaning I have a dynamic number of branches that is not limited to 64. Subsequently, the next stage sorts small ranges in parallel, and the following stage builds nodes in parallel as well.

    The clusterization further improves performance by reducing the number of points per worker during the initial iteration and increasing the number of busy workers at the beginning.

    However, it's worth noting that the spinlock here doesn't perform well with relatively small sets with many workers. Thus, the number of workers should be chosen dynamically. Currently, I have kept it constant, but it may need to be adjusted in the future.
     
    Last edited: Jun 9, 2023
    daniel-holz likes this.
  31. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Very helpful. Many thanks for the details.
     
    n3b likes this.
  32. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    If I understand correctly physics data has been designed in ecs agnostic way long time ago. I think now it's the time to fully adopt ecs component including creating component that has native collection data field like NativeHashMap at runtime to make it become truly full dots physics and much more integrated to further boost more performance.
     
  33. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Alright, I had a closer look and for 20000 spheres I am getting quite similar results between the two versions.

    Here is a common frame at the end of the drop in version 0.5.1:
    upload_2023-6-10_14-29-36.png

    B
    roughly marks the beginning of the physics step
    E roughly marks the end of the physics step (last ExportPhysicsBodies job is ending there).

    So, you can see that the physics is done about at time 10.5 ms since the beginning of the frame.

    Here is the same situation in version 1.0.0-pre65:
    upload_2023-6-10_14-37-12.png

    As you can see, the frame layout on the physics side is very similar and here it also ends at about time 10.5ms since the beginning of the frame.

    One big difference here is that these regular spikes that you see on the timeline in 0.5.1 are gone in 1.0.0.

    Note that I had to make sure that Burst Safety Checks are disabled in both runs. Otherwise, I was getting worse performance.

    So, to conclude, in cases where there is a single step done per frame, the performance doesn't look much different between both versions with potentially a small improvement in 1.0.0 compared to 0.5.1 and more stable frame rates.

    Furthermore, as soon as the FixedRateCatchUpManager decides to do a second physics step within the FixedStepSimulationSystemGroup, you naturally get the sync point at the beginning of the second step, and you will get much higher time consumption in a single frame. This is just normal if this happens. If you want to avoid having more than one step per frame, you need to change the RateManager as previously mentioned.
     
  34. optimise

    optimise

    Joined:
    Jan 22, 2014
    Posts:
    2,129
    I think I know why u still get similar result for both 0.5.1 and 1.0 since you are using quite powerful. I tested 20000 spheres at i7-7700HQ to reproduce huge fps dropping when spheres touching the ground. Since your machine is quite powerful, you will need to increase spheres count until can reproduce huge fps dropping when spheres touching the ground but spawning spheres at mid air still can get solid 60 fps.
     
  35. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    Since the timing results with a single physics step per frame and 20000 spheres are very similar, while both being significant in time consumption and they were run on the same machine, we can conclude that there has not been any performance regression.
    More spheres might just get you into the "two steps per frame" territory, which might show as a regression, but could just be a different stepping pattern (1 step in old version vs 2 steps in new).

    I didn't run this on a particularly performant machine. I ran it on a laptop with i7-10850H, which is good but not the latest and greatest. That said it does compare positively against your CPU overall, with single threaded performance being comparable (rating of 2715 vs 2068). Obviously, specifically with DOTS, parallel performance matters most.

    What might be a big factor also is that I made sure that the jobs safety checks were off and I only render one view (the game tab). This made a big difference in performance in both versions.

    So, I would urge you to check if you do get into that "2 vs. 1 physics step" situation between the two versions and if you don't want it to ever do 2 physics steps, simply change the RateManager in your 1.0.0 version, as I had suggested above. Details for how to do this can be found in this other post.
    In a nutshell, changing the RateManager to the FixedRateSimpleManager might be the solution in your case.
     
    Last edited: Jun 13, 2023
    Thygrrr likes this.
  36. daniel-holz

    daniel-holz

    Unity Technologies

    Joined:
    Sep 17, 2021
    Posts:
    297
    After I had some unit issues with the data captured in the profiler in 2020.1 and in 2022.2 (the former used units of seconds while the latter used milliseconds, making it impossible to compare the two data sets in the Profile Analyzer), I have now successfully converted the units (by saving the 2020.1 profiler data as .pdata file in the profile analyzer in 2020.1 and loading it in the profile analyzer in 2022.2) and was able to create this apples to apples comparison:

    upload_2023-6-13_13-7-25.png

    Highlights:
    • 0.5.1 on 2020.1 shows some occasional spiking which can be seen in the analysis (see max in the 0.5.1 range at the bottom right). These spikes are gone in 1.0.0 on 2022.2.
      Note that these spikes are infrequent enough not to skew the comparison, as can be seen by the median and mean data points for 0.5.1 which are almost the same.
    • 1.0.0 is on average ~8% faster than 0.5.1, as can be seen in the median (and mean) data points (see bottom right).