Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.
  2. Voting for the Unity Awards are OPEN! We’re looking to celebrate creators across games, industry, film, and many more categories. Cast your vote now for all categories
    Dismiss Notice
  3. Dismiss Notice

XBOX Performance Revisited

Discussion in 'Windows' started by stonstad, Nov 2, 2018.

  1. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    I can't seem to figure out why I'm getting abysmal performance on an XBOX One X.

    I reviewed other helpful posts (https://forum.unity.com/threads/uwp...and-other-disasters-help.519008/#post-3503893), (https://forum.unity.com/threads/uwp-xbox-one-poor-performance-enhanced-access.504884/page-2), and (https://forum.unity.com/threads/cas...r-than-no-batching-at-all-uwp-xbox-one.491982) but I'm still at a loss.

    Just not sure what I'm missing. I know there are plenty of configuration options required to get it right. Here's what I've tried so far, please bear with me!

    Here's my setup
    1) Unity 2018.2.13f1.
    2) XBOX One X with OS version 10.0.17763.2023.
    3) The title is UWP with .NET scripting backend (required).
    4) XBOX development kit is October 2018.
    5) Resolution is 1920x1080.
    6) Test scenario is a straight UWP export from Unity without other integration.

    Things I am definitely doing
    1) Compiling in Release mode from both Unity and Visual Studio (Release x64).
    2) Setting app type to "game".
    3) No VS debugger attached.

    Other things I tried
    1) Tested with D3D11 and D3D12 (D3D12 7 FPS faster on XB1X).
    2) Tested with static batching, dynamic batching, and graphics jobs on/off (off=slower).
    3) Disabled all shadows (from quality settings)

    Things telemetry tells me that I don't understand.
    1) CPU utilization for the app is low (< 3%) which matches observed behavior on Windows/PC.
    2) GPU utilization is pegged at 20% of ONE core. I understand only one core may be dedicated to 3D rendering. No idea why this is happening.
    3) Memory allocation is normal, about 150MB.

    My baseline system (for comparison purposes) is an i7-8650U (1.9Ghz) with NVidia Geforce GTX 1060. Outside the Unity editor via a built UWP binary, I'm seeing 38 FPS @ 4K and 65 FPS @ Full HD with vsync off. XB1X @ Full HD is 12 FPS.

    Compiler Version and Build from Unity Log
    Built with Compiler Ver '190024218'
    Built from '2018.2/release' branch
    Version is '2018.2.13f1 (83fbdcd35118)'
    Release build
    Application type 'XAML'

    Here's a screenshot of the low GPU utilization I mentioned. I've read of other forum members seeing 100% of one core. I'm apparently stuck at 20% of one core without and also with D3D12.



    FPS & Memory on XB1X


    Graphics Details on XBX1


    Any suggestions or tips are greatly appreciated. I'm hitting my head on a wall trying to move forward...
     
    Last edited: Nov 2, 2018
  2. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    I suggest using the profiler if you're able.
     
  3. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    OK, I'm trying to get Unity's profiler to connect. The editor and XB1X are on the same subnet and Client/Server is enabled in the capabilities file. No luck yet -- but hopefully I can learn something via profiling once I get a connection. So there are no known perf issues with GPU instancing or batching on XB1? That's good to know if true.
     
  4. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    The batches issues that we were made aware got fixed a long time ago. I second the advice to use a profiler.

    Did you try using D3D build type? XAML can hurt performance (I don't know by how much, though).
     
  5. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    OK ...
    ... the tallest pole in the tent?



    Physics... I'm aware of the potential hidden cost of physics. I log collisions and I do not have mesh colliders. Due to logging I know there are at most two collisions per minute. Throughout PC development I was vigilant and routinely monitored the impact of physics. On my slower laptop enabling and disabling physics doesn't change framerate.

    But OK... Let's say that I have hidden collisions and I need to fix my physics -- we can move on. However, the particle system was next highest item on the profiler. I disabled particles and I am no longer CPU bound. I'm getting 70+ FPS and the GPU shows 95% utilization.

    So the original problem is that the game is extremely CPU bound on the XBOX. On the PC disabling physics or particles doesn't change framerate by a single frame. It feels like I am not getting enough CPU cycles on the XBOX -- is this possible?

    According to this MS article a UWP title is supposed to have 4 exclusive and 2 shared CPU cores (https://docs.microsoft.com/en-us/windows/uwp/xbox-apps/system-resource-allocation).

    Here's the alleged graph of XBOX CPU utilization -- which does NOT show the game as being CPU bound.

    When you run your games on XB1 is this what your CPU utilization looks like when the game is CPU bound?


    I'm just really confused by this behavior. How might I prove that the XBOX is heavily CPU throttled?
     
  6. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    I don't think you're getting throttled - you're main thread bound. That graph shows total CPU usage, which would be 100% if you utilized all 8 cores.

    Don't forget that the Xbox One CPU is very slow - your i7-8650U is probably 6 times faster. iPhones nowadays have faster CPUs.
     
    Mr-Mechanical likes this.
  7. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    You may be right. OK, so it's CPU bound on the XB1X. Is that because of my content or an inefficiency somewhere? i.e. are SIMD calculations slow because I'm using the .NET scripting backend instead of IL2CPP? Is the particle system not working in procedural mode? Even though my laptop certainly has the possibility of being faster, my gut tells me something is wrong here. I appreciate the help. I will think on this and see if I can figure something out.
     
    Last edited: Nov 3, 2018
  8. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    Honestly, I can't answer these questions - it all depends on content. SIMD isn't related to scripting backend unless you're using burst jobs - in that case, they will only run in fast mode on IL2CPP. Our code doesn't do anything special just because it's running on Xbox - you're getting the same code paths as on PC.

    You'll have to use the profiler figure out how to optimize it.
     
  9. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    Fair enough. Thank you for your helpful responses. I respect your experience and knowledge!

    My goal is to understand why my game is slow so that I may fix it and release it upon the world. :)

    The first question I hope to answer is -- which system has better single-core performance? A Surface Book 2 i7-8650u or an XBOX One X with benchmarking limited to a single core (i.e. main thread CPU bound). We can partially answer this question through a synthetic benchmark.

    The Test: A CPU-bound Unity project with no rendering, no camera and a single script to execute prime number factorization to the Nth digit for each frame update. A sufficiently large number is chosen so that net FPS is around 30 FPS. Vsync is disabled so that we aren't capped at 60 FPS on a potentially fast machine. I'll test on both aforementioned systems and include tests for .NET and IL2CPP scripting backends.

    Results:

    Single core performance on an XBOX One X is better than Surface Book 2 i7-8650u.



    When system CPU utilization is approximately 18%, a UWP title without threading is CPU bound. This is logical since we are allotted 6 processors (4 of which are fully available) and 1 / 6 is 16.6%.



    Here is what I conclude. You're correct -- the XBOX One X is not CPU throttled or nerfed. I also conclude that if my Unity game is not GPU bound it shouldn't run drastically slower on an XBOX One X compared to a Surface Book 2 i7-8650u.

    Here is what physics looks like on Surface Book 2 i7-8650u:



    And here is what it looks like on an XBOX One X.



    edits: grammar
     
    nrXic likes this.
  10. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    In comparing these two CPU usage graphs... I see that physics isn't the only system that runs more slowly. Rendering and scripts are also 2X slower.
     
  11. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    Try lowering the rate of physics updates so it doesn't go into a spiral of degraded performance.
     
    stonstad and Peter77 like this.
  12. Mullan7

    Mullan7

    Joined:
    May 23, 2013
    Posts:
    79
    Keep in mind that performance restrictions apply to UWP. To get full CPU power you have join ID@xbox and get access to XDK
     
    Behappy859, stonstad and Gametyme like this.
  13. Peter77

    Peter77

    QA Jesus

    Joined:
    Jun 12, 2013
    Posts:
    6,419
    Furthermore you can try to disable Physics.autoSyncTranforms.
    https://docs.unity3d.com/ScriptReference/Physics-autoSyncTransforms.html

    Here are a few Physics optimization resources:
    https://unity3d.com/learn/tutorials/topics/physics/physics-best-practices
    https://docs.unity3d.com/Manual/iphone-Optimizing-Physics.html

    Also make sure when you build with il2cpp, that you don't use the debug configuration if you profile performance.
     
    stonstad and hippocoder like this.
  14. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    This is partially true. You get access to 7 cores with XDK, rather than 6, but it doesn't help single threaded performance and wouldn't help in OP's case.
     
  15. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    Thanks for the feedback -- I do appreciate it. I think we're missing a key concept here.

    Why would physics cause the game to be CPU bound when total system XB1X CPU utilization is ~18%, or about one core? Isn't Unity physics multithreaded on all x64 platforms across six processors (4 dedicated)? Again, refer to the synthetic benchmarks which show XB1X compute to be faster than the reference system.

    I reran tests (for my game, not the benchmark) with just physics enabled and then a separate test with rendering enabled. Both physics and fill-rate is 5-6x faster on XB1X compared to the reference system (i7-8650u) tested above. Synthetic testing reaffirms this concept. But put them both together and it's a slide-show.

    I think there is something broken in Unity physics. The "make UWP suck" flag needs to be turned off.
     
    Last edited: Nov 6, 2018
  16. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    @Peter77 I appreciate your thoughts on ways to address the behavior. Reducing complexity fails to address why Unity runs poorly on a system that should otherwise perform quite well. Maybe XB1X multhreaded compute is 5% slower, or 5% faster. I am not going to fix this problem -- but I am calling attention to the idea that something is wrong here. The numbers and performance disparity does not add up.

    *edited. I appreciate the thoughts/ideas on possible causes and ways to fix.
     
    Last edited: Nov 5, 2018
  17. hippocoder

    hippocoder

    Digital Ape Moderator

    Joined:
    Apr 11, 2010
    Posts:
    29,723
    It's a decision by microsoft. The console is not allowed to allocate full resources because it is running alongside other things. That is what I understood so far. But if one core breaks your game I would imagine you really don't appreciate just how limited the xbox is vs a desktop. It's mostly bandwidth being the problem.

    Why not file a bug report and drop the case number here? staff can then at least peep at it and verify :)
     
  18. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    I'm very confused on the "Single threaded performance" numbers you posted. What exactly runs faster on Xbox One if scripting, rendering AND physics runs slower on Xbox One? From my testing, Xbox One single threaded performance was much, much slower than any PC I could get my hands on.

    Not all physics calculations are multithreaded by the way.
     
  19. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    Running just physics with all cameras turned off is faster on XB1X compared to SB2. Running just rendering with physics disabled is faster on XB1X compared to SB2. However, if both rendering and physics are enabled on XB1X it is slower than SB2. XB1X is 12FPS, SB2 is 65FPS.

    I keep coming back to the well on this because it doesn’t make sense.
     
  20. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    Just to add some color — if I run a script which does just compute — prime number factorization in update — which is main thread only — XB1X and SB2 desktop performance is quite close. So we have a metric that says compute *could* be close but then we all agree there are other factors at work. So then we also take into account the number of cores and processor speed, limiting factors caused by the OS, etc.

    What scenario is more likely:

    A) Synthetic benchmarks show equivalent compute performance. And then a real world test (game) runs faster with just physics enabled or just rendering enabled (faster fill rate on XB1X), but not both simultaneously. The difference is 12FPS XB1X and 65 FPS SB2. Because console.

    Or

    B) There is some kind of interaction defect (thread locking, platform code specific inefficiency, etc) in Unity which causes two otherwise identical x64 binaries to behave very differently.

    I don’t know and that’s why I am here... It just isn’t as obvious to me as it is to others here that performance on XB1X should be this slow.

    I feel like, to prove my point, I need to build a synthetic benchmark for CPU and GPU in both Unity and Unreal Engine, and then test both scenarios while CPU bound, GPU bound, and include a more blended scenario.
     
    Peter77 likes this.
  21. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    I completely agree. Part of the effort would be figuring out what a good test looks like. I think one possible test is to modify the prime number factorization to be multithreaded, and then compare both throughout and processor utilization telemetry on XB1X for singlethreaded and multithreaded. Does it scale up some or is it flat?

    I’d like to a test for physics but how do I establish what is normal?
     
  22. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    Just to make it clear: you got those results from the exact same UWP app running on Surface and Xbox?

    Did you try looking at profiler "timeline" view and comparing them? I suggest looking at absolute timings (in ms), rather than framerate.

    You could also use a different profiler. I suggest windows performance analyzer - CPU sampled profile: https://files.unity3d.com/zilys/ETWPerfGuide/
     
  23. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
  24. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    Keep in mind that it works on Xbox too (since it is running Windows 10). I just thought it might be useful to compare. In either case, I asked our physics devs read your other thread.
     
  25. Kriszo91

    Kriszo91

    Joined:
    Mar 26, 2015
    Posts:
    181
    Hello!

    I'm try to get smooth 4k on X but, the console seems not fully utilize, low device usage but low performance too, im not using "Physics" all Physics setting are default, i have a maun menu where only one character are shown and some planes around it, and a canvas with the ui elements, the verts are arround 150k- to 300k, no gpu instance(because not working) DX12, cpu usage around 10-20% one gpu engine are around 50-60%, so i think something are not going well, i don't know its come from Unity side or microsoft, but X has a A 6 teraflop GPU close to 1070 6.46, gtx 1080 has 9, im using 1080 and i have 200-300FPS in 4K in this menu scene, also you can see Everspace released with UWP but Unreal Engine.i think, somehow need to revision the whole Unity UWP workflow.
     
  26. Mullan7

    Mullan7

    Joined:
    May 23, 2013
    Posts:
    79
    Thanks that's good to know. So is that the only real performance difference? The Xbox creators program docs led me to believe that XDK was significantly more powerful.

    As for UWP performance troubles in general, the unity alpha has a bug fix for multi threaded rendering on UWP DX12 which has helped me quite a bit
     
  27. Kriszo91

    Kriszo91

    Joined:
    Mar 26, 2015
    Posts:
    181
  28. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    I backported that bug fix to 2018.3 beta too - it should show up soon.
     
  29. stonstad

    stonstad

    Joined:
    Jan 19, 2018
    Posts:
    596
    @Tautvydas-Zilys Do you have any detail you can share around the DX12 multithreaded rendering fix? How do you find time to respond to forums and fix code? Thank you for your efforts.
     
  30. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    It makes us build DX12 commands on multiple render threads, rather than single one. It helps if you're render thread bound (you'd need to use timeline profiler to see if it's the case).

    C++ code doesn't compile instantly :(.
     
  31. f0ff886f

    f0ff886f

    Joined:
    Nov 1, 2015
    Posts:
    201
    Will this get backported to Unity 2017 LTS?
     
  32. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    No, that feature was experimental at best in 2017.4 even on the standalone player.
     
  33. f0ff886f

    f0ff886f

    Joined:
    Nov 1, 2015
    Posts:
    201
    Do you have any rough idea what the % speedup may be for heavily renderbound title? Is it meaningful for just an ms or so at best?
     
  34. Tautvydas-Zilys

    Tautvydas-Zilys

    Unity Technologies

    Joined:
    Jul 25, 2013
    Posts:
    10,501
    I saw a 10% improvement on a project I tested. No idea what the upper limit is, though.
     
  35. nrXic

    nrXic

    Joined:
    Jun 9, 2013
    Posts:
    2
    I feel unqualified to speak on this because my Unity experience is minimal, but I also found the performance to be bizarre. While I am working on a 2009 AMD CPU that is fairly equivalent to the Xbox One CPU, I do realize that the restrictions (even the expanded restrictions that allow games to play better) imposed by MS here play a role in performance.

    I've tried many tests to see what could run well on the Xbox and felt that yes, things are very CPU bound here. I didn't document my tests and results, but I got the impression that the number of scripts, the physics, the number of objects, all of these were issues.

    In order to get the performance I'm looking for, I have relegated to vertex based lighting with stencil shadows (which IMO doesn't look too bad). I'm actually fine with that as I'd like to make some retro looking titles. Low poly setup (max 15K triangles on screen at once), as well as fairly aggressive occlusion to keep numbers down. I strove for efficiency in my scripts, with physics running at a reasonable clock rate (I tend to have them going at 240Hz out of personal preference but couldn't do so here). The restrictions do make things sensitive to Garbage Collection, with noticeable hiccups occurring every 15 or so seconds. I had to optimize my code further to try to eliminate that. I've removed collisions from 90% of the objects, really simplifying things.

    But I did want to confirm the idea that things don't seem to be adding up. I wanted performance similar to between the original Xbox and Xbox 360, and couldn't achieve it. But I'm the sort of guy who's "just happy to be here" and I am happy for this opportunity to release console games with little certification to allow for concept games, quirky titles, smaller scoped arcade titles.