Search Unity

Gfx.WaitForPresent 38 FPS Adreno 506, 50+ FPS Adreno 330 ???

Discussion in 'General Graphics' started by NEVER-SETTLE, Jan 27, 2019.

?

Do you like to WaitForPresent?

  1. Yes

  2. No

  3. Sometimes

Results are only viewable after voting.
  1. NEVER-SETTLE

    NEVER-SETTLE

    Joined:
    Apr 22, 2018
    Posts:
    30
    Scenario: I have a simple scene with around 15k tris (depending on where you look), using Legacy Difuse shaders. I run the game on a Galaxy S5 Android 6.0.1. From GSM Arena:

    • released 2014, April
    • Chipset Qualcomm MSM8974AC Snapdragon 801 (28 nm)
    • CPU Quad-core 2.5 GHz Krait 400
    • GPU Adreno 330
    Getting 50 FPS average Profiler screenshots

    galaxy_s5_prof.png
    galaxy_s5_rend.png

    I run the game on a Lenovo P2 Android 7.0. GSM Arena:

    • Released 2016, November
    • Chipset Qualcomm MSM8953 Snapdragon 625 (14 nm)
    • CPU Octa-core 2.0 GHz Cortex-A53
    • GPU Adreno 506
    Getting 38 FPS average Profile screenshots
    {585B36A7-9226-42F6-8246-CA7081EE5D13}.png
    UnityBug.png

    What I've tried so far:

    • replace the legacy difuse shader with mobile difuse -> both phones gain 10 fps, same performance gap
    • disable auto graphics api and force GLES2 - no difference
    • force GLES3 - no difference
    • tried using the Snapdragon profiler to make a snapshot capture but it keeps crashing when I try to take the snapshot
    • forcing the same resolution on both devices (1080), no change (I was thinking that Unity might do some magic and downscale on the S5), also note that both phones are 1080p
    Unity 2018.3.0f2

    Same game, same graphics profiles (preferences), same everything. How is this possible?

    This is just a stripped down main menu of the game. The main scene of the game (a city with 80k tris on average, and many Standard shaders, runs at over 40 FPS on GS5, and at 28-30 on P2 :/. I also tried the city scene on a Xperia XA2 (slightly better specs than the P2) and it runs at around 56-60 fps)

    PS: Notice in the screenshots how because of looking at a different area, the S5 renders more things on screen, yet still has so much more FPS than the P2
     
  2. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    Looking at the info on Wikipedia about Adreno GPUs (https://en.wikipedia.org/wiki/Adreno):
    330 has more memory bandwidth than both 506 and 508. 508 has more than 506, so I would suppose that the results you're getting are somewhat correct.

    What is the size of the textures you're using? Do you have mipmaps enabled?
     
    NEVER-SETTLE likes this.
  3. NEVER-SETTLE

    NEVER-SETTLE

    Joined:
    Apr 22, 2018
    Posts:
    30
    Hi aleksandrk, thanks for the link, good info there.

    The textures you see in the profiler are:

    garage - 2048x2048 RGB Compressed ETC 4 bits - 2.7 MB
    garage floor - 1024x1024 RGBA Compressed ETC2 8 bits - 1.3 MB
    garage floor normal - 1024x1024 RGB Compressed ETC 4 bits - 0.7 MB
    garage normal - 2048x2048 RGB Compressed ETC 4 bits 2.7 MB
    garage_props1 - 1024x1024 RGBA Compressed ETC2 8 bits - 1.3 MB
    garage_props2 - 1024x1024 RGBA Compressed ETC2 8 bits - 1.3 MB
    garage_props3 - 1024x1024 RGB Compressed ETC 4 bits - 0.7 MB

    All are atlases
    All have mipmaps
    All are used in the scene you see in the profiler, and are used just once.

    Other than that, the skybox, 6 x 1024x1024 RGBA ETC2 8 bits 1.3 MB

    Do you think this is too much for the P2?
     
    Last edited: Jan 28, 2019
  4. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    It's very easy to figure that out - just cap the resolution on each texture to 64x64 and see if there's perf improvement :)
     
  5. NEVER-SETTLE

    NEVER-SETTLE

    Joined:
    Apr 22, 2018
    Posts:
    30
    True. I overrode the default settings for Android and set all textures (besides the Skybox) to 64 kb, there's no performance difference on either S5 or P2. The over 15 FPS gap is still there.
     
  6. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    Can you try and set "BlitType" in Player settings (Android) -> Resolution and presentation to "Auto" or "Never" and see if the gap is smaller?
     
  7. NEVER-SETTLE

    NEVER-SETTLE

    Joined:
    Apr 22, 2018
    Posts:
    30
    Yes, I tried that and here are the results:

    Always (this was set as default)
    P2: 36 - 34
    S5: 44 - 42

    Never
    P2: 38 - 36
    S5: 56 - 54 (???)

    Auto - same as Never

    ON TOPIC:
    - the performance gap raised now to 18 FPS

    OFF TOPIC
    - so I understood that if I'll keep this Blip option to Never, it will cause problems in Linear color space (the default for Android seems to be Gamma in PlayerSettings), post process anti aliasing (don't use, only PP I use is ColorGrading), or/and non-native resolutions (using native atm, but thinking of limting to 1080 probably in the future). So from my best understanding so far I can leave this option on either Never or at worst Auto to keep that big FPS gain on low end phones like the S5?
     
    Last edited: Jan 30, 2019
  8. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    On topic: please submit a bug report :)
    Off topic: Auto will turn on blitting when it thinks it's necessary automatically, so you're right.
    It's weird that it didn't do anything on P2.
     
    NEVER-SETTLE likes this.
  9. NEVER-SETTLE

    NEVER-SETTLE

    Joined:
    Apr 22, 2018
    Posts:
    30
    Well it did give 2 FPS, which, is like, close to nothing compared to the S5 difference.
    Thanks for the help so far!
     
  10. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    Hi, I am just curious how did you conclude by his screen shots and specs that the problem could lie on GPU side instead of CPU side? In both shots, it is CPU time which is increasing and Gfx.WaitForPresent does not necessarily mean that the game is GPU bound. It can CPU bound as well as explained by Martintilo from UnityTechnologies in another thread. (Check the underline text). So he has not shown anything related to render thread

    • Gfx.WaitForPresent: When the main thread is ready to start rendering the next frame, but the render thread has not finished waiting on the GPU to Present the frame. This might indicate that your game is GPU bound. Look at the Timeline view to see if the render thread is simultaneously spending time in Gfx.PresentFrame. If the render thread is still spending time in Camera.Render, your game is CPU bound and e.g. spending to much time sending draw calls/textures to the GPU.
     
  11. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    The screenshot says that Camera.Render takes 1.57 ms.
     
  12. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    First of all I think, now I think
    Gfx.WaitForPresent has been renamed as Gfx.WaitForPresentOnGfxThread. Am I right?

    and regarding your comment, Ok. So since in CPU profiling hierarchy view, there is very little time in Camera.Render so it is not CPU bound. When I read Unity Docs explanation of Gfx.WaitForPresentOnGfxThread keeping the attached imaged in view (profiling data from android device), it confuses me a lot.

    Documentation says,

    If the render thread spends time in Camera.Render, your application is CPU-bound and might be spending too much time sending draw calls or textures to the GPU.

    At point 2 of image highlighted, in render thread there is Camera.Render function of 6 second on render thread. And it is continuing even after player loop has finished. Does it mean my Game is CPU bound here? What should be the regular scenario? Should this Camera.Render be minimum or it should not even exist here?

    Also at the same time, Unity Doc says,

    If the render thread spends time in Gfx.PresentFrame, your application is GPU-bound, or it might be waiting for VSync on the GPU.

    And at point 1 highlighted in image, you can see render thread has Gfx.WaitForPresent. As per my understanding, this Gfx.WaitForPresent can mean either My game is GPU bound OR my game is just waiting for vsync.

    But the more explanation of the same marker tells A WaitForTargetFPS sub-sample of Gfx.WaitForPresentOnGfxThread represents the portion of the Present phase that your application spent waiting for VSync.

    Which is the case for me. So I assume that my game is CPU bound and not GPU bound. am I accessing it right?Thanks.


    upload_2021-1-25_12-51-51.png
     
  13. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    In your case it's like this:
    You have two threads, main thread and render thread. Main thread waits for the render thread to report that rendering the previous frame finished - that's where Gfx.WaitForPresentOnRenderThread happens. After it's done telling the render thread what to do for the current frame, the main thread can start preparing the next one.
    Render thread submits the work to the GPU. This is marked as "Camera.Render'. Then it tells the window system "OK, I'm done with submitting stuff, show what I just asked the GPU to render". At this point the render thread waits for the GPU to finish rendering to the backbuffer. This is marked by "Gfx.PresentFrame". After this is finished, it tells the main thread "OK, you can start submitting work for the next frame, we're ready".

    In your case you spend 8.68ms waiting for VSync. If your target FPS is 30, you're good, because this means you're neither CPU nor GPU bound - both can finish the work in time and then just wait for VSync.
     
  14. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    So below is my understanding based confusion. Thanks a lot for your help so far.

    1) Gfx.PresentFrame is the time Render thread is waiting for some response from GPU. It represents the time GPU is taking either in rendering/computing OR just in waiting for next vsync.

    2) Now to further split Gfx.PresentFrame to know how much time GPU spent for real rendering and how much time in waiting for vsync, we can see WaitForTargetFPS. This yellow color portion out of whole the time of GFX.PresentFrame NECESSARILY represents the time GPU spent waiting for vsync. Rest of the time than this yellow portion is where GPU is actually computing. ((Please note my emphasis on word "necessarily" and check below source of emphasis and my understanding. The underlined text is cause of my emphasis).

    Reference from Unity Docs :

    Gfx.PresentFrame :
    Represents the time your application spent waiting for the GPU to render and present the frame, which includes waiting for VSync. Samples with the WaitForTargetFPS marker on the main thread show how much time is spent waiting for VSync.

    Gfx.WaitForPresentOnGfxThread :

    If the render thread spends time in Gfx.PresentFrame, your application is GPU-bound, or it might be waiting for VSync on the GPU. A WaitForTargetFPS sub-sample of Gfx.WaitForPresentOnGfxThread represents the portion of the Present phase that your application spent waiting for VSync.


    BUT BUT BUT

    Below is also from same Unity Docs, where it says, WaitForTargetFPS DOES NOT NECESSARILY means time spent waiting for vsync. It can include time spent by GPU in actual computing as well.

    WaitForTargetFPS :
    To determine what is causing samples with this marker
    to use a lot of time, switch to the Timeline view in the CPU Profiler module. In this view, you can check what happened on the render thread and how much time passed between this sample ending in the current frame and the same sample ending in surrounding frames.

    If the duration is larger than your application’s frame time should be (based on the targeted frame rate or vSync) your frames are taking too long to render or compute. If that’s the case, investigate the render thread and see how much time it spent on Gfx.PresentFrame over other work it did to prepare and issue commands to the GPU. If the render thread spent a large amount of time in Gfx.PresentFrame, your rendering work is GPU-bound. If the render thread’s time was spent preparing commands, your application is CPU-bound.

    So so so

    This all confuses me a lot. How can I conclude how much time is spent for vsync waiting and how much time is spent for actual GPU rendering so that I may know when I am GPU bound and not just in wait of vsync. Here is an image attached. Where my frame rate is lower than my target 30 FPS. How to evaluate vsync waiting time and GPU computing time?

    upload_2021-1-26_12-52-4.png
     
  15. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    Right :)
    It may happen that it missed the VSync point by a small (or not so small) amount because of something. In this case it has to wait for the next VSync interval and the application will be either CPU or GPU bound.
    To understand if that's the case, check your target FPS/VSync settings. If, for example, you have a 30 FPS target and your frame takes less than that, then you're VSync bound. If it takes longer, you're either bound by the CPU or the GPU (depends on where it spends time before it starts waiting for VSync).
     
    WaqasGameDev likes this.
  16. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    Sorry I mistakenly wrote that my target FPS is 30. My target FPS is 60 on 60Hz devices because I am not using Application.framerate any where and I am using Vsync Setting to EVERY BLANK.

    Here you wrote WaitForTargetFPS means "Waiting for Vsync". If this is the case, then why in this image, WaitForTargetFPS is 35.73ms. Time between two Vsyncs on Every Blank V sync setting for 60Hz device should be 16.66ms. So maximum wait for Vsync can not be longer than 16.66ms for Every Blank setting on 60Hz device. But in my case it is reporting WaitForTargetFPS 35.73ms??? My confusion still stands. Once I have identified my CPU is not bound because render thread is just spending time in GFX.WaitForPresent, how to identify if it is Vsync bound or is it GPU bound? What you conclude the from the same image.
     
  17. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    What is the device you test this on?
    From the screenshot I would say that either the settings lead to 30 FPS instead of 60, or there's a bug somehere :)
     
  18. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    Oh.. I tested this on Huawei Honor 9 Lite, Android 9. So should I conclude that WaitForTagetFPS under Gfx.WaitForPresentOnGfxThread NECESSARILY shows the time only spent for Vsync wait and nothing else? If this is the case, then I will recheck if on device Vsync is Every blank or Every second blank. But kindly confirm me whether WaitForTagetFPS under Gfx.WaitForPresentOnGfxThread NECESSARILY shows the time only spent for Vsync wait and nothing else? Thanks.
     
  19. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    On mobile the default settings target 30 FPS.
    I don't think it's possible to get only the time spent in VSync, so it shouldn't be only that.
     
  20. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    Thanks for sticking with me for so long. If I go with your comment that time spent for Vsync is not possible to be determined and WaitForTargetFPS can include CPU Render thread/GPU computing time, then the wording #1 and #2 is very misleading/confusing in the attached docs. upload_2021-1-26_16-57-48.png
     
  21. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    :confused:
    Meanwhile struggling to understand these profile markers, I profiled my new game on Android Device and this time I am targeting Vulkan API as well. So I came to know about another an other term which is not in the Docs as well. GfxDeviceVK.present . My google search told me this has something to do with Vulkan API. But no explaination in unity docs I found. Is it also the same as GFX.PresentFrame?


    upload_2021-1-26_18-19-58.png
     
  22. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    @GameHourStudio We'll update the docs about the markers to make them more clear.

    GfxDeviceVK.present is indeed something specific to Vulkan :)
    It's nested under Gfx.PresentFrame, so it's not the same.
     
  23. WaqasGameDev

    WaqasGameDev

    Joined:
    Apr 17, 2020
    Posts:
    118
    Thanks a lot for clearing so much things to me I was struggling to understand since last many months. Now I am in a situation to dig things further for me based on all discussions with you. Have a very wonderful time.:) Regards,
     
    aleksandrk likes this.