Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Insane performance downgrade unless application launched via Build-and-Run

Discussion in 'AR/VR (XR) Discussion' started by isidro02139, Apr 17, 2019.

  1. isidro02139

    isidro02139

    Joined:
    Jul 31, 2012
    Posts:
    72
    We are working on an education app that teaches communication skills in virtual reality. We are currently in the process of porting the project from Windows Mixed Reality, a desktop VR solution, to Google Daydream, and our device of choice is the Lenovo Mirage Solo.

    Obviously one major part of the porting process is to optimize our scripts and geometry to be performant on a Daydream device. We are able to get it to run at a stable 75FPS (which is the native frequency of the target device) - BUT ONLY IF WE RUN THE APPLICATION USING UNITY'S "BUILD AND RUN" COMMAND. Running the application through any other means - the device app menu, or via the adb - will result in an extremely slow, choppy experience, characterized by extremely large spikes (up to 100ms long) that show up in profiler as "XR.WaitForGPU" and which cannot be explained by simple VSync. Additionally, if the headset is taken off the head for at least 5 minutes (so the process is suspended / the device stops rendering and requires the controller to wake up), the performance spikes will go away.

    This behaviour has been confirmed to NOT depend on device temperature or build settings. It occurs in development builds and production builds and on multiple different Lenovo Mirage Solo devices, built from several different computers and OSs. The issue is present in both 2017.4.25f1 and the latest 2018.3.11f1 builds.

    Details

    It is as if the application has two states, the normal state where everything works properly, and the state with performance spikes. Which state the app will be in depends on how it was launched. If it was launched using Unity's Build and Run, it will be in the performant state, at a stable 75FPS, provides a smooth experience and is fully functional (profile image).

    If launched otherwise it will launch in the state with performance spikes. It will have a low framerate (about 50FPS) and, even worse, the frame drops will be staggered and irregular. This is incredibly disorienting. All the functionality is still there but it is completely unusable from a performance / subjective feel standpoint (profiler image).

    When profiling the problem, the "performance spike" state is characterized by multiple XR.WaitForGPU spikes, one spike every 2-10 frames. I have attached screenshots of the exact same scene with and without spikes.

    Now, I am no stranger to XR.WaitForGPU and I am familiar with its purpose. The theoretical time to render a frame on this device is 13.333ms, and I would expect that if we fail to meet our target framerate we will get dropped frames, and have to wait for an entire refresh cycle. Sad, but something I'd be able to live with and optimize. However, even waiting for a full refresh cycle would only make us wait around 13ms. You will notice that the spikes in question are not your regular waiting for the next frame - they are are HUGE, routinely 50-80ms per spike. I've seen some that were over 100ms. There are NO corresponding CPU spikes in the preceding frames, as far as the profiler is concerned the rest of the application is working perfectly and executing everything within the allotted time budget.

    In other news, the profiler tells me that our app doesn't seem to be doing anything to lock up the device's renderer for such long periods (100ms!). Furthermore, the exact same binary is perfectly capable of running smoothly and without spikes - provided it is started with "build and run" (or put to sleep for 5min and woken).

    Once the app is rid of the spikes it will remain in its performant mode until the process is restarted, even after multiple scene transitions the spikes do not return.

    Note that not only is it the case for two LAUNCHES of the same binary to be in different performance (one with spikes, one without), but also it is possible for a SINGLE launch to start with spikes and then after the trick (losing focus for 5 minutes) remain spike-free for the rest of its run.

    So while both of these tricks are great to allow us to obtain a smooth framerate, obviously they are not acceptable for a commercial product from a customer's perspective. Worse, if it was the other way around (build and run causes the spikes, normal on-device app starts do not) we would be able to shrug off this problem as it would not affect the customer.

    Everything the profiler tells me is corroborating my opinion that I was able to optimize the scene sufficiently to work on an Android device in 75FPS.

    Addition info and takeaways

    (1) I know that with the above description, you may be tempted to blame it on thermal throttling of some sort. It was the first thing to occur to me as well, which is why I ran several thermal tests and I have completely eliminated temperature and thermal throttling as a possible cause. I repeat, I HAVE COMPLETELY ELIMINATED DEVICE TEMPERATURE AS A POSSIBLE CAUSE OF THESE PERFORMANCE SPIKES. I have had the device render the scene for two hours without pause and get quite hot, and still have no spikes. I have also had a cold device start in the morning and immediately entering the state with spikes (which I was able to convert to no-spike state by taking the headset off).

    (2) I've spent a lot of time in VR including in apps that did not meet their FPS requirement so I definitely know what it looks like when you fail to render the frame in the required time. This is not just your regular judder where you get some double images and mild discomfort.

    (3) The Mirage Solo is a 6DoF device, which means asynchronous reprojection helps less, especially with lateral head movements. 6DoF on Android has not been around for all that long. This issue may have been present before but not as noticeable on a 3DoF device because reprojection would pick up the slack.

    (4) Both taking the device off the head (and have it pause the app / stop rendering as a consequence) AND waiting for some period of time (~5 minutes) are necessary prerequisites to eliminate the performance spikes. Doing either of the two on its own will not eliminate the spikes.

    Conclusions

    - Performance spikes seem to be caused by something external to our scripts and geometry
    - The GPU doesn't seem to be over encumbered.
    - The problem doesn't seem to be in our Unity scene set up. We closely monitor the CPU expenditure as well as GPU load and there doesn't appear to be anything that would cause such massive spikes / delays. Indeed, it is perfectly possible - by using tricks that have nothing to do with our code and scene setup - to get our scene to behave perfectly fine on device.
    - The problem doesn't depend on the binary itself. The exact same binary is capable of running with both performance spikes and without performance spikes.
    - The problem is not a property of the launched process, either. A single process can go from having performance spikes to not having performance spikes during its lifetime.

    Any ideas are much, much appreciated:)

    A&G
     
    Last edited: Apr 17, 2019
    JoeStrout likes this.
  2. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    2,983
    Hi!
    Do you know if this happens on an empty scene?
    Can you take a systrace of the app when it's exhibiting spiky behaviour?
     
  3. florianpenzkofer

    florianpenzkofer

    Unity Technologies

    Joined:
    Sep 2, 2014
    Posts:
    479
    Build-and-Run uses the following command line to start the app:
    "adb shell am start -a android.intent.action.MAIN -c android.intent.category.LAUNCHER -f 0x10200000 -S -n package/activity"

    You could also try Systrace in Snapdragon Profiler. It might give you some more info about rendering tasks.

    Then you could compare the output of "adb shell dumpsys SurfaceFlinger" between different runs of your app (fast and slow).
     
    isidro02139 likes this.
  4. isidro02139

    isidro02139

    Joined:
    Jul 31, 2012
    Posts:
    72
    Dear @florianpenzkofer – thanks for the ideas! I will try to get a systrace as you suggest as well @aleksandrk and report back posthaste :cool:

    Re: the launch parameters, I have tried to launch the apk with:

    adb shell am start -a android.intent.action.MAIN -c android.intent.category.LAUNCHER -f 0x10200000 -S -n com.package.name/com.unity3d.player.UnityPlayerActivity


    and

    adb shell am start -a android.intent.action.MAIN -c android.intent.category.LAUNCHER -f 0x10200000 -S -n com.package.name/com.unity3d.player.UnityPlayerActivity --ez android.intent.extra.VR_LAUNCH true 


    ..but regularly start in the spiky state. The only way to recover "smooth" performance is to use the trick – deep-sleeping the device for controller wake-up.
     
  5. isidro02139

    isidro02139

    Joined:
    Jul 31, 2012
    Posts:
    72
    @aleksandrk @florianpenzkofer

    Update: After running some more tests, we have more or less managed to establish that the "spikes" problem is present in scenes with certain skinned mesh renderers. When these skinned mesh renderers and their animators are removed from the scene, the spikes do not appear. We are working on providing a minimal reproduction scene / build with detailed repro instructions.

    We have tried with a variety of settings including GPU skinning and graphics jobs and various combinations thereof.

    The animated meshes in question are nothing too fancy - humanoid characters with 6k vertices, 9k triangles, the skeleton is not overly complex. We render up to 4 such characters at any one time. We were willing to entertain the idea that the skinning was a performance bottleneck that somehow causes all these skipped frames, in some way that it is masked from the profiler. However, if the app is in the "no spikes mode" through the use of one of the tricks described above, the device is perfectly comfortable performing the skinning of these same meshes without any performance impact.

    Hoping to share the repro project asap :cool:
     
    JoeStrout likes this.
  6. isidro02139

    isidro02139

    Joined:
    Jul 31, 2012
    Posts:
    72
  7. mpgholden

    mpgholden

    Joined:
    Aug 21, 2014
    Posts:
    38
    @isidro02139 did you ever find a resolution to this? I see the Fogbugz ticket is still open. I'm running into an issue with spikes on waiting for the GPU when skinned mesh renderers are enabled in the scene in a VR project for Android, too, but haven't figured out why and the "trick" you mentioned about letting the app idle or go into the background doesn't seem to work in my case.