Search Unity

Help me understand these profiler results

Discussion in 'Editor & General Support' started by JoeStrout, Jul 26, 2018.

  1. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    8,123
    I'm trying to optimize an Android (Oculus Go) app, but I don't fully understand what the profiler is telling me.

    While looking at a fairly busy part of the scene, it seems that I usually get frames in 16 ms, but occasionally there is a 1-frame doubling of that. I assume these are missed frames, where something just couldn't complete before the vsync, so it had to wait for the next one. I need to eliminate those skips, but I'm still not sure exactly what's causing them.

    Here's the frame before the spike:
    Frame5732.png

    Here's the frame of the spike, with what appears to be a whole lot of not much going on:

    Frame5733.png

    And here's the frame after the spike:

    Frame5734.png
    So I guess the problem is the Render thread, which isn't starting until 8.5 ms in, and I guess not finishing in time for the frame. But when you look at the graph at the top, the Rendering time plus everything else (except vsync) is always well below 16 ms. And when I compare the pre-hitch frame to any other frame in the middle where it's getting a solid 60 fps, I don't see any significant difference — the green bars in the render thread always start around 8-10 ms in, and last for 8 or 9 msec, except for Gfx.ProcessCommands. That one normally takes between 16 and 17 ms, but on the pre-skip frames it takes 32 ms and change. I don't know whether that's cause or effect, though.

    Do you people who understand these profiler results see any smoking gun here? What should I focus on to prevent these missed frames?
     

    Attached Files:

  2. samizzo

    samizzo

    Joined:
    Sep 7, 2011
    Posts:
    478
    WaitForPresent/WaitForGPU and similar usually means the CPU is waiting for the GPU to finish, so probably your previous frame submitted too much geometry or geometry with complex shaders that couldn't complete before the next frame wanted to start submitting geometry to the GPU. So the CPU has to wait for the GPU to finish.

    It could also mean that your CPU took more than 16ms and you're vsync locked. In this case, WaitForPresent is waiting for the next vsync and the GPU might not be doing a lot. That might be your problem - the frame right before the hitch looks like it took 17.75ms. When it's running at 60fps what is the average frame time? I would also check the breakdown of CPU usage in the hierarchy view on that previous frame.

    There are some tools listed here along with some suggestions. I don't know if the Remote Monitor can be used but it looks pretty handy. It looks like you can't do GPU profiling in Unity on Android sadly. Here's a blog post also that has some suggestions.

    Sam
     
  3. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    261
    Looks to me that what's happening is basically what samizzo said:
    1. the pre-spike frame went slightly above 16.66 ms which
    2. means you missed the refresh rate of the oculus screen/the VSync rate 0f 60 fps and need to wait for the next VSync frame

    So you're GPU bound and trying to render too much/uploading too many different commands to the GPU. You don't even need to improve it too much to be more stable under 16.66 and the issue should disappear. Try reducing texture sizes and improving batching? You could use the frame debugger to figure out why stuff might not get batched.
     
  4. JoeStrout

    JoeStrout

    Joined:
    Jan 14, 2011
    Posts:
    8,123
    Right, I gathered that the frame took slightly longer than 16.66 ms; what I don't get is how to see exactly how much time it's taking. I have the ability to turn large parts of the environment on or off, and I thought I'd be able to see that feature A takes x milliseconds, feature B adds y more milliseconds, etc. But I'm not seeing that at all, though probably I'm just not looking in the right place.

    Here's a profile with everything (the whole environment, mostly looking at a blank screen) turned off:
    NoFeatures.png
    As expected, you can see we're getting a steady 60 FPS, no trouble. Yet the CPU time for the above randomly-selected frame is still 16.78 ms, GPU 0.0 ms, just like in the spiking case in my first message.

    As I turn one one or two big parts of the environment, nothing much changes... here are thumbnails, but click for the full images:
    OneFeature.png TwoFeatures.png
    Well, I guess one thing that changes is the green region (rendering, on top of other stuff except vsync) in the top graph grows from about 5 msec to 10 msec or so. But I'm not sure what to make of that, because it's still stays at that same level in the frame-dropping case (with all environment features turned on). Note that the one-feature snapshot shows a frame with CPU listed as 18.42 ms, which is well beyond what we saw in the dropped-frame case, yet no frames here were dropped.

    Finally, this thumbnail shows the full-environment case, selecting a frame well between the spikes:
    ThreeFeaturesA.png
    ...and right before a spike, as in my original post:
    ThreeFeaturesB.png

    And even looking at the Camera.Render bar near the bottom, it jumps from 1.86 ms (no features) to 7.86 ms (one feature), but then pretty much holds there — 8.6 ms with two features (fine) and 7.28 ms with three features (frames dropping). What's the important difference between two and three, that makes us suddenly start dropping frames?

    So that's my question. Looking at all these bars, where can I see something increasing as I add more complex stuff to the environment? Right now all I know is that with one or two features I'm OK, but with all three features I start dropping frames. But somewhere in this data there ought to be what I need to say, feature A costs this, feature B costs that, etc. Right?
     
  5. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    261
    GPU time is usually not available unless you open the GPU profiler chart (probably needs to be open and connected to the app as you start it up) and GPU profiling would need to be available. We should probably try to figure out a way to more cleanly indicate whether you're CPU or GPU bound, independent of the GPU profiler. For now, the shorthand is to look at whether the Main thread or the Render thread are waiting and you'll know you're bound to the other.

    So the interesting area in timeline view to look at here is the Rendering thread. It spills over into the next frame which is fine as it also starts it's work for this frame usually a bit into the frame (in you're case somewhere between 5 and 11.2 ms). However in these last two thumbnailed images, you can see how it starts to oscillate out of control a bit. The Render thread picks up later in the frame and moves more into the next frame. It surpasses the point where the XR.WaitForGPU call starts and then the GPU misses the first VSync point at which to flip, upping the wait time from 4 ms to 8 ms. And then, as mentioned, it delays the start point of rendering until it slips even more.

    I'm not sure how to get more granular info as to what costs you too much in this, except for looking at it with the Frame Debugger though
     
    JoeStrout likes this.
  6. RMGK

    RMGK

    Joined:
    Sep 30, 2011
    Posts:
    56
    I have been having the same issue on the GO and GO only. I'm actually getting better perf on the GearVR S7 than on the GO. Unity 2018.3.0 OVR 1.32 / Unity 2018.3.4 OVR 1.34
    As of now, the sync issue here is caused by multithreaded rendering, not the GPU itself. See the 2 images attached:

    Multithreading Off
    Screen Shot 2019-02-07 at 15.39.22.png
    Multithreading On
    Screen Shot 2019-02-07 at 15.41.37.png

    Answer right now: Turn multithreaded rendering off on the GO.

    Also, Unity reports way better perf but the Oculus Monitor tools don't match up, which they do with GearVR which is leading me to think there is an issue there as well. Will report back with more info if I get it.
     
  7. jeromeWork

    jeromeWork

    Joined:
    Sep 1, 2015
    Posts:
    279
    Very helpful. Thanks @RMGK
     
  8. MartinTilo

    MartinTilo

    Unity Technologies

    Joined:
    Aug 16, 2017
    Posts:
    261
    yes please. The data shown by the profiler should be mostly reliable. If there is a mismatch, please also file a bug report :)
     
  9. NTDev4

    NTDev4

    Joined:
    Oct 12, 2014
    Posts:
    3
    You're my hero.
     
    RMGK likes this.