Search Unity

Gfx.PresentFrame taking 4x as long, but only for intermittent periods

Discussion in 'Editor & General Support' started by Nicolas1212, Jun 28, 2018.

  1. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
    I'm trying to figure out the cause between intermittent frame stutters in our game. When profiling, on average, everything happens in <3ms on the CPU with a Gfx.PresentFrame at ~9ms on the GPU (Standalone Mac build), except on occasion, the GPU kicks over to the next frame with a PresentFrame of 30+ms.

    Here's a step-by-step of 4 frames showing the problem (attached below as well): https://imgur.com/a/OVWvYfy

    - 1: Normal frame, CPU finishes in 1.5ms, PresentFrame takes 9.5
    - 2: CPU takes under 2ms, PresentFrame explodes to 25.8ms
    - 3: Because of the previous frame, CPU needs to wait for ~10ms before continuing. PresentFrame takes another 32ms
    - 4: CPU needs to wait nearly 27ms for GPU to finish

    This only seems to happen intermittently - FPS will be normal for a second, then PresentFrame will stall for 10 frames, then back to normal for 20 frames, then PresentFrame will kick in for 4, etc

    I'm somewhat at a loss to figuring out what's causing this.

    Some points:
    • Standalone player is 488x864
    • Application.targetFrameRate is 60 (equates to VSync every frame, as it's enforced on certain platforms)
    • Game is 2D, static map taking up the screen (drawn via RenderImage on a canvas), with some UI on top. Total draw calls are ~30, with ~1k tris for certain 3D models
    • There are no GC allocations during the phases that this happens. The only memory allocations that arrive through play are from network messages (~1 every 2-3 seconds)
    • There's no culling as all objects are visible on screen
    • No Game or Scene tabs are open when profiling
    • There are no shadows, normals, reflections, HDR, etc
    What I've tried so far, which hasn't had an effect:
    • Turning off VSync
    • Setting all texture resolution to 1/8 quality (to check if it's an issue with memory bandwidth)
    • Removed all UI and canvases except the map, draw calls down to 5
    • Removed all 3D models
    • Removed all particles
    • Set all shaders to Mobile unlit
    • Decreased the display resolution to half the size (244x432)
    • Enabled experimental graphics jobs
    • Disabled Metal support to go to OpenGLCore
    • Changed from forward rendering to deferred to legacy (currently using forward)
    • Keeping loaded shaders alive
    I understand that PresentFrame is a sort of a catch all to showing the image on the screen, but I don't know why it would jump 3-4x at random intervals for random durations.

    Any ideas/suggestions to try would be very welcome, or if any Unity employee could explain a little what goes on inside the method so I could try to figure out what's triggering it.

    Using Unity 2018.1 with .Net 3.5 1.png 2.png 3.png 4.png

    Thanks
     
  2. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
  3. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    3,014
    Hi!
    Can you please try setting the Application.targetFrameRate to -1 and check if the behaviour is still the same?
    Also, which MacOS version are you on?
     
  4. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Hi!

    I would suggest setting it to 0 to make sure that simulation frame rate doesn't interfere with VSync.

    Also on OSX it would be best to use Instruments to dig into what's actually happening on GPU. It doesn't seem that builtin profiler can provide enough information to understand the problem. Present means many things - process uploaded gpu commands, wait for vsync...

    Another issue I see is that there is unnaconted time in profiler which I can't correlate with what we do there (looking at the code). There are cases when GC.Collect kicks in on another thread and pauses all other threads - might be a that.
    To find out what's in there I suggest to do sampling profiling with Instruments.
     
  5. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
    Hi Alek & Alexey,

    Thanks for replying - sorry for the late reply on my part; I got pulled away on other tasks, and I've not used Instruments before, so it's taking a while to get into it.

    Doing a sample with Instructions, one of the things that stood out is that the CPU is very erratic - going from 0 to 40 to 0 to 120 etc

    https://imgur.com/a/QmxY73h

    Doing a Time Profile, and for specific time sequences, the Job Queue is taking up 50% or more of the CPU (disclaimer: I'm *far* from an expert in Instruments, so I could be reading this wrong)

    https://imgur.com/a/UndKFiy (image isn't embedding properly)

    I understand Unity breaks up a lot of the internals into Job-based stuff, but any ideas on what could be taking so long. The game itself doesn't use the Job system, so whatever's in there would be all Unity.

    Is there a way to disable the Job system so everything runs on the main thread? (to check if there's an issue with locks/semaphores)? In the main PlayerLoop, every time something is added to the queue using JobQueue::EnqueueAll(), it's taking 1ms

    This is a project that's been brought through a lot of different Unity versions, all the way from Unity 4, so it wouldn't surprise me if it was something like slow optimisations in order to retain backwards compatibility.

    Thanks
     
  6. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Thanks for providing more information!

    Could you please check kernel/system checkbox to see kernel data? Sometimes a lot of time is spend in waiting for semaphore. And taking into account self time (time - sum of child times) I suspect that JobQueue::processJobs is just sitting in semaphore most of the time.

    Also what is the behavior with Application.targetFrameRate set to -1?
     
  7. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
    Hi Alexey,

    It looks like it's a semaphore issue all right. This:

    https://imgur.com/a/CKjRxKP

    Translates to this when showing the kernal calls:

    https://imgur.com/a/g97kbde

    I'm assuming this means that something's not freeing the semaphore as it should?

    In most instances that I could find the underlying job had to do with the Animator. We use the animator for our buttons, though there are only 3 active buttons in this scene. Does Unity use Animator internally?

    When setting Application.targetFrameRate to -1, the game is pretty much unplayable - pause, fluid, pause, fluid. Like watching stop motion :)

    Is it possible to disable the Jobs system to test?
     
  8. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    This is expected behavior - if there are no jobs to process worker threads sit idle in a semaphore. That means that the thread doesn't do anything and waits for system to wake it up.
    No, it is on on all platforms except WebGL, and autoconfigured based on CPU cores count. I don't think that would help - it doesn't look like the game is CPU bound.

    Do you have VSync on? VSync should stall main thread.

    I think in order to diagnose problem further it is important to see what's happening during Gfx.PresentFrame on GPU. It slips through 1 vsync according to sample time. And GPU capture might be helpful.
     
  9. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
    You know, I'm not going to lie; I thought that Application.targetFrameRate was the "code" way of controlling VSync, so I had it disabled in Quality Settings. Reading up on it though, it's not case, as once VSync was set, the standalone performance is a *lot* smoother. Basically going from this

    https://imgur.com/a/lE0LedF

    to this

    https://imgur.com/a/EPq6q7T

    We're moving to 2018.2 and the lightweight render pipeline, which should lift off even more load from the CPU/GPU, so it should improve.

    That said, the editor still lags similar to what's shown in the first post: everything's fine, then bit hits with Gfx.WaitForPresent/PresentFrame. Even when nothing is moving on the screen, or nothing is being rendered (I turned off the camera)

    Is there something special in relation to the editor and VSync/Application.targetFrameRate? It doesn't seem to respect it at all:

    https://imgur.com/a/3uDR7GS

    (On a separate note: can you recommend a good resource in getting up to speed with Instruments & Metal capture? I'm feeling somewhat out of my depth here - I don't know how to see what's happening inside PresentFrame for a given frame)
     
  10. Nicolas1212

    Nicolas1212

    Joined:
    Dec 18, 2014
    Posts:
    139
  11. alexeyzakharov

    alexeyzakharov

    Joined:
    Jul 2, 2014
    Posts:
    507
    Application.targetFrameRate is a "soft" vsync and is useful to limit simulation rate without hardware vsync set (e.g. on servers). It is not desirable to use targegFrameRate with vSyncCount set, as CPU frame might drift and cause rendercommands to be scheduled later and miss next vsync.

    Yes, Editor doesn't use gfxdevice vsync for views afaik to avoid waiting for each window being vsync'ed.

    I would recommend to look at GPU counters available for Metal. There is some information at the official page https://developer.apple.com/documen...viewing_performance_metrics_with_gpu_counters which can help with analysis.
     
    Niukeh likes this.