Search Unity

UpdateAllSkinnedMeshes spikes unpredictably

Discussion in 'General Graphics' started by JLJac, Jul 16, 2019.

  1. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Hello!

    I have issues with PostLateUpdate.UpdateAllSkinnedMeshes sometimes spiking up into the 20ms range, causing lag that's quite noticeable.



    Does anyone know what this might be?

    I have some reason to believe it might be a threading issue. I have some 10 skinned meshes in the scene, but I can see the calls to those finish up very quickly at the beginning of UpdateAllSkinnedMeshes. What takes the time is a sub-call to Semaphore.WaitForSignal, indicating that it is waiting for a thread to finish something up. I can't Deep Profile it, I suspect because the Deep Profiler slows the entire frame down to take more than 20ms regardless, so when PostLateUpdate is finally called this mystery thread is already finished.

    For context, I am running a few custom scripting threads. Could it be that my threads steal resources from Unity's worker threads and stalls them? I have tried assigning my own threads low priority, to no avail.

    Thank you!
     
  2. Peter77

    Peter77

    Joined:
    Jun 12, 2013
    Posts:
    3,899
    What Unity version and platform did you run the test on? Did you perhaps profile the game running in the editor? In this case, do you see the same spikes when profiling a build on the target hardware?

     
  3. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Thank you for your answer! I'm on 2019.1.8f1, Windows 10. And yep, I'm profiling in the editor - I'll try profiling the standalone and get back with the results.
     
  4. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Okay, so I managed to reproduce the problem in a standalone build (kind of):

    culling_waitForSignal.PNG

    It appears that this time it's not UpdateAllSkinnedMeshes that's causing it, but it seems to be Gfx.WaitForPresent instead - but the problem is very similar looking ~ the game spends a load of time waiting for Semaphore.WaitForSignal at random intervals.

    Can't deep profile the stand-alone, so again I'm kind of lost here... Very thankful for any clues.
     
    Last edited: Jul 18, 2019
  5. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    854
    Hi!
    Semaphore.WaitForSignal is probably indicating that it's waiting for another thread (rendering thread?) to finish work. You should check, what the other threads are doing at this time.
     
  6. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    It's difficult to know, because the issue doesn't occur during deep profiling. I have a theory that the reason is I have my own custom scripting threads which are causing the problem by stalling the unity job threads, but during deep profiling everything is so slow that the custom threads are already finished up by the time Camera.Render is called.

    Do I need to finish up all scripting threads before the end of the frame so as to not interfere with the jobs system at Camera.Render? Does this sound like a reasonable cause for the issue, or am I barking up the wrong tree?
     
  7. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    854
    You can try using a system-wide profiler to figure if that's the case.
    And yes, this can be the root cause :)
     
  8. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    I'll look into sytem-wide profiling! Sorry about the lacking technical knowledge.

    Also I will try to finish up my parallel processes before Camera.Render is called and see if that solves it. To your knowledge, will threads that are waiting for a ManualResetEvent be available to Unity's render routine, or do they need to properly terminate?
     
  9. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    854
    I don't know, unfortunately.
     
  10. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Gotcha, I'll just have to try it :)
    Update - have been able to semi-reliably reproduce the problem by spawning 10 threads each frame and having them do busywork, swamping the thread pool. It doesn't look exactly like my problem (basically a whole bunch of processes get random long delays rather than just UpdateAllSkinnedMeshes in particular, but UpdateAllSkinnedMeshes is among them). Next I'll try to solve the problem by attempting to finish up my threads before the end of the frame - will report back on it.

    Edit: Threads that are waiting for a ManualResetEvent do NOT cause conflict with Unity's render routine, they are fine. Only if they are actually doing work do they become a problem.
     
    Last edited: Jul 18, 2019
  11. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    854
    So, if I understand correctly, you're spawning some threads manually and give them some work to do.
    Unity creates a worker thread per available core (so on an 8-core system it's going to create 6 or 7 workers, depending on whether gfx is multithreaded or not). If you create additional threads that do a lot of work, they are very likely to interfere with the built-in workers.
    Can you try using C# jobs instead?
     
  12. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Yeah, correct! My game relies on some very heavy simulation (fluid sim) and working on the main thread doesn't cut it. I did try the jobs system, but was not able to get any performance gains out of it because the nativeArrays are slow to access: https://forum.unity.com/threads/nat...order-of-magnitude-slower-than-arrays.535019/
    Instead I have found a method that uses standard c# threads, where for the heavier tasks I will call a c++ dll plugin from inside the thread. Admittedly it is a while since I tried the jobs system (it was in beta at the time) but I'm hesitant to convert back because I found it to be somewhat limiting, and was ultimately getting slower results than just doing all the work in standard c# on the main thread!

    Q: Is there a way to manually tell Unity's job system to use fewer threads than the full range for its job system? If I could have two threads to myself and be confident that the Jobs will use the remaining ones, that would be great!

    I have done some work trying to make my threads terminate or pause before Unity's render routine, and it seems to be working - I have not seen the strange slowdowns since. Sometimes Unity will use threads outside the render block (such as when instantiating an object) and then I occasionally get a bit of Semaphore.Wait slowdowns still, but I think I know how I might get around this as well.

    For future readers:
    Unity uses multithreading for rendering. If you have too many other threads working when Unity enters its render routine (the green stuff at the end of the frame in the profiler) it won't get access to the threads it needs as quickly as it needs them. What seems to be a working fix is to make sure to either terminate or pause your threads when Unity enters the rendering routine. The OnPreRender call can be used for this - a thread can be initiated at the start of the frame, work in parallel with the main game logic, and as long as you "catch" it in LateUpdate or OnPreRender it seems Unity's render routine can continue uninterrupted.

    Setting a low priority on your custom threads does not work, they need to be terminated or in an idle state.

    I'm trying another method now which I still haven't examined fully, but which seems to be working. I have a ManualResetEvent called "isNotRendering" that I reset OnPreRender and set OnPostRender, basically "closing" it for the duration of the rendering routine. Inside my worker threads I put a isNotRendering.WaitForOne() in the main logic loop. This essentially makes the threads pause for the duration of the rendering, and seems to work okay. In order to be really sure though, I'd recommend properly terminating the threads before the end of the frame.
     
  13. aleksandrk

    aleksandrk

    Unity Technologies

    Joined:
    Jul 3, 2017
    Posts:
    854
    Nope. Not at the moment.
    I used it in a project about a year ago (together with burst) and it was very fast. Way faster than on the main thread in C#.
     
  14. JLJac

    JLJac

    Joined:
    Feb 18, 2014
    Posts:
    29
    Cool, thanks for the info.

    Yep, I also got it to be very fast in specific test cases! Problem is, if you have a huge amount of data that's intimately intertwined with all of your game logic (a big fluid sim) you have two options: Shuffle the data to and from a NativeArray each frame (very slow) or keep the data in a NativeArray (slow whenever you need to access/manipulate the data from outside a job). The latter might work if your entire project is only Job logic, but when I tried it Unity wasn't quite there yet. So, the simulation was incredibly fast, but accessing it from the rest of the game was too slow to be viable. End of the day, for my specific circumstances it ended up being slower than just churning through it all on the main thread :(

    Custom multithreading has been working fine though, as long as I keep it out of the rendering routine, it seems!