Search Unity

  1. Megacity Metro Demo now available. Download now.
    Dismiss Notice
  2. Unity support for visionOS is now available. Learn more in our blog post.
    Dismiss Notice

Burst and thread safe API outside Unity Jobs system

Discussion in 'Burst' started by sebas77, Mar 20, 2018.

  1. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    Hello,

    I could guess that Burst works in a similar way to how native dll works with mono. If this is true, would be feasible to make the Burst attribute available for code even outside the Job system?

    Similarly, could be possible to have thread safe API (transforms, raycast) that could work outside the job system?
     
  2. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Currently this mode is not available. A burst call does not work magically like a `DllImport` call. Internally it is requiring a delegate to call the generated native code. This delegate has to be created manually and has to match the original C# method signature that it was compiled from. So the process is still a bit cumbersome, error prone and not really user friendly to use it in C#. I would also be more in favor of integrating it with the Mono JIT and IL2CPP so that a tagged method like `[BurstCompile]` could be called directly without having to go through a delegate, but it makes the integration of burst a lot more tight to the way the JIT or IL2CPP are marshalling arguments, so this is not something easy to integrate nor portable... not sure we will go this path. Though we may expose the simple delegate method path for the preview 2018.2, stay tuned!

    In which case you would like to do this?
     
    5argon and sebas77 like this.
  3. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    Thanks for the reply! It was a wild guest, but I still hope I could use it outside the job system

    I have my own multi-threaded scheduling system, which is comparable to the JobSystem, therefore I would love to not be forced to use jobs if I don't strictly see the benefits. Currently the benefits I see are not due to the Job interface, but to the fact I would like to use burst and the thread safe API. I understand that you want to make performance accessible to everyone, but you should not assume that very experienced people don't use and love Unity (like I do). I believe c# is powerful enough, but I am limited by the unity API. Burst is also awesome and it would be a shame to limit it to jobs only. What I am trying to say is, you are doing a great job, don't limit it to your vision only though, it could be counter productive.
     
    Ofx360 likes this.
  4. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    Hi Sebas,

    We have a preview API to compile delegates using:
    Code (CSharp):
    1. Burst.BurstDelegateCompiler.CompileDelegate
    It's just a prototype test API for now but we used it to optimize some main thread code.


    That said. There are a couple of reasons why using C# job system fully makes sense:

    * The safety system is the key feature and the close connection of Containers + C# Jobs is required to provide full race condition detection. I can't stress enough how important a full safety system is when writing multi threaded code...
    * Having two job systems running one for engine code and others for your own code will result in worse frame-rate. Switching thread context is expensive and two job systems fighting for the CPU will not help on that.
    * We are very invested in making the C# Job system the absolute fastest and most powerful job system.


    That said I would really love to hear from you trying the Unity Job system and see what we need to improve to solve for all the uses you have that you feel you might not be getting from the C# Job system.
     
    Orimay, UnLogick and SugoiDev like this.
  5. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    Hello Joachim,

    it's great seeing you reply on the forum wow! I tried it for my One Million Points on CPU demo and it works well. I noticed that the IL2CPP needs some optimizations, but I think you are aware of it. I also 100% understand your goals for the Job System and I agree with the objectives, I am just saying that for medium/big teams with custom solutions there could be already alternatives. Our solutions currently runs at the same speed of the Job System in my demo, faster if compiled with IL2CPP.
    The only concern I wrote in my article about the the Job System is that this solution would be optimal in a totally multi-threaded environment, but as long as a lot of stuff still happens on the main thread, triggering a burst of parallel tasks is less beneficial than running the tasks while the main thread does other things. Obviously I understand that stalling the main thread is necessary for safety concerns.
    Again I love what you are doing, no critics at all, I just would also love to be able to use all this stuff outside the job system too. I will test the prototype API asap!
     
  6. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    >Our solutions currently runs at the same speed of the Job System in my demo, faster if compiled with IL2CPP.

    Is it possible to create a benchmark for it? If there is any situation where the Job System is not the fastest solution around we want to address that. Is it scheduling overhead or NativeArray vs builtin C# array performance?

    A benchmark comparing it would be really great to get access to and for us to review.

    I think in practice the most important thing to benchmark is using it via Burst. NativeArrays in particularly are really a perfect fit on Burst and ultimatily I expect that most C# code that needs to run fast will run in Burst.
     
  7. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Not sure to follow exactly the issue about the stalls. You create a stalls with the Unity Job System if you perform an explicit stall by calling `job.Complete()`, but otherwise, nothing stops you to work on something else on the main thread and later when you really need the results, perform a `job.Complete()`.

    Somehow it seems that you imply that the way to use Unity Job System should look like this:


    _job.Schedule(...);
    // wait for complete - stalls the main thread!
    _job.Complete();
    // Perform other unrelated calculations
    ...


    While you could mitigate stalls on the main thread by postponing the job.Complete():


    _job.Schedule(...);
    // Perform other unrelated calculations
    ...
    // wait for the job to complete - may not stalls the main thread
    _job.Complete();


    Or even better, if you can afford it, you can use double buffering jobs (launch a job0 on one frame, wait the result of the previous job1 frame, next frame swap job0/job1...etc.)

    Also, one of the potential reason the Unity Job System is slower than a custom C# task system is likely that the scheduler of the Unity Job System is written in C++ and when calling the job function, it has to perform a costly transition from un-managed to managed back to un-managed.

    As Joachim suggested, where the Unity Job System shines is when it is used with burst: in that case, there are no managed/un-managed transitions, the Job threads are also not processed by the Mono GC unlike C# threads (so overall it will be faster on the GC pressure as well) and the codegen produced by burst can give a significant boost.

    I converted your sample using burst (mainly using the new Unity.Mathematics and removing static variable access) and the Unity Job System with burst is roughly 2x times faster than your custom C# task system.
     
  8. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    yes you are absolutely right. Thanks for enlightenment, I will profile it again.

    To be honest, there is no way to know from the name of the functions when the jobs actually start. From the examples I saw so far, it seemed more reasonable that schedule was just preparing the jobs while Complete was starting and waiting for them.

    Obviously now it makes more sense, but if you didn't tell me, I wouldn't have guessed.

    About the performance, I also thought Unity Jobs could have been slower for that reason, but it is not. Currently Unity jobs is slower only if compiled in native code through IL2CPP, but this is a know problem by you.

    Of course Burst is what I am looking forward for, my concern is about how to use Burst with IL2CPP. Once we move to 2018, I don't see any reason to not use IL2CPP. I still expect burst to do a better job than the Microsoft compiler, even with all the optimizations enabled.

    I will replicate your work for my new profiling, why did you need to remove the static access? I guess because burst doesn't compile it right?
     
    Last edited: Mar 21, 2018
  9. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Indeed, static read access of readonly fields are not yet supported by burst. We hope that we will be able to bring them by 2018.2. Note that static read on non-readonly fields or static writes will not be possible (obviously also regarding thread safety), as it would require to access the "managed" .NET memory which is VM dependent (e.g Mono or .NET)
     
  10. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    Hello Xoofx, I am doing some tests. I will test burst right after this, but without burst I did this:

    Code (CSharp):
    1. var jobSchedule = _job.Schedule(_particleCount, 32);
    2.          
    3.             //do something seriously slow
    4. #if DO_SOMETHING_SERIOUSLY_SLOW        
    5.             Thread.Sleep(10);
    6. #endif
    7.             jobSchedule.Complete();

    and the timings didn't change compared to my article ones...while the main thread gets stuck for 10ms, the jobs should start, but it seems they don't? According what you said they should. I mean they could have started (I cannot verify it), but in this case the timing should be 10ms faster.
     
  11. LennartJohansen

    LennartJohansen

    Joined:
    Dec 1, 2014
    Posts:
    2,394
    call JobHandle.ScheduleBatchedJobs();

    then do your sleep.

    Lennart
     
  12. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    thanks can you explain me the difference beetween Schedule and ScheduleBatchedJobs?

    Edit: NVM, Got it! I don't understand why there is Schedule at all then.

    Edit2: Oh I see, it's actually a static function which is not specifically related to the jobs I just scheduled...weird
     
    Last edited: Mar 25, 2018
    holo-krzysztof likes this.
  13. LennartJohansen

    LennartJohansen

    Joined:
    Dec 1, 2014
    Posts:
    2,394
    You can set up many jobs with schedule, then you tell them to start with JobHandle.ScheduleBatchedJobs();
    If you do not start them they will only start when you call complete
     
  14. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
  15. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    I lost the link with the build that includes burst, can you link it again please?

    Edit: right I forgot Burst runs only inside the editor and that I have already decided to wait for the standalone compatible version to profile it.
     
    Last edited: Mar 25, 2018
  16. GabrieleUnity

    GabrieleUnity

    Unity Technologies

    Joined:
    Sep 4, 2012
    Posts:
    116
    @sebas77

    Schedule
    doesn't actually schedule the jobs immediately, but add them to a queue. Jobs are scheduled when you call JobHandle.ScheduleBatchedJobs or JobHandle.Complete. This is done for performance reasons since scheduling individual jobs results in expensive Semaphore.Signal calls. By scheduling many jobs at the same time delayed this cost will instead be paid only once per ScheduleBatchedJobs calls.

    in ECS, we schedule jobs automatically at given sync points (you can get a bit more information here: https://github.com/Unity-Technologi...entation/content/ecs_in_detail.md#sync-points)
     
    sebas77 likes this.
  17. mimimiprod

    mimimiprod

    Joined:
    Oct 2, 2014
    Posts:
    17
    Any news on the
    Code (CSharp):
    1. Burst.BurstDelegateCompiler.CompileDelegate
    ? is this still in pipeline?
     
  18. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Unfortunately, the feature `Burst.BurstDelegateCompiler.CompileDelegate` has been cancelled.

    We are going to release `BurstCompiler.CompileFunctionPointer<T>` as part of 1.1 which are functions that are supposed to be only callable inside a job.
     
    5argon and eizenhorn like this.
  19. Greenwar

    Greenwar

    Joined:
    Oct 11, 2014
    Posts:
    54
    So then you can call methods outside of the job after it has finished running or am I understanding it wrong?
     
  20. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Not sure to follow your question as well, but function pointers in burst are only callable from HPC#/burst jobs.
    You can't call any burst compiled HPC# code without going through a job.
     
  21. mimimiprod

    mimimiprod

    Joined:
    Oct 2, 2014
    Posts:
    17
    Hey,


    Thank you for the answer. That is unfortunate to hear though. So, there will be no feature that allows us to use the power of the burst compiler outside of a job? Sometimes we have systems that are not big enough to be scheduled for a job; because of the overhead of creating a job. Those systems could still benefit a lot from better compiled code.
     
  22. OlegPavlov

    OlegPavlov

    Joined:
    Jun 22, 2014
    Posts:
    13
    We found that your scheduling system is not perfect enough on Android. in some cases, overall performance suffers from scheduling jobs worth 2-3 ms.

    Can we execute a job specifically in the main thread? It would help us a lot in some cases.
     
  23. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    >Can we execute a job specifically in the main thread? It would help us a lot in some cases.
    Yes. You can use .Run() on any job.
     
  24. OlegPavlov

    OlegPavlov

    Joined:
    Jun 22, 2014
    Posts:
    13
    > Yes. You can use .Run() on any job.
    There's no such thing as .Run() in any job not in 2019.2 not in 2019.3 documentation.
    do you mean .Execute()? And it will run with burst optimizations?
     
  25. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    Every job type has a extension method .Run() which can be used as an alternative to .Schedule()
     
  26. Kamyker

    Kamyker

    Joined:
    May 14, 2013
    Posts:
    1,085
    Was this changed? I'm running function fine in MonoBehaviour.Update().
     
  27. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    416
    Yes, it has changed. Function pointers are now callable from C# code. You need to be careful to cache .Invoke (the underlying delegate) as it is re-created everytime you access the property (unlike in Burst where it boils down to a pure function call).
     
    Kamyker likes this.
  28. Kamyker

    Kamyker

    Joined:
    May 14, 2013
    Posts:
    1,085
    Unfortunately seems like NativeArrays don't work and without them overhead of running simple functions is not worth using.
     
  29. julian-moschuering

    julian-moschuering

    Joined:
    Apr 15, 2014
    Posts:
    529
    I'm sure that will work in the future. For the time being you may just wrap it:

    Code (CSharp):
    1. [MethodImpl(MethodImplOptions.AggressiveInlining)]
    2. public static bool ComputeSomeStuff(NativeArray<float> values)
    3. {
    4.     return computeSomeStuffBurst((float*)values.GetUnsafePtr(), values.Length);
    5. }
    6.  
    7. [BurstCompile]
    8. static bool ComputeSomeStuffBurst([NoAlias] float* values, int valueCount)
    9. {
    10.     // hellishly fast code
    11. }
     
    PutridEx likes this.
  30. Kamyker

    Kamyker

    Joined:
    May 14, 2013
    Posts:
    1,085
    Thanks @julian-moschuering

    No idea if possible but it would be amazing if those function could be edited and recompiled during playmode without domain reload. Even if it makes caching .Invoke impossible it would significantly speed up my workflow.
     
    Last edited: Feb 17, 2020
  31. Nyanpas

    Nyanpas

    Joined:
    Dec 29, 2016
    Posts:
    406
    Will stuff like [MethodImpl(MethodImplOptions.AggressiveInlining)] and [BurstCompile] be an implicit conversion based on certain criteria at some point so we don't have to explictly specify?
     
  32. LaireonGames

    LaireonGames

    Joined:
    Nov 16, 2013
    Posts:
    705
    For me the job system has too big of a fundamental flaw that I'm trying to work around and starting to question if its worth it at all. It looks like Unity uses jobs internally for things like recalculating bounding volumes and particle system modules.

    Well now that I have some huge jobs (can take around 500ms in the editor) those Unity jobs can be blocked by them and that feels so fundamentally wrong. I'll be walking around and everything will freeze for a noticeable second, I check the profiler and there is a huge spike on WaitForJobGroupID and within it one of my jobs.

    So now I could do things like break down that job into many smaller ones but the reason I didn't before was cause in my initial tests this batch size was most optimal, splitting it up will mean it will take much longer overal to process all my data.

    This however isn't a full solution, cause now my FPS is limited by the biggest job I have. I see consistent time lost per frame when I'm processing lots of smaller jobs as well.

    As a note, I've also profiled in builds (Windows standalone) and seeing the same behaviour. Its just less noticeable since everything is running faster without the safety system but its still a big bottleneck for me.

    I've also opened a ticket around this to make sure its not just a bug, but it doesn't feel like it.
     
  33. LaireonGames

    LaireonGames

    Joined:
    Nov 16, 2013
    Posts:
    705
    Also the documentation is just really confusing. I've been working with a custom multi threaded solution for years but jobs feels messy compared.

    I create my job and call schedule. Then on the main thread I check each frame (on both Update and LateUpdate) for if the handle .IsCompleted and is so I call Complete().

    Well the documentation says things like: "The job system intentionally delays job execution until you call
    ScheduleBatchedJobs manually"

    And when talking about .Complete() "this method flushes the jobs from the memory cache and starts the process of execution".

    I never call ScheduleBatchedJobs anywhere (about to try it and see if it helps) and it feels like in my code Complete should never be called until the job has finished but it instantly raises the question of why are my jobs even running/how are they getting started?
     
    Orimay and Nyanpas like this.
  34. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    I had problems with it too, the design is completely different to what you would expect from other multithreaded patterns, but it makes sense eventually. It's probably the simplest way to solve the data dependency problem. The only thing I don't like about the job system is that it has been partially written in c++, when this was not necessary (they reuse a framework that is used also by the engine at c++ level).

    If you want to know more about how it works at low level ,you can check the work-stealing technique. Regarding how to use it, I wouldnt know where to start to explain it, but you must not use .Complete() as it would defeat the purpose.
     
    Nyanpas likes this.
  35. LaireonGames

    LaireonGames

    Joined:
    Nov 16, 2013
    Posts:
    705
    See I felt I was used to it and happy, its just when an issue comes up that it leaves you scratching your head cause the documentation feels conflicting.

    It feels like jobs is mainly made for one purpose and that's to be used in quick bursts in sync with the main thread. Long running jobs are the afterthought which I find strange since they feel like the more usual use cases.

    My first thought isn't to multi thread the updates of say 1000 enemies, but I would think to multi thread that big task of calculating a terrain. It feel like jobs focuses more on those 1000 enemies type scenarios.
     
  36. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,641
    it really depends by a lot of factors, but if you use full ECS and SystemBase most of it should be completely transparent to you. If you are not using ECS, then it's another story.