Search Unity

[Solved] C# Job System VS Managed threaded code

Discussion in 'C# Job System' started by Aithoneku, Aug 15, 2018.

  1. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    Short question
    What is best way how to execute expensive code which requires usage of managed code while the game uses C# Job System?

    Little bit elaborated
    Let's say our game needs to execute quite expensive calculation which are not (realistically) possible to write for C# job system (huge cryptography work, image processing using 3rd-party libraries, generally any work which can be executed in separated thread and is too expensive to rewrite using only blittable types). How can be this done while extensively using job system?

    More specific example
    To be little bit more specific: let's say I have huge amount of IJob instances and similarly big number of IManagedJob (my own interface) instances. Each of them can be executed in separate thread, some of them must be completed in current frame, others can take several frames. I can see multiple ways to handle this problem, but none seems to be optimal:

    • Use C# job system AND my custom managed job system at the same time.
      • Advantages: it's simple to write the code.
      • Disadvantages: if my managed job system uses (core count - 1) threads, then there are (core count - 1)×2 threads in total, which makes whole system ineffective (context switching)
    • Ignore C# job system and run IJob instances in my managed job system
      • Advantages: optimal thread count
      • Disadvantages: losing burst optimizations, impossible to use ECS
    • First schedule jobs from one system, when completed schedule jobs from the other
      • Advantages: (almost) no overhead from context switching
      • Disadvantages: requires precise timing in main thread when to schedule other jobs, there will always be some overhead from context switching due to jobs which lasts several frames
    • Use smaller thread count in my managed job system
      • Advantages: less overhead from context switching
      • Disadvantages: depending on managed tasks, whole system might be the least effective variant (if there is too many managed jobs)
    Real-world problem
    If you want to ask what real-world problem I'm solving: I joined certain project which is in the middle of progress and it uses own managed job system. My task is to implement something which could hugely benefit from C# job system (lots of low-level calculations on arrays), but rewriting their managed jobs to C# job system is not possible due to time and financial constraints.

    Hypothetical problem
    After my experiences in the industry, I'm 99% sure that there will always be jobs which requires usage of managed code. I cannot imagine how to write such code while using ECS in new projects.
     
  2. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    You could use the low level API's to make your own job type.

    https://github.com/Unity-Technologi...ter/Documentation/content/custom_job_types.md

    A simple way would be to use GCHandle to keep a reference to a managed object, then on the custom job convert it to the required interface and invoke some class or whatever you need to run your non-burst C# code.

    In this setup you lose
    * Safety system, you have full access to C#, any memory and there will be nothing to report race conditions
    * Burst can't be used

    But at least you can have jobs with dependencies on other jobs.

    So i would suggest keeping as much of the code in NativeContainers + real C# jobs. And then the code that can't be ported because you have dependencies on code you don't control keep that in those GC handle based custom managed job types.

    Optimally of course over time you find a way to convert all of the C# code to be burst compliant with good data layout. So that you actually get speedups beyond just running the code in parallel.
     
    Maeslezo and Aithoneku like this.
  3. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    I see, thank you very much, I'll check that. It seems like the solution I was looking for.

    Even if I don't take into account things I cannot control (in our development)... I really don't think I'm capable of doing that in reasonable time and quality for such huge codebase. I mean, converting "all of the C# code to be burst compliant" means losing a lots of OOP features and almost feels like switching back to very restricted subset of C.
     
  4. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    So I tried to use GCHandle and found a working solution, so I post it here for the reference and (potential) review;

    Code (CSharp):
    1. interface ITask
    2. {
    3.     void Execute();
    4. }
    5.  
    6. struct Job : IJob
    7. {
    8.     public GCHandle _Task;
    9.  
    10.     public void Execute()
    11.     {
    12.         ITask task = (ITask)_Task.Target;
    13.         task.Execute();
    14.     }
    15. }
    16.  
    17.  
    18. // Get instance of task to be executed
    19. ITask task = GetTaskInstance();
    20.  
    21. // Creates native reference of the task. Note: if there will be problems with
    22. // instance of GCHandle in future, it can be converted to/from IntPtr.
    23. GCHandle taskHandle = GCHandle.Alloc(task);
    24.  
    25. // Schedule jobs
    26. Job job = new Job()
    27. {
    28.     _Task = taskHandle,
    29. };
    30.  
    31. JobHandle jobHandle = job.Schedule();
    32.  
    33. // ...
    34.  
    35. jobHandle.Complete();
    36.  
    37. // Release native reference of the object
    38. taskHandle.Free();
    As I thought, in real-world project, this is pretty much out of the question...
     
  5. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    I'm sorry for bothering you again. I just would like to ask/confirm: is using GCHandle and usage of managed code in job's Execute function (not in member fields) in jobs without Burst attribute going to be officially supported in future? For example, AFAIK you're planning to block usage of static variables/methods from jobs with static analysis - is there possibility that you are going to block managed code in jobs without Burst attribute?
     
  6. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    We will perform static analysis but as with all burst related features our principles are "Performance and safety by default". Its on by default but you can disable it where necessary.

    An example of current approach is this:
    Code (CSharp):
    1. [NativeDisableParallelForRestriction]
    2. NativeArray<int> indexList;
    3. or
    4. [NativeDisableUnsafePtrRestrictionAttribute]
    5. MyPointer* pointer;
    which allows you to use unsafe pointers on a job, clearly thats unsafe and not a great default to allow. So our approach is that when choosing things that aren't provably safe you have to manually specify that this is what you intended. At that point you are on your own to shoot yourself in the foot :)

    The same similar principles will apply for static analysis.
     
    Aithoneku likes this.
  7. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
    Also do note that GCHandle approach is used in ECS for streaming scenes right now. Turns out you need a string for that...
     
    Last edited: Aug 31, 2018
    Orimay and Aithoneku like this.
  8. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    Thank you for your answer.

    (I interpret it that we will be always able to call manged code from jobs (without Burst attribute) and if there will be static analysis used for blocking such code, we will be given an attribute to disable this safety per case.)
     
  9. Joachim_Ante

    Joachim_Ante

    Unity Technologies

    Joined:
    Mar 16, 2005
    Posts:
    5,203
  10. vutruc80

    vutruc80

    Joined:
    Jun 28, 2013
    Posts:
    57
    In your context without Burst speedup, i don't see any advantage of Unity Job System over native c# one or i am wrong?
     
  11. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    There are advantages;
    • When I use Unity C# Job System, I don't need to spawn own thread. Thanks to that, there will be always running (C-1) threads (C = core count) so the system isn't slow down by context switching.
    • When I have big task and part of it can be written in burst-optimized code. In other words there is a task T which can be separated into T1 (managed code), T2 (burst-optimized code) and T3 (managed code).
      • If I don't use C# Job System for T1 and T2, then I schedule T1 in my custom thread. Wait until it's finished (which can be checked in next frame). When it's done, take it's results and schedule T2 in job system. Again, wait for results - check every frame - and when it's done, finally schedule T3 in my custom thread.
      • If I do use C# Job System for everything, I schedule T1, T2 and T3 and the same time as jobs using dependency system (T3 depends on T2 which depends on T1). Then I just wait for result of whole T. So thanks to this approach, I avoided delays before scheduling T2 and T3.
    So even thought using managed C# code in Unity C# Job System isn't faster (because it doesn't use Burst compiler), it's faster in the context of whole application.
     
  12. vutruc80

    vutruc80

    Joined:
    Jun 28, 2013
    Posts:
    57
    - If i'm not mistaken, you can configure the worker thread count
    - In your example, T1, T3 could be implemented with c# native task. It seems to me that it's faster and a lot easier to integrate with existed code.
     
  13. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    AFAIK I can't. AFAIK, that's internally handled by Unity. (If you mean Unity C# Job System's worker threads.)

    Easier, yes. Faster, it depends. In terms of individual tasks, yes, it would be faster because there wouldn't be overhead of Unity C# Job System and allocating unmanaged reference. However, time since start of T1 to end of T3 will be greater due to delays between scheduling each sub-task.

    Example time. For simplicity, let's say we target 30 fps, so we have 33ms per frame. And let's say each sub-task (T1, T2, T3) takes 10ms.

    When managed code is executed from non-job threads:
    • Frame #1: schedule T1. It runs on my thread, takes 10ms, rest of the time the thread sleeps.
    • Frame #2: main thread fetches result of T1 and schedules T2. It runs on Unity job system worker thread, takes 10ms, rest of the time the thread sleeps.
    • Frame #3: main thread fetches result of T2 and schedules T3. It runs on my thread, takes 10ms, rest of the time thread sleeps.
    • Frame #4: main thread fetches result of T3.
    When managed code is executed using job threads:
    • Frame #1: schedule T1, T2 and T3 using dependency system. Unity job system worker thread: executes T1 in 10ms. Immediately it can start T2 which also takes 10ms. When it's finished, it starts T3. Together, it took 30ms.
    • Frame #2: fetch results of T3.
     
  14. vutruc80

    vutruc80

    Joined:
    Jun 28, 2013
    Posts:
    57
    If IJob.Schedule() can only be called from main thread then you are right. Once we start messing with Unity, we have to stick with it :)
     
  15. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    Exactly
     
  16. vutruc80

    vutruc80

    Joined:
    Jun 28, 2013
    Posts:
    57
    There is a work-around : you can use all Unity hook points to schedule T2 and T3 like Update, LateUpdate, OnRenderObject, OnPreRender, WaitEndOfFrame. That will provide finer grain than one frame delay.
     
  17. vutruc80

    vutruc80

    Joined:
    Jun 28, 2013
    Posts:
    57
    Or better if you know T1 only run in one frame, after schedule T1 in Update(), in OnRenderObject() you could use AutoResetEvent.Wait before schedule T2, and at the end of T1, you call AutoResetEvent.Set to unblock the main thread.
     
    Last edited: Sep 15, 2018
  18. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    For me, using GCHandle, is the best one.
    • I can schedule all the steps at once at one point instead of tracking it all around code.
    • Work-around using all Unity hooks shortens the delay, but it's still bigger than when worker thread starts next step immediately. I don't think you can top that.
    • T1 running one frame - it was just an example trying to explain that scheduling manually in main thread will always, always add some delay. The duration of job is really unpredictable and different on each machine, so you cannot make any kind of assumptions (like "it will be finished next frame").
    • IMHO work-around you suggest adds also more complexity in the code.
     
  19. Manufacture43

    Manufacture43

    Joined:
    Apr 21, 2017
    Posts:
    140
    I like the GCHandle approach too. The only drawback I see is the GC Alloc it causes.

    In Aithoneku's code, I see a mention to switch it to an IntPtr. How would you use it then ?

    I'm trying to use UnsafeUtility.PinGCObjectAndGetAddress instead, which gives you a ulong handle instead of an actual GCHandle. I'm not sure how to convert either the pointer or the ulong handle to the managed object...
     
  20. Aithoneku

    Aithoneku

    Joined:
    Dec 16, 2013
    Posts:
    67
    You can always use object pooling - the pool could use a structure which keeps both managed object and it's GCHandle. Of course you still need handle deallocating the memory on appropriate time.

    My comment says "if there will be problems with instance of GCHandle in future, it can be converted to/from IntPtr" - the point is, that I wasn't sure whether there will be some problems with using instances of GCHandle with Job system and I noticed methods ToIntPtr and FromIntPtr - in both cases you simply store unsafe pointer in a structure.

    Well, I can see ways to convert between them, but I don't know whether the following is safe. I don't know unsafe part of C# enough to know the dangers of following!

    IntPtr is just a container of pointer. First, I can see there are constructors for IntPtr taking void pointer and long integer. It's not ulong and I don't know whether it's ok in this case to cast between them. Then, you can convert it to GCHandle. But as I wrote, I don't know whether it's safe. All I can do is recommend to study how these things work.

    But what can I write further is very important rule: if you allocate something, anything, you should deallocate it by same source (class, unit, etc.). So if you allocate some handle with UnsafeUtility.PinGCObjectAndGetAddress, deallocate it only with UnsafeUtility.ReleaseGCObject (and don't use, for example, GCHandle.Free). This rule should be kept even when you switch your system, programming/scripting language, etc. Behavior of such action is undefined (unless explicitly stated)! That means "it might work now, it might work next week, but nothing guarantees that it will work later or when you change configuration (debug/release) or system or anything else".
     
  21. Manufacture43

    Manufacture43

    Joined:
    Apr 21, 2017
    Posts:
    140
    Of course I was going to use ReleaseGCObject... Anyway, I did go the pooling route, but I would be nice to have it garbage free by default! The way I did it is that my class is disposable, so I must not forget to dispose it...

    Code (CSharp):
    1. // Usage:
    2. // struct MyJob : ManagedJob.IWork
    3. // {
    4. //     void Execute() { }
    5. // }
    6. // ...
    7. // ManagedJob job = new ManagedJob() { Work = new MyJob(); }
    8. // JobHandle handle = job.Schedule();
    9. // handle.Complete();
    10. // job.Dispose();
    11. struct ManagedJob<T> : IJob, IDisposable where T : class, ManagedJob<T>.IWork // T must be class so that Work is mutable
    12. {
    13.     public interface IWork
    14.     {
    15.         void Execute();
    16.     }
    17.  
    18.     GCHandle handle;
    19.     public T Work
    20.     {
    21.         set
    22.         {
    23.             handle = GCHandle.Alloc(value);
    24.         }
    25.         get
    26.         {
    27.             return (T)handle.Target;
    28.         }
    29.     }
    30.  
    31.     void IJob.Execute()
    32.     {
    33.         IWork task = (IWork)handle.Target;
    34.         task.Execute();
    35.     }
    36.  
    37.     public void Dispose()
    38.     {
    39.         handle.Free();
    40.     }
    41. }
    42.  
    43. struct ManagedParallelFor<T> : IJobParallelFor, IDisposable where T : class, ManagedParallelFor<T>.IWork
    44. {
    45.     public interface IWork
    46.     {
    47.         void Execute(int index);
    48.     }
    49.  
    50.     GCHandle handle;
    51.     public T Work
    52.     {
    53.         set
    54.         {
    55.             handle = GCHandle.Alloc(value);
    56.         }
    57.         get
    58.         {
    59.             return (T)handle.Target;
    60.         }
    61.     }
    62.  
    63.     void IJobParallelFor.Execute(int index)
    64.     {
    65.         IWork task = (IWork)handle.Target;
    66.         task.Execute(index);
    67.     }
    68.  
    69.     public void Dispose()
    70.     {
    71.         handle.Free();
    72.     }
    73. }
    74.  
    I think my thing is maybe a bit overcomplicated. I can't remember why I went with that generic mess instead of just having my managed jobs inherit from ManagedJob
     
    forestrf likes this.
  22. funkyCoty

    funkyCoty

    Joined:
    May 22, 2018
    Posts:
    727
    Note: If any jobs in the dependencies are touching native containers directly, rather than through a GCHandle, the safety system breaks down in the editor.

    Example:
    Job A touches a native container in burst'd job or otherwise
    Job B depends on Job A, but only uses a GCHandle to also touch the same native container

    In the editor, this will throw exceptions about safety and not work. In standalone builds, however, this does work as it should.
     
  23. PrimeAlly

    PrimeAlly

    Joined:
    Jan 2, 2013
    Posts:
    35
    There are a couple of gotchas to this:
    1. Future versions of Unity will prevent global variable access from jobs using static analysis
    2. Allocating managed memory in jobs is incredibly slow

    https://docs.unity3d.com/Manual/JobSystemTroubleshooting.html

    I'm not sure there is a way around this if you want to go for the managed approach? @Joachim_Ante @Aithoneku

    In my example testing, for thousands of string allocations looped in parallel (IJobParallelFor) there is a huge penalty compared to running it in the main thread... OTOH that seems to also be true for the Task Parallel Library so perhaps my string test is not suitable to multithreading in general.
     
    Last edited: Apr 6, 2020
  24. XOuYang

    XOuYang

    Joined:
    Jun 17, 2021
    Posts:
    4
    hi bro, i am confused with unity main thread and job thread. Why is the battle logic running on the unity main thread time consuming normally,but running on the job worker thread cost much more then main thread, I checked job thread is running on big core, What is the difference between job work thread and unity main thread ? they are running on big core. can you give me some tips?