[Solved] C# Job System VS Managed threaded code

Aithoneku · Aug 15, 2018

Short question
What is best way how to execute expensive code which requires usage of managed code while the game uses C# Job System?

Little bit elaborated
Let's say our game needs to execute quite expensive calculation which are not (realistically) possible to write for C# job system (huge cryptography work, image processing using 3rd-party libraries, generally any work which can be executed in separated thread and is too expensive to rewrite using only blittable types). How can be this done while extensively using job system?

More specific example
To be little bit more specific: let's say I have huge amount of IJob instances and similarly big number of IManagedJob (my own interface) instances. Each of them can be executed in separate thread, some of them must be completed in current frame, others can take several frames. I can see multiple ways to handle this problem, but none seems to be optimal:

Use C# job system AND my custom managed job system at the same time.

Advantages: it's simple to write the code.

Disadvantages: if my managed job system uses (core count - 1) threads, then there are (core count - 1)×2 threads in total, which makes whole system ineffective (context switching)

Ignore C# job system and run IJob instances in my managed job system

Advantages: optimal thread count

Disadvantages: losing burst optimizations, impossible to use ECS

First schedule jobs from one system, when completed schedule jobs from the other

Advantages: (almost) no overhead from context switching

Disadvantages: requires precise timing in main thread when to schedule other jobs, there will always be some overhead from context switching due to jobs which lasts several frames

Use smaller thread count in my managed job system

Advantages: less overhead from context switching

Disadvantages: depending on managed tasks, whole system might be the least effective variant (if there is too many managed jobs)

Real-world problem
If you want to ask what real-world problem I'm solving: I joined certain project which is in the middle of progress and it uses own managed job system. My task is to implement something which could hugely benefit from C# job system (lots of low-level calculations on arrays), but rewriting their managed jobs to C# job system is not possible due to time and financial constraints.

Hypothetical problem
After my experiences in the industry, I'm 99% sure that there will always be jobs which requires usage of managed code. I cannot imagine how to write such code while using ECS in new projects.

Joachim_Ante · Aug 15, 2018

You could use the low level API's to make your own job type.

https://github.com/Unity-Technologi...ter/Documentation/content/custom_job_types.md

A simple way would be to use GCHandle to keep a reference to a managed object, then on the custom job convert it to the required interface and invoke some class or whatever you need to run your non-burst C# code.

In this setup you lose
* Safety system, you have full access to C#, any memory and there will be nothing to report race conditions
* Burst can't be used

But at least you can have jobs with dependencies on other jobs.

So i would suggest keeping as much of the code in NativeContainers + real C# jobs. And then the code that can't be ported because you have dependencies on code you don't control keep that in those GC handle based custom managed job types.

Optimally of course over time you find a way to convert all of the C# code to be burst compliant with good data layout. So that you actually get speedups beyond just running the code in parallel.

Aithoneku · Aug 15, 2018

Joachim_Ante said: ↑

You could use the low level API's to make your own job type.

https://github.com/Unity-Technologi...ter/Documentation/content/custom_job_types.md

A simple way would be to use GCHandle to keep a reference to a managed object, then on the custom job convert it to the required interface and invoke some class or whatever you need to run your non-burst C# code.

In this setup you lose
* Safety system, you have full access to C#, any memory and there will be nothing to report race conditions
* Burst can't be used

But at least you can have jobs with dependencies on other jobs.

So i would suggest keeping as much of the code in NativeContainers + real C# jobs. And then the code that can't be ported because you have dependencies on code you don't control keep that in those GC handle based custom managed job types.
Click to expand...

I see, thank you very much, I'll check that. It seems like the solution I was looking for.

Joachim_Ante said: ↑

Optimally of course over time you find a way to convert all of the C# code to be burst compliant with good data layout. So that you actually get speedups beyond just running the code in parallel.
Click to expand...

Even if I don't take into account things I cannot control (in our development)... I really don't think I'm capable of doing that in reasonable time and quality for such huge codebase. I mean, converting "all of the C# code to be burst compliant" means losing a lots of OOP features and almost feels like switching back to very restricted subset of C.

Aithoneku · Aug 20, 2018

So I tried to use GCHandle and found a working solution, so I post it here for the reference and (potential) review;

Code (CSharp):

interface ITask

{

void Execute();

}

struct Job : IJob

{

public GCHandle _Task;

public void Execute()

{

ITask task = (ITask)_Task.Target;

task.Execute();

}

}

// Get instance of task to be executed

ITask task = GetTaskInstance();

// Creates native reference of the task. Note: if there will be problems with

// instance of GCHandle in future, it can be converted to/from IntPtr.

GCHandle taskHandle = GCHandle.Alloc(task);

// Schedule jobs

Job job = new Job()

{

_Task = taskHandle,

};

JobHandle jobHandle = job.Schedule();

// ...

jobHandle.Complete();

// Release native reference of the object

taskHandle.Free();

Joachim_Ante said: ↑

Optimally of course over time you find a way to convert all of the C# code to be burst compliant with good data layout. So that you actually get speedups beyond just running the code in parallel.
Click to expand...

As I thought, in real-world project, this is pretty much out of the question...

Aithoneku · Aug 31, 2018

Joachim_Ante said: ↑

A simple way would be to use GCHandle to keep a reference to a managed object, then on the custom job convert it to the required interface and invoke some class or whatever you need to run your non-burst C# code.
Click to expand...

I'm sorry for bothering you again. I just would like to ask/confirm: is using GCHandle and usage of managed code in job's Execute function (not in member fields) in jobs without Burst attribute going to be officially supported in future? For example, AFAIK you're planning to block usage of static variables/methods from jobs with static analysis - is there possibility that you are going to block managed code in jobs without Burst attribute?

Joachim_Ante · Aug 31, 2018

We will perform static analysis but as with all burst related features our principles are "Performance and safety by default". Its on by default but you can disable it where necessary.

An example of current approach is this:

Code (CSharp):

[NativeDisableParallelForRestriction]

NativeArray<int> indexList;

or

[NativeDisableUnsafePtrRestrictionAttribute]

MyPointer* pointer;

which allows you to use unsafe pointers on a job, clearly thats unsafe and not a great default to allow. So our approach is that when choosing things that aren't provably safe you have to manually specify that this is what you intended. At that point you are on your own to shoot yourself in the foot

The same similar principles will apply for static analysis.

Joachim_Ante · Aug 31, 2018

Also do note that GCHandle approach is used in ECS for streaming scenes right now. Turns out you need a string for that...

Aithoneku · Aug 31, 2018

Joachim_Ante said: ↑

We will perform static analysis but as with all burst related features our principles are "Performance and safety by default". Its on by default but you can disable it where necessary.

An example of current approach is this:

Code (CSharp):

[NativeDisableParallelForRestriction]

NativeArray<int> indexList;

or

[NativeDisableUnsafePtrRestrictionAttribute]

MyPointer* pointer;

which allows you to use unsafe pointers on a job, clearly thats unsafe and not a great default to allow. So our approach is that when choosing things that aren't provably safe you have to manually specify that this is what you intended. At that point you are on your own to shoot yourself in the foot

The same similar principles will apply for static analysis.
Click to expand...

Thank you for your answer.

(I interpret it that we will be always able to call manged code from jobs (without Burst attribute) and if there will be static analysis used for blocking such code, we will be given an attribute to disable this safety per case.)

Joachim_Ante · Aug 31, 2018

Yes.

vutruc80 · Sep 14, 2018

In your context without Burst speedup, i don't see any advantage of Unity Job System over native c# one or i am wrong?

Aithoneku · Sep 14, 2018

vutruc80 said: ↑

In your context without Burst speedup, i don't see any advantage of Unity Job System over native c# one or i am wrong?
Click to expand...

There are advantages;

When I use Unity C# Job System, I don't need to spawn own thread. Thanks to that, there will be always running (C-1) threads (C = core count) so the system isn't slow down by context switching.

When I have big task and part of it can be written in burst-optimized code. In other words there is a task T which can be separated into T1 (managed code), T2 (burst-optimized code) and T3 (managed code).

If I don't use C# Job System for T1 and T2, then I schedule T1 in my custom thread. Wait until it's finished (which can be checked in next frame). When it's done, take it's results and schedule T2 in job system. Again, wait for results - check every frame - and when it's done, finally schedule T3 in my custom thread.

If I do use C# Job System for everything, I schedule T1, T2 and T3 and the same time as jobs using dependency system (T3 depends on T2 which depends on T1). Then I just wait for result of whole T. So thanks to this approach, I avoided delays before scheduling T2 and T3.

So even thought using managed C# code in Unity C# Job System isn't faster (because it doesn't use Burst compiler), it's faster in the context of whole application.

vutruc80 · Sep 14, 2018

Aithoneku said: ↑

There are advantages;

When I use Unity C# Job System, I don't need to spawn own thread. Thanks to that, there will be always running (C-1) threads (C = core count) so the system isn't slow down by context switching.

When I have big task and part of it can be written in burst-optimized code. In other words there is a task T which can be separated into T1 (managed code), T2 (burst-optimized code) and T3 (managed code).

If I don't use C# Job System for T1 and T2, then I schedule T1 in my custom thread. Wait until it's finished (which can be checked in next frame). When it's done, take it's results and schedule T2 in job system. Again, wait for results - check every frame - and when it's done, finally schedule T3 in my custom thread.

If I do use C# Job System for everything, I schedule T1, T2 and T3 and the same time as jobs using dependency system (T3 depends on T2 which depends on T1). Then I just wait for result of whole T. So thanks to this approach, I avoided delays before scheduling T2 and T3.

So even thought using managed C# code in Unity C# Job System isn't faster (because it doesn't use Burst compiler), it's faster in the context of whole application.
Click to expand...

- If i'm not mistaken, you can configure the worker thread count
- In your example, T1, T3 could be implemented with c# native task. It seems to me that it's faster and a lot easier to integrate with existed code.

Aithoneku · Sep 14, 2018

vutruc80 said: ↑

If i'm not mistaken, you can configure the worker thread count
Click to expand...

AFAIK I can't. AFAIK, that's internally handled by Unity. (If you mean Unity C# Job System's worker threads.)

vutruc80 said: ↑

In your example, T1, T3 could be implemented with c# native task. It seems to me that it's faster and a lot easier to integrate with existed code.
Click to expand...

Easier, yes. Faster, it depends. In terms of individual tasks, yes, it would be faster because there wouldn't be overhead of Unity C# Job System and allocating unmanaged reference. However, time since start of T1 to end of T3 will be greater due to delays between scheduling each sub-task.

Example time. For simplicity, let's say we target 30 fps, so we have 33ms per frame. And let's say each sub-task (T1, T2, T3) takes 10ms.

When managed code is executed from non-job threads:

Frame #1: schedule T1. It runs on my thread, takes 10ms, rest of the time the thread sleeps.

Frame #2: main thread fetches result of T1 and schedules T2. It runs on Unity job system worker thread, takes 10ms, rest of the time the thread sleeps.

Frame #3: main thread fetches result of T2 and schedules T3. It runs on my thread, takes 10ms, rest of the time thread sleeps.

Frame #4: main thread fetches result of T3.

When managed code is executed using job threads:

Frame #1: schedule T1, T2 and T3 using dependency system. Unity job system worker thread: executes T1 in 10ms. Immediately it can start T2 which also takes 10ms. When it's finished, it starts T3. Together, it took 30ms.

Frame #2: fetch results of T3.

vutruc80 · Sep 14, 2018

If IJob.Schedule() can only be called from main thread then you are right. Once we start messing with Unity, we have to stick with it

Aithoneku · Sep 14, 2018

vutruc80 said: ↑

If IJob.Schedule() can only be called from main thread then you are right. Once we start messing with Unity, we have to stick with it
Click to expand...

Exactly

vutruc80 · Sep 15, 2018

There is a work-around : you can use all Unity hook points to schedule T2 and T3 like Update, LateUpdate, OnRenderObject, OnPreRender, WaitEndOfFrame. That will provide finer grain than one frame delay.

vutruc80 · Sep 15, 2018

Or better if you know T1 only run in one frame, after schedule T1 in Update(), in OnRenderObject() you could use AutoResetEvent.Wait before schedule T2, and at the end of T1, you call AutoResetEvent.Set to unblock the main thread.

Aithoneku · Sep 17, 2018

vutruc80 said: ↑

There is a work-around : you can use all Unity hook points to schedule T2 and T3 like Update, LateUpdate, OnRenderObject, OnPreRender, WaitEndOfFrame. That will provide finer grain than one frame delay.
Click to expand...

vutruc80 said: ↑

Or better if you know T1 only run in one frame, after schedule T1 in Update(), in OnRenderObject() you could use AutoResetEvent.Wait before schedule T2, and at the end of T1, you call AutoResetEvent.Set to unblock the main thread.
Click to expand...

For me, using GCHandle, is the best one.

I can schedule all the steps at once at one point instead of tracking it all around code.

Work-around using all Unity hooks shortens the delay, but it's still bigger than when worker thread starts next step immediately. I don't think you can top that.

T1 running one frame - it was just an example trying to explain that scheduling manually in main thread will always, always add some delay. The duration of job is really unpredictable and different on each machine, so you cannot make any kind of assumptions (like "it will be finished next frame").

IMHO work-around you suggest adds also more complexity in the code.

Manufacture43 · Sep 20, 2018

I like the GCHandle approach too. The only drawback I see is the GC Alloc it causes.

In Aithoneku's code, I see a mention to switch it to an IntPtr. How would you use it then ?

I'm trying to use UnsafeUtility.PinGCObjectAndGetAddress instead, which gives you a ulong handle instead of an actual GCHandle. I'm not sure how to convert either the pointer or the ulong handle to the managed object...

Aithoneku · Sep 21, 2018

Manufacture43 said: ↑

I like the GCHandle approach too. The only drawback I see is the GC Alloc it causes.
Click to expand...

You can always use object pooling - the pool could use a structure which keeps both managed object and it's GCHandle. Of course you still need handle deallocating the memory on appropriate time.

Manufacture43 said: ↑

In Aithoneku's code, I see a mention to switch it to an IntPtr. How would you use it then ?
Click to expand...

My comment says "if there will be problems with instance of GCHandle in future, it can be converted to/from IntPtr" - the point is, that I wasn't sure whether there will be some problems with using instances of GCHandle with Job system and I noticed methods ToIntPtr and FromIntPtr - in both cases you simply store unsafe pointer in a structure.

Manufacture43 said: ↑

I'm trying to use UnsafeUtility.PinGCObjectAndGetAddress instead, which gives you a ulong handle instead of an actual GCHandle. I'm not sure how to convert either the pointer or the ulong handle to the managed object...
Click to expand...

Well, I can see ways to convert between them, but I don't know whether the following is safe. I don't know unsafe part of C# enough to know the dangers of following!

IntPtr is just a container of pointer. First, I can see there are constructors for IntPtr taking void pointer and long integer. It's not ulong and I don't know whether it's ok in this case to cast between them. Then, you can convert it to GCHandle. But as I wrote, I don't know whether it's safe. All I can do is recommend to study how these things work.

But what can I write further is very important rule: if you allocate something, anything, you should deallocate it by same source (class, unit, etc.). So if you allocate some handle with UnsafeUtility.PinGCObjectAndGetAddress, deallocate it only with UnsafeUtility.ReleaseGCObject (and don't use, for example, GCHandle.Free). This rule should be kept even when you switch your system, programming/scripting language, etc. Behavior of such action is undefined (unless explicitly stated)! That means "it might work now, it might work next week, but nothing guarantees that it will work later or when you change configuration (debug/release) or system or anything else".

Manufacture43 · Sep 24, 2018

Of course I was going to use ReleaseGCObject... Anyway, I did go the pooling route, but I would be nice to have it garbage free by default! The way I did it is that my class is disposable, so I must not forget to dispose it...

Code (CSharp):

// Usage:

// struct MyJob : ManagedJob.IWork

// {

// void Execute() { }

// }

// ...

// ManagedJob job = new ManagedJob() { Work = new MyJob(); }

// JobHandle handle = job.Schedule();

// handle.Complete();

// job.Dispose();

struct ManagedJob<T> : IJob, IDisposable where T : class, ManagedJob<T>.IWork // T must be class so that Work is mutable

{

public interface IWork

{

void Execute();

}

GCHandle handle;

public T Work

{

set

{

handle = GCHandle.Alloc(value);

}

get

{

return (T)handle.Target;

}

}

void IJob.Execute()

{

IWork task = (IWork)handle.Target;

task.Execute();

}

public void Dispose()

{

handle.Free();

}

}

struct ManagedParallelFor<T> : IJobParallelFor, IDisposable where T : class, ManagedParallelFor<T>.IWork

{

public interface IWork

{

void Execute(int index);

}

GCHandle handle;

public T Work

{

set

{

handle = GCHandle.Alloc(value);

}

get

{

return (T)handle.Target;

}

}

void IJobParallelFor.Execute(int index)

{

IWork task = (IWork)handle.Target;

task.Execute(index);

}

public void Dispose()

{

handle.Free();

}

}

I think my thing is maybe a bit overcomplicated. I can't remember why I went with that generic mess instead of just having my managed jobs inherit from ManagedJob

funkyCoty · Feb 29, 2020

Joachim_Ante said: ↑

You could use the low level API's to make your own job type.

https://github.com/Unity-Technologi...ter/Documentation/content/custom_job_types.md

A simple way would be to use GCHandle to keep a reference to a managed object, then on the custom job convert it to the required interface and invoke some class or whatever you need to run your non-burst C# code.

In this setup you lose
* Safety system, you have full access to C#, any memory and there will be nothing to report race conditions
* Burst can't be used

But at least you can have jobs with dependencies on other jobs.

So i would suggest keeping as much of the code in NativeContainers + real C# jobs. And then the code that can't be ported because you have dependencies on code you don't control keep that in those GC handle based custom managed job types.

Optimally of course over time you find a way to convert all of the C# code to be burst compliant with good data layout. So that you actually get speedups beyond just running the code in parallel.
Click to expand...

Note: If any jobs in the dependencies are touching native containers directly, rather than through a GCHandle, the safety system breaks down in the editor.

Example:
Job A touches a native container in burst'd job or otherwise
Job B depends on Job A, but only uses a GCHandle to also touch the same native container

In the editor, this will throw exceptions about safety and not work. In standalone builds, however, this does work as it should.

PrimeAlly · Apr 6, 2020

There are a couple of gotchas to this:
1. Future versions of Unity will prevent global variable access from jobs using static analysis
2. Allocating managed memory in jobs is incredibly slow

https://docs.unity3d.com/Manual/JobSystemTroubleshooting.html

I'm not sure there is a way around this if you want to go for the managed approach? @Joachim_Ante @Aithoneku

In my example testing, for thousands of string allocations looped in parallel (IJobParallelFor) there is a huge penalty compared to running it in the main thread... OTOH that seems to also be true for the Task Parallel Library so perhaps my string test is not suitable to multithreading in general.

XOuYang · Apr 1, 2022

Joachim_Ante said: ↑

Yes.
Click to expand...

hi bro, i am confused with unity main thread and job thread. Why is the battle logic running on the unity main thread time consuming normally,but running on the job worker thread cost much more then main thread， I checked job thread is running on big core， What is the difference between job work thread and unity main thread ? they are running on big core. can you give me some tips?

Search Unity

[Solved] C# Job System VS Managed threaded code

Aithoneku

Joachim_Ante

Unity Technologies

Aithoneku

Aithoneku

Aithoneku

Joachim_Ante

Unity Technologies

Joachim_Ante

Unity Technologies

Aithoneku

Joachim_Ante

Unity Technologies

vutruc80

Aithoneku

vutruc80

Aithoneku

vutruc80

Aithoneku

vutruc80

vutruc80

Aithoneku

Manufacture43

Aithoneku

Manufacture43

funkyCoty

PrimeAlly

XOuYang

Search Unity

Unity ID

Useful Searches

[Solved] C# Job System VS Managed threaded code

Unity Technologies

Unity Technologies

Unity Technologies

Unity Technologies