Search Unity

Feature Request Add API to restrict the number of workers that can be occupied by groups of jobs

Discussion in 'C# Job System' started by pbhogan, Mar 15, 2023.

  1. pbhogan

    pbhogan

    Joined:
    Aug 17, 2012
    Posts:
    382
    I'm not sure exactly what the API for this would look like, but please create a way to add some kind of group when scheduling jobs, and let us set worker limits (and maybe priority) per group.

    At first I thought it could be attributes on the IJob struct that defines those limits, but I think being able to add multiple job types to the group is a requirement.

    Maybe something like:

    Code (CSharp):
    1. var jobGroupHandle = JobsUtility.CreateJobGroup( maxWorkerCount, priority );
    2. var jobHandle = new MyJob().Schedule( jobGroupHandle );
    Maybe group is not the right name. Layer? Queue? Naming things is hard. :rolleyes:

    I'm running into a situation where periodically flooding the job system with a large number of jobs causes stuttering and pauses, presumably because it's starving Unity's culling, rendering, physics, etc. jobs from workers. I would really like to deprioritize and limit how many workers can be dedicated to certain jobs. I'd rather they take longer to complete than make the game stutter.

    Someone suggested trying to group jobs with IJobParallelFor with batch sizes set to limit worker count, and chaining job handles for subsequent IJobParallelFor schedules to keep it consistent. I'm going to try this, but it greatly complicates things and it also forces me to wait for the entire IJobParallelFor group to be complete to get individual results out.
     
    Burgrokash and Enderlook like this.
  2. metallibus

    metallibus

    Joined:
    Jun 1, 2019
    Posts:
    12
    Yeah, this is quite likely your problem. It seems as though many of Unity's internal systems use Job workers themselves, but have no way of increasing their priority or anything. Therefore if you busy all of the workers with work to do, Unity can't get its own work done and rendering freezes as it tries to finish your jobs so it can then do its own work. This is generally the same problem as this other thread as well as some issues I was having with NavMesh....

    How is a 5 year old job system still able to take down the rendering pipeline? If the point of jobs is to run expensive work somewhere off the main thread, why is it acceptable that giving it too much work to do freezes the main thread? This is basically antithetical to the purpose of the system itself.

    This seems like such a reasonable ask, and is a feature of most multithreading/work pooling systems I've used. I understand with the way jobs are set up, this is possibly slightly trickier, but how this basic feature is still missing after 5 years of development is beyond me.

    If this isn't being built, Unity's internal systems should at the very least have their own queue which gets priority or their own worker threads that can't be clogged by user level work. The fact that I can lockup the rendering pipeline by submitting too much data to "async" NavMesh calls is just insane.
     
  3. pbhogan

    pbhogan

    Joined:
    Aug 17, 2012
    Posts:
    382
    Hopefully they get do to it. There is definitely active work going into the job system.

    With latest 2020.2 releases the work they've done optimizing the job system scheduler seems to have paid off and I'm getting less stuttering. But that's with burst enabled and a lot of CPU cores. As soon as the jobs get longer running, or burst is disabled, the stuttering is back.
     
  4. metallibus

    metallibus

    Joined:
    Jun 1, 2019
    Posts:
    12
    This is the thing... if your job count exceeds active workers, which is based off CPU cores, you're screwed. If your users have fewer CPU cores, you have less jobs you can submit simultaneously before the stuttering occurs.

    I know they've been working on jobs, and I know they've tried to improve scheduling etc, but this is a fundamental problem that needs to be addressed on its own, and until it is, the problem won't go away - just be slightly more or less likely to be run into.
     
  5. pbhogan

    pbhogan

    Joined:
    Aug 17, 2012
    Posts:
    382
    Well, it depends on the jobs. If they're tiny jobs, in my case literally hundreds if not thousands, can still be okay. Or rather, it is much better with the current scheduler optimizations—so, kudos to the team there. Praise where it is due.

    However, when there are longer running jobs, it becomes a problem. And obviously this is an issue on consumer hardware with lower CPU counts and lower clock speeds than our development machines which tend to be more powerful. 30% of users have 4 cores or less. 60% have 6 cores or less. My machine has 10 and I'm running into issues. I can only imagine what would happen if I tried to run this on, say, a Nintendo Switch.

    I'd be happy just to hear Unity acknowledge the problem and intend to address it.
     
  6. MartinMa_

    MartinMa_

    Joined:
    Jan 3, 2021
    Posts:
    455
    Why this tread have 0 answer from Unity dev?You bother to create a thread and noone from Unity cant even reply.They better reply to thread "why i get null reference ex".This is really sad.
     
  7. kevinmv

    kevinmv

    Unity Technologies

    Joined:
    Nov 15, 2018
    Posts:
    51
    @pbhogan Indeed we are aware of the issue and plan to address it but I sadly don't have an exact timeline I can give. We do want to provide better control over grouping jobs and how those groups should be run across worker threads and in relation to other work in the job system. We have some work underway currently for this and we will share more when we can.

    All the best,
    Kev
     
  8. pbhogan

    pbhogan

    Joined:
    Aug 17, 2012
    Posts:
    382
    @kevinmv Thanks for the note Kev! We appreciate knowing it's on your radar.