Search Unity

Calling Jobhandle.complete() on completed jobs takes very long

Discussion in 'Scripting' started by davidus2010, May 4, 2021.

  1. davidus2010

    davidus2010

    Joined:
    Dec 4, 2017
    Posts:
    72
    So, i have a script which schedules a number of burst compiled jobs used to generate meshes. The jobs are schedules using the.schedule() method. After all jobs have been schedules, i call
    JobHandle.ScheduleBatchedJobs() to ensure they all start. I store the data of the job and the job handle in a list.
    Because i do not need the jobs to finish immediatly (they can take as many frames as they need), i want to wait for them to finish. I do that by iterating over all pending jobs, and check if they are done using "handle.IsCompleted". If they are completed, i call "handle.Complete()" so i can access the job's result data again on the main thread. However, this takes up a significant amount of time. Checking the profiler reveals that the jobhandle.complete() call ends up having multiple "Semaphore.waitforsignal" calls. If i understood the docs correctly, this means that it's still waiting for some jobs to finish, correct? If so, how do i wait for the jobs to be finished, and access the data results produced by them?
    Another thing that got me confused was the way the jobs are scheduled. I can see that the jobs that using multiple worker thread at the beginning, but after a short time only one worker is left with most of the remaining work, instead of the job being split up somewhat evenly. I am woking on a 12-core system, so i expected there to be more than 1 thread active (as seen in the profiler screenshots).
    Profiler screenshots:
    upload_2021-5-4_19-31-24.png
    upload_2021-5-4_19-32-47.png
     
  2. dgoyette

    dgoyette

    Joined:
    Jul 1, 2016
    Posts:
    4,196
    Can you show some of the job code itself? And are you sure that there isn't an issue with burst compilation? I'd try to convince myself that it's actually compiling for burst, as there are various thing you can do in your code that prevent Burst Compilation from working, even if you specify it on the job. First thing to do would be go to Jobs -> Burst -> Inspector..., find your job, and see that it's showing properly compiled assembly, rather than some burst compilation error.

    The profiler view reminds me a lot of a job I made, which wasn't a parallel job, and burst compilation wasn't even working due to some problems with my code. In that case, it basically just used a single thread, and took a very long time to run.
     
  3. davidus2010

    davidus2010

    Joined:
    Dec 4, 2017
    Posts:
    72
    Hi, thanks for the quick response. I already checked the burst inspector, also the profiler confirmed that the jobs are burst compiled. However, even if the weren't burst compiled, that's not the issue i was having, since that would just effect the total runtime of the job. However, since i am trying to wait for it to finish anyways before "completing" it, it still doesnt explain the huge spikes on the main thread.
     
  4. dgoyette

    dgoyette

    Joined:
    Jul 1, 2016
    Posts:
    4,196
    Well, I think the main thread wait time could just be waiting for the job to finish. So, if the job takes 23ms on a background thread, then the main thread will sit there waiting for the job to complete.

    What kind of job is it? IJobParallelFor? Something else? It just doesn't look like your job is being parallelized.
     
  5. davidus2010

    davidus2010

    Joined:
    Dec 4, 2017
    Posts:
    72
    It's a job of type "iJob". I was trying to avoid having to wait those 23ms on the main thread, thus the mentions "isCompleted" check before calling complete(). Maybe part of my code can help making things more clear (everything but relevant parts removed for better readability):

    //imports, class name, etc skipped

    private Dictionary<int, (JobHandle handle, meshGenerationJob job, Entity ent, Mesh.MeshDataArray dataArray, double startTime)> pendingJobs;

    //other variables are here

    protected override void OnUpdate() {

    //skipped

    Entities.WithoutBurst().ForEach((<whole tons of components here>) => {

    if (pendingJobs.ContainsKey(ent.Index)) return;

    //getting required variables for the job here

    var processJob = new meshGenerationJob {
    settings = nodeSettings,
    meshData = data,
    vertexOffset = vertexOffset,
    lodNode = lodNode,
    sphereConf = sphereConf,
    heightMapSettings = heightMapSettings,
    translation = originalTranslation,
    rotation = rotation,
    bounds = bounds,
    heightMapArray = heightMapArray,
    noiseSettings = noiseConf
    };

    var handle = processJob.Schedule();

    pendingJobs.Add(ent.Index, (handle, processJob, ent, dataArray, Time.ElapsedTime));
    }).Run();

    JobHandle.ScheduleBatchedJobs();

    //complete pending mesh creation jobs
    var toRemove = new List<int>();

    foreach (var elem in pendingJobs) {

    if (!elem.Value.handle.IsCompleted) continue;

    elem.Value.handle.Complete();

    toRemove.Add(elem.Key);

    //handle job results
    }

    foreach (var elem in toRemove) {
    pendingJobs.Remove(elem);
    }
    }
     
  6. dgoyette

    dgoyette

    Joined:
    Jul 1, 2016
    Posts:
    4,196
    Hmm. Without being able to run this myself, I'm probably just speculating. I do find it surprising that you're not seeing more parallelism out of this, given that you're scheduling many jobs. But perhaps the job system doesn't try to distribute multiple jobs among multiple threads?

    My experience with jobs is with IJobParallelFor, where I'll generally create a single job handle, and provide the job with a large NativeArray of data. The data is then processed across all available threads, as hoped. I don't know whether that's necessarily better, or more performant, than your approach of creating a very large number of individual jobs. Have you tried populating a NativeArray with the data used by your job, and using IJobParallelFor in order to process it? It seems like everything you're passing into the meshGenerationJob constructor could be placed into a Struct, and your job contains a NativeArray of those structs.

    Does the profiler give you any more detail if you turn on deep profiling and look at the hierarchy view? I know that, unfortunately, just doing all the work to populate the data the job will use often takes much more time than it actually takes the job to process the data... I'm not sure if your "//getting required variables for the job here" section is taking a long time here.
     
  7. davidus2010

    davidus2010

    Joined:
    Dec 4, 2017
    Posts:
    72
    Hm, i will try taking a look at IJobParallelFor, but i doubt that that would solve the root cause of my issue. I also profiled with deep profiling turned on, but that also ended at the level of "Semaphore.waitForSignal", leading me to believe that it is indeed waiting for the worker thread(s) to finish on the main thread. Which is something that shouldn't happened, since i check for job.isCompleted before calling the complete() method. So what's confusing me is that it appears to be waiting for a job to finish when calling complete(), although i only call complete when the job has "isCompleted" set to true
     
  8. dgoyette

    dgoyette

    Joined:
    Jul 1, 2016
    Posts:
    4,196
    I guess there's a fair bit going on here that's different than how I approach things. It seems like your initial ForEach is causing all the job creation itself to be performed on a single background thread? But you've disabled Burst for that, which is probably why it's doing that phase so slowly. I'm trying to understand if the OnUpdate_Lambda you see on Worker 3 in your profiler screenshot is 100% the ForEach running, just scheduling the jobs, and none of that is the execution of the jobs itself.

    Instead of .Run on your ForEach, have you tried .Schedule?

    As you explain this more, this does start to sound like one of those cases where you need to do so much work just to get the data into the background thread, that it doesn't save any time.
     
  9. davidus2010

    davidus2010

    Joined:
    Dec 4, 2017
    Posts:
    72
    Hi, the first entities.foreach job has to a number of tasks on the main thread, and i also confirmed that it is not the where the performance issues are coming from. I can't execute it using .schedule, since i am using that loop to create a number of jobs which are schedules from within this loop. The complete() call which is seen in the profiler screenshots that is causing the main thread freezes is located in the foreach loop that handles the job results:

    foreach (var elem in pendingJobs) {

    if (!elem.Value.handle.IsCompleted) continue;
    //this here is causing the issue:
    elem.Value.handle.Complete();
    ...

    I am using this architecture to allow jobs to be running over multiple frames, which wouldnt work using a normal "entities.foreach().scheduleParallel"
     
  10. nomadshiba

    nomadshiba

    Joined:
    Jun 22, 2015
    Posts:
    27
    Old question but i just had the same issue.

    From my tests i found the issue:
    When all of the worker threads are busy with a job or jobs.
    Even if its too short when if you call JobHandle.Complete for another job, Jobs System first waits for one of the workers threads to be free, And shows the current job keeping the worker thread busy in the Profiler.

    So i believe thats because so the job doesnt effect the core main thread running on.

    As a solution if your job is IJobFor you can make the batch size something like `arr.Length / 2` so it only uses 2 workers or cores, not all of them. You can also use `JobsUtility.JobWorkerCount`, to decide how many workers you wanna use.
    And for IJob you can limit the number of JobHandles you combine. So you are not keeping all of the workers busy. So you can run other short running jobs in the frame.
     
    Last edited: Aug 31, 2022