Search Unity

Jobs are burstified in Editor, but not on standalone x64 windows client

Discussion in 'Data Oriented Technology Stack' started by sebas77, Jan 12, 2020.

  1. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    I start saying that this is the first time I am having this problem on our project. So far we used the job system and burst only for few things, mostly related to Physic ECS and Havok. I spent the last 3 weeks to jobify/burstify a considerable amount of code (much more must be done) and I learned a lot from it. What I didn't want to learn was to find out that the code that works fine in the editor, is not burstified once the client is built. I checked the log and I don't see anything wrong with it, but when I profile the code, all the jobs result not burstified.

    Log:

    D:\RobocraftX\Library\PackageCache\com.unity.burst@1.2.0-preview.11\.Runtime\bcl.exe exited after 27630 ms.
    stdout:
    Creating library C:\\Users\\Seb-e-Sam\\AppData\\Local\\Temp\\burst-aototqbdch2.qf5\\lib_burst_generated.lib and object C:\\Users\\Seb-e-Sam\\AppData\\Local\\Temp\\burst-aototqbdch2.qf5\\lib_burst_generated.exp
    Compile and Link 442 methods successful in 26891ms: D:\RobocraftX\Temp\StagingArea\Data\Plugins\lib_burst_generated.dll

    Profiler:

    upload_2020-1-12_11-49-29.png

    those jobs are burstified in the editor and they have been previously burstified on a standalone client too. I didn't change the burst version nor that code during the last three weeks, they were previously working as expected in the client. I need direction to understand what's going on. On Monday I will continue this investigation.
     
  2. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    1,692
    AOT compilation enabled in preferences?
     
  3. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    754
    Are you using generic jobs?
     
  4. GilCat

    GilCat

    Joined:
    Sep 21, 2013
    Posts:
    548
    By the looks of the timeline profiler it doesn't seems like it.

    Try deleting BurstAotSettings_StandaloneWindows.json and the other possible platform json files in project setting folder. I've seen people having problems with that not resetting properly.
     
    steveeHavok and sebas77 like this.
  5. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    Hi,

    I am back to the office and my colleague reminded me that this is actually an old problem. It's a bug that it happens with some Havok jobs that arises ONLY when a dev client is built (which I need to profile it). I forgot about it, as it works fine when a release client is built.
     
    Last edited: Jan 13, 2020
  6. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    Anyway to ensure their compilation in non dev build? I am having the same problem.
     
  7. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    It should be a burst bug, but I am not sure why it happens only with havok jobs, so I am discussing it with the Havok guys @steveeHavok .

    @wobes Do you have this problem with Havok too?
     
  8. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    I have this problem with explicitly typed generic jobs. They were working on Burst 1.1 as far as I remember, but not anymore so I was wondering if what you have said about non dev build can be verified.
     
  9. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    OK then it's the case with Havok as well. @xoofx shall we open a bug report for this?
     
  10. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    It might be related to generic jobs if they use any in their sources.
     
  11. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    yes it matches the Havok case. The Havok jobs are partially burstified, the ones that are not, are actually generic, so good catch!
     
  12. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    43
    Is there a small repro of the issue that you could possibly create? That'll help us track down where the issue is.
     
  13. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    It's probably simple to make a repro, I will give a go
     
    wobes, Opeth001 and sheredom like this.
  14. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    I would also appreciate if you reproduce the issue. Glad that I am not alone.

    Thank you.
     
    Last edited: Jan 13, 2020
  15. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    It's not so simple, but I am doing it.
    The reason is because (as I remembered while doing it) generic structs are not considered unmanaged, so they cannot be used by Burst. What people do to solve this, is a semi-ugly work-around which is probably the culprit of the issue.
    They use a combination of CreateJobReflectionData and JobScheduleParameters.


    I created a test, but it has a slightly different side effect. The job is burstified and runs in the editor, but if I build a client it doesn't run at all. It runs when I stop using [BurstCompile]. Do you think it's good enough?

    Edit: I have a better understand of what's going on now. First it's not true that generic jobs cannot be burstified (I wonder what problem I hit at the begin of this test now), second the pattern using CreateJobReflectionData and JobScheduleParameters is used actually to introduce a new kind of Job interface (akin to IJob, IJobParallelFor and so on).
     
    Last edited: Jan 13, 2020
  16. sheredom

    sheredom

    Unity Technologies

    Joined:
    Jul 15, 2019
    Posts:
    43
    Good enough starting point at least :) If you can share that it'd be grand!
     
  17. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    I am spending more time on it because I am finding out that some previous understanding of mine were wrong (it's not true that generic jobs cannot be burstified per se), can you show me the code of the jobs that are not working for you?
     
  18. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    For instance.

    The job itself.
    Code (CSharp):
    1. [BurstCompile]
    2.     public unsafe struct FieldChangeDetectionJob<T1> : IJobChunk where T1 : unmanaged, IComponentData
    3.     {
    4.  

    The job wrapper class. That later on will be instantiated via Activator.
    Code (CSharp):
    1. public class FieldChangeDetector<T1> : FieldChangeDetector where T1 : unmanaged, IComponentData
    2.     {
    3.         public unsafe override JobHandle Schedule(JobComponentSystem jobComponentSystem, JobHandle jobHandle)
    4.         {
    5.             return new FieldChangeDetectionJob<T1>
    6.             {
    7.  
    Explicitly typed job so it is visible in Burst inspector and included during AOT.

    Code (CSharp):
    1.  public unsafe struct TestingEntityArchetype1 : IReplicatable<Health>
    2.     {
    3.         public ReplicationTypeModel ReplicationTypeModel => new ReplicationTypeModel
    4.         {
    5.             PriorityConstant = 1000,
    6.             PriorityDistanceMultiplier = 3000
    7.         };
    8.  
    9.         public FieldChangeDetectionJob<Health> FieldChangeDetectionJob => new FieldChangeDetectionJob<Health>();
    10.     }
    Above will work on earlier version of Burst 1.0. But not on the current.
    To ensure that Burst sees the job I use Burst inspector to see if it is compiled. If it is then it should be Bursted in build. At least that was the behavior with Burst 1.0
     
  19. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    1,955
    I don't think that's true.
    Any job in editor that is executed will be bursted (that can) as it's JIT but only jobs that can be determined by the AOT compiler will be bursted in a build. This is why generic jobs don't work unless explicitly specified because the AOT is not smart enough to figure out what generic versions are required.
     
    Last edited: Jan 13, 2020
  20. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    They are explicitly typed in my property. Hence they work. They are not working if they are not visible in Burst Inspector. I tested it multiple times and made generic works. It broke with Burst update. I tested it in in dev build on Burst 1.0 and they were Bursted. But they are no longer Burst ed with the new Burst update and I do not know why. The jobs are even included in lib_burst_generated.dll, I can see their tokens.

    Code (CSharp):
    1.  public FieldChangeDetectionJob<Health> FieldChangeDetectionJob => new FieldChangeDetectionJob<Health>();
     
    Last edited: Jan 13, 2020
  21. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    Tomorrow I will work more on it. Initially I thought the problem was just about generic jobs, but it's not true. Then I thought it was due to generic jobs implementing custom job interfaces, but that's not the case either. So now I am not sure anymore that reproducing the havok case is so simple.
     
  22. tim_jones

    tim_jones

    Unity Technologies

    Joined:
    May 2, 2019
    Posts:
    47
    Right at the bottom of the Burst User Guide it has this under Known Issues:

    @wobes it's interesting if your workaround worked in previous versions of Burst. Did it work for AOT compilation too? It's not officially supported behaviour, but this is something we hope to improve in future releases.
     
  23. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    This should work. We haven't changed this behavior from 1.0 to 1.2. Burst is able to detect explicit generic instance. What burst does not support is scheduling of generic jobs (I explained this a bit here) for which this Job is never explicitly used with actual generic arguments (open generics vs closed generics). If you have still the issue, then we have a regression but I would be surprised because it would break many existing complex systems that are already in used in Unity DOTS.

    If your Job is not listed in the inspector, that means that it is not able to detect your job because it is missing BurstCompile attribute or it is hidden behind a generic job scheduling.
     
    Last edited: Jan 14, 2020
  24. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    @xoofx It is visible in inspector but not burstable in build anymore. I will make a reproduction of it. Thank you.

    @tim_jones It worked in AOT yes.
     
  25. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    Btw, Is it a Mono build or an IL2CPP build? Wondering if there was a change in the UnityLinker process workflow that would run before burst and could remove these generics from the assemblies we are supposed to compile...
     
  26. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    @xoofx Mono. ILC2PP never worked for me. But Mono worked with Burst 1.0.
     
  27. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    Update:

    The simplest way to reproduce the Havok problem is to use the Unity Physic Examples. Since Havok/Physic use custom job interfaces, in this case ICollisionEventsJob, my intuition was related to how ICollisionEventsJob is registered/used, but reproducing the same steps didn't yield any unexpected result. Hence I guess the issue must be related to the code written in IHavokCollisionEventsJobExtensions, but Burst doesn't complain at all about it, it just doesn't compile/link/generate the code if a build client is built.
    The simplest/most reliable way to prove it is to check if the code is debuggable. In the editor the code is undebuggable because burstified, in the client, once attached to the debugger, the code can instead be debugged.

    Bug report is here: https://fogbugz.unity3d.com/default.asp?1211429_8tan1sjskrsgc82s

    Nobody double checked it, so I hope to not have done something stupid.
     
    Last edited: Jan 14, 2020
  28. wobes

    wobes

    Joined:
    Mar 9, 2013
    Posts:
    731
    @xoofx, @tim_jones It turned to be a bug on my side. My approach now fully works with the latest version of Burst.
    The problem was my Ordinal sorting of components by their name per their archetype for caching purpose.

    So for instance even though such job will be visible in Burst inspector

    Code (CSharp):
    1. public FieldChangeDetectionJob<Health, Mana, Energy> FieldChangeDetectionJob => new FieldChangeDetectionJob<Health, Mana, Energy>();
    It will be scheduled as <Enerhy, Health, Mana> because I sorted it that way, so that is why it was not Bursted initially.

    Thank you for your attention and sorry for the false alarm.
     
  29. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    Thanks for the repro. This is indeed an issue in Havok that should actually redirect the JobProducerType to the Havok assembly, but this is a scenario not supported today, so that means that any Havok jobs will go through Mono instead.

    This is requiring to bring support for this in burst with a new kind of "JobProducerType" (allowing to define a Job Execute entry point method in a different assembly by type name string and not by System.Type) and Physics/Havok will have to use this new system. We are tracking the issue, not sure what can be the ETA of this, but I'm worried that won't come before 1.4-preview (After GDC and before June...).
     
    sebas77 likes this.
  30. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    thank you a lot, I just wonder why it is a debug only problem as in release it actually works.

    Edit: @xoofx I also want to highlight that a very similar mechanism is used for Unity Physic ECS and it doesn't give any problem in that case. IT's only when I enable the havok physic that it breaks and the JobPRoducerType code looks very similar to me.
     
    Last edited: Jan 16, 2020 at 1:01 PM
  31. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    Even in release, the code running in Havok is not burst compiled. Without Havok, Physics should run with Burst. The problem is that JobProducerType doesn't support to redirect the implementation to another assembly (Havok), which means that Burst cannot detect that, so the managed version of Havok jobs are going to run.
     
  32. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    @xoofx let me understand if I got it right. Are you saying that my c# code (as seen in the screenshot in my first post) cannot be burstified because it's used in a callback inside the Havok package (using the JobProducer pattern), but it is correctly burstified in the normal Physics (that uses the same JobProducer pattern too)?

    Even if this is correct, our game is currently live with havok and if I build the same client in Debug the frame rate drops to 6 fps.
    if I compare the Editor (which burstifies the job) with the Debug client, I can see that all the ICollisionEventJob and IContactsJob that we implement, take N times longer on the client. Something doesn't add up with what you say.
     
    Last edited: Jan 16, 2020 at 6:01 PM
  33. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    If it is debug with havok and mono, debug mode with mono is going to generate crappy code. I would believe that what you see is Mono JIT not performing well. When Havok is active, nothing will run with Burst (for jobs that are shared between Havok and Physics)
     
  34. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    I always thought that there wasn't much difference between debug jitted code and release jitted code in Unity (or at least I have never noticed it so far in 7 years). Anyway I think it's the case because that huge difference in performance is only on some specific functions, that are the ones I listed. Now that I think about it, I must have misunderstood what you are saying, because obviously when I profile in the editor, that is debug code and it is much faster than the code produced for the debug client for those specific jobs, while the rest of the code has similar performance.

    Take this in consideration: we support both Havok and Physic and we didn't notice much difference in performance between the two. If we did, we wouldn't have supported officially Havok. I am still not convinced, the only alternative explanation would be that Physic is not burstifying either.

    Edit: extra note, I never mentioned that as workaround I am actually now using the
    BurstCompiler.CompileFunctionPointer to burstify the code inside the jobs, which fixed the issue, in the sense that now the debug client runs as fast as the release one. So I am still convinced that what you are saying is not what we see.

    P.S.: all I wrote here can be reproduced in the bug example I uploaded.
     
    Last edited: Jan 17, 2020 at 2:10 PM
  35. xoofx

    xoofx

    Unity Technologies

    Joined:
    Nov 5, 2016
    Posts:
    246
    I don't know, but on the code sample you joined, I debugged the standalone player directly, and I could verify that the job that it was trying to resolve was not the same at compile time. So the Havok job was not being running with burst but with managed. You can put a Debug.Log in your job with a burst discard, and you will see it immediately on your window.
     
    sebas77 likes this.
  36. sebas77

    sebas77

    Joined:
    Nov 4, 2011
    Posts:
    1,019
    this bewilders me. I have to say using BurstCompiler.CompileFunctionPointer makes the debug client usable for profiling (which means runs at the same frame rate of the release client instead of 5-6 fps), but the Unity Profiler still doesn't shopw the (Burst) text next to the job name. However this is what I see in the profiler when I profile the game in the editor (similar results if I build a debug client using CompileFunctionPointer and more or less equivalent frame time of a release build with or without CompileFunctionPointer )

    upload_2020-1-17_17-9-10.png

    this is what the profiler writes when I run Havok:

    upload_2020-1-17_17-45-44.png

    Same level/configuration/code of course. The havok code is actually faster, I am not even sure why.

    from my first screen shot you can see that if I build a debug client without using CompileFunctionPointer, I get this instead:

    upload_2020-1-17_17-54-34.png

    If we had those times in release build (which doesn't use CompileFunctionPointer) we would have surely noticed
     
    Last edited: Jan 17, 2020 at 5:55 PM
unityunity