Search Unity

Feature Request Allow to call managed methods from Burst without requiring so much boilerplate

Discussion in 'Burst' started by Enderlook, May 18, 2023.

  1. Enderlook

    Enderlook

    Joined:
    Dec 4, 2018
    Posts:
    52
    Sometimes you want to call managed functions from Burst, but there is no handy way of doing it.

    At the moment the current approach is:

    Code (CSharp):
    1. [UnmanagedFunctionPointer(CallingConvention.Cdecl)]
    2. private delegate bool ManagedFunction(int i);
    3.  
    4. private static object GcRoot;
    5. private static readonly SharedStatic<IntPtr> ManagedFunction = SharedStatic<IntPtr>.GetOrCreate<SomeType>();
    6.  
    7. [MonoPInvokeCallback(typeof(ManagedFunction))]
    8. public static void MyManagedFunction(int i)
    9. {
    10.     // Something managed.
    11. }
    12.  
    13. public static void EnsureFunctionPointersAreSet()
    14. {
    15.     // You must call this method before calling any of your Burst compiled methods to ensure the managed function pointers are initialized.
    16.  
    17.     if (GcRoot == null)
    18.     {
    19.          ManagedFunction del = MyManagedFunction;
    20.          ManagedFunction.Data = Marshal.GetFunctionPointerForDelegate(del);
    21.          GcRoot = del; // If the delegate is garbage, the pointer becomes invalid.
    22.     }
    23. }
    24.  
    25. [BurstCompile]
    26. public static UnmanagedMethod(int i)
    27. {
    28.     bool result = ((delegate* unmanaged[Cdecl]<int, bool>)ManagedFunction.Data)(i);
    29. }
    This approach is very verbose and error-prone. Also, since the `EnsureFunctionPointersAreSet` is actually managed, the Burst compiler can't execute it in a static constructor, so you can't store the function pointer itself in a readonly static field but must wrap it in `SharedStatic<T>`, meaning that calls inside Burst can't be direct, but must be indirect, and probably require to execute the special logic that `SharedStatic<T>` may use to store its content. Also, you require the allocate a delegate, since Unity doesn't support [UnmanagedCallersOnly], so you can't simply take the address of a managed method as it doesn't have the correct calling convention.

    It would be cool if Unity could do this or something similar automatically.
    I mean, Unity already patches calls to static methods with `[BurstCompiler]` to call the bursted implementation instead.
    So it's not too far stretched having the exact opposite: Burst compiler adding support of static managed functions (with unmanaged return type and unmanaged parameters) by some sort of patching like this one.
     
  2. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    5,975
    Calling managed methods from a Burst compiled method is IMO a code smell. You are best advised to make do without the managed method in one way or another even if that means 10 times the effort than what it currently takes you to make that managed call.

    Within Burst, Jobs, Entities the managed world is entirely separate and ought to be treated accordingly. Consider it like calling a web service via POST/GET where you cannot make direct calls either.

    Jumping through hoops to do this is actually a good thing, and it will likely remain rather complicated and convoluted because it requires a safe tunnel through to and back from the managed world. Making this part easier will actually make it too easy for devs to make those managed calls, with disappointing results (loss of performance) and ripe with issues.

    It‘s more often a mindset issue than it is an issue with Burst. Knowing the reason why you make that managed call would help. Can you convert the managed call to Burst? Do you even need to make it at this point, as in: Can you make it preemptively or delay it?

    Most commonly you can often make the managed calls before or after the burst method, and use a value type or native collection to pass values along. Sometimes it will require a redesign of surrounding managed code but the end result will be a much cleaner (and faster) codebase.

    Lastly, you can use your IDE to write/generate that boilerplate code for you and hide it in a separate (partial) file.
     
    chadfranklin47 likes this.
  3. Enderlook

    Enderlook

    Joined:
    Dec 4, 2018
    Posts:
    52
    But sometimes there is simply no API for doing that Burst.
    For example `Physics.ComputePenetration`, `Physics.OverlapCapsuleNonAlloc` or simply creating a `GameObject`.
    Either APIs that can't aren't threadsafe or require managed types to work.
    This is sometimes overlooked as Burst was intended to be used with the C# Job System, and since basically all managed APIs of Unity aren't threadsafe it doesn't make much sense.
    But there are still valid reasons to optimize code with Burst for methods that must be run in the main thread, but still require to call some managed APIs, math is still math regardless of where you call it.
     
    forestrf likes this.
  4. brunocoimbra

    brunocoimbra

    Joined:
    Sep 2, 2015
    Posts:
    679
    I don't want to invalidate the request, but you said
    However, you only gave examples of managed functions with managed parameters. If you are willing to create wrappers for each managed function you want to call, wouldn't the effort to do something like:
    Code (CSharp):
    1. void MyBurstedMethod()
    2. {
    3.     Vector3 position = /* some heavy math */;
    4.     NonBurstedMethod(position);
    5. }
    6.  
    7. void NonBurstedMethod(Vector3 position)
    8. {
    9.   new GameObject().transform.position = position.
    10. }
    Be equivalent to just:
    Code (CSharp):
    1. void NonBurstedMethod()
    2. {
    3.   new GameObject().transform.position = MyBurstedMethod().
    4. }
    5.  
    6. Vector3 MyBurstedMethod()
    7. {
    8.     return /* some heavy math */;
    9. }
    10.  
    That is supported today?

    I think the best of both worlds would be to Unity just give us burst-compatible call for existing APIs (like NativeArray overloads for Physics NonAlloc methods, a burst-compatible API for creating GameObjects, etc).

    For the threaded issue you mentioned, others have requests some way to schedule jobs on main thread that depends on jobs on other threads, this would also solve many issues that we have today when trying to use DOTS in its current state.
     
  5. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    5,975
    Could you show us a Bursted method where you make a managed call as an example?

    If I take the example of creating game objects, like spawning many in a loop, then i would run the burst code that determines what to spawn (probably jobified), it fills a native array of data for spawning, and when the method returns managed code loops over that array and calls new GameObject.

    Meaning you should double check if intermingling managed calls is actually necessary. And don‘t get fooled by thinking that iterating over a list twice or more will always be slower than doing it just once, the opposite may actually be true. It‘s the same with jobs. A single either/or job may actually be slower than two jobs, one for the either and another for the „or“ part of the code. Just by removing the conditional a job can be sped up significantly.
     
  6. Enderlook

    Enderlook

    Joined:
    Dec 4, 2018
    Posts:
    52
    Creating a wrapper for a method is much easier than rewriting the code logic so the outermost method is managed.
    For example, I have something like this:
    Code (CSharp):
    1. private unsafe bool Check(out Vector3 startPosition, out Vector3 middlePosition, out Vector3 finalPosition, out float height)
    2. {
    3.     // Managed code...
    4.  
    5.     bool result = Check(&wrapper, &cameraView, startPosition_, middlePosition_, finalPosition_, height_);
    6.  
    7.     // Managed code...
    8. }
    9.  
    10. [BurstCompile]
    11. private static unsafe bool Check(Wrapper* self, Vector2* cameraView, Vector3* startPosition, Vector3* middlePosition, [Vector3* finalPosition, float* height)
    12. {
    13.     // Unmanaged code...
    14.  
    15.     for (int x = 0; x < vertical; x++)
    16.     {
    17.         for (int y = 0; y < vertical; y++)
    18.         {
    19.             // Unmanaged code...
    20.             Set(x, y, floorCastMaxDistance);
    21.             if (self->Work(startPosition, middlePosition, finalPosition, height))
    22.                 return true;
    23.         }
    24.     }
    25.  
    26.     // Unmanaged code...
    27.     return false;  
    28. }
    29.  
    30. private bool Work(Vector3* startPosition, Vector3* middlePosition, Vector3* finalPosition, float* height)
    31. {
    32.     // Unmanaged code...
    33.  
    34. fallback:  
    35.     // Unmanaged code...
    36.  
    37.     if (ManagedCode1(0, endUpwardDepenetrateInflateHeight, &endPosition, &secondFloorCastPoint, &depenetrationPlaneNormal, &depenetrationScale, endUpwardDepenetrationFallback, &largestCollidersCount))
    38.         goto fallback;
    39.  
    40.     // Unmanaged code...
    41.  
    42.     Depenetration(0, endUpwardDepenetrateInflateHeight, &secondFloorCastHit, &endPosition, &secondFloorCastPoint, &depenetrationPlaneNormal, &depenetrationScale, endUpwardDepenetrationFallback);
    43.  
    44.     // Unmanaged code...
    45.  
    46.     ManagedCode2(&handle, &firstForwardSweepOrigin, &firstForwardSweepDirection, additionalFloorCastForwardOffset, firstForwardDepenetrateInflateRadius, 0, &newHorizontal);      
    47.  
    48.     // Unmanaged code...
    49.  
    50.     if (ManagedCode3(&firstUpwardSweepOrigin, &firstUpwardSweepDirection, firstUpwardSweepDistance, 0, firstUpwardDepenetrateInflateHeight, &playerForward, &scale))
    51.         goto fallback;
    52.              
    53.     // Unmanaged code...
    54.  
    55.     if (ManagedCode3(&secondForwardSweepOrigin, &secondForwardSweepDirection, secondForwardSweepDistance, secondForwardSweepDepenetrateInflateRadius, 0, &planeNormal, &playerForward))
    56.         goto fallback;
    57.      
    58.     // Unmanaged code...
    59. }
    60.  
    61. private bool Depenetration(float inflateRadius, float inflateHeight, RaycastHit* colliderB, Vector3* newEndPosition, Vector3* newSecondFloorCastPoint,
    62.     Vector3?* depenetrationPlaneNormal, Vector3* depenetrationScale, float fallbackDepenetrationDistance)
    63. {
    64.     // Unmanaged code...
    65.  
    66.     Managedcode4(&handle, inflateRadius, inflateHeight, colliderB, newEndPosition, &value);  
    67.  
    68.     // Unmanaged code...
    69. }
    Modifying this logic to prevent any Burst -> Managed call would be pretty messy.
     
  7. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    5,975
    Btw, you should be using float2, float3 etc in Burst compiled code to gain all the speed benefits of using Burst in the first place. That is unless you only pass those along, but as soon as you perform calculations with them use the Mathematics package types to (potentially) get vectorized code.

    What is the actual performance boost range (in percent) that you see from bursting these particular methods?
     
  8. Enderlook

    Enderlook

    Joined:
    Dec 4, 2018
    Posts:
    52
    I haven't used `float3` because the values move back and forth between managed/unmanaged, but I could try to cast the as soon as they enter the Burst land.

    With my current approach, I get a 3% performance improvement. This doesn't sound much, but since this code is run on almost every frame, even small improvements like this one accumulate with other systems also improved by Burst, and all together help us to get a 2 or 3 additional FPS, helping us to reach our target 60 FPS.
     
  9. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    5,975
    These types convert implicity, meaning this is perfectly legal:

    float3 pos = new Vector3();

    Sorry to say that, but you're simply nowhere near the benefits Burst is offering. A 3% increase is just not good for what is possible to achieve with properly bursted code. I have a hunch that this jumping back and forth between managed and bursted code is actually what's destroying any notable performance benefits, and if I remember correctly, that's also what the docs warn about that the transition to managed is costly.

    You could normally take any non-bursted calculation or loop and speed it up dozens to hundreds of times, and that's without even using jobs! I'm certain you could, by properly optimizing (ie vectorizing) those hot code paths, speed it up a lot more than what you currently measure.
     
  10. Modify your algorithm so your unmanaged and managed calls/data are separate and run separately. There is no other even marginally worthwhile way IMHO.
    The second you call into managed code you lose the meaning of burst compilation. I'm not even sure you can do that on any meaningful way. And why would you do that? You're better off to write your entire algo in managed code.
    But if you separate your data and algorithm so you can burst compile the meat of it and then apply changes in managed code (passing the crunched data into it), then you may (more likely) see some advantage of burst compilation.
     
  11. CodeSmile

    CodeSmile

    Joined:
    Apr 10, 2014
    Posts:
    5,975
    Not just some. I wouldn‘t be surprised to see the 3% improvement (if this refers to just the bursted sections) jumping up to 300% or more!
     
  12. Enderlook

    Enderlook

    Joined:
    Dec 4, 2018
    Posts:
    52
    The 3% of the increase was total execution, both managed and managed.

    I noticed that a part of my managed code could be turned into unmanaged and did, though I still have unavoidable managed calls and got a total improvement of 12% to 48% in two common user cases (the percentage includes managed code).

    The managed code can't be removed because it uses `Physics.CapsuleCastNonAlloc`, and `Physics.ComputePenetration`, as both methods do not have Burst compatible equivalents (yes, `CapsulecastCommand` does exits, but `CapsulecastCommand.ScheduleBatch(...).Complete()` is not Burst compatible, so I would have to either do a managed call or rewrite all the logic to have that outside the unmanaged method, as this code is not asynchronous so I can't use C# Job System)

    Interestingly, replacing all `Vector3` and `Quaternion` with `float3` and `quaternion` give a total improvement of 9% 45%, which is a bit less. I hypothesize this is due to the cost of casting them back and forth when travelling to managed land.
     
  13. forestrf

    forestrf

    Joined:
    Aug 28, 2010
    Posts:
    231
    Funnily enough, Physics.ComputePenetration and other physic queries that don't change the physic state are thread safe as stated by PhysX if the world doesn't change during their execution.
    In release builds, Physics.ComputePenetration can be called from another thread or a job without throwing errors because the "is in main thread" check is removed, 99% surely for performance reasons.

    With other unity API I wouldn't be safe, but this is something you may want to consider. At least it's a solution you can try if you end up needing that extra perf and rewriting half of your code for months is not an option, but you must make sure the physics world doesn't change.

    More info https://gameworksdocs.nvidia.com/PhysX/4.1/documentation/physxguide/Manual/Threading.html (afaik Unity uses PhysX 4.x)