So I've spent a lot of time trying to figure out why we were seeing massive slow downs when passing C# delegates to be invoked from a native plugin (using delegates + Marshal.GetFunctionPointerForDelegate). We were seeing something of the order of 6 slowdown when the work is executed from the native callback vs calling it directly from C#. Initially I thought it had to do with PInvoke (i.e. when C# passes the delegate down to native side) overhead, but that turned out to be a non-issue. So after a lot of debugging and massive amounts of profiling, I managed to drill down: The ReversePInvokeWrapper that IL2CPP generates is taking 6x longer than the actual work that needs to be done. We're looking at about 0.1ms of time per call cosumed by the reverse pinvoke wrapper vs 0.15ms of actual work being done inside the callback. I managed to instrument and profile a debug build running in IL2CPP, see the screenshot attached (this is using Orbit Profiler). entry 0: RadixSAP_BucketJobProcessorEnki is the actual work being done (~avg 21us) entry 1: ReversePInvokeWrapper_RadixSAP_BucketJobProcessorEnki is the pinvoke wrapper. (avg ~138us) This means that the reverse p-invoke wrapper is taking around 6x time of the actual method being called. To add to the confusion, this only seems to be happening for some methods - if I call an empty method, there's no massive overhead. Im at a loss of how to figure out what's going wrong. Edit: So far I have only tested this on Windows and OSX IL2CPP builds, it happens on both of them - not had time to dig into mobile/consoles yet.