Background So I am evaluating what order of magnitude of objects I can expect my game to handle, and I have set up a barebones physics demo to check out a large number of simultaneous collisions. Set up a floor which is a large plane with a static Physics Body, and a Plane Physics Shape. a single system which at start spawns 20.000 cubes from a prefab which has a dynamic Physics Body, and a Sphere Physics Shape. the cubes are spawn with random locations within a set volume so that they all have some air time before they reach the floor; and the spawning volume can be considered fairly packed. I also have various settings to optimize performance: a Physics Step with Unity Physics, 1 iteration count, and Multi Threaded on. (I have tried Havok as well with little to no difference. Jobs with Use Job threads, no Jobs Debugger, and leak detection off. Burst compilation with Synchronous Compilation and Safety checks off. disabled all shadows. Results The spawning, as well as its packed nature, results in a high frame time initially; however, after settling down and during free fall I get a CPU frame time of 20 – 40 ms. This looks something like this: And profiling it yields this hierarchy: Then after the majority of cubes has reached the floor, and very frequently collides with it, and/or with other cubes, I get a CPU frame time of 400 – 500 ms. This looks like so: and produces this profile hierarchy: In other words, my bottleneck seems to happen for WaitForJobGroupID The lion's share of which consists of: BroadPhase:DynamicVSDynamicFindOverlappingPairsJob (Burst) NarrowPhase:ParallelCreateContactsJob (Burst) Solver:ParallelBuildJacobiansJob (Burst) Semaphore.WaitForSignal Solver:ParallelSolverJob (Burst) Question My question is simply, is this result to be expected? And/or is there any more optimizations I can try? Thank you!