Search Unity

  1. Welcome to the Unity Forums! Please take the time to read our Code of Conduct to familiarize yourself with the forum rules and how to post constructively.

OverlapSphere vs. EntityQuery & distance check performance comparison

Discussion in 'Physics for ECS' started by Cramonky, Sep 27, 2021.

  1. Cramonky


    Apr 1, 2013
    Hi all,

    Often times you may have a situation where you need to find entities that are within a given radius. For example in my case, a turret AI that needs to search for targets within a specified radius.

    My first instinct was to use OverlapSphere, but another potential solution I thought of was to just check the distance of every entity with a certain tag using an EntityQuery. In addition for every entity I also have to do a raycast to ensure that the AI actually had line-of-sight to the target entity.

    I made a quick test of this and frankly the performance difference I am seeing is far more than I expected, see for yourself:

    Doing an entity query and checking the distance (square distance, really) is over 23 times faster than doing an overlap sphere. For reference, here is what the scene looks like, just a random splattering of objects:

    There are 50 spheres and 50 cubes. The spheres are the main entities. They are the origins of the overlap spheres and distance checks that check for all other spheres. The cubes are just there to simulate some kind of environment to raycast against.

    Finally, here is my code I am using for testing.erything.

    OverlapSphere system + Tag component and system group
    Code (CSharp):
    1. using Unity.Collections;
    2. using Unity.Entities;
    3. using Unity.Physics;
    4. using Unity.Physics.Systems;
    5. using Unity.Transforms;
    7. [GenerateAuthoringComponent]
    8. public struct TestOverlapSphereVsQuerySystemTag : IComponentData { }
    10. public class TestPerformanceSystemGroup : ComponentSystemGroup { }
    12. [UpdateInGroup(typeof(TestPerformanceSystemGroup))]
    13. public class TestPerformanceOverlapSphereSystem : SystemBase
    14. {
    15.     private const float testSphereRadius = 100f;
    17.     private BuildPhysicsWorld buildPhysicsWorldSystem;
    19.     protected override void OnCreate()
    20.     {
    21.         buildPhysicsWorldSystem = World.GetOrCreateSystem<BuildPhysicsWorld>();
    22.     }
    24.     protected override void OnUpdate()
    25.     {
    26.         //Using overlap sphere
    27.         PhysicsWorld physicsWorld = buildPhysicsWorldSystem.PhysicsWorld;
    28.         NativeList<DistanceHit> outHits = new NativeList<DistanceHit>(Allocator.TempJob);
    30.         Entities
    31.             .WithAll<TestOverlapSphereVsQuerySystemTag>()
    32.             .WithReadOnly(physicsWorld)
    33.             .WithDisposeOnCompletion(outHits)
    34.             .ForEach((in LocalToWorld localToWorld) =>
    35.             {
    36.                 CollisionFilter overlapSphereCollisionFilter = new CollisionFilter()
    37.                 {
    38.                     BelongsTo = 1u << 31,
    39.                     CollidesWith = 1 << 2,
    40.                     GroupIndex = 0
    41.                 };
    43.                 if (physicsWorld.OverlapSphere(localToWorld.Position, testSphereRadius, ref outHits, overlapSphereCollisionFilter))
    44.                 {
    45.                     for (int i = 0; i < outHits.Length; i++)
    46.                     {
    47.                         LocalToWorld l2w = GetComponent<LocalToWorld>(outHits[i].Entity);
    49.                         CollisionFilter raycastCollisionFilter = new CollisionFilter()
    50.                         {
    51.                             BelongsTo = 1u << 31,
    52.                             CollidesWith = 1 << 1,
    53.                             GroupIndex = 0
    54.                         };
    56.                         RaycastInput raycastInput = new RaycastInput()
    57.                         {
    58.                             Start = localToWorld.Position,
    59.                             End = l2w.Position,
    60.                             Filter = raycastCollisionFilter
    61.                         };
    63.                         if (physicsWorld.CastRay(raycastInput, out Unity.Physics.RaycastHit raycastHit))
    64.                         {
    66.                         }
    67.                     }
    68.                 }
    70.             }).Run();
    72.     }
    73. }
    EntityQuery system:
    Code (CSharp):
    1. [UpdateInGroup(typeof(TestPerformanceSystemGroup))]
    2. public class TestPerformanceEntityQuerySystem : SystemBase
    3. {
    4.     private const float testSphereRadius = 100f;
    6.     private BuildPhysicsWorld buildPhysicsWorldSystem;
    7.     private EntityQuery entityQuery;
    9.     protected override void OnCreate()
    10.     {
    11.         buildPhysicsWorldSystem = World.GetOrCreateSystem<BuildPhysicsWorld>();
    12.         entityQuery = GetEntityQuery(typeof(TestOverlapSphereVsQuerySystemTag));
    13.     }
    15.     protected override void OnUpdate()
    16.     {
    17.         //Using entity query
    18.         PhysicsWorld physicsWorld = buildPhysicsWorldSystem.PhysicsWorld;
    19.         NativeArray<Entity> entityQueryResults = entityQuery.ToEntityArray(Allocator.TempJob);
    21.         Entities
    22.             .WithAll<TestOverlapSphereVsQuerySystemTag>()
    23.             .WithDisposeOnCompletion(entityQueryResults)
    24.             .WithReadOnly(physicsWorld)
    25.             .ForEach((in LocalToWorld localToWorld) =>
    26.             {
    27.                 for (int i = 0; i < entityQueryResults.Length; i++)
    28.                 {
    29.                     LocalToWorld l2w = GetComponent<LocalToWorld>(entityQueryResults[i]);
    30.                     float sqrDist = Unity.Mathematics.math.lengthsq(l2w.Position - localToWorld.Position);
    32.                     if (sqrDist < testSphereRadius * testSphereRadius)
    33.                     {
    34.                         CollisionFilter raycastCollisionFilter = new CollisionFilter()
    35.                         {
    36.                             BelongsTo = 1u << 31,
    37.                             CollidesWith = 1 << 1,
    38.                             GroupIndex = 0
    39.                         };
    41.                         RaycastInput raycastInput = new RaycastInput()
    42.                         {
    43.                             Start = localToWorld.Position,
    44.                             End = l2w.Position,
    45.                             Filter = raycastCollisionFilter
    46.                         };
    48.                         if (physicsWorld.CastRay(raycastInput, out Unity.Physics.RaycastHit raycastHit))
    49.                         {
    51.                         }
    52.                     }
    53.                 }
    55.             }).Run();
    56.     }
    57. }
    MonoBehaviour that just spawns everything:
    Code (CSharp):
    1. using Unity.Entities;
    2. using Unity.Mathematics;
    3. using Unity.Transforms;
    4. using UnityEngine;
    6. public class TestOverlapSphereVsQueryController : MonoBehaviour
    7. {
    8.     public GameObject TargetPrefab;
    9.     public GameObject EnvironmentPrefab;
    11.     public int TargetCount;
    12.     public int EnvironmentCount;
    13.     public float Radius;
    15.     private Entity targetPrefabEntity;
    16.     private Entity environmentPrefabEntity;
    18.     private EntityManager entityManager;
    19.     private BlobAssetStore blobAssetStore;
    21.     private void Awake()
    22.     {
    23.         entityManager = World.DefaultGameObjectInjectionWorld.EntityManager;
    25.         blobAssetStore = new BlobAssetStore();
    26.         GameObjectConversionSettings settings = GameObjectConversionSettings.FromWorld(World.DefaultGameObjectInjectionWorld, blobAssetStore);
    28.         targetPrefabEntity = GameObjectConversionUtility.ConvertGameObjectHierarchy(TargetPrefab, settings);
    29.         environmentPrefabEntity = GameObjectConversionUtility.ConvertGameObjectHierarchy(EnvironmentPrefab, settings);
    30.     }
    32.     private void Start()
    33.     {
    34.         Unity.Mathematics.Random r = Unity.Mathematics.Random.CreateFromIndex(0);
    36.         for (int i = 0; i < TargetCount; i++)
    37.         {
    38.             Entity targetEntity = entityManager.Instantiate(targetPrefabEntity);
    40.             Translation randomPos = new Translation()
    41.             {
    42.                 Value = (float3)transform.position + r.NextFloat3Direction() * r.NextFloat(0, Radius),
    43.             };
    45.             entityManager.AddComponentData(targetEntity, randomPos);
    46.             entityManager.AddComponent<TestOverlapSphereVsQuerySystemTag>(targetEntity);
    47.         }
    49.         for (int i = 0; i < EnvironmentCount; i++)
    50.         {
    51.             Entity envEntity = entityManager.Instantiate(environmentPrefabEntity);
    53.             Translation randomPos = new Translation()
    54.             {
    55.                 Value = (float3)transform.position + r.NextFloat3Direction() * r.NextFloat(0, Radius),
    56.             };
    58.             entityManager.AddComponentData(envEntity, randomPos);
    59.         }
    60.     }
    62.     private void OnDestroy()
    63.     {
    64.         blobAssetStore.Dispose();
    65.     }
    66. }
    All this may be obvious, but I was just really surprised at how massive the difference apparently is. Especially since using OverlapSphere may be people's first choice when doing this kind of thing. I've tested using various numbers of objects spread over different sized areas and the EntityQuery always wins. I'd be interested to hear if anyone has any input.
  2. NocturnalWisp


    Oct 2, 2019

    That's quite interesting to know. I would be curious to know the affect of using spherecastnonalloc as it doesn't allocate any data which could reduce it, as well as checking against a single layer in the layer mask instead of all of them. There are ways to optimize the sphere cast.

    I think what may be a huge part of it is the number of objects in the search. The more objects, the more the search has to query against. With a sphere cast it is running a collider through the direction given, it doesn't try and check all objects. I think for smaller projects it's less efficient, but once you have 1000s even 100000ds of thousands you may start to see different results. (Not that unity is good numbers that large)

    I imagine the query could be optimized with partitions and other algorithms, it's all really fascinating the results of the various operations.
  3. CookieStealer2


    Jun 25, 2018
    Instead of using OverlapSphere() try OverlapAABB(). I believe it would be faster. Remember that OverlapSphere() does a bunch of trigonometry to calculate the closest distances and so on. Such calculations are not in OverlapAABB. Although, it will use a box volume, but that can be fixed afterwards by checking if the distance to the hit entity is not greater than the radius.

    Here's an extension method to do that:
    Code (CSharp):
    1.         /// <summary>
    2.         /// Adds all nearby rigidBodies positioned within the sphere.
    3.         /// </summary>
    4.         public static void GetNearbyRigidBodies(this ref CollisionWorld collisionWorld, float3 position, float radius, ref NativeList<RigidBody> rigidBodies, CollisionFilter collisionFilter)
    5.         {
    6.             var input = new OverlapAabbInput
    7.             {
    8.                 Aabb = new Aabb
    9.                 {
    10.                     Max = position + new float3(radius),
    11.                     Min = position - new float3(radius)
    12.                 },
    13.                 Filter = collisionFilter,
    14.             };
    15.             var indexList = new NativeList<int>(Allocator.Temp);
    16.             collisionWorld.OverlapAabb(input, ref indexList);
    17.             var bodies = collisionWorld.Bodies;
    18.             var sqrRadius = radius * radius;
    19.             for (int i = 0; i < indexList.Length; i++)
    20.             {
    21.                 int rigidBodyIndex = indexList[i];
    22.                 RigidBody rigidBody = bodies[rigidBodyIndex];
    23.                 if (math.lengthsq(rigidBody.WorldFromBody.pos - position) <= sqrRadius)
    24.                 {
    25.                     rigidBodies.Add(rigidBody);
    26.                 }
    27.             }
    28.         }
    lclemens likes this.
  4. Cramonky


    Apr 1, 2013
    I decided to try out all of the various Overlap methods and OverlapSphere was the fastest:

    Not being a physics expert in any way, this goes along with my impression that spheres are almost always the fastest kind of object to compute physics for (followed by capsule, box, convex hull, and non-convex mesh)

    One thing to remember of course (which I was kind of overlooking when I wrote the OP) is that the Overlap methods are testing against the actual colliders rather than just the origin point. An object could have part of its collider within the check radius but it's origin outside of the radius, so you may need to use an Overlap method in some cases. Luckily for my case that doesn't matter and I only care about the origin, so I've changed all of the places I used overlap sphere to use the EntityQuery method.
    Novack likes this.
  5. CookieStealer2


    Jun 25, 2018
    These results are very strange. I thought OverlapAABB would be fastest because like you said, the other overlap methods needs to find points on the colliders whereas OverlapAABB just checks if the AABB is in range.
  6. TheOtherMonarch


    Jul 28, 2012
    No Sphere is always the fastest test because you just need to test squared distance from center.
    Novack likes this.
  7. Thygrrr


    Sep 23, 2013
    That's still a bunch of additions and multiplications, AABB overlap is literally a handful of subtractions/additions.

    But it does use more conditionals. So it could be mildly slower per node.

    But if your bounding volume hierarchy is complex (many, many objects), AABB overlaps can be optimized further (you can wholesale reject entire partitions, you can't do that with spheres so more tests need to be calculated.
  8. TheOtherMonarch


    Jul 28, 2012
    I am not making this up if you read any source on computational geometry they will tell you spheres are faster. I don't know what special case you have in mind but it sounds wrong.
  9. Thygrrr


    Sep 23, 2013
    TL;DR - Unity internally wraps the sphere into an AABB, which makes the OverlapSphere volume hierarchy rejection exactly as fast as for OverlapAABB. However, sphere vs. sphere and capsule vs. sphere and capsule vs. capsule tests for the actual colliders is faster than AABB vs. sphere/capsule, so if the test case contains a majority of spheres or capsules (rather than other collider shapes), and if the test case is built in a way that partial intersections with the query volume are likely, OverlapSphere is going to be fastest, and OverlapCapsule will be second.

    I don't think so (and I loved computational geometry in uni).

    A single sphere overlap against a single 3D point is pretty fast (3 subs, 3 adds, 3 muls, 1 compare) but still slower than an AABB overlap (3 subs and 3 compares) with a single point.

    And we are not overlapping points here, we're overlapping bounding volumes.

    An AABB overlap in a BVH (which itself is comprised of AABBs) is faster than a sphere overlap in the same hierarchy.

    Sphere vs. AABB case needs to check 8 points (each corner of the AABB) for each candidate, that's 8 dot products with 3 coordinates each, plus one extra check for the sphere center, a simple AABB check to rule out the AABB is entirely inside the sphere or vice versa. That adds up to 27 subs, 48 adds, 32 muls, 11 compares in the worst case (you can reject/accept early whenever you hit one true case - so best case is still one sphere-point check to accept)

    AABB vs. AABB case needs to check 3 ranges, that's 6 planes checkable with 1 coordinate each, a combined 3 subs, 3 adds, 3 compares. In the best case, it can reject after just one of each.

    Furthermore, AABB benefits from the BVH's space partitioning, if something is outside a parent node's X, Y, or Z range (any one is enough), the entire subtree can be discarded. With spherical tests, this is not the case.

    We can ignore memory fetches here as there is the same volume to test against and the query volume is always in the cache. It's purely a computational race and AABBs are killing it.

    The sphere overlap can compete if you wrap the sphere into an AABB first. ;) And that's why Unity does that first, and only uses the sphere itself form the final tests. There are some scenarios if most or all the final colliders in the leaves of a BVH are capsules or spheres where then the sphere test can catch up, otherwise I don't see the sphere test being better here. Sphere vs. sphere, sphere vs. capsule, capsule vs. sphere and capsule vs. capsule are all faster than AABB vs. sphere and AABB vs. capsule in their respective worst cases. AABB vs. mesh or convex hull is faster than sphere vs. mesh or convex hull, that should include box colliders.

    This doesn't take into account the computer science (bounding volume hierarchies) and even in brute force overlap tests, spheres aren't the easiest as discussed above - but you can't do certain things with AABBs, like, you know, rotate them, have meaningful collision normals, etc. - spheres are better at that and as such a practical fundamental primitive in physics simulation.

    I would love to see the source code for your tests - chances are, some memory access/cache stall/other technicality caused the discrepancies. There are very few cases I can imagine where OverlapAABB doesn't smoke them all, and these are all somewhat degenerate. (but can be valid use cases, or course)

    For example, if your test case includes a majority of sphere colliders or capsule colliders, your performance gains for capsule and sphere queries are from the cheap final tests against the colliders, not from the hierarchy overlaps/rejections. The screenshot a the top of the thread seems to hint at a somewhat sparse, somewhat sphere collider heavy scenario.
    Last edited: Oct 6, 2021
    hippocoder and CookieStealer2 like this.
  10. Cramonky


    Apr 1, 2013
    Here's a package which hopefully has everything you need to just drop into any ECS project. Just open the OverlapTestingScene and play. You can modify values on the Test Overlap Controller object and each system has a constant value at the top for you to change the check distance. The systems are all identical except for whatever is needed for its specific method of overlapping/distance checking.

    Attached Files:

    Thygrrr likes this.
  11. Terdiman


    Apr 8, 2016
    There are better ways to do a sphere-vs-AABB test. See for example:

    Also this kind of counting doesn't take SIMD into account, so the comparisons are also questionable. It's going to depend on how the tests are implemented. For example a sphere-vs-AABB test in PhysX would be just this:

    i.e. a couple of math ops (each one of these V3XXX calls usually map to a single SIMD instruction):

    const Vec3V closest = V3Clamp(offset, V3Neg(boxExtents), boxExtents);
    const Vec3V d = V3Sub(offset, closest);
    return Ps::IntBool(BAllEqTTTT(FIsGrtrOrEq(mRadius2, V3Dot(d, d))));
    const Vec3V offset = V3Sub(mCenter, boxCenter);

    At the end of the day this doesn't matter anyway: the spheres are larger so they don't cull as many objects as boxes, leading to more tests overall. There's always a balance between the "pruning power" of a shape and the cost of its test against primitives. Traditionally spheres as fastest to test but lead to many more tests, while at the other end of the spectrum convex hulls are slowest to test but lead to the least amount of tests overall. Who's faster will depend on your implementation (how you actually wrote the code), how many objects you have in the scene, and how they are spatially organized.

    You cannot generally say what's going to be faster. Overall though, AABBs are "the best" because they're a good middle ground between "pruning power" and "number of tests", and the implementation match SIMD instructions quite well.

    For the number of objects involved in this thread though, it won't matter. 50 spheres & 50 cubes is basically nothing, so your perf difference is probably coming from somewhere else (maybe even the overhead coming from scripts, etc).
    MehO and hippocoder like this.
  12. Cramonky


    Apr 1, 2013
    Lots of interesting discussion going on, most of which is going over my head haha but still interesting nonetheless.

    I wanted to do some more testing, though. So I stripped out all extra code after the overlap check (i.e. iterating over the results and doing a raycast for each) and these are my results:
    You can ignore the Entity Query system. This aligns more with some of the above users' expectations of AABB being much faster than everything else.

    So the real culprit behind the times was simply the AABB overlapping more things and just doing more raycasts (go figure, I should have guessed that one).

    But anyways, my original point behind making this thread was the somewhat counter-intuitive result that brute-forcing a search over every entity was significantly faster than using one of the physics methods, as long as you only care about the origin point. One might naively think (as I did) that the physics methods have some black magic that makes them super fast and the best choice to use for some kind of AI targeting system, but that appears to not necessarily be the case.
    webik150 and CookieStealer2 like this.
  13. CookieStealer2


    Jun 25, 2018
    Maybe physics queries will be faster if there are more bodies in the scene?
  14. Cramonky


    Apr 1, 2013
    I thought about that too, but my testing shows that the Entity Query system always is faster than everything else, how much faster depends on the circumstances.

    Here are a few examples. I have disabled the raycasting portion of the code, but I am still looping over all results and just getting the LocalToWorld component of the result's entity so there is some work being done. The maximum distance for all checks is 100, so the OverlapSphere has a radius of 100, OverlapAABB has a min of -100 and max of 100, etc.

    400 objects, 200 overlap checks per frame, spaced over a 2000 radius:

    400 objects, 200 overlap checks per frame, spaced over a 200 radius:

    400 objects, 200 overlap checks per frame, spaced over a 20 radius:

    2000 objects, 1000 overlap checks per frame, spaced over a 1000 radius:

    Again feel free to download my package from above if you want to play with it yourself.
    Novack and jmcusack like this.
  15. Terdiman


    Apr 8, 2016
    That's because you don't have a lot of objects to process. It's completely normal that brute-force wins for a low number of objects. Put 10K or 100K objects in the scene, try to find a small subset of them overlapping a query sphere somewhere, and it's likely that methods using BVHs and/or more complex structures will become faster.

    With ECS and "data oriented" approaches the number of objects it takes for brute-force to start losing has increased (i.e. brute-force became more efficient), but ultimately it will still become slower than e.g. tree-based approaches.
    MehO and Cramonky like this.
  16. Terdiman


    Apr 8, 2016
    That might be a Unity-specific thing if this remains the case all the time. For pure collision tests with no overhead from anything else, trees are still the way to go. What's the unit for these time values you're reporting?
  17. Cramonky


    Apr 1, 2013
    You're definitely right. I should have clarified that when I said "every entity" I really meant "every entity that meets a certain criteria like having a specific tag". So If you have 100 AIs and 100,000 environment objects, you aren't doing a query of everything to get 100,100 total objects, you are only getting the 100 other AIs. I used collision filters for the physics methods to achieve a similar filtering of results.

    The times are in milliseconds, taken from the "Systems" window.
  18. Terdiman


    Apr 8, 2016
    Ok then there's something seriously eating up all the CPU time somewhere. It shouldn't take "65 ms" (?!?!?) to do 1000 overlap checks over 2000 objects for example. Maybe I'll try some benchmarks on my end tomorrow but this feels like some overhead coming from systems and glue code unrelated to the actual overlap code.
  19. Terdiman


    Apr 8, 2016
    Yeah ok a quick (already existing) PhysX test I had (purely C++) can do e.g. 16384 sphere overlap queries over 4096 box objects in ~5.6ms. Looks like this:


    I could try to repro exactly your setup if needed but it seems clear that the numbers you're getting aren't due to the collision code itself, but rather something else going on in the system. I don't remember how it works in Unity but I've seen that kind of stuff in other engines where in many cases the bottleneck was just the Python to C++ bindings, and in particular passing arrays of results between them....
  20. Cramonky


    Apr 1, 2013
    Yeah most of my test cases have included some work being done after the overlap tests to more closely simulate a real world scenario, cause you're probably never going to do an overlap check and do nothing with the results :p.

    Here's a test with 1,000 overlap checkers and 10,000 environment objects with no extra work being done after the overlap check:
    Last edited: Oct 7, 2021
  21. eurobax


    Nov 27, 2021
    And what about Entity Query System results? Should it win already?
    shamsfk likes this.
  22. shamsfk


    Nov 21, 2014
    What was the result for Entity Query System?) I've read a long topic, and in the final moment, when the question should have been answered, the EQS was checked off... That's like not getting to know who is a killer in a detective book man