Search Unity

Getting better performance out of AStar using burst

Discussion in 'Burst' started by Sarkahn, Nov 12, 2020.

  1. Sarkahn

    Sarkahn

    Joined:
    Jan 9, 2013
    Posts:
    440
    So I wrote an astar pathfinding package. I wanted something generic and simple that would work with jobs and burst. It's nothing fancy, just a straightforward implementation of the redblob version of AStar using native containers.

    It works well on small and medium maps, but performance gets *really* bad when the maps get very large. I have performance tests set up for profiling, and on a 1000x1000 map it takes 50+ ms to path from one end of the map to the other, even with burst.

    Any advice on how I can improve it?
     
    Last edited: Nov 12, 2020
    bb8_1 likes this.
  2. tertle

    tertle

    Joined:
    Jan 25, 2011
    Posts:
    3,761
    Don't except that you can path from 1 end of your map to the other (1k x 1k isn't even large).

    There are tons of ways around this problem.

    A couple of examples
    - You can pre-compute partial routes that go across your world
    - Use navmesh instead of a grid. This can greatly reduce the number of nodes you need to visit.
    - You can limit number of nodes visited (1000 for example) and if the algorithm doesn't find it in that time return the partial path and start walking along that then compute the rest of the path from the partial path.

    And you can do all of these things at once.
     
    bb8_1 and Sarkahn like this.
  3. Sarkahn

    Sarkahn

    Joined:
    Jan 9, 2013
    Posts:
    440
    I want to keep it as generic as possible so there aren't any restrictions on the kinds of maps you can use with it. And right now I plan on using it for a grid-based game so a navmesh wouldn't work unfortunately.

    Adding an option to limit the the number of nodes you can visit per frame is a good idea. It really just hides the actual performance issue but if that's what I have to do, so be it. I was kind of hoping I was maybe just doing something really dumb with my implementation and there's some simple way I could tweak it to make it much faster.
     
  4. Antypodish

    Antypodish

    Joined:
    Apr 29, 2014
    Posts:
    10,775
    Try divide your map into chunks. Let's say 10x10 or 20x20.
    Each chunk contains "gates" to neighbour chunks.
    Initially you lookup for path through chunks, for long distances.
    Then when you actually move, you lookup only for a path of nearest chunks, for which you know the path is valid.
     
    bb8_1, charleshendry and Sarkahn like this.
  5. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,266
    If you want something faster at the low-level, you may want to look into creating specializations. Unfortunately the sneaky optimizations I am aware of don't work in your case because you have interface method invocations and random container access in your inner loop. So vectorization, branchless operations, and bitmasking are all off the table. The only real place to improve would be the priority queue. For that, you could use a wider tree than a binary tree and use simd operations to compare against several children at once when cascading up or down. This also reduces depth significantly so you don't pay for the traversals nearly as much.

    Of course the high-level optimizations everyone else is suggesting are valid too.
     
    bb8_1 and Sarkahn like this.
  6. Sarkahn

    Sarkahn

    Joined:
    Jan 9, 2013
    Posts:
    440
    I was considering a hierarchical solution like that, I think most of that work would be on the map side though. It does make sense it could be a massive improvement for huge maps though!

    So it sounds like the only obvious way to improve the algorithm is to specialize and create restrictions on the inputs then?

    RE the Priority Queue - I did try a few other containers, including some I've seen on these forums, and profiling was showing the algorithm was running fastest with my NPQ. Of course that doesn't mean it can't be improved - I will do some reading and see if I can figure out how to implement your suggestions.

    Thanks for all the suggestions so far!
     
    bb8_1 likes this.
  7. DreamingImLatios

    DreamingImLatios

    Joined:
    Jun 3, 2017
    Posts:
    4,266
    Yes. That isn't to say that having a generalized solution isn't useful. Many use cases don't need maximum performance.
    I haven't come across a proper simd-optimized priority queue in C#. And the only one I have vetted (C++) is way too platform-specific to share. Outside of AStar, I haven't found a use case for one yet to motivate me to write one.
     
    bb8_1 and Sarkahn like this.
  8. eizenhorn

    eizenhorn

    Joined:
    Oct 17, 2016
    Posts:
    2,684
    Implement hierarchical A* (for reducing dead ends cost and wide spread cost), check your heuristic function (which really matters for search accuracy depends on map type) you can read a paper from this my post, just exclude potential fields part https://forum.unity.com/threads/unity-dots-case-study-in-production.561229/page-2#post-4412335. And one more important optimization - make it calculate not instantly but off the main thread, it's the main way for A* implementation for huge scale map (in addition to all described above in the thread) in most cases you can spread calculation across a couple of frames. For future optimizations use LOS check for agents for trace top-level node gates and traverse through this top-level A* nodes (regions) and in the case when N+1 node not in LOS (Brenham or grid traversal) go straight to N node and just check from time to time if N+1 visible and go straight to this and repeat until you arrive end region or to the region where you can't see next N+1 node and build low-level A* path for this small granular region. In addition, for early outs of path request, you can build one more top-level thing - Area - block of connected regions and every region belongs to some AreaID and in the case when you request path and point you requested in a different area, then path not reachable and you don't need to even try to calculate the path, the similar region-area approach was used in Rimworld and in our Diplomacy is Not an Option.
     
    Last edited: Nov 12, 2020