# Having trouble understanding how A* can be applied to action planner

Discussion in 'Scripting' started by Sendatsu_Yoshimitsu, Nov 22, 2018.

1. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
I'm trying to roll my own GOAP system, and I've read in multiple papers that A* is the go-to pathfinding system to build a plan, but I'm having a lot of trouble seeing how it's supposed to be implemented.

If I were using A* on a preconnected node graph, it would make a lot more sense: you'd just iterate through each connected node in turn, check its current score + heuristic to see which node scores the lowest, move the winner to your open list and start over with its neighbors. But with an action planner the nodes aren't connected to each other- you just have a huge list of actions with some List<KeyValuePair<int,bool>> of effects and a second list of conditions. So it seems like instead of pathfinding, you would have to brute force the route by iterating through every possible action at every step, and check if maybe this time that particular action brought your current world context closer to the target one.

So am I missing something? It seems like there's no way to narrow the search, but if that's the case you're not really using A*, you're just doing tons of list traversal.

2. ### Lurking-Ninja

Joined:
Jan 20, 2015
Posts:
4,645
Edit: Unfortunately I totally misunderstood the problem.

Well, path-finding arises when you have places to perform actions. In theory every action should determinate where it can be performed. Depending on the type of game you're building you can lock down the place on planning (usually in adventure games) or on execution (action games).
For the sake of simplicity let's get the action games: you have an action coming up in the plan. You, in theory, should have a list of places where the action can be performed. Now, you have to determinate where to go with your character (usually it's the nearest available place). At this point you really have to perform a path-finding operation to get the real path to the place where your action can be performed.
Now, A* usually works with predefined NavMesh or just connected nodes. It depends, but in reality these are the same from the path-finding point of view.
After you have the path, you insert an action to go to the place before the action and postpone the real action after the arrival to the point.

Using A* or using other type of path-finding has nothing to do if you're building a GOAP system or Behavior Tree or simple FSM. A* only means that you're using a specific algorithm to find your way to the target place.
A* is one of the most performance-effective way to find your way to the target.

So A* isn't building your plan, it's only for finding the way to places (one type of action).

Last edited: Nov 22, 2018
3. ### lordofduct

Joined:
Oct 3, 2011
Posts:
6,981
In a GOAP system you have goals you may want to accomplish. These goals require a set of steps to accomplish. Most steps rely on some precondition.

Goal: kill a target
To kill a target requires firing a weapon.
To fire a weapon requires that the weapon is loaded.
To load a weapon requires drawing the weapon.

You can view each action a 'node' in a graph. And each precondition as an 'edge' in the graph.

If you then perform a regressive search from the goal to the current state (which you can represent as a null precondition for instance). You can find the most efficient path to get to killing a target.

Your so called 'neighbours' are the nodes that facilitate the precondition.

Here is a document describing this in detail:
http://alumni.media.mit.edu/~jorkin/GOAP_draft_AIWisdom2_2003.pdf

In it you'll find this example action sequence just like the one I described:

Note how each action has an effect and a precondition.

So if you needed to determine the neighbours of 'Attack'. All nodes whose 'effect' satisfy 'Attack's precondition are nodes that can travel to 'Attack'.

Note these are one way connections.

So once you've created your list of actions. You can then build all connections based on these effects/preconditions. You'd do this up front, and not every time you calculated a path... front load that expensive task.

eses and Lurking-Ninja like this.
4. ### Antistone

Joined:
Feb 22, 2014
Posts:
1,779
I'm not familiar with "action planners" in the sense being used here, but this is not true of A* in general. A* is completely applicable to metaphorical distance, not just to physical distance.

The most basic way to use A* to decide your strategy in a game is to consider every possible state the game could be in as a node in a graph and the actions you can take in the game as edges that move you from one state to another.

For instance, maybe right now you're in the state {10 gold, 5 lumber, center of map}. If you move north, then now you're in the state {10 gold, 5 lumber, north of map}, and that could be a "move". If instead you decide to use your lumber to build a house, then now you're in the state {10 gold, 0 lumber, center of map, house in center of map}, and that is also a "move"; you didn't change your location in the game map, but you "moved" to a new state in the game graph.

If you need a tower, then A* can be used to find "paths" that lead to a tower existing, such as (purchase axe, chop tree, build foundation, build tower) or (buy sword, threaten wizard, demand wizard conjures a tower). The "distance" (or length) of these "paths" might be based on how much time they take or how many resources they consume.

Notice that for a non-trivial game, the full graph of all possible game-states and all actions you could take in all of those states is often really freakin' big and so you might not be able to examine the whole thing. But most of the things that you might do can be viewed as stripped-down versions of the above that sacrifice thoroughness for speed and/or memory.

Lurking-Ninja likes this.
5. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
First of all that's a great paper, so thank you so much for the link!

Now for precomputing the list of connections, I'm still a bit stymied picturing what that looks like- would the idea be that you store it in something like a Dictionary<effect,List<Action>>, so at each iteration the system can index which actions will fulfill a given effect?

6. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
Each node is a world state -- essentially a list of preconditions. Orkin used a big bit vector in FEAR, with meanings like:

The connections between nodes are the actions that are allowed in that world state.

As a first pass, instead of optimizing down to a vector of bits, you could just have a List<> of attributes about the world.

BTW, Quest Machine originally used A* to generate quest plans. I have yet to find an adequate heuristic for its particular case, so it now uses BFS with some optimizations and cheats. But the point of Quest Machine is to generate interesting quests; it doesn't necessarily have to always choose the most optimal plan. In fact, that might get boring. Your use case may be different, in which case it may be important to design a good heuristic.

Lurking-Ninja likes this.
7. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Huh, off the top of my head I would've thought that the bitwise operations would've been more expensive than just storing an int ID for each effect/precondition, but the performance savings would be huge if you could get it working; that's definitely worth messing around with once I'm happy with everything else.

Not to be dense, but when it comes to world states the idea of making each node a unique permutation of the world state makes tons of sense, but I assume you don't want to pregenerate the entire graph, since 99% of the possible world states won't be helpful- is the idea that for each action possible out of the current state, you derive its heuristic by testing whether the resulting world state would be closer to the target (either by having more matching effects than you have right now, or by satisfying a precondition to an action that could produce one of those effects)?

Also, this is 100% tangential so don't feel obligated to weigh in, but I'd be genuinely interested to hear how BFS ties into your asset- it's so simple (compared to action planning/pathfinding) that debugging it must be way happier. For my particular use case, being boring is actually a bit of an advantage- this is to generate enemy behavior that the player reconstructs later (that's why I can afford to have such a huge decision space, it only runs once per quest). If it's too variable, the player can't reliably figure out what the AI is doing based on the traces it leaves behind, and I might as well be randomly generating a list of actions. One of the reasons GOAP is so attractive even though I'm a pretty novice developer is because it stores and executes logic in a way that is very easy to explain to the player, resulting in transparent behavior that I'm hoping feels intelligent to the player, even if it's consistent.

8. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
With bit vectors, you can just AND the world state vector and the preconditions vector to see if all of the preconditions are met. It's a single bitwise operation.

That might work as a heuristic. Or you could assign weights to actions, so "shoot target" might count as being closer to the goal than "reload weapon".

It is. And it's easier to read, which is good for a code asset. But BFS is action planning/pathfinding. A* without a heuristic is Dijkstra's algorithm, which in an unweighted graph is BFS. So A* is just BFS guided by a heuristic. Quest Machine can get away with it because it does some things to significantly pare down the search space prior to searching, and because quests are generally only a few steps long. A quest with 100 steps would be ridiculous, and frustrating for the player.

9. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Huh, for some reason I always mentally partitioned BFS and Dijkstra's/A* into different baskets, even though you're completely right. (I also agree that a 100+ step quest would be ridiculously frustrating. You know who else I'll bet agrees? Anyone who did the epic shadow knight thing in Everquest- this was years ago, but I'm fairly sure that quest had in excess of 50 individual steps and something like 7% of the global server population who ran with that class ever completed it.)

10. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
I started that quest. Then I looked it up on Allakhazam and created an alt instead.

I think it's more interesting for a quest to have 1-3 steps -- but steps that are a reaction to the current state of the game world, since it makes the player(s) feel like they have actual agency to change things.

11. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Yeah, it also depends a bit on the kind of game and quest that you're working on. Looking at Witcher 3 as an example, the less linear one of their quests was, the fewer steps they gave it- if there was gameplay involved they kept it to 1-3 steps (find monster, track monster, hunt monster), but if you were in a linear, story-driven sequence they'd often stack 5-7 stages or more, and use narrative sequences to break up the experience so it never felt tedious.

TonyLi likes this.
12. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
That's my thinking with Quest Machine. Designers can hand-write complex, non-linear quests with lots of branching decisions; and then let the procedural generator automatically fill out the world with shorter, reactive quests.

Lurking-Ninja likes this.
13. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
I know human memory tends to be limited to keeping track of 5-9 items in a discrete category before they start losing track, which is another decent reason to keep branching/stepwise advancement limited; otherwise a lot of quests end up feeling like more work than they are just by lieu of being hard to mentally track.

On my thing this is getting absurdly granular, but I'm working on building a set of actions that let the AI interact with an abstraction of the game world. Doing something like breaking into a building or robbing a person requires that all steps be relative to the entity they're targeting- if they aren't, the AI would just do weird things like break down a weak wooden door across town, then carry the openedDoor effect over to the bank to get into its highly-secured, expensive-to-destroy vault door.

So it seems like there are two ways of handling that: one would be to institute really aggressive hierarchies, where instead of the planner building one big, global plan, it would be asking locations/NPCs/objects for sub-plans relative to them; in my bank example, the bank itself would be the only object capable of seeing its actions, and it would simply advertise plans that higher-level planners could request. That has the drawback of being pretty expensive, since it isn't possible to gauge the cost of a plan without assembling it and AIs might end up requesting tons of multi-step plans from every building in town, just to find one that was inexpensive.

The alternative is keeping the planner global, giving it the ability to look inside of entity/buildings, and simply tag every action with a unique identifier, such that each bank's robbery plan wouldn't require doorOpen and locationCased, it would specifically require doorOpen (bldID #52), locationCased (bldID #52) or whatever.

My kneejerk reaction is to favor lots of sub-plans that are each made up of very generic effects, because planning on a global scale is going to become linearly more expensive relative to how large the world is, whereas even with the added expense of computing smaller plans only to throw them away, working in a hierarchy is going to remain roughly the same cost as the game world grows. Does that sound about right to you guys?

14. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
These locations/NPCs/objects are commonly called "smart objects," and the effects that they advertise are usually called "affordances." (Smart objects are one of the primary ways that Quest Machine trims down the search space.) The Sims relies heavily on smart objects. When a sim is hungry, it doesn't do any reasoning. It just iterates through all objects in the scene and chooses one that advertises that it will reduce the sim's hunger counter. Then it blindly follows that object's scripted animation. But to the player it looks like the sim is reasoning about the game world and making a plan to act in it.

Your bank wouldn't have to assemble its sub-plan right away. It could simply advertise that it can provide a sub-plan to put the bank's money into the agent's hands, and provide a heuristic value for what it guesses that plan might cost, without actually generating the plan right now. Only generate the plan if the planner chooses to pursue that path.

This is very similar to a hierarchical task network (HTN). When you have the ability to plan out the environment like this, HTN planning can outperform the STRIPS-style GOAP planning that we've been discussing in this thread. Killzone 2 was one of the first games to popularize HTN planning over GOAP. nucl.ai has a Killzone 2 presentation here. There's also a good presentation here on how and why the developers of Transformers: Fall of Cybertron switched from GOAP to HTN.

15. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Ooh, I've never heard of HTNs before, thank you for the citations!

Glancing over quest machine's docs, it looks like we're both approaching the general architecture in similiar ways (worlds are full of entities, entities have actions for the planner to leverage). So I'm curious, how granular did you make your planning hierarchy? For my thing I'm planning on being pretty coarse, and probably only giving NPCs and entire scenes my equivalent of your entity class, since I'm thinking that the broader the search space (within sane limits), the more variety and interesting options the planner will have.

16. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
Since Quest Machine is a tool made for game designers, its granularity is defined by the designer. One designer could define an action as broad as "broker intergalactic peace with planet X" while another could define one as narrow as "pick a carrot from the field." It can get away with this because of another difference from your game. Actions in Quest Machine only specify the preconditions and effects. They don't actually specify how to do it. The action "Kill Orc in Cave" has a precondition {Orc in Cave} and an effect {Remove Orc from Cave}. The designer also specifies a message that the action should listen for, such as "Killed"+"Orc"+"Cave". But it assumes that a separate gameplay system, not Quest Machine, handles the combat and sends the message. (That said, Quest Machine does include helper components, such as one that sends a message when a GameObject is destroyed/killed.)

So you're right -- the broader search space produces more interesting results. The search space in Quest Machine ends up being broad but not too deep, which works well for the breadth-first search that it uses.

17. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Thank makes a lot of sense, thank you for the detail! And yeah, in my experience breadth actually results in much more interesting behavior most of the time; I'm still in the shakedown phase with my thing, but I'm actually considering building a hard limit of 6-7 actions per plan, simply on the basis that longer plans are less interesting to investigate and much more likely to have weird, hard-to-understand reasoning behind them.

TonyLi likes this.
18. ### TonyLi

Joined:
Apr 10, 2012
Posts:
10,228
That sounds like something that will shake out in playtesting. You may even find that playtesters can't figure out plans longer than, say, 3-4 actions. Or they might find 6-7 to be too simple.

19. ### Sendatsu_Yoshimitsu

Joined:
May 19, 2014
Posts:
673
Oh yeah, for sure. A lot of the starting math on this is based on playtesting the mechanics with a pen & paper mockup, which we did to test the concept before any code got written, but this sort of thing is so subtle and pervasive to your game experience that I assume it'll be tweaked throughout testing and development. ^^

unityunity