July 16, 2019

Planning vs. plan recognition


Successful example of Artificial Intelligence for gameplaying is chess and tictactoe. In both cases the AI works by creating the gametree and search for a node in the tree. Because this strategy works so great, most AI amateurs are trying to transfer this idea to robotics as well. The problem is, that robotics has a larger state space and it's not possible to build the entire state space.
To overcome the bottleneck we have to describe what AI planning in general is. Planning means, that a game engine is available in which actions are allowed. In case of chess, the game engine allows the player to make different moves in the game. The underlying game engine affects the size of the state space. If the player has more possible actions and if the game tooks longer, then the state space will become greater. The only way in reducing the state space is to modify the underlying game engine. If the game contains of a small amount of moves the state space is much smaller.
The question is not how to plan with the existing game engine, the more interesting question is how to create a new game ontop of the old one which has an abstract state space. I want to explain this on a simple maze robot. In the normal game, the robot can walk in four directions: up, down, left, right. The result is a certain type of games. For example, if the robot would like to go 10 steps upward, he has to execute the sequence: up, up, up, up, up, up, up, up, up, up.
To simplify the game, we can add a macro-actiion which is called “5up”. In the newly create game, the robot can change it's position with the 5up command directly to 5 steps upwards. To reach the same goal position, he has to execute the sequence: 5up, 5up. On the first look, the idea of macro-actions looks not very powerful, but it is. It allows to reduce the state space for any game. The result is, that the solver can plan longer sequences in a shorter amount of steps. We are not talking about an improvement of 10% or 50%, but the improvement will become 1 million percent and more. It's possible to exploit macro-actions as the only problem solving technique to play all kinds of games.
The only difficulty is, that most games, doesn't provide macro-actions. They have to be invented by the programmer first. This is called domain knowledge, because it's annotate the gameplay on a semantic level. But let us go back to the planning process. If the game engine is fixed, it's indeed hard or even not possible to plan for a complicated game. A game has a fixed amount of actions, and if the solver can only the normal actions he will have to search the game step by step. If the gameengine is a physics simulation, the costs of executing a single action will become much harder than in chess. That means, if the programmer is not allowed to improve the game engine with macro-actions he will struggle with AI planning.
The interesting point is, that it's ok to extend a game engine with new actions. It's not cheating to invent an action like “5up”. It's the same what humans would do if they want to play the game. They are not saying, that the player has to press 10x times the up key, but they will say “go to the top”. That's shorter and increases the abstraction level. The same strategy will improves the human-machine communication as well. If the game engine gets new abstract commands, it will allows the human player to formulate more elaborated commands.
The only problem is, that creating hierarchical abstract macro-actions for a given game is more complicated than only traverse a given game engine. That means, the overall concept is a bit harder to grasp. To make the understanding more easier, it's important to know, that in AI planning the planning can be ignored. If the STRIPS file is available, it's very easy to search in the gametree. This can be realized with 20 lines of code, or with an existing solver. The more demanding task is to create the STRIPS file for a certain domain. This is the real bottleneck in AI planning. Let us explain why AI planning itself is easy.
Suppose a game engine is available. The game engine provides to the outside world a set of possible actions: action0=left, action1=right, action2=up, action3=down. If the actions are executing in serial order a gametree is the result. This is a graph of all possible combinations. A solver can send the sequence (left, left, up, up) or he can send (up, up, right, down) to the game engine. Traversing the game tree means, either to send random commands to the game engine which takes a bit longer, or to create a dedicated tree which speeds up the search a bit. And now we can ask which issue will limit the solver, it's the size of the state space. That means, if the total amount of nodes is only 100 or maybe 1000 the solver will find the plan very fast. But if the number of nodes is higher, he will struggle.
Instead of explaining what AI planning is, the better idea is to ask what the size of the gametree will be. If the size is too large, AI planning will fail and we have to modify the game engine with macro actions.