July 21, 2019

Production systems are language based game engines


The history of AI knows production systems, General problem solvers and STRIPS like solvers. What they have in common is, that they are using language based transformation of a system which results into high speed problem solving. To explain the idea in detail i give an example.
A robot is located within the center of a map. He can move in 4 directions (up, down, left, right). The numerical way of problem solving is working with a gametree. All possible sequences of the robot, e.g. “up, up, left, up, left” is stored in a graph and a solver can search in the graph for a goal node. Dedicated path planning algorithm like A* or RRT modify this approach a bit and search the graph faster.
A production system is working different from graph search. The basic idea is, to describe the problem on a semantic level with the help of a robot language. This is equal to introduce macroactions. The robot language contains of the following words:
- lowlevelactions: up, down, left, right
- midlevelactions: 5up, 5down, 5left, 5right
- highlevelaction: movetolocation, moveincircle, resetgame
The words of the robot language can be combined to larger programs, very similar to a behavior tree. A production system is using the grammar for reaching a goal. The advantage over a graph search algorithm is, that the computational effort is much lower. The reason, why STRIPS like solvers are not very common in robotics is because it's hard to invent the robot language from scratch. In most problems only the low level commands (up,down, left, right) are known, but not macro-actions.
In the STRIPS notation, the domain grammar is stored in the STRIPS syntax, while in the SOAR production system the syntax is based on SOAR production rules. In both cases, the domain file is equal to a grammar which describes a problem on a higher level. The grammar allows to search in the action space of the robot more efficient than testing out only lowlevel sequences of actions.
Symbolic AI
Symbolic AI is equal to create a higher abstraction layer on top of movement primitives. On the lower side, the robot can only move in four directions with one step each. He is not able to reach with one command a position far away. This kind of super-actions has to be invented by the programmer. It's possible to transform a super action like “moveto(10,20)” into a sequence of lowlevel actions. In the literature the concept is described as hierarchical planning and it's improves the efficiency in problem solving.
The major question is not how STRIPS or SOAR works internally, which is equal to plan in a hierarchical fashion, the bottleneck is the grammar which describes the macro-actions. Using an existing STRIPS domain description for determine the lowlevel actions of the robot is an easy to solve problem. Because the solver will search in the domain file, test out some alternatives and then he will have found the correct action sequence. The more demanding challenge is to convert a game into the STRIPS domain file.
A language grammar allows to execute high level actions in a game. After running a macro-action the game will be in a new state. The transition is formalized in a symbolic game engine. Let me give an example. According to the grammar, the macro-action “5up” is available. The meaning of the command has to be formalized in sourcecode:
newposition = oldpos + (0,-5)
defines the action precisely. The action modifies the underlying game. It changes the position of the robot. Now we can ask which lowlevel actions are needed to become the same result. The transformation from high level actions into lowlevel actions can be realized with a solver.
Language understanding machines
A software which accepts natural language commands is easier to realize then it might look at the first side. All what is needed is an if-then-statement:
if input==”5up” then ...
if input==”5down” then ...
We can feed the program with a string for example “5up” and the program is doing something. From a perspective of a game programmer, such a software module is called a game engine. It specifies which actions can be send to the game. The statement after the “then” section are grounding the input word. They are formulating precisely, how the game state get changed. After a game engine was programmed it's possible to formulate a high level program written in the domain specific language:
5up
5up
5down
... is an example program. Each command is send to the game engine and is executed in a virtual machine.
A simple grammar for solving Sokoban like puzzles

In he example picture the well known game of sokoban is shown. To make things easier no obstacles are there but the robot has to push the boxes to the goal position at the lower right of the map. The lowlevel actions available for the robot are: left, right, up, down.
The first idea to solve the problem with a RRT graph failed. It's not possible to generate the entire action sequence, because the number of required actions are to high. The robot has to reach first box0, then he has to push the box to the goal, then he has to go to the next box and so on.
The next better approach after using RRT graph search is implementing a robot control language and use a symbolic solver for figuring out the low level actions. This approach works much better. The implemented grammar has the following elements:
"reset","movetobox","pushboxtowaypoint","movesmall", "pushsmall"
All these words are equal to powerful macroactions. It modifies the position of the player in absolute values. For example the action “movesmall” is able to teleport the robot 2 fields with a single command. Let us observe how a potential high level plan will look like:
1. movetobox 0 # a parameter is given to the command to specifiy the box0
2. pushboxtowaypoint 0
3. movetobox 1
4. pushboxtowaypoint 1
If the high level planner is made more detailed the improved plan will look the following:
1. movetobox 0
1a movesmall
1b movesmall
2. pushboxtowaypoint 0
2a pushsmall
2b pushsmall
2c pushsmall
3. movetobox 1
3a movesmall
3b movesmall
4. pushboxtowaypoint 1
4a pushsmall
4b pushsmall
Even this detailed plan is not expanded fully. A low level planner has to convert all the movesmall and pushsmall actions into lowlevel primitives. The resulting plan will contains only “left,right,up,down” actions. That's the basic idea of a hierarchical planner and the idea is powerful enough to determine the actions of the robot. The proposed system contains of three elements:
1. the Sokoban game itself
2. the grammar which includes lowlevel and highlevel actions
3. the planner which generates the plan and converts it into lowlevel action sequence
Let us dive into potential pitfalls. There are two possible pathways in generating the macro actions. I have choosen the manual way. That means, in the program there is a subroutine for “movesmall” and for “pushsmall”. In the subroutine the angle to the goal and the new position is determined with normal computer code. I'm unsure if it's possible to generate the macro actions without human intervention. In the literature the concept is called “Rule learning”, but I'm in doubt if this works for Sokoban. That means, my program doesn't have learning capabilities but it's static. All the macro actions are defined before the runtime.
Solving a symbolic problem
Let us go a step backward and describe what the planner has to do. Instead of trying out a sequence of lowlevel actions, the newly established environment contains of macro actions:
"reset","movetobox","pushboxtowaypoint","movesmall", "pushsmall"
The goal state has to reached with using a sequence of these actions. Because the macro actions are more powerful than the regular actions the sequence will become much shorter. The proposed plan can be described with a few commands. Before the planner can generate the actions autonomously it's important to ground the macro actions. Grounding means, that the actions are not only given in a Python list, but that the game engine accepts the command as a valid command and so something in return. That means, if the command movetobox is send to the game engine, the game engine will modify the robot's position and the result is displayed on the screen. This is called grounded actions, because the resulting game engine state can be measured by the planner.
Let me give an example. We are using a simple randomized planner which is sending randomly macro-actions to the game engine. After each trial we are testing what the position of the robot and the boxes is. If the boxes are at the goal position the plan has been found. Before we can measure the robot's position the action need to have an influence on the game engine. The counter example would be, that we are sending the high level action “movesmall” to he game engine but nothing happens. Then the action is not grounded and it's only a word in the program without having a meaning.

No comments:

Post a Comment