Robotics and Artificial Intelligence: July 2019

July 29, 2019

GOAP Textadventure

Even if GOAP was introduced in the literature already, it make sense to give a simple example. In the following sourcecode a GOAP planner for solving a textadventure is presented. It contains of the action model itself and a randomized solver for determine the action sequence.

"""
description: GOAP Textadventure 2019-07-29
"""
import random,copy

class Actionmodel():
  def __init__(self):
    self.worldstate = None
    self.tasklist=("openbox","closebox","takekeyfromtable","putkeyinbox","takeball","placeballatgoal")
    self.reset()
  def reset(self):
    self.worldstate = {
      "ballpos": "inbox", # inbox, atrobot, atgoal
      "keypos": "attable", # attable, atrobot, atboxkeyhole
      "boxstatus": "close", # close, open
    }
  def task(self,name):
    result=True
    if name==self.tasklist[0]: # openbox
      if self.worldstate["keypos"]=="atboxkeyhole":
        self.worldstate["boxstatus"]="open"
      else: result=False
    if name==self.tasklist[1]: # closebox
      if self.worldstate["keypos"]=="atboxkeyhole":
        self.worldstate["boxstatus"]="close"
      else: result=False
    if name==self.tasklist[2]: # takekey
      if self.worldstate["keypos"]=="attable":
        self.worldstate["keypos"]="atrobot"
      else: result=False
    if name==self.tasklist[3]: # putkeyinbox
      if self.worldstate["keypos"]=="atrobot":
        self.worldstate["keypos"]="atboxkeyhole"
      else: result=False
    if name==self.tasklist[4]: # takeball
      if self.worldstate["boxstatus"]=="open" and self.worldstate["ballpos"]=="inbox":
        self.worldstate["ballpos"]="atrobot"
      else: result=False
    if name==self.tasklist[5]: # placeballatgoal      
      if self.worldstate["ballpos"]=="atrobot":
        self.worldstate["ballpos"]="atgoal"
      else: result=False
    return result
  def show(self):
    print(self.worldstate)

class Solver():
  def __init__(self):
    self.mymodel = Actionmodel() 
    #self.plan1()
    self.randomsearch()
  def randomsearch(self):
    planlength=5
    trialmax=100
    for trial in range(trialmax):
      engine=copy.deepcopy(self.mymodel)
      actionlist=[]
      for i in range(planlength):
        name=self.validaction(engine)
        actionlist.append(name)
        engine.task(actionlist[-1])
      if engine.worldstate["ballpos"]=="atgoal":
        print("trial",trial,actionlist)
        engine.show()
  def validaction(self,engine):
    maxtrial=16
    for trial in range(maxtrial):
      localengine=copy.deepcopy(engine)
      r=random.randint(0,len(localengine.tasklist)-1)
      name=localengine.tasklist[r]
      temp=localengine.task(name)
      if temp==True: break
    return name
  def plan1(self):
    m = Actionmodel() 
    print(m.worldstate)
    m.task("takekeyfromtable")
    m.task("putkeyinbox")
    m.task("openbox")
    m.task("takeball")
    m.task("placeballatgoal")
    print(m.worldstate)

class Game():
  def __init__(self):
    self.mysolver =Solver()   

mygame=Game()

The textadventure contains of a robot who has to place a ball into a goal position. Unfortunately, the ball is located in a box, and the box is secured with a key. For modeling such a game, a worldstate is created in form of a Python dictionary. The worldstate holds the ballposition, the keyposition and the status of the box. The user can send tasks to the textadventure like “openbox” and the game engine analyzes if the task is valid. For example, the box can only be opened if the key is in the box.

For testing out, if the action model works the method “plan1()” was created which sends a sequence of actions manual to the game engine, similar to playing the game interactively. After “plan1()” works, the solver can be implemented which finds the correct sequence autonomously. The solver is using a random generator for sending all possible action sequences to the action model. If the desired goal states is reached the solver stops.

Let us describe the system more abstract. in contrast of a STRIPS solver no dedicated planning language was used, instead the check if the preconditions are fulfilled are done within the class in computercode with if-then-statements. Additionally, the solver is not very advanced but because of the small state space it's working fast enough. In less than 100 lines of code it was demonstrated how a GOAP action model can be realized in Python and how a solver plans the action sequence.

Let us take a look into the core function which takes an action and calculates the next worldstate. The idea is, that the user sends a predefined command like “openbox” or “takeball” to the task-subfunction. The first thing what the method is doing is to check if the precondition is fulfilled. For example, the box can only be opened, if the key is in the keyhole, otherwise the task fails. The inner working is based on a if-then-statement plus direct manipulation of the worldstate. That means, the function changes the worldstate to the new value. For the programflow the action names doesn't need to be formulated in natural language. The method would work the same, if only the actionid is send. But for reason of debugging it make sense to use English descriptions. This allows the user to interact with the program in a pseudo-engish language. That means, the user can send a request like “putkeyinbox” to the game engine, and the program is doing something.

The GOAP model works similar to a textadventure. That means, the worldstate and the allowed actions are formulated in a textonly language and no graphical representation is allowed. The most interesting result is, that the resulting statespace of all allowed actions is very little. Overall the game consists of 6 actions plus 3 variables in the worldstate dictionary. That means, for the solver it is very easy to find the desired goal state by simply testing out random actions.

Nobody cares of a performance increase of factor 1000

What if a technique would be available which is able to run a Artificial Intelligence algorithm 1000x times faster? The surprising fact is, that such speed up is easy to achieve. All what the programmer has to do is switch from the Python language to C++ language which results into a speedup with the factor 10, and then he prefers rapidly exploring random trees (RRT) search technique over a normal random solver which makes the program run 100 times faster. In a direct comparison, a RRT graphsearch in C++ is around 1000x faster than a Python randomized solver.

But there are some reasons who speaks against this kind of speedup. The problem is, that RRT is harder to implement because a complicated graph has to be maintained, and the C++ version needs more time to type in than the Pyhon language. Another surprising fact is, that a speedup of factor 1000 is not big enough in Artificial Intelligence. In normal computing such improvements is great, but for problems in Robotics and AI the needed performance speed up is much greater. In AI, only a speedup in the size of 1 million times faster and more is something which should be mentioned. Or to explain it the other way around. If an AI search algorithm is too slow, this is equal that the algorithm needs instead of milliseconds the amount of years. To overcome the issue it is not enough to switch from Python to C++, because the result is, that even in C++ the algorithm would run too slow. Most AI problems can only be fasten up with additional abstraction layers, which modifies the problem itself.

Unsolved problems

On the first look, the Python program works great. It calculates the action sequence and can bring the system into the goal state. The open question is, if the simplified textadventures simulates the reality accurately. That means, if the real world and the goap model are working the same.

Storing the task into a dictionary?

From a technical point of view it would be interesting in storing the actions as a dictionary. This would allow to add new actions during runtime. The same technique is used in the STRIPS language and in the ACT-R cognitive architecture as well. All these planner are based on the principle that the possible actions are not stored in sourcecode direct, but as data.

But in reality the advantage is only low. Sure, some lines of code can be safed, but I'm in doubt if the concept would scale very well. If we take a look into more complicated textadventure, the game engine is usually create with normal Python code, but not with code which run modify during runtime. Sure, the LISP language allows to treat sourcecode as data as well, and it's interesting to describe the details, but the resulting system is very complicated to understand. The reason why the documation of the ACT-R project doesn't make much sense, is because self-modifying LISP code was used everywhere in the software. I think the more beginner friendly idea is to use only static code which is modified with a version control system like git. That means, if we need a new action, we have to add a if-then-entry in the sourcecode.

Suppose, the textadventure is able to parse not only single word statements but two word actions for example “take ball” or “open box”. The resulting parser would work differently. The sourcecode has to be modified. I think it's important if some sourcecode is there which can be adapted to new requirements. So i decided against storing the code in a dictionary. For the user the only thing which is important is, that he can send an action name to the game engine and the engine is changing the worldstate. How this is done internally it's up to the Python programmer.

The difference of computer science and artificial intelligence

Computer science is taught on most universities. The aim is to take an algorithm and convert it into a runnable program. The software is started on real hardware and solves a problem the normal user has. A typical example in which computer science can help a lot is to search for a node in a graph. Computer science describes which algorithm for graph search are available (for example depth first search), it provides a programming language (for example Python) and it makes recommendation who to run the software (with the help of the Linux operating system which runs on a Laptop).

Unfortunately, Computer science fails if the problem doesn't fit to this structure. This is true for most Artificial Intelligence problems. AI has the goal to transform a problem into a computer science problem. A typical example workflow is, that AI invents a theory about a robot control system, this theory is converted by computer scientists into software and then it can be executed on a real computer. If computer science is about algorithm, compiler design, operating systems and computer hardware what is the topic of Artificial Intelligence? It's about observing humans and create machine readable models from it. AI experts call these models forward models or knowledge models and they are describing who humans solving a problem. The aim of AI is to create universal models which fits to many situations.

Let me give an example to explain the difference between computer science and Artificial Intelligence. AI is about inventing neural networks, STRIPS notation and expert systems, while computer scientists are converting these ideas into runable code.

Measuring Artificial Intelligence with the AI in the box experiment

The simple reason why the AI-in-a-box experiment is so effective is because it is able to hold down Artificial Intelligence. This feature is important because otherwise the human has no chance. To understand the issue we have to go back to the first computer chess matches in which human experts played against the computer. Usually such a match ends with the winning of the machine.

The same match can be done in any other domain. For example Starcraft or Pong. No matter how well the human plays, he will loose the match against the computer. That means, there is no game available which can be played only by a human but which is to hard to a game AI. The AI-in-the-box experiment is the logical next step after the recognition that the human will loose any game. If the human is not able to win against the machine, because the AI can think much faster and try out more creative approaches, then the human needs a different option to hold down the AI. In a normal match this is called cheating, and exactly this is done in the AI in the box experiment. The idea is not to let a human play against the AI under equal start conditions, but the idea is, that the AI starts in the starcraft map with less resources and is encapsulated after a big wall. That means, the human is no longer a player in the game, but the human controls the resources and observers what the AI is doing with limited access to energy.

The simplest possible AIbox experiment can be realized within computerchess. All what is needed is a chessboard which allows to start with synthetic board positions. One side gets only a king and a pawn, and the other side has build a wall of queens around the king. Now, the king has to win against the stronger opponent, and the king is played by the AI. Even if the AI is the most advanced chess engine ever created which runs on a supercomputer, he is not able to beat the human. Because the human has more physical resources. The human player is in the social role of the environment. He controls all the pieces on the board and even his movements are not planned great, the AI has no chance to win the game. This gives the human the flexibility to study an advanced chess player in a controlled environment.

In contrast to a normal chess game between AI and human such experiment make sense, because the human isn't forced to win the game. He starts with a better position from the beginning and can relax.

GOAP planning for Sokoban

The game of Sokoban consists of a robot, a box and a goal. The robot has to execute some actions which pushes the box to the goal. Instead of creating the overall gametree the better idea is to describe the problem from an abstract perspective. In the GOAP (goal oriented action planning) technique the robot has the following actions:

- movetobox (precondition: robot far from box, effect: robot is at box)

- adjustposatbox (precondition: robot is at box, effect: robot has push position)

- pushbox (precondition: robot has push position, effect: box is at goal)

The planner can generate an action sequence. A sample plan would look like: movetobox, adjustposatbox, pushbox.

This is the basic idea behind GOAP. But perhaps it's possible to somplify the overall pipeline with conditional planning. Conditional planning means to create a more abstract plan which works in different situations.

The main idea behind the GOAP problem solving technique is to transfer a domain into a language. In the Sokoban example, three actions are available plus some world descriptions like “robot is at box”. This new kind of problem is a textadventure, it is played interactively and the visual representation is no longer needed. A pleasant sideeffect is, that the state space was reduced dramatically. Even the Sokoban map has a size of 100x100 fields, the planning space for the solver is reduced to the textadventure.

The open question is, if the problem space can be reduced further. A possible idea would to a use constructive grammar which produces the textadventure. In the literature this is called "procedural grammar". A GOAP model has the syntax:

rulename, if feature then feature

Rule induction

If the GOAP action model is not known in advance it make sense to collect a dataset and try to determine the rules with machine learning algorithm like Decision tree learning algorithm C4.5. At first, we need a vast amount of data:

case id	name	precondition	effect
0	movetobox	robot far from box	robot is at box
1	movetobox	robot far from box	robot is at box
2	adjustposatbox	robot is at box	robot has push position
3	adjustposatbox	robot is at box	robot has push position
4	pushbox	robot has push position	box is at goal
5	pushbox	robot has push position	box is at goal

In the table some cases are recorded. They were acquired from plan traces in a learning from demonstration scenario. To make thinks simple, the precondition / effect variables are the same, that means every action was successful. The idea is to first record all the interaction with the Sokoban game, extract features, group these features into actions and then give the actions a name. The result is GOAP forward model which predicts the outcome of an action in terms of feature values.

July 27, 2019

How to win against an AI

In the game OpenRA there is an option to play against an Artificial Intelligence. It is working with scripting AI and the AI is cheating. Not in the sense, that it's playing to well but the opposite. The build in feature of the AI is, that the programmer has reduced it's performance. Otherwise the human player would always loose.

Somebody may think that's easy to win against a computergenerated force. Because the human player has the ability to think abstract and the AI not. The problem with modern AI systems is, that they have a lot modules builtin which can plan ahead and additionally the AI can press in one minute more often a keystroke so overall the human player has no chance.

From a strategy point of view the question is how to win against an opponent who is stronger and nearly unbeatable. The first thing to do is to measure the performance of the AI precisely. The newbie may think that the AI has an advantage of 30% or maybe 40% and as a result it's possible to overcome the gap with some tricks. The sad fact is, that the advantage of the AI is much greater. A normal programmed AI is at least 10 times stronger, and a well realized AI is up to 100 times stronger. The advantage can't be measured in a challenge similar to the ELO score in a chess tournament, but it's an absolute value which has to do with how many actions the AI can execute each minute, and how little error the AI is doing during runtime. A single AI can play easily against 10 human players at the same time.

How likely is the chance to win against such a player? Right, the chance is zero. Even if the human player trains hard he can't beat a computer program. It's a physical limit he has. A human player can press each minute around 60 times a mousebutton and activate an action. In contrast, a computer generated force can press in the same time the mousebutton 600 times and more. Especially if the situation is more complex the computer has an advantage because the same algorithm can handle hundred of detail problems in the game and he can decide each issue with maximum efficient.

Some years ago during the last AI winter there was a time, in which philosophers were optimistic that computer's can't do everything. They imagined, that the strength of computers is located in repetitive tasks, while creative complex tasks can't be handled by computers. At this time the philosophers were right, because in the early 1990s there was no example available of well playing game AI. If the programmer struggle to program such an automated player it is easy to maintain a position in which the human is always superior to a machine. Unfortunately, the gap is only a technical detail. If somebody programs an AI, then the AI is able to beat the human.

The only chance a human player has to win against a game AI who plays OpenRA is, if no such AI is available. That means, the human acts in an environment in which game AI are not highly developed, because the AI experts are not available or they are not able to solve the issue. The programmer takes a short look at the problem, comes to the conclusion that the state space of OpenRA is too complex and then he admits, that it's not possible to create a simulated force.

Putting the AI Into a weaker position

A normal game of OpenRA starts with the same conditions. That means, the AI starts with the same amount of ressources like the human. In a short amount of time, the AI is using it's strength to increase the gap to the human. That means, he will build more units and conquer a larger space of the map. The result is a very strong AI which has occupies more ressources and the human player has become much weaker.

But what will happen, if the AI is put into a weaker position, somewhere in the middle? That means, the AI has less units and less ressources, while the human controls the map? Right, then the game is fair. Which means, the AI has fight hard to beat the human. The AI needs all the power to overcome the weaker starting position. From a philosphical point of view, such experiments are called “AI in a box”, because the aim is not to play a match against the AI, but the idea is to put the AI into a prison, and then observe what the system is doing with the limited ressources. The human player doesn't compete with the AI but it's forming an environment for the AI.

The situation is comparable in a computer chess match in which the AI has only the king and the human player dominates the match with the entire collection of all the powerful pieces like queen, bishop and so on. If the AI will become too powerful, the experiment gets stopped.

Again, GOAP architecture explained

Goal oriented action planning is basically an explanation how to implement the STRIPS planner in a computer game. The first step is to create a table with actions:

name, precondition, postcondition, costs

The table gets filled with actions like “walktolocation”, “getupobject”, “opendoor”. The table can be seen as an abstract textadvanture, in which the robot can execute actions which brings the system into a new state. If the textadventure contains some possible actions, it's possible to start the simulation with an inputstate for example “robot is in the middle” and a goal state “robot is at the exit”. And now, the planner calculates the steps in between. The solver works with a graph search algorithm, similar to a pathplanner but only for symbolic actions.

And now comes the funny part, what will happen if an action like “walktolocation” should be executed? Right, the details are not stored in the table. The action name has to send to the physics engine in the game. That means, the game has the same table which is enriched with the details how to execute an action. In most cases, it is realized with normal computer code. That means, in the game engine there is a method called “walktolocation”, and this method contains of some statements.

Explaining the difference between GOAP and STRIPS is a bit difficult. From a technical perspective it's the same. I would go a step further and call GOAL a slim down version of STRIPS. That means, it's a step backward. The advantage of GOAP over STRIPS is, that the surrounding documentation is easier to read. All the philosophical and mathematical parts of the STRIPS project are missing, and GOAP is some kind of tutorial how to realize a non player character for a game. It's not an algorithm nor a framework and be programmed with any language like Python or C++. GOAP is some kind of documentation how to create a symbolic planner from scratch. The tutorial explains, that the first thing which is needed is a table which contains of actions, also the precondition/costs and postconditions are stored in the table. And then a planner figures out the next steps for the robot. GOAP can be seen as software engineering pattern for creating a simple planning robot.

No Algorithm available

Computer programmers are trained to search for a library, an algorithm or a framework which can solve a task. Unfortunately, GOAP isn't an algorithm and even some example sourcecode is available at github it makes no sense to use this code in the own project. The inner working of GOAP is to create a textadventure around a given problem. That means, it won't solve an issue but it will make things more complicated. GOAP can be described as an abstract game engine which is traversed by a solver. It is not located within the domain of computer science and algorithm theory, but has to do with software engineering.

Let us observe how existing GOAP solvers were created in the past. In all cases, the programmers are using an existing programming language like C# or Python and build their own GOAP datastructure. Then, the table is filled with actions from the domain. The problem is, that the process of doing so is highly individual and can't be reproduced very well. What we can for sure is, that an object-oriented programming language was used and that the GOAP framework fits into the normal version control contributions which are made with the git tool. The programming of a GOAP solver is motivated by the requirements to program a non player character which has a certain feature set. And the understanding of the AI programmer how to realize such features in software.

It's interesting to know, that GOAP was first described by practical game developers. Perhaps one reason is, that it can't be formalized in algorithmic notation and can't be described with mathematical terms. I would located GOAP next to software engineering discipline which is a highly individual discplines which is dominated by personal stories from real projects. It makes sense to explain how a certain game was upgraded with a goap solver, and it make sense to summarize different software projects in comparison. But the attempt of explaining GOAP from a theoretical perspective will fail.

Text adventure generator

A GOAP model is equal to a text adventure. Producing such models can be realized with a text adventure generator. This is a piece of software which is producing a game. In the game, it's possible to do actions like “opendoor” or “walktolocation”. The textadventure which is the GOAP model will figure out what happens next. The advantage of a textadventure is, that a solver can determine the needed actions easily because the overall state space is small. It can be converted into a graph and the solver will take under a second to find complex action sequences.

The open problem is how to convert a normal game into a textadventure. The term textadventure generator implies that the process can be done autonomously, but in reality most GOAP models are created in a software engineering process by human programmers. They take a look at the game and write for the game a text interface. This results into the agent architecture which can solve the game autonomously.

Create an AI box experiment with computer games

The AI box experiment is the logical step after a human has recognized that he will loose any game against the AI. After a chessplayer has lost 10 matches against the computer player, he will come to the conclusion, that he needs a controlled environment in which he can observe the behavior of the AI. This is possible be given restricted ressources to the AI. That means, the human player has the normal chess pieces, while the game AI starts only with a king. Now, the AI can use all it's strength and the human can't be beaten. The ability to think very fast, and the ability to think 100 moves ahead won't help a single king to win against a human controlled setup.

The same AI in a box experiment can be realized with the realtime strategy game OpenRA as well. :The idea is, that the game AI starts with lower ressources than the human. And then the AI has to overcome it's restrictions. Even if the human player doesn't play very well the game, the AI can't beaten him because the AI is in a weaker position. Similar to be restricted to a prison.

What would happen, if the AI is able to understand the situation, can it escape from the prison? Yes and now. Let us observe first the situation with computerchess. A single king, even if it's controlled by the strongest AI player in the world is not able to win against a normal equipped situation which contains of 16 pieces. No matter, which kind of trick the AI is trying to explore, the human is always stronger.

The problem is, that not all prisons are secured very well. In some movies it was shown, that the prisoner can escape. And if a human can do so, an AI has perhaps the same capabilities. The point is, that in an escape game, the AI starts in weaker position. The idea is not to figure out who plays a game better, but the idea is, that the opponent controls the game, and the AI has to overcome the restriction.

The secret behind game AI

Real time strategy games like OpenRA have a built in game AI. But how exactly is this computer generated force working in reality? Sometimes, the principle is described as scripting AI but this definition isn't clear enough. The more elaborated explanation has to do with abstraction. The transition between no AI into an autonomous system is a soft one.

Let us go into the details. A realtime strategy game is usually programmed for the human player in mind. The underlying physics engine provides to the player some actions. For example he can select the units and let them walk to a target position. This abstraction layer can be formalized as grammar. The human player can interact with the game with a controlled vacabulary. He has some actions which are mapped to the GUI and after activating the actions the physics engine is doing something.

This baseline abstraction has nothing to do with an ingame AI but it's supporting the normal game play. It defines the game itself and it's rules. A sideeffect is, that the game AI can use the same layer as the baseline. That means, the AI doesn't move the cursor to a button and clicks with the mouse, but the AI has access to underlying API with all the allowed actions. The only thing what is missing is which of the around 100 actions should be activated next.

A naive approach would be to use a random generator. Such an AI Would become very weak. The more interesting technique is to create a second layer which allows the AI engine to plan the movement. A cpu efficient way in doing so is called GOAP (goal oriented action planning). This is some kind of textadventure which is forming an additional layer on top of the physics engine. In the GOAP model all the actions are available but they can be activated without running the normal physics engine.

Let me give an example. In every Real time strategy game the player can build a powerplant. But why should somebody do so? Because the effect of the action is, that the energy level gets increased. The relationship between the action and the effect can be stored in a textadventure in a compact representation. This allows the planner to execute the action and without a delay he gets the result that the energy level is higher. Or let me give another example.

Formalizing all the allowed actions into a textadventure is equal to building a goal model. It's a headless game which can be played in a much faster time. In this textadventure a complicated game tree can be built. The solver is trying out possible action sequences and he will see the result.

The next step contains the AI itself which is not hard to explain. The solver gets a goal, for example, maximize the energy level and the AI will figure out by it's own which actions are needed. The generated action sequence from the textadvanture is executed in the real time and for the human player a lifelike game AI is visible. Creating the GOAP model from scratch is a bit trickey, but with trial and error it's possible.

From a birds eye perspective the AI contains of a abstract actions which are searched by a solver to fulfill a goal. The resulting action sequences gets executed in the game. The advantage of this technique is, that the amount of needed cpu ressources is low, and the AI can become very strong. Let us go back to the example with the powerplant. If the GOAP model is available it's possoble to send to the planner a high level request. For example a goal can be: maximize the energy level. Another goal is to build as much units as possible. A more general goal would be to increase the overall strength. The interesting point is, that from an algorithm perspective it is very esay to fulfill such goals. Because all the possible actions and the effects are formalized. A game tree can be generated which means, that the solver is testing out an action and determines what the result is. If the solver is doing so many times he can browse through the entire game.

In the AI literature, sometimes it was written that Real time strategy games are hard to solve because the state space is too large. Exactly this is the reason why many abstraction levels are needed to simplify the search process. If the solver would try out random mouse actions it is indeed not possible to figure out the next action. Because after a short amount of time, the possible combinations are endless. But if the game is structured hierarchically and if a goap model is used it's not very hard to build the entire game tree. Sometimes a third layer is used which is called Goal manager. On this layer, a dedicated manager is defining the next goal. The advantage is, that such goal definition can be realized on a very high abstraction level. It's possible to write down the following statement:

if the own base is under attack, play defensive.

else play offensive.

Such high level goals are only possible because the lower layer are available. That means, if the game is working great, the goap model is available and the solver accepts goals, it's possible to build on top of these layers a goal manager which contains of simple if-then-statements.

Linux isn't recommended for gaming

Sometimes it was argued, that Linux is a great gaming platform, because the STEAM client is available in the Ubuntu repository and the Mono environment allows to play C# games under Linux. For example the OpenRA realtime strategy game is available as a Linux package and can be started under the open source operating system.

I have tested it out but the performance is poor. After joining a multiplayer match, the framerate for the other player drops below the limit. That means, my Linux box was the reason why 3 other players were not able to play the game fast. Linux is an obstacle which prevents to have fun with the OpenRA game. The problem is, that it won't help to adjust the settings. Even with a so called “frame limiter” the slow performance remains the same. Combined with seldom crashes of the game, Linux is a bad choice for serious gamers.

I'm using the latest Fedora version which includes the Wayland environment which is according to the developers optimized for modern graphics cards. My CPU has 4 virtual cores which can provide a lot of power, but not in Linux.

July 25, 2019

List of typographic inventions

The history of technology can't be told by ignoring the most revolutionary invention of all times which was the powerhouse for realizing all other ideas. It's called book printing and the most remarkable fact is, that it has changed rapidly over the times. What we today perceive as information society is a very young development which will change rapidly within the next ten years.

But let us start with the early beginnings. In the 15th century, Gutenberg invented the first printing machine which was superior to manual writing. In the 18th century his machine was upgraded with a steam engine. Later the fotography was invented which established in the 19th the ability to create high quality books. The technology didn't stopped, but the next improvement in printing technology was the computer controlled printer which is also called digital printing. At the same time in the 1980s the Postscript document format was invented which is building together with the HTML standard the backbone for distributing academic knowledge over the internet. This technology was improved by the first search engines and the access of private households to internet flatrates.

The latest improvement in book printing technology is the Academic social network “Academia.edu” which puts the normal user in the role of a scientists. That means, anybody can create his own fake science with the help of LaTeX and MS-Word and with a bit luck he will find his readers / students somewhere in the internet who are downloading his manuscripts.

What the future will bring can be imagined today. If not the human, but deeplearning algorithm are creating the manuscript the amount of information will grow rapidly. It will result into an information explosion in which former boundaries are no longer available. Any human in the world can become an academic journal, and even robots can write their own newspaper.

The interesting feature around the printing technology is, that on top of this innovation all other disciplines can accumulate knowledge. No matter if the subject is literature, biology or economy, all disciplines are depended on books and they are profit from a faster Gutenberg galaxy. Another interesting feature is, that newer technology doesn't replace older technology, but all of them are used at the same time. In the year 2019 it makes totally sense to buy a printed hardcover book in a book store for 100 US$, but it make also sense to upload a postscript document to the internet.

What is wrong with SUN microsystems?

A short look into the history of computing of the early 1990s will come to the conclusion that a SUN workstation was the greatest machine on earth. It contains of superior hardware, and was shipped with the SUN operating system. In the early 1990s SUN was equal to the future. It had a revolutionary computer design which was equal to something not seen before.

If we are analyzing the same hardware and the same software with today's eyes, SUn has lost nearly all the magic. On outdated mainstream PC for 300 US$ plus a normal Ubuntu Linux operating system would clone the SUN workstation great. The resulting workstation would have a greater power and more attractive software, then the original SUN computer ever had. What was gone wrong with SUN microsystem that it had lost all the glory?

It wasn't the problem of SUN itself, because we can take any major technology company from the early 1990s and clone the idea with modern cheap hardware. It has to do with the computing industry in general. The idea of building, shipping and buying an advanced computing workstation is not something which is new and exciting but it has become a mainstream product. Each day, million of high-performance computers are sold, and using a multi-tasking operating system which can draw windows and has ethernet connection is not advanced technology, but it's the minimum requirement in the low end PC market.

The difficulty is to identify an area which is the new SUN workstation. That would be a kind of technology which is revolutionary, but isn't widespread available. It's not clear, which company or product has this status, but what we can say for sure is, that former PC companies and software companies are not in that position. It is located outside the silicon valley bubble.

Creating cargo cult robotics projects

The main problem with robotics projects is, that nobody knows how to do it right. So called general Artificial Intelligence wasn't invented yet, and the only technology which is available today are normal computers and programming languages like Python. The pathway from today's computer technology into modern robotics is blocked and the only thing to bypass the obstacle is stage the desired behavior. That means, the engineers can't do science, but they have to do fake science which is called pseudo science or cargo cult science.

It means, to pretend the desired behavior with today's available technology. In the original meaning from Richard Feynman, cargo cult science was some kind of theater play in which airplanes were made out of wood. The reason why is because such wood made airplanes was technology which was available for the people at the island. Transfering this idea into robotics means, that the robot has to controlled with a human operator in the loop. Because telerobotics is the only technology which is known today.

The interesting point is, that lots of technologies are available to realize remote controlled robots. There are servo motors, joysticks, data-gloves and even cameras available. That means, with normal out-of-the-box technology it's possible to setup a teleoperated controlled system. It's not a real robot but has more in common with an airplane made of wood. The airplane is not able to fly and the robot is not able to act autonomously. But it's better than nothing.

Let us describe the details. A teleoperated robot can be controlled with a joystick, a mouse or with a dataglove. It can be controlled with a single person, or with many persons at the same time. A typical lowtech example would be a robot arm which is controlled by two operators at the same who are using a mouse to move the joints. In the optimal case, the human operators are hidden behind paravent because this will make the illusion more realistically. The problem is, that such a robot needs two human operators in the loop. That means, it's productivity is low and the overall setting is fake. But the hope is, such kind of teleoperated robot is inspiring the audience who future technology will look like. It's not a demonstration how well science works, but it shows, that theater playing make sense.

Postpone the hard parts

If a telerobotic system gets activated nothing will happen. The system has no representation of the environment, no planner and no neural networks. The engineers struggled to implement such features because their knowledge is sufficient to solve the task. Instead, a human operator has to send keystrokes to the system and it can't be called a robot, but a trivial mechanical machine.

From an abstract point of view, in a teleoperated robot the most demanding module is missing. It's the AI software which can control the robot by it's own. This module wasn't realized, and the engineers don't care about. They have postpone the step to later moment. They are planning to program the AI in 10 years if their knowledge is greater. Perhaps, they will never figure out how to realize a robot. The robotics engineers have a relaxed understanding of their role. They recognize that the teleoperating robot isn't a real robot but they doing nothing to overcome the issue. Instead, they are done with the project.

Some of the engineers imagined, how to program the AI which is able to control the robot. But they haven't accumulated enough wisdom for a detailed analysis. Instead of realizing the AI part the aim is to deceive the audience and them self and pretend that the robot is working right now. They are doing so by controlling the device manual. From the perspective of efficiency, cargo cult science is a here to stay. Basically the idea is to solve only the easier parts of the project and ignore the harder one. It's equal to a conservative approach in which energy is invested only if it's successful. Instead of trying to fail much often and stand up again, the idea is to never fail and fear about the harder problems. The problem is recognized as too big, and the logical consequence is stay away from it. Fake robotics engineers are staying in the comfort zone and plug the holes with irrational beliefs. A typical statement in a telerobotic project is, that a higher force will control the robot, but in reality the higher force are only two human operators with a joystick in the hand.

GOAP AI planning explained step by step

The Goal and action planning system has the same disadvantages like the original STRIPS planner. The algorithm itself works great, but it's complicated to explain. I won't invent the wheel twice but i will explain it more easier. At first, a GOAP tutorial needs two elements:

1. how the solver works

2. how to build the domain model

A GOAP planner contains of low level actions which can be executed on the robot directly, for example the action “left”. And it contains of tasks which doesn't produce an action but result into a goal. Let us take an example. The robot should go to the middle of a map. This goal is defined by sending the goal to the game engine: task(“gotomiddle”)

If the game engine receives the task it will put the robot to the middle by changing it's absolute coordinates. The planner has to figure out which goals are needed in which sequence and then the low level actions are determined. The inner working of such a planner is remarkable easy because it's a graph search algorithm which is traversing the symbolic game tree, and checks out potential paths.

The second important aspect of GOAP is how to create the domain model. Without a model the planner won't work. The domain model is equal to grammar which contains goals in natural language. It's a wordlist which contains detailed description for each entry. Building such a grammar is possible with learning from demonstration. That means, a human operator is observed how he is solving a task, and according to the annotated actions the tasks are identified.

Combining teleoperation with symbolic planning

Every solver is working with a goal in mind. The problem is, that goals are not available in the software itself, because they are referencing to request from the outside. The high level goal is an input which is feed into the planner. The source of the goal is not another module, but the human operator.

If the human operator defines the goal interactively, a semi-autonomous system is the result. An easy to grasp example is an inverse kinematic in which the human operators points to a position on the screen and the robot arms is following the trajectory. The interesting aspect is, that such a robot arm is moving faster than a human can execute the actions. The human needs only to point the mouse to the goal, but the lowlevel actions are determined by the system.

This interaction between human and robot can't be modeled in software. Every goal planner comes to a bottleneck because it's only in parts possible to determine the goals themself. That means, the robot itself doesn't has meaning, he becomes only a tool if a higher instance asks the system to do a certain action. Let us give an example:

A plan is provided to the robot control system, which is a list of waypoints. Why does the robot system should follow the waypoints? Nobody knows, because the plan is given by a human operator. He knows why the plan makes sense, but this knowledge isn't available for the robot. The communication between both separate system needs a shared language. That means, the robot control system should be able to parse the plan and the human operator as well. What the robot can do is to figure out the low level primitives for realizing the plan.

It's the same idea how modern compiled programming languages are working. The compiler needs as input a hello world program. For the compiler the program itself doesn't make sense, but it can convert the sourcecode into binary code.

Essence of GOAP

The problem with most AI paradigm is, that it's hard to grasp it's inner core. Is GOAP an algorithm, a piece of software or a programming language? Unfortunately none of them, and it's much harder to explain the overall concept. I would call it, an extension to a physics engine which provides abstract actions. The user can send a command like “movecenter” to the physics engine, even if the game itself doesn't know of such an action. The physics engine is put into a measurable state and a solver can try to reach this state.

Or to explain it from a different viewpoint. The goap concept means, that a plan notation is invented and grounded into the physics engine. This plan notation allows the human operator to provide a sequence of actions, and the goap planner can try to convert this plan into lowlevel actions. The term goal is equal to a macro action. For example a goal might be “reach waypoint”. It's an option to raise the abstraction level. A plan which contains of only 10 single steps can describe a complex longer sequence of movements in the game. A comparison with functions in a normal programming language make sense. For the programmer it's not important how the “isprim()” function works internally, but it delivers a return value back to the program. GOAP is using the paradigm for structuring the action space of a robot in a game.

July 21, 2019

Production systems are language based game engines

The history of AI knows production systems, General problem solvers and STRIPS like solvers. What they have in common is, that they are using language based transformation of a system which results into high speed problem solving. To explain the idea in detail i give an example.

A robot is located within the center of a map. He can move in 4 directions (up, down, left, right). The numerical way of problem solving is working with a gametree. All possible sequences of the robot, e.g. “up, up, left, up, left” is stored in a graph and a solver can search in the graph for a goal node. Dedicated path planning algorithm like A* or RRT modify this approach a bit and search the graph faster.

A production system is working different from graph search. The basic idea is, to describe the problem on a semantic level with the help of a robot language. This is equal to introduce macroactions. The robot language contains of the following words:

- lowlevelactions: up, down, left, right

- midlevelactions: 5up, 5down, 5left, 5right

- highlevelaction: movetolocation, moveincircle, resetgame

The words of the robot language can be combined to larger programs, very similar to a behavior tree. A production system is using the grammar for reaching a goal. The advantage over a graph search algorithm is, that the computational effort is much lower. The reason, why STRIPS like solvers are not very common in robotics is because it's hard to invent the robot language from scratch. In most problems only the low level commands (up,down, left, right) are known, but not macro-actions.

In the STRIPS notation, the domain grammar is stored in the STRIPS syntax, while in the SOAR production system the syntax is based on SOAR production rules. In both cases, the domain file is equal to a grammar which describes a problem on a higher level. The grammar allows to search in the action space of the robot more efficient than testing out only lowlevel sequences of actions.

Symbolic AI

Symbolic AI is equal to create a higher abstraction layer on top of movement primitives. On the lower side, the robot can only move in four directions with one step each. He is not able to reach with one command a position far away. This kind of super-actions has to be invented by the programmer. It's possible to transform a super action like “moveto(10,20)” into a sequence of lowlevel actions. In the literature the concept is described as hierarchical planning and it's improves the efficiency in problem solving.

The major question is not how STRIPS or SOAR works internally, which is equal to plan in a hierarchical fashion, the bottleneck is the grammar which describes the macro-actions. Using an existing STRIPS domain description for determine the lowlevel actions of the robot is an easy to solve problem. Because the solver will search in the domain file, test out some alternatives and then he will have found the correct action sequence. The more demanding challenge is to convert a game into the STRIPS domain file.

A language grammar allows to execute high level actions in a game. After running a macro-action the game will be in a new state. The transition is formalized in a symbolic game engine. Let me give an example. According to the grammar, the macro-action “5up” is available. The meaning of the command has to be formalized in sourcecode:

newposition = oldpos + (0,-5)

defines the action precisely. The action modifies the underlying game. It changes the position of the robot. Now we can ask which lowlevel actions are needed to become the same result. The transformation from high level actions into lowlevel actions can be realized with a solver.

Language understanding machines

A software which accepts natural language commands is easier to realize then it might look at the first side. All what is needed is an if-then-statement:

if input==”5up” then ...

if input==”5down” then ...

We can feed the program with a string for example “5up” and the program is doing something. From a perspective of a game programmer, such a software module is called a game engine. It specifies which actions can be send to the game. The statement after the “then” section are grounding the input word. They are formulating precisely, how the game state get changed. After a game engine was programmed it's possible to formulate a high level program written in the domain specific language:

5up

5down

... is an example program. Each command is send to the game engine and is executed in a virtual machine.

A simple grammar for solving Sokoban like puzzles

In he example picture the well known game of sokoban is shown. To make things easier no obstacles are there but the robot has to push the boxes to the goal position at the lower right of the map. The lowlevel actions available for the robot are: left, right, up, down.

The first idea to solve the problem with a RRT graph failed. It's not possible to generate the entire action sequence, because the number of required actions are to high. The robot has to reach first box0, then he has to push the box to the goal, then he has to go to the next box and so on.

The next better approach after using RRT graph search is implementing a robot control language and use a symbolic solver for figuring out the low level actions. This approach works much better. The implemented grammar has the following elements:

"reset","movetobox","pushboxtowaypoint","movesmall", "pushsmall"

All these words are equal to powerful macroactions. It modifies the position of the player in absolute values. For example the action “movesmall” is able to teleport the robot 2 fields with a single command. Let us observe how a potential high level plan will look like:

1. movetobox 0 # a parameter is given to the command to specifiy the box0

2. pushboxtowaypoint 0

3. movetobox 1

4. pushboxtowaypoint 1

If the high level planner is made more detailed the improved plan will look the following:

1. movetobox 0

1a movesmall

1b movesmall

2. pushboxtowaypoint 0

2a pushsmall

2b pushsmall

2c pushsmall

3. movetobox 1

3a movesmall

3b movesmall

4. pushboxtowaypoint 1

4a pushsmall

4b pushsmall

Even this detailed plan is not expanded fully. A low level planner has to convert all the movesmall and pushsmall actions into lowlevel primitives. The resulting plan will contains only “left,right,up,down” actions. That's the basic idea of a hierarchical planner and the idea is powerful enough to determine the actions of the robot. The proposed system contains of three elements:

1. the Sokoban game itself

2. the grammar which includes lowlevel and highlevel actions

3. the planner which generates the plan and converts it into lowlevel action sequence

Let us dive into potential pitfalls. There are two possible pathways in generating the macro actions. I have choosen the manual way. That means, in the program there is a subroutine for “movesmall” and for “pushsmall”. In the subroutine the angle to the goal and the new position is determined with normal computer code. I'm unsure if it's possible to generate the macro actions without human intervention. In the literature the concept is called “Rule learning”, but I'm in doubt if this works for Sokoban. That means, my program doesn't have learning capabilities but it's static. All the macro actions are defined before the runtime.

Solving a symbolic problem

Let us go a step backward and describe what the planner has to do. Instead of trying out a sequence of lowlevel actions, the newly established environment contains of macro actions:

"reset","movetobox","pushboxtowaypoint","movesmall", "pushsmall"

The goal state has to reached with using a sequence of these actions. Because the macro actions are more powerful than the regular actions the sequence will become much shorter. The proposed plan can be described with a few commands. Before the planner can generate the actions autonomously it's important to ground the macro actions. Grounding means, that the actions are not only given in a Python list, but that the game engine accepts the command as a valid command and so something in return. That means, if the command movetobox is send to the game engine, the game engine will modify the robot's position and the result is displayed on the screen. This is called grounded actions, because the resulting game engine state can be measured by the planner.

Let me give an example. We are using a simple randomized planner which is sending randomly macro-actions to the game engine. After each trial we are testing what the position of the robot and the boxes is. If the boxes are at the goal position the plan has been found. Before we can measure the robot's position the action need to have an influence on the game engine. The counter example would be, that we are sending the high level action “movesmall” to he game engine but nothing happens. Then the action is not grounded and it's only a word in the program without having a meaning.

Short history of the Gutenberg Galaxy

The Gutenberg Galaxy is a virtual space of printed books. The most important event was the invention of the Gutenberg press in the 15th century. This technology allows for the first time to reduce the photocopy costs. A later milestone was the invention of the motor driven high speed press by Koenig&Bauer in the 19th century. They have revolutionized the printing of newspapers. At the same time the first public libraries were established in larger cities all over the world.

A recent event in the progression of the Gutenberg galaxy were the first soft cover books in the 1960's. They have replaced former hard cover printings and their main advantage was the lower price. Around the year 2000 a very important milestone was reached, which is the internet. From now on it's possible to store a digital copy of a book in the cloud and – very important – do a fulltext search in the content. This results into today's information age in which classical books have become obsolete.

All the invention from 1500 until today have lowered the costs of distribution knowledge. It's easier than ever to write, print and reproduce information. This is equal to flood the Gutenberg universe with more content. A look back into the past shows, that not the invention of the computer, nor the electric light was the most important invention, but the supporting technologies of the Gutenberg age can be understand as the tipping point. Without the distribution of books, it would not be possible to discuss biology, literature, mathematics or engineering.

The interesting point is, that all innovations in the Gutenberg related domain were rejected at the beginning. In the 1960s in which the first soft cover books were published, the criticism were great. Most people doesn't like the new bookformat well and they prefered to spend more money in exchange of a hard-cover book which has a higher quality. The same techno-scepticism is there against the Academia.edu platform which is a breakthrough in distributing academic knowledge. Most researchers doesn't like the idea to upload their pdf paper to a server without paying the former Article processing charge. They don't trust the idea of reducing the costs of knowledge. They fear this will change the world. And all the concerns are correct.

Let us research some behaviors related to the gutenberg galaxy. How does it feels, if somebody buys a soft cover book for a cheaper price but not the hard cover version? How does it feel if somebody buys a newspaper instead of visiting the church at sunday? What is the emotion if somebody uploads a pdf paper to Academia.edu instead of sending the manuscript to a publisher? It feels always bad, it is nothing what looks right. It is something not wanted by society. In all of these cases, the individual user who is doing so stands up of the crowd. He is doing the opposite of what all other are doing. The reason is, that the idea of reducing the costs is something which is equal to make progress. And this is what Luddites are in fear of.

Industrial revolution

Some authors are surprised why the world has made since the 17th century such a dramatic progress and many new inventions were taken during the industrial revolution. The simple explanation is, that the printing press was the basis technology which triggered all the other inventions. If it's possible to copy a book, it is possible to establish a university in which the books are archived in libraries. And if a university is available, students will attend the classes, gets educated and can found new companies which brings economy forward.

Let us imagine an industrial revolution without the printing press. It won't work. Even if a certain genius had invent something, it wouldn't be possible to spread the news to other people. The printing press is some kind information highway which connects all the other development in society. The role of the printing press was overtaken by the Internet. It provides the same service but at lower costs and much faster. To understand the power of Internet technology we have to focus on a technical simple workflow. An author has written a manuscripts and uploads the pdf file to a server. Then the information is indexed by google and all the other users can download the document. They didn't have to pay anything, and they can get the content within seconds. That's the fastest printing press ever invented.

For today's point of view, this simple use case of the internet seems normal, but in the history of mankind it's rare. The first internet flatrates and the first search engines were invented in the year 2000. That means, the described workflow of uploading and download electronic manuscripts was invented only 19 years ago. What would happen if this innovation is active for a longer period of time, for example for 100 years? Yes, it will revolutionize everything. The amount of information will grow, and the number of people which have access to the content as well. What they are doing exactly with this electronic press is unclear, but it will speed up the information exchange.

July 20, 2019

Will Rust replace C++?

The Rust programming language looks like a promising candidate which simplifies programming. In contrast to C++ the language was designed from scratch and has many powerful features. Even the performance can be compared with C++. Unfortunately these features are not enough to compete with C++. The problem with C++ is, that the language is nearly perfect, which means there is no need for new kind of language. The reason why C++ has pointers is because a computer is working with pointers, and the reason why so much libraries are available in C++ is because so much programmers have written it. It's simply not possible to invent programming from scratch and make everything much better.

The future of C++ isn't Rust, but the future is a layered software development workflow in which the prototype is written in Python and the production code in C/C++. That's the best practice method in producing high quality software which runs on major operating systems. It's true that there is need for additional language apart from C/C++ but not with the aim to replace it but to extend it.

The funny thing with programming is, that writing code isn't very hard. I've never heard of a programming project in which the programmers where not able to create an array and fill it with values. The reason why software projects fails has to do with coordinating programmers, decide which kind of program has to be written and because the users doesn't like the application. Converting a specification into executable binary code is the easy part in a software project. If somebody is in the comfortable position, that he can type in C++ sourcecode into an IDE the project is on the right track.

Software engineering can't be revolutionized with new programming languages but with new collaborations tools like atlassian jira, wikis, github and so forth.

Is there a need for an alternative to SOAR?

The cognitive architecture SOAR is a well known and intensively documented framework for creating Artificial Intelligence agents which contains lots of interesting features like subgoaling, chunking and graph based working memory. But something is wrong with SOAR, which prevents that the software can be recommended for serious application. It has mainly to do with the documentation which are formulated always in the “you can do something”, and “you have to click this button” and so on. This kind of language works similar to the promotional language used in advertisement campaign, in which the new car is simply great, and and all what the user has to do is to get excited.

From an academic standpoint, this kind of description doesn't fulfill the minimum standard, which means, that the SOAR documentation can't be called scientific, but it's a waste of time reading it. Sure, SOAR itself is great, it combines most of the advanced features which are available for modern agent architectures but it's possible to make the overall system much better.

Which means, there is a need to invent things from scratch and built a new cognitive architecture from scratch which is less powerful than SOAR, but provides a better self-description what the project is about. The first step in doing so is to explain what the idea behind SOAR is. In contrast to normal solver, which is searching in the gametree for a goal node SOAR is working with heuristics. That means, the requirement for CPU ressources is low and no high computational task is needed to solve a problem. Everthing in the SOAR universe works reactive and with human engineered knowledge. The concept is sometimes called symbolic AI because no number crunching is involved in the game.

A first approach to understand SOAR better is to compare it with a STRIPS planner. A strips planner contains of actions which can be executed by the solver, and the precondition/postcondition for each action is given in the strips file. A cognitive architecture is some kind of advanced strips planner which has the ability to access a working memory and to learn new goals on the fly.

The Strips notation is a good starting point for developing a cognitive architecture from scratch, because the concept is described in mainstream Gaming AI and it's not very complicated to grasp the basic idea. The game starts with an init state, the solver is trying to find a sequence of actions, and this will guide the robot to the game.

In a paper [1] a mixture was described between a classical HTN planner and a memory architecture. The resulting architecture isn't so powerful like SOAR, bit explains more easier to understand what the idea is. It starts with a vanilla HTN planner for figuring out the actions for the robot. Before the planning process gets started, some operations on the internal memory have to be done. The paper admits, that it has reverse engineered the SOAR architecture, and the similarity is there. Exactly this approach makes sense, because this kind of inventing something twice is equal to learn something. If somebody isn't able to clone a software, he hasn't understand it.

HTN planning plus short term memory

The idea of combining HTN planning with a memory was also discussed in another paper.[2] It describes a non player character in a game, which doesn't have full information about the game, but sees only a limited amount of information which are stored in the short term memory. Similar to the previous referenced paper, the systems starts with a normal HTN planner for generating actions which is extended by additional components like a subgoal generator and a short term memory.

It's interesting to know, that the concept of a working memory plus subgoaling is described as well in the SOAR context as in Game AI non player characters as well. It seems, that life-like characters in games have a natural demand for a working memory and for hierarchical planning capabilities.

Let us imagine, how a non player character can be created which is more flexible. The first thing to do is to realize the working memory not as a normal datastructure but as a semantic network. All kind of variables can be stored there and nodes inbetween the items are possible, similar to a linked list. The second thing to do is to add a learning functionality. Which is a decision tree ID3 learning algorithm. This allows the non player character to adapt his behavior in realtime.`

[1] Zhang, Jun, et al. "MBHP: A memory-based method on robot planning under uncertainties." 2011 IEEE International Conference on Robotics and Biomimetics. IEEE, 2011.

[2] Mahmoud, Ibrahim M., et al. "Believable NPCs in serious games: HTN planning approach based on visual perception." 2014 IEEE Conference on Computational Intelligence and Games. IEEE, 2014.

Measuring Artificial Intelligence

The problem with most AI related projects is, that the newbies doesn't know what the goal is. Sure, Artificial Intelligence is the goal, but how exactly is this defined? A possible answer to the problem is to define Artificial Intelligence on a time scale. AI is equal to run a problem solver very fast.

Let us make a small example. The task is to solve the first level of Sokoban game. For doing so, the robot has to push some boxes. He can do so by building a graph of all possible movements and a common algorithm for doing so is RRT (Rapidly-exploring random tree). After starting the software, the CPU will consume 100% and after a while the program has found the solution. In case of Sokoban, the needed gametree is very large. Even if RRT is a nice algorithm he will consume lots of cpu ressources.

Can we increase the performance? Exactly here comes Artificial Inteligence into the game. RRT itself can be called AI, because it's a vanilla graph search algorithm. It's working a bit faster than a brute force solver but not very much. If the Sokoban problem is slightly more complicated the solver will fail. That means, a standard problem has to little cpu ressources to build the tree in realtime. It will takes many hours, until the sequence of robot moves was generated. AI related problem solving is to identify a faster technique. The aim is to solve the sokoban puzzle but with less computational effort.

The definition has the advantage that it explains what the goal is. The goal is more than only to control a robot, because the RRT algorithm can do so very well. The aim is to use as little cpu ressources as possible. That means, to invent an algorithm which is using the underlying hardware more efficient. Somebody may argue, that apart from graph search there is no alternative available which means, that games like Sokoban or Lemmings can't be solved. But this ignores that some AI heuristics are available which can do so much faster. The amount of potential speed up technique is large, and it's up to the programmer to identify the best one.

The interesting aspect of using the RRT algorithm is, that the technique is in theory well suited to solve any kind of Sokoban level. But in reality it can't. The problem is, that after some movements in the game, the game tree will become very large and even the RRT algorithm isn't able to handle this complexity. That means, everything works fine, but the CPU of the underlying hardware will need to much ressources. This brings the computer to it's physical limit and there is a need to invent something which works better. The question is not how to program an AI, the question is how to make the RRT algorithm more faster.

Let us take a look into RRT. The algorithm itself works great. It is using a graph to store existing movements in the game, and it extends the graph in a tricky way. Programming an RRT algorithm in software is a bit more complicated than programming the Sokoban game itself, but it's possible with all programming language available. The difficulty is, that RRT won't solve the core problem which is to find a path through the game tree. It's simply a scaling problem which means, that RRT works well only for problem which has 1000 upto 10000 nodes and then it becomes difficult. On the first look this insights sounds surprising, because RRT is one of the most efficient graph search algorithm available. Sometimes it was described as faster as A*. Unfortunately, this advantage is not enough. Because Sokoban is more complicated to solve than the RRT algoirthm has to offer.

All serious AI techniques are trying to overcome the bottleneck of classical graph search algorithm. They are trying to find a path in a large state space. They are doing so with heuristics and domain knowledge. What we can say for sure, is that normal graph search algorithm like A* or RRT are not able to solve harder problems. Hard means, that the map is huge in which the pathplanning problem is there or that the state space of the robot is huge. This is a surprising insight, because for the newbie RRT looks like a powerful tool for solving all kind of planning problems. The bottleneck is not to improve RRT a bit, for example by using a multicore CPU, the problem is, that solving Sokoban needs a speed up of 1 million percent. This is equal to a complete new algorithm apart from graph search.

What's wrong with RRT?

RRT is one of the most powerful search algorithm available. The reason why is a combination of building a graph and extend the graph efficiently with new nodes. Creating a graph is important because this helps to not figure out the same sequence of actions twice, and adding new nodes with the RRT techniques saves many ressources.

RRT outperforms A* a tittle bit. Not as much, that A* is become obsolete, but using RRT as the standard search algorithm for pathplanning and action planning in games. The only problem is, that RRT itself will fail in most problems. Graph search plus an optimized node-adding method isn't enough to handle a state space which contains of millions of nodes. If we like to use RRT for solving computer chess, sokoban, Lemmings or a robotics task we will notice, that the algorithm fails. That means, if will take to long for finding an answer. This has nothing to do with a certain programming language for example Python vs. C++, but the problem is, that a speedup of around 1000 upto 1 million is needed which can't be provided with RRT.

The answer to the problem is, to modify the problem itself. Instead of searching in the state space of lowlevel actions, the idea is to search in a symbolic state space. If RRT is used, for sampling PDDL actions, it's a very powerful algorithm. The problem is, that searching in symbolic actions has nothing to do with graph search, but it's a heuristic which goes beyond the RRT idea. The advantage of RRT is at the same time it's bottleneck. Searching in a graph works for all kind of problems but it results into a poor performance.

Let us go back to the Sokoban problem. Suppose, the RRT Solver is trying to analyze the state space. After a while, it's obvious that it takes too much time. We can cancel the operation and write a notice that RRT won't solve the problem. That means, RRT is useless for solving sokoban. We can switch back to the normal brute force solver which has nearly the same performance. That means, a highly optimized RRT algorithm will need 1 year to search in the game tree, and a normal brute force algorithm will take 100 years. Both performance results are not practical because the software has to find the next move in under a second.

What i want to explain is, that for serious problem we can ignore RRT and use a normal brute force solver without any graph building strategy. The bottleneck isn't how to search fast in a graph, the problem is how to map a task to a graph. Suppose, there is a graph given which contains of 2000 nodes and the aim is to find the shortest path. Under this constraints, RRT makes totally sense, because it's able to search the path very fast. Importunately, the Sokoban game doesn't provide a graph with 2000 nodes, but it provides no graph at all, and it's up to the programmer to identify a data structure in the game.

A well known technique for speed up the search in the state space is called “reactive planning”. That are non-sampling techniques which needs no or only a little amount of computational effort. Usually they are working in a hierarchical fashion. For example, it make sense to divide a map into smaller maps and use an quadtree planner to find the path. This is only the most common introduction into the subject of reactive planning, there are many more strategies available to reduce the state space drastically.

Artificial Intelligence is grouped around alternatives to classical RRT planning and is using knowledge, reactive planning and heuristics for speeding up the search in the game tree. The goal is to reduce the computational load to a minimum which allows to solve very complex problems on mainstream CPUs.

On the first look, quadtree like planner looks not very powerful. But compared to a vanilla RRT algorithm they are able to speed up the search for 10000x times and more. That means, after starting the algorithm the result is presented within milliseconds. The good news is, that octree planners, reactive planning and cognitive architectures doesn't contain Artificial Intelligence itself, but what they are doing is to search faster in the game tree. They are producing the same action sequence for a robot like the vanilla RRT algorithm but in a shorter amount of time. The initial problem is not “how to make the robot” smart, but the problem is, that the robot is smart already, but the solver needs 100 days, until he has determined the next move. AI is equal to reducing the calculating speed to zero or almost zero.

Lessons learned from implementing RRT for Sokoban

RRT (Rapidly-exploring random tree) is sometimes presented as the preferred path planning algorithm which works well for larger maps and for real time purposes. Implementing the algorithm in software is not that hard. It's mainly a class which contains of methods like “comparewithgoal”, “addnode”, “getparentid”, and “showpath”. If the user starts the algorithm the software will generate a game tree and adds new nodes carefully. This is equal to a fast sampling of the state space.

For smaller problems it works great. A small problem is a map of 20x20 fields, in which the robot is at top left and has to find the path to a goal and additionally some obstacles are in the map. Running the RRT algorithm for such kind of problems results into a performance of around 1 seconds which is fast enough. If the programming language is carefully selected for example C++, the speed is much better and it can be used under realtime constraints.

The problem begins if we are modify the setup slightly. In the robocode domain, the robot has not only move to a goal he has to push a box. The goal function has to check if the box has reached the goal. If the RRT algorithm is used for such kind of purposes the performance will become worse. On my computer it takes many minutes until a solution was found. A simple request to the solver like “put the box to goal position A” will result into a larger processing task which occupies lots of CPU ressources and will take 2 minutes. A more complicated task in which two boxes should be moved to a goal position results into an runtime of many hours which means, that the RRT algorithm doesn't find a solution.

The reason why has to do with the state space. If the robot has to move to the box, he will need around 10 steps, for moving the box to a goal he needs additional 15 steps, then he needs again 10 steps to move to box 2 and so on. The amount of possible actions in this longer sequence is high, and the resulting graph will grow quickly which is too much for the RRT algorithm.

The problem is, that it's not possible to improve the performance a bit, because for solving the box pushing task the algorithm has to run 1000x faster. RRT is not the answer to the problem, it prevents that the solver will find a solution. Sure, RRT is one of the fastest sampling algorithm available, but state space sampling isn't able to solve complex problems. I think the problem is located within planning itself. The idea of providing a start position and the solver has to find the goal position isn't working anymore. Only for simplify toy problems the algorithm will find the steps in between.

The good news is, that this understanding not only critizes the RRT algorithm, but similar techniques like PRM, A* and reinforcement learning will have the same poor performance. The only way to overcome the difficulty is to switch to a different kind of search strategy which doesn't sampling the state space.

Alternatives

Possible alternatives to RRT which are working faster are easy to tell. The first one is a quadtree path planner, which means, that the map is divided into submaps which have different sizes. This allows to plan longer routes with a single step. The second technique is symbolic planning in which the problem is converted into a STRIPS domain file which is equal to hierarchical actions. In both cases, the original problem is ignored, the normal state space is no longer relevant but a different layer is built on top of the original game.