July 29, 2019

GOAP Textadventure


Even if GOAP was introduced in the literature already, it make sense to give a simple example. In the following sourcecode a GOAP planner for solving a textadventure is presented. It contains of the action model itself and a randomized solver for determine the action sequence.
"""
description: GOAP Textadventure 2019-07-29
"""
import random,copy

class Actionmodel():
  def __init__(self):
    self.worldstate = None
    self.tasklist=("openbox","closebox","takekeyfromtable","putkeyinbox","takeball","placeballatgoal")
    self.reset()
  def reset(self):
    self.worldstate = {
      "ballpos": "inbox", # inbox, atrobot, atgoal
      "keypos": "attable", # attable, atrobot, atboxkeyhole
      "boxstatus": "close", # close, open
    }
  def task(self,name):
    result=True
    if name==self.tasklist[0]: # openbox
      if self.worldstate["keypos"]=="atboxkeyhole":
        self.worldstate["boxstatus"]="open"
      else: result=False
    if name==self.tasklist[1]: # closebox
      if self.worldstate["keypos"]=="atboxkeyhole":
        self.worldstate["boxstatus"]="close"
      else: result=False
    if name==self.tasklist[2]: # takekey
      if self.worldstate["keypos"]=="attable":
        self.worldstate["keypos"]="atrobot"
      else: result=False
    if name==self.tasklist[3]: # putkeyinbox
      if self.worldstate["keypos"]=="atrobot":
        self.worldstate["keypos"]="atboxkeyhole"
      else: result=False
    if name==self.tasklist[4]: # takeball
      if self.worldstate["boxstatus"]=="open" and self.worldstate["ballpos"]=="inbox":
        self.worldstate["ballpos"]="atrobot"
      else: result=False
    if name==self.tasklist[5]: # placeballatgoal      
      if self.worldstate["ballpos"]=="atrobot":
        self.worldstate["ballpos"]="atgoal"
      else: result=False
    return result
  def show(self):
    print(self.worldstate)

class Solver():
  def __init__(self):
    self.mymodel = Actionmodel() 
    #self.plan1()
    self.randomsearch()
  def randomsearch(self):
    planlength=5
    trialmax=100
    for trial in range(trialmax):
      engine=copy.deepcopy(self.mymodel)
      actionlist=[]
      for i in range(planlength):
        name=self.validaction(engine)
        actionlist.append(name)
        engine.task(actionlist[-1])
      if engine.worldstate["ballpos"]=="atgoal":
        print("trial",trial,actionlist)
        engine.show()
  def validaction(self,engine):
    maxtrial=16
    for trial in range(maxtrial):
      localengine=copy.deepcopy(engine)
      r=random.randint(0,len(localengine.tasklist)-1)
      name=localengine.tasklist[r]
      temp=localengine.task(name)
      if temp==True: break
    return name
  def plan1(self):
    m = Actionmodel() 
    print(m.worldstate)
    m.task("takekeyfromtable")
    m.task("putkeyinbox")
    m.task("openbox")
    m.task("takeball")
    m.task("placeballatgoal")
    print(m.worldstate)

class Game():
  def __init__(self):
    self.mysolver =Solver()   

mygame=Game()
The textadventure contains of a robot who has to place a ball into a goal position. Unfortunately, the ball is located in a box, and the box is secured with a key. For modeling such a game, a worldstate is created in form of a Python dictionary. The worldstate holds the ballposition, the keyposition and the status of the box. The user can send tasks to the textadventure like “openbox” and the game engine analyzes if the task is valid. For example, the box can only be opened if the key is in the box.
For testing out, if the action model works the method “plan1()” was created which sends a sequence of actions manual to the game engine, similar to playing the game interactively. After “plan1()” works, the solver can be implemented which finds the correct sequence autonomously. The solver is using a random generator for sending all possible action sequences to the action model. If the desired goal states is reached the solver stops.
Let us describe the system more abstract. in contrast of a STRIPS solver no dedicated planning language was used, instead the check if the preconditions are fulfilled are done within the class in computercode with if-then-statements. Additionally, the solver is not very advanced but because of the small state space it's working fast enough. In less than 100 lines of code it was demonstrated how a GOAP action model can be realized in Python and how a solver plans the action sequence.
Let us take a look into the core function which takes an action and calculates the next worldstate. The idea is, that the user sends a predefined command like “openbox” or “takeball” to the task-subfunction. The first thing what the method is doing is to check if the precondition is fulfilled. For example, the box can only be opened, if the key is in the keyhole, otherwise the task fails. The inner working is based on a if-then-statement plus direct manipulation of the worldstate. That means, the function changes the worldstate to the new value. For the programflow the action names doesn't need to be formulated in natural language. The method would work the same, if only the actionid is send. But for reason of debugging it make sense to use English descriptions. This allows the user to interact with the program in a pseudo-engish language. That means, the user can send a request like “putkeyinbox” to the game engine, and the program is doing something.
The GOAP model works similar to a textadventure. That means, the worldstate and the allowed actions are formulated in a textonly language and no graphical representation is allowed. The most interesting result is, that the resulting statespace of all allowed actions is very little. Overall the game consists of 6 actions plus 3 variables in the worldstate dictionary. That means, for the solver it is very easy to find the desired goal state by simply testing out random actions.
Nobody cares of a performance increase of factor 1000
What if a technique would be available which is able to run a Artificial Intelligence algorithm 1000x times faster? The surprising fact is, that such speed up is easy to achieve. All what the programmer has to do is switch from the Python language to C++ language which results into a speedup with the factor 10, and then he prefers rapidly exploring random trees (RRT) search technique over a normal random solver which makes the program run 100 times faster. In a direct comparison, a RRT graphsearch in C++ is around 1000x faster than a Python randomized solver.
But there are some reasons who speaks against this kind of speedup. The problem is, that RRT is harder to implement because a complicated graph has to be maintained, and the C++ version needs more time to type in than the Pyhon language. Another surprising fact is, that a speedup of factor 1000 is not big enough in Artificial Intelligence. In normal computing such improvements is great, but for problems in Robotics and AI the needed performance speed up is much greater. In AI, only a speedup in the size of 1 million times faster and more is something which should be mentioned. Or to explain it the other way around. If an AI search algorithm is too slow, this is equal that the algorithm needs instead of milliseconds the amount of years. To overcome the issue it is not enough to switch from Python to C++, because the result is, that even in C++ the algorithm would run too slow. Most AI problems can only be fasten up with additional abstraction layers, which modifies the problem itself.
Unsolved problems
On the first look, the Python program works great. It calculates the action sequence and can bring the system into the goal state. The open question is, if the simplified textadventures simulates the reality accurately. That means, if the real world and the goap model are working the same.
Storing the task into a dictionary?
From a technical point of view it would be interesting in storing the actions as a dictionary. This would allow to add new actions during runtime. The same technique is used in the STRIPS language and in the ACT-R cognitive architecture as well. All these planner are based on the principle that the possible actions are not stored in sourcecode direct, but as data.
But in reality the advantage is only low. Sure, some lines of code can be safed, but I'm in doubt if the concept would scale very well. If we take a look into more complicated textadventure, the game engine is usually create with normal Python code, but not with code which run modify during runtime. Sure, the LISP language allows to treat sourcecode as data as well, and it's interesting to describe the details, but the resulting system is very complicated to understand. The reason why the documation of the ACT-R project doesn't make much sense, is because self-modifying LISP code was used everywhere in the software. I think the more beginner friendly idea is to use only static code which is modified with a version control system like git. That means, if we need a new action, we have to add a if-then-entry in the sourcecode.
Suppose, the textadventure is able to parse not only single word statements but two word actions for example “take ball” or “open box”. The resulting parser would work differently. The sourcecode has to be modified. I think it's important if some sourcecode is there which can be adapted to new requirements. So i decided against storing the code in a dictionary. For the user the only thing which is important is, that he can send an action name to the game engine and the engine is changing the worldstate. How this is done internally it's up to the Python programmer.

No comments:

Post a Comment