January 28, 2022

Top down Artificial Intelligence with evaluation functions for different games

Top down Artificial Intelligence with evaluation functions for different games

Manuel Rodriguez, January 28, 2022

The classical understanding in robot programming is to imagine what the robot has to do next. He is embedded in a game and has to provide the best actions. This kind of bottom up understanding prevents that the game in general is getting understood by machines. The more elaborated attempt in creating artificial Intelligence is working with virtual referees which is also known as top down AI. The main ingredient is an evaluation function to judge about the current state of all the players in the game.

Tags: Artificial Intelligence, bottom up AI, game AI, scoring function

Table of contents

1 Artificial Intelligence Philosophy

1.1 Intelligence equal to winning a game?

In a colloquial sense, intelligence is very well defined. It’s the ability to win a game. Suppose somebody is able to win a game of chess, this is seen as a desired skill. People are playing chess because they want to win the game. The same is true for other games like scrabble or the Rubiks cube puzzle. Winning the game is the only purpose all these games have. And what humans likes is reaching this goal.

The interesting but seldom explained fact is, that “winning a game” is an easy task for computers. In case of chess, the computer has to evaluate the current position and plan some steps ahead into the future and this allows the machine to win every single game. Even if the computer starts with a weak position, he will beat the opponent for sure if there is a path available.

That means especially the highly wanted ability to win a game is provided by a computer as a default. There is no discussion available if computers are able to win a game of chess. They are able to do so and if someone tries to prove the opposite he can try to lay a game against a chess engine. But, if winning a game is so easy, what is about Artificial intelligence?

Suppose computers are able to win a game but they aren’t able to invent such games from scratch. Somebody has to program an evaluation function for chess first then the computer can use its computational performance. It seems that the isolated goal of winning games is not sufficient to describe what artificial intelligence is about. It has to do with creating evaluation function and selecting which seems to be important. For example, some decades ago, the game of chess was chosen as a relevant game for computers. And the chance is high, that in the future other games which are more complex are getting played by computers.

1.2 Artificial Life

Since the mid 1990s, the term Artificial Life was introduced. Most newbies believe, that Artificial Life has to do with creatures in a 2d world who are searching for food. And then the brain of these creatures is improving by itself. The AI life robots will become more intelligent by its own.

The main problem with AI life is, that the term isn't defined precise. It can mean anything or nothing. In most cases, it has to do with Artificial Intelligence in an ecosystem. And the question is how to program robots in such a way.

A possible alternative interpretation is, that AI life doesn't provide the answers but is trying to analyze certain problems. One of them is the relationship between robots and their environment. An interesting but seldom asked question is "Where is intelligence located, in the single individual, in the swarm or in the environment?"

Let us try to analyze the issue from a technical perspective. It's possible to write a simple video game in which some creatures can move and search for food resources So it is some sort of AI life simulator. What the creatures will do is simple to answer: they will do nothing, because they weren't programmed. To make this simulation more lifelike some sort of energy, intelligence or something else is needed.

The first attempt is working bottom up. The thesis is, that intelligence is located in the individual. So each of the lifeforms in the game will need a brain and the brain can make decisions. This principle is equal to behavior based robotics, in which a single robot perceives the environment with a sensor, plans something and then acts in a way that the situation gets better.

The problem with bottom up AI is, that it remains unclear what the objective is. The individual robot has to take a decision, and the goal is also determined by the single robot. That means, the AI problem is referenced back to itself and it remains unclear what the robot has to do next or what the goal is. The more sense making approach is working top down. Not the single robot stands in the focus but the game which is represented by the environment. Because this approach is discussed only seldom in the literature it should be explained in detail.

Top down AI life means, that the robot creatures in a game are less important. The assumption is, that these creatures are acting randomly or are tele-operated by humans. What is important instead is the game mechanics of the environment: Where is food is located in the map, how the robots can act, which action results into a positive reward and so on. Basically spoken, the idea is to design a game which makes fun to play.

If a game was invented with this principle in mind it can be played by:

  • a random generator
  • a human player
  • an algorithm
  • a neural network

The most antagonist approaches in playing a game is a random generator on the one hand who has a low winning probability, and a super-human-AI on the other hand, who will win every single game. For the game engine, both ways are equal. A player in the game is simple somebody who can take decisions and the outcome gets simulated by the game engine. The problem of decision making in a game can be postpone to the external entity.

Let us go back to the subject of Artificial Life. In the past, AI life was mostly defined as simulating characters for example a robot in a maze. But the definition can be flipped into the opposite easily. The new approach is to define that the maze is the artificial life, and the robot in the maze is an external stakeholder who isn't intelligent.

This definition may surprise because a common assumption is, that player in a game are able to solve the game. But this is not precondition. A player can be someone who is only a random generator. If the robot in a maze is controlled by a noise generator he remains a player, even if his action aren't making much sense.

In a previous paper this problem was recognized as the difference between bottom up and top down robotics. Bottom up robotics says, that the single robot needs to be intelligent, while top down robotics explains, that the game engine needs to be intelligent. Instead of arguing which of the perspectives is the correct one, it should be mentioned that both are available and they are defining the same situation different.

There are many arguments, why top down robotics might be the better approach to explain artificial intelligence. The reason is, that from a technical perspective it is pretty easy to win a game, while it is hard to invent new games. Programming a robot that is able to win a game is a trivial case, because all what is needed is to search in the game tree. And computers can do so very well. In contrast, creating a new sort of game engine which acts as an environment is much harder to realize.

Suppose bottom up robotics is a here to stay. Then the overall problem is how to program a single robot, so that this robot will do something useful. The assumption of bottom up robotics is, that this problem is complicated. But is this really the case? Suppose there is a robot on a graph who should find the shortest path to the goal. Nearly all programmers in the world are able to write an algorithm who is able to do so. Such an algorithm (=the source code in Python) will need around 100 lines of code, and gets executed on a desktop pc in under 1 second. So we can say, that winning such a game is not an AI problem, but it is a trivial computer science problem.

In contrast, a problem which is hard to solve is to write a game engine. That means, a piece of software which holds the maze, and which defines which actions are allowed in the maze. This is colloquial labeled a robot simulator. Such a simulator can't be realized in 100 lines of code but it will need much more effort. And if the simulator should provide a more complex game, the needed amount of code lines will grow exponential.

1.3 From bottom up AI to top down models

A computer is a machine which can do anything. But before the machine will solve a problem, it has to be programmed. Programmers are able to create such programs but they need to know what sort of software is needed.

In the domain of AI, there are two major schools available which are bottom up Artificial Intelligence and top down AI. Bottom up AI means, that the intelligence is located within the robot. This paradigm gets ignored next and the section tries to explain only what top down AI is.

The first thing to know is, that a computer can win easily any game if the computer knows the cost function of the game. A cost function is a mathematical model about what the goal is. The cost functions grounds the reality into numerical values. Most users are familiar with cost functions because in video games the current score is shown on the screen. If the player collides with an enemy, the energy level is reduced and this is shown in the status bar .

Suppose , a robot knows in advance at which situation in the game he gets a positive reward and in which a negative reward is given. Then, the robot's computer is able to plan the actions to maximize the overall reward. That means, the robot has won the game if the score is maximized so it is an optimization problem. In such a situation it's pretty easy to define what AI is about. Because maximizing the overall reward is no longer an AI problem but it's a normal computational problem which can be solved with software. There are many algorithms available for this purpose and they can be implemented in a programming language.

The only bottleneck is a situation in which the situation is unknown which means that the cost function is missing. Without a cost function, it is not possible to plan actions and without a plan the robot isn't able to win the game. In such a case an AI problem is available because it remains unclear how to program the robot.

Top down robotics puts a strong emphasize on cost functions. The idea is that intelligence is never located within the robot but intelligence is equal to ground a domain in a cost function. That means, not the decisions of the robot are intelligent but the processing of the cost function. With this paradigm in mind it is possible to locate where the Artificial Intelligence is located. It is the cost function but nothing else.

The art of creating cost functions is usually treated as game design. Game design means to invent a game and define at which moment in the game the player gets rewarded. This is perhaps the most obvious difference to bottom up robotics. Bottom up AI tries to play an existing game, while top down robotics assumes that the game has to be invented first. Games are usually located outside of human players and outside of robots. A game isn't defined in the player's brain but it is equal to the environment. For example, a stairs can form a game and the robot has to walk on these stairs. if bottom up robotics tries to understand embedded intelligence the idea of top down robotics is to define what environmental intelligence is.

The idea of a cost function is to translate between a game and the actions of participants in this game. The cost define what a single player has to do within the game.

1.4 Playing robot soccer with a remote control

In the past, robot soccer was usually played with autonomous robots. The idea was to write a software which allows the robot to play this game. Programs written from different teams have to compete to each other. This sort of task combines Artificial Intelligence with a concrete problem.

There is a small modification available to this problem. Here the robots are no longer controlled by the onboard AI but they are remote controlled. That means, human players are standing outside the arena and pressing buttons which allows the robots to catch the ball. On the first look it is hard to recognize why such a game has something to do with Artificial Intelligence. The reason is, that the software for controlling a remote control robot is little.

But perhaps both ideas can be combined with each other? What is about a robot soccer game in which the robot are controlled by humans, but the referee is controlled by an AI? The referee is hidden in the camera and the task is to detect the ball position and determines which of the players has won. In such a setup AI problems and normal remote controlled robots are available both.1 2

To understand why this combination makes sense let us investigate the bottleneck of robot soccer in the past. Suppose the teams and the match is working with the strict rule that the robots are controlled by software. Then the situation is that the robots are activated, and the team is standing around and has nothing to do. They can observe if the written code works fine but they can't interrupt the game. So their role is passive by nature and it depends on the program if the robot is doing a useful task.

Somebody may argue, that this interaction mode is not a problem but it is the goal of a robot competition. No it is not the goal. The goal is to let robots play soccer and use Artificial Intelligence for this purpose. The open question is how to exactly such a match is realized. The reason why sometimes the competition was modified slightly in a sense that remote controlled robots are allowed is because it makes more fun. If the human player can press buttons they can influence the robots on the play field. The resulting behavior of the robots is more human like. That means, they are acting different.

The only question left open is where is the AI located if the robots are controlled remotely. In a normal case, no artificial Intelligence at all is needed. This makes it complicated to see such game as a serious challenge for computer science. Computer science is about neural network, autonomous robots and algorithm. But in a remote controlled use case such things are missing. In the past both attempts to play the game were seen as opposite: First idea is to build autonomous robots and program them and the second idea is to control the robots with a remote device and avoid a fixed algorithm.

1.5 A virtual robot referee

In the past, Artificial Intelligence was not defined precisely enough. Sometimes it was assumed, that a complex robot is equal to an artificial intelligence. For example if a robot is able to walk with two legs or kick a ball into the goal then the robot is intelligent. Another assumption was, the chess playing computers are intelligent, because they are able to win against any player.

The problem with this understanding of Artificial Intelligence is, that an important aspect of games is ignored. Every game contains of a referee which is judging about the player. It depends on the referee if it makes sense to kick a ball into the goal. If the referee recognizes that the goal was the wrong one, the robot can't be called intelligent anymore.

The interesting situation is, that even advanced robot challenges like Micromouse, line following problems and Robocup are working with no referees or only with human referees. That means, there is a human which has a stop watch and he determines if the robot has done the task. But what if the task of the referee is automate as well? Such an instances is called a virtual referee because it is a software which determines if a robot or a human player has made everything correct. It is a scoring mechanism which judges about the quality of a behavior.

The hypothesis is,that a player in a game is not an example for artificial intelligence but a referee who is realized in software is an Artificial Intelligence. The reason is, that the referee is the highest possible instance in a game. The only way to judge about the referee is by assuming that the game itself is boring. For example that the line following problem in general doesn't make sense.

The amount of literature and projects about virtual referees is low. In the past the problem was simply not recognized as important. Instead all the attention was put on robots who are playing and winning games but not about judging them. Can we train a robot to become a referee? Of course, but this role is different from a normal player. A referee represents the game. He gives scores and punishment to the other players.

1.6 Realizing artificial intelligence with evaluation functions

In the past Artificial Intelligence was mostly discussed from a theoretical perspective. It was asked what intelligence is and how machines can emulate such property. The next logical step is to create a practical demonstration for artificial intelligence and the best approach is a computer which is able to play and win a game. In such a case the topic of AI is discussed more realistic and very important it can be demonstrated which sort of games are too hard to the AI.

In some discussion it is assumed that game AI is state of the art in AI research but it is possible to go a step further and investigate who to create a game playing AI in general which is able to win all the existing games. The interesting situation is that it is possible to describe who an AI can win a game. What the AI has to do is to score the current game state with a numerical value and then use a search algorithm like A* to explore the game tree. This strategy works surprisingly well.

How the A* algorithm is working is well known the bottleneck in the description is the evaluation function. Every game can be evaluated and the interesting situation is that most games doesn't provide a score or the score is delayed. For example in the tetris game the player can make an action which is not recommended without affecting the score on the screen. The player will see only 3 moves later that the move was wrong. What an AI has to do is to create its own scoring function which determines precisely if a certain situation is an improvement or not. This scoring function can be combined with a search algorithm into a winning strategy.

From a more abstract perspective Artificial Intelligence is identical with finding a scoring function for a game. A strong player has a better evaluation function than a weak player. The weak player makes a move and doesn't notice that his situation is worse than before. At the same time this definition allows to identify which problem remains unsolved in AI research. It is an open problem how to find such an evaluation function. What is used today are handcrafted examples which were programmed by humans. What is hard coded in the evaluation function is the heuristic knowledge of an expert which results into a numerical score. This scoring mechanism is the core element of an AI player. The other parts like the search algorithm, the GUI and the move generator are programmed around the scoring function.

Sometimes the problem of creating an evaluation function is discussed under the term “grounding problem” grounding means to convert a world state into a numerical value which can be processed by a computer. The reason is that existing games are formulated on a higher level than computers can understand. The only language computers can process is based on numbers. If a problem was formulated in terms of integer values and floating numbers it can be solved easily on a desktop PC.

Let me give an example. Suppose there is a list of values which have a value form 0 to 100. The computer has to search in the list for the minimum value and print out the entry. Such a task is not an AI problem but it is a classical programming problem. Any programmer and of course any discussion forum in the world knows how to solve this problem fast and efficient in a major programming language. If a scoring function is there, the decision making in a game is reduced to such a minmax search problem. The only obstacle is, that most games doesn't provide such an easy to solve problem, so the game has to be converted first. This transformation is not understood by today's researchers and it is an open problem within Artificial Intelligence.

1.7 A cost aware AI pong player

The classical understanding of Artificial Intelligence is dominated by the bottom up paradigm. What Rodney Brooks has introduced in the late 1980s was never a paradigm shift but it was the default understanding what AI is about. Suppose there is a game of pong and the AI player has to act in this game. Then the player needs to figure out the next action and everything is described from a bottom up perspective. Bottom up means, that the problem is solved from the robot's perspective. For example, the AI player is on the right side of the pong table, there is a ball and then the player decides to move towards the ball.

A typical implementation of a bottom up pong player is working with some if then rules. For example, if the player is near the ball then move to the it. The problem with this approach is that it is not working well enough. The more elaborated attempt to play the game is think from a top down perspective. Top down ignores the action space of a single player and tries to judge about a game from the referee perspective. .A referee has to decide how gets a reward in a game, and a referee has to decide who has won even if the game is not over.

A referee doesn't act in the game but he score the players who are doing something. Top down judgment and creating a virtual referee is the same. This approach allows much better to implement an AI. The AI is not the player on the right side and also not the player on the left side, but the AI is the game engine who is judging from a birds eye perspective towards the game.

2 NP hard problems

2.1 What is the np hard problem in AI?

In the past the so called np-hard problem was sometimes mentioned in the literature. The basic idea is, that all the robotics problems are np hard which can be shown in a proof. It means, that the problems are so complicated that a normal desktop PC can't solve it. For example, the pick&place problem in the 3d space is np hard, driving a car is an np hard problem and controlling a biped robot as well. NP hard means, the state space is exponential in size and it is not possible to solve this space. That means, there is no algorithm available for np hard problems.

Let me give an example for np hard problems. Suppose a robot should grasp a ball and put it into a box. The initial situation is known and the goal situation is defined as a constraints. The movement in between are equal to the state space. For a duration of around 1 minute the robot has to control the arm so that the goal state is reached. In each second the robot can decide between different movements for the joints, and the overall amount of possible action sequence is larger than 1 million, larger 1 billion and larger than 1 trillion. This makes the grasping problem to an np hard problem.

The next step would be to investigate how many action sequences are available in the duration of 1 minute and determine in years how long a supercomputer would need to calculate them all. But it doesn't change the situation drastically. The more interesting situation is how to solve np hard problems?

Solving a problem is done in computer science with an algorithm. An algorithm is a program which can solve a problem. The only way in solving np hard problem is to reformulate them. That means, only non np hard problems can be solved by an algorithm in a small amount of time on a desktop PC. And if the grasping problem with the robot and the ball are np-hard something with the problem is wrong.

So the question is how to transform an NP Hard problem into a non np hard problem? This is done in game design. Game design is not about solving problems, but game design is about inventing game. The game has to be invented in a way, that it has a small state space. A typical sign for this precondition is, that a scoring function is available, that the problem has a hierarchy. For example a reformulation can be, that in step 1 the robot has to approach the ball. And the measurement if this action was successful is done by the distance between the robot arm and the ball.

Unfortunately it is very hard to define in general how to create games which are fitting to a certain domain. The easiest way in doing so is to use existing games which were invented by someone else and investigate if these games are fitting to a domain. For example the question is if the ball grasping problem can be modeled as a graph traversing problem. That means, the robot and the ball are in a graph and there a quest has to be solved. Another possible attempt to increase the abstraction is to describe problems as text adventure. Most text adventures have a surprisingly low state space. The reason is, that in a text adventure the amount of possible actions in each state is low.

What we can say in general that only if a problem was reformulated as a polynomial problem it can be solved by algorithms and computers. In contrast, if the game design wasn't done this way, it makes no sense to think about automated AI players.

3 Practical demonstration

3.1 How to program a cost aware line following robot

In the past, line following robots were programmed with a certain bias. The idea was, that there is a piece of robot hardware which has to be programmed, and then a certain algorithm or software gets implemented on the machine. Typical terms for describing the project have to do with the onboard CPU, the programming language and the sensor readings. So in general it was a robot focused paradigm.

This sort of self understanding is not very efficient because it ignores the environment of the robot. The better idea is to prefer a top down ideology in which the robot gets ignored and the programming is focused on the game itself. At first the robot is controlled manual. Either with a remote control device or by manual moving the robot forward. Then, a cost function is added. That means, during the teleoperation a cost is calculated. If the robot misses the white line on the ground, a penalty is given. This cost function is calculated with the onboard computer of the robot. So the robot doesn't act as an individual player in the game, but he monitors what he is doing. This includes cases in which the own performance is low.

In the third and last step, the cost function is connected to a solver which determines the next action for the robot. But this step is not very important and can be postponed or ignored at all.

The resulting project is much different from line following robots in the past. First thing is, that the robot isn't working autonomously but it is controlled by a human. And second, the robot doesn't follow the line but the robot moves mostly away from the line to check if the internal cost function is able to detect the situation.

3.2 How to model a production line

In a previous paper it was explained in depth that top down models are important in robotics. The model allows to ground a domain into a machine readable optimization problem. The only thing left open is how exactly a model has to be created so that it will fit to a domain.

A possible example is given in the case of a production line. The idea is that the production line is working with many conveyors at the same time. The reason is that every conveyor is automatically something which is increasing the productivity. A robot is used to sort the incoming boxes on these conveyors. The figure shows the model realized in the petri net notation. The robot can grasp 2 objects at the same time in a box. His task is to take an object from one of the left conveyors and place it to the correct colored conveyor on the right.

Production line

On the left side, the objects are incoming with a random order and the goal of the game is to sort them on the right side. On the first look this task looks not very complicated because the workspace is structured very well. Nevertheless there is a need to model the situation. Modeling means to draw the petrinet to the screen and make sure that the model can be simulated. That means it is a computer game in which the robot can take decision and see what happens if he puts the item to the wrong target conveyor.

In the model, such a situation is handled with a negative reward. That means there is a counter who recognizes if the robot is doing what he should. So the goal of the game is to maximize the own score. It is important to know that the reward function is made explicit, that means it is written in the game description. So the question is not what is going inside the brain of the robot or if he is intelligent enough to take the correct actions, but the question is how does the game works. That means, are conveyor is moving, does the system is recognizing if an item was place into a position?

The reason why the color sorting problem needs to be modeled in an abstract way is because it allows to solve it. Suppose the model was programmed, than it is possible to use this model for optimization purposes. That means, it can be determined what the optimal actions for the robot are. In such a case the reward function is known, the model was defined so it is a simple numerical optimization problem. The solver has to make sure that the robot is able to win the game. And this will produce a behavior pattern which makes sense in the game. That means the color sorting problem can be solved the same way like a chess engine is working.

It should be mentioned that the order of steps in modeling are important. The first step is always to create the model. Only if the model is available and if the model can be played by a human player then it is possible to automate the system with a robot. It is not possible to determine first the optimal action for the robot and then investigate what the problem is about. That means, bottom up approaches which are starting at the robot aren't working. The only sense making order is to solve such problems top down. That means at first the game is defined, then the game is simulated, and then it can be solved on the low level by a robot.

With such a strict definition of top down modeling the definition of Artificial Intelligence is very strict. AI doesn't mean that the robot is able to understand something, but roughly spoken there is no robot but only a game model. That means, it is not important what is inside the brain of the robot but the only thing which counts is which color has a certain item, what is the conveyor about and so on. That means the problem description is very important while the robot itself is only an add on in the system.

Let me describe the situation from a more abstract perspective. In bottom up robotics the focus is on the robot. The idea is that the robot is analyzing the environment and the robot is executing certain action. In top down model based robotics, the focus is on the game. The game contains of a map, shown in the figure, it is a logical structure and there is a goal. So the idea is to design either a board game or a video game and make sure that this game is an internal logic. The interesting situation is that the game's logic is not located in a human player nor in the robot. But the game's logic is defined top down. It is written in the game description and it includes many things like a metal conveyor, items on the conveyor, a movable cart, possible actions, a reward function and so on.

Creating reward functions

The screenshot with the production displays on the one hand a complex computer science problem. Because the optimal actions for the robot are needed so that the color sorting pick&place problem can be solved. In the concrete example the robot has loaded the red and the blue item. So he has to place first the blue item and then go upwards to place the red item. Then it can be investigated what to do next.

But even if the problems sounds very complicated a computer software is able to calculate the actions easily. The reason is that the problem's specification is made clear. That means it is not a challenge for a computer to figure out the action sequence but it is some sort of well defined situation. The only true challenge in robotics are ill defined problems. That means if no model is available but the robot should figure out by himself what the game is about. That means if no problem was given it is hard problem and if a problem was defined it is an easy challenge. This paradox situation has to do with the inner working of computer and how to program them. A computer is a great tool for adding two numbers or search in a graph for a value. Such a task can be realized in a programming language with a few lines of code. The only challenge for a computer is a situation which can't be solved with this principle. That means if the problem has nothing to do with adding two numbers then it is a hard problem.

Unfortunately, most robotics problems are located in this region. And this makes it hard to analyze these problems. A possible attempt to describe the gap between the capabilities of machines and the real world is the symbol grounding problem. It means that the robot doesn't understand natural language instructions. And this produces a semantic misconception. Another more detailed attempt to describe the symbol grounding problem is working with textual goal functions. For example the human operator is saying to the robot “pick up the red item from the conveyor”. The robot can execute the action only if a certain world understanding is there. That means the robot needs a model in which the terms red object, conveyor and pickup are defined.

Let us take a closer look into the example with the production line to understand how to solve the symbol grounding problem. In the example domain a simplified model of a production line was given. This model was programmed in a computer program. The model provides to the outside world basic commands like “moveto(), pickup()”. These commands can be parametrized, for example pickup(2) is translated to “pick up the red item”.


References

  1. Arenas, Matías, et al.: "A robot referee for robot soccer", Robot Soccer World Cup. Springer, Berlin, Heidelberg, 2008.↩︎
  2. Zhu, Danny, and Manuela Veloso. "Event-based automated refereeing for robot soccer", Autonomous Robots 41.7 (2017): 1463-1485.↩︎

No comments:

Post a Comment