December 10, 2019

System identification and automatic game playing

Most examples of Narrow Artificial Intelligence is about game playing. The domain is given in advance, and the AI software has to bring the model into the goal state. Typical examples are the TicTacToe game, the micromouse challenge or a line following robot. All these problems have in common, that the domain was defined precisely and the AI has to solve the game.

In case of the micromouse challenge the problem is provided by the maze, the robot hardware and the rule that the robot has to travel through the maze. What the engineers are doing is to solve this challenge. They are using pathplanner, motion planner and vision algorithm which will allow the robot to drive autonomously through the maze.

The surprising fact is, that all of these robots are useless for practical applications. It's not possible to utilize the micromouse robot for a different kind of task. This is a tragedy in case of very complex robots. The paradox situation is, that on the one hand the robotic system is highly developed but from a practical point of view, the robot can do nothing.

To overcome this bottleneck there is need to introduce a prestep before the robotic software is programmed. This prestep is called system identification. System identification tries to describe the domain. That means, the rules of the micromouse challenge are not given in advance, but the engineers has to describe first what the task for the robot is.

Or let me give a more easier example. A normal AI project is about programming a software which can solve TicTacToe. The more elaborated AI project is, if it's unclear which kind of game should be played by the software. The rules are not provided in advance. This is the case for real robotic applications. Especially if the aim is to replace human work with robots. The problem is, that it's unclear what exactly a human worker is doing. He is trying to solve a problem, but the domain is not described in advance.

System identification is equal to a forward model which is equal to a simulation and this is equal to a physics engine. A physics engine is not the AI itself, but it's the environment in which an AI Controller gets activated. A physics engine is sometimes introduced in the Artificial Intelligence community as a reward structure, or inverse reinforcement learning. The idea is, that before a game can be solved there is need to figure out what the game is about. This prestep before a solver can bring the system into a goal step is more complicated to solve than normal AI tasks. In most AI domains there is no need for this step because it's trivial to do the model checking. A TicTacToe simulator provides the accurate simulation for the real tic tac toe game, all the rules are known in advance.

This kind of situation is missing for robotics application. The typical robot project works with a missing domain model. That means, it's unclear if the action of the robot produced a positive or a negative reward.

Model predictive control

The rules of TicTacToe are well known: two players have to place pieces on the board and the problem is which is the best move. But is this really the problem? No it's not because the optimal move can be determined by a game-tree-search. The more interesting question is what is the TicTacToe game about. If a player puts a piece on the board, he brings the game into a new state. The action has an effect and the amout of possible actions for the next player is smaller. The game engine formalizes the rules. It determines under which condition somebody has won and what allowed actions are.

The interesting problem in TicTacToe is not solving the game itself, but it's how to transfer the game rules into the game engine. If this topic is unanswered it's not possible to automate the game. In case of the TicTacToe game the game engine is trivial to program. In most cases it can be programmed with less than 100 lines of code. In other domains like micromouse, or robotic grasping domains the game rules are more complicated. In most projects, the AI programmer ignore the task for formalizing the game rules. They assume that programming an AI is equal to solve a domain.

An often cited example for model predictive control is steering of a vehicle. The project contains of two parts: system identification and solving the model. In the system identification task, it is defined what will happen if the car steers to the left. This allows to predict future game states. It's equal to invent a game. The game rules are about a car which is able to steer to the left and to the right. Solving this game is done in the second task called “solver”. Here is the question which action is needed to next to win the game. Solving a game makes only sense, if the game rules are available. That means, a random action can be send to the forward simulation and the model provides the What-if-feedback back to the controller.

It's important to know, that a forward model can be controlled by a human operator. A forward model means only that the game rules are formalized in a simulation and it's possible to play with this simulation. Playing means, that the human operator can try out different actions and observe the reaction of the system. Because of this reason, the system identification step is often ignored by the AI community. It has nothing to do with automatic game playing itself, but it's the prestep towards this goal.