June 20, 2019

Stages towards autonomous robots


The steps to Artificial Intelligence are:
1. activity recognition of a human user
2. teleoperation by a human user
3. Fully autonomous robot
The most interesting transition is from step 2 to step 3. A teleoperated robot is something which is available today and which is not very complicated to build. All what is needed is a robotarm, a joystick and a human user. In case of a crane such systems are used in reality today. In contrast, fully autonomous robot who are executing the task without a human in the loop are much harder to realize and most projects fail. The problem is not only how to build a robot, the more detailed problem is, how to improve a teleoperated system into a fully autonomous system. One promising idea in that direction is to identify the steps inbetween which are needed.
Step 2a: teleoperation with a time-delay
Step 2b: predictive teleoperation
A preditive teleoperation means, that the operator see the effect of an action before it's executed. The outcome is made visible as a info box which is presented as overlay to the normal display. Perhaps there are more inbetween steps available. And it make sense to focus on the missing steps because this will answer the question how to program a robot.
The transition from teleoperation towards fully autonomous system can be seen similar to a bridge building project. On the left side today's technology is visible which is working but needs a lot of human intervention. And on the right side, future autonomous robots are the vision which are not available today. The question is how to overcome the gap inbetween.
Let us start the process slowly by analyzing a teleoperation in detail. The interesting point is, that with a dataglove in combination with a robotarm very complicated tasks can be done. The human operator wears the dataglove and transmits the actions to the robotarm. Such a system is called a human-machine-interface and it's working with human level intelligence. That means, the operator is in control of the system and the robotarm can do the same what the human can do. The only bottleneck available is, that such a system is more expensive over a human hand, and the control is slow. This makes it only useful for crane-applications, in which the human can not move the object by himself, but need the machine. In case of simple pick&place tasks on the assembly line a teleoperated robotarm is not useful, because such a system would be much slower than a normal human employee.
The open question is how to improve a teleoperated robot arm so that the interaction with the human will become faster and more productive. Let us analyze how Artificial Intelligence can be realized today. In most cases a solver is used which is scoring alternatives and selects the next action. If a solver is available the system will run autonomously. And exactly here is the problem. The teleoperated robot arm doesn't has a solver, the constraints are unknown. It is not clear what will happen if the gripper open the finger. Without a solver, it is not possible to plan longer sequences, which means, that the AI isn't working.
Here a short summary. The steps towards fully autonomous robot can be described as a transition process. The solver is the critical component:
1. activity recognition of a human user
2. teleoperation by a human user
2a: teleoperation with a time-delay
2b: predictive teleoperation
2c: solver for teleoperated robot
3. Fully autonomous robot
What we can say is, that step 2 (teleoperation with human user) is available, and step 3 (fully autonomous robot) is the goal. The steps inbetween are not clearly defined but they are needed to reach the goal.
Model based teleoperation is one way in realizing a solver.[1] The term “Model-based” is referencing to a formalized space in which action will take space. Teleoperation is possible without a model, this is the normal case, in which the signals are transmitted from the joystick to the robotarm, but teleoperation is also possible with a model in the loop which is doing some checks about the scene. In most cases, a model is equal to a physics engine. According to the definition a physics engine is a model for predicting newton law of motion. Real time physics engine has the problem, that they can only predict the near future (which are the next 1-2 seconds) but they can not predict longer horizons. To overcome the problem, the appropriate model for a teleoperated robot is a mixture of a physics engine plus a PDDL like task model.
From the perspective of the user, such a system will look like an "intelligent tutorial system". It simulates a task, but provides also help for the user. Let me give an example. In the following picture a simplified game of Lemmins is shown, the task is to bring all the lemmings into the goal which is left bottom. To make the game easier to play, the human user needs an advice what to do in this level. A possible way in doing so is to sketch the overall route of the swarm.

This can be done with some arrows which are describing the trajectory through the level. The arrows make clear what the walk-through is for the level. With the overlay path it will become much easier to attach actions to the lemmings. The solver has to find only the sequence of actions which are forcing the lemmings on the desired route. That means, the overall game was divided into two layer: a high level layer which is equal to a trajectory, and a low level layer which fulfills the given path.
Ironically this concept will produce lots of new problems. The first one is how the trajectory will look like for a certain level, and the second problem is, how to force the lemmings to follow the path. If the answers are known, that game AI can solve the level by it's own.

[1] Passenberg, Carolina, Angelika Peer, and Martin Buss. "Model-mediated teleoperation for multi-operator multi-robot systems." 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE, 2010.