August 07, 2019

Creating a Task and motion planner


A so called Task and motion planner is very complicated to realize. From the description itself, it's a mixture of a high level text adventure plus an underlying physics engine. The idea is, that a solver determines in the text adventure what the actions are to fulfill a goal, and then the motion planner converts the high level tasks into concrete motions which are executed by the robot. The problem is to implement such an architecture in sourcecode.
My project so far relies on the programming language python. The easier part was to create the simulation itself. Thanks to the libraries pygame, tkinter and box2d it was easy in doing so. The resulting robot can be controlled with the keyboard by a human operator. The more compliated parts are the text adventure and the motion planner. The first idea was, to utilize the STRIPS or the Prolog syntax which is equal to store facts and rules. In the literature the concept is explained in detail but in reality, the resulting text adventure was hard to maintain. The problem was, that the rules have access to all the facts and no modules are available.
The better idea is to realize the text adventure with object oriented programming techniques. Which means, that every item in the game like the robot, the box and the map get a separate class, and the methods in the class can only operate on the internal datastructures. This time, the sourcecode was easier to read, because it's compatible to normal programming paradigm. That means, if somebody creates a standalone text adventure he will use for sure an object oriented language, but not the STRIPS notation.
What is open right now, is to combine all the modules into a runable application. This makes it hard to predict if the idea make sense or not. Even the example problem was a minimal example, the amount of needed sourcecode is higher than usual. Especially the concept of running two simulations in parallel makes the code complicated. The problem is, that the normal physics engine represents the game but in the text adventure the same game is calculated but in a different way.
Is there a need to create the text adventure at all? The answer is yes, because without a text adventure the solver can't determine the next step. The precondition to search in a tree for a node is, that a forward model is available which can produce the game tree. Let us go a step back and describe what a GOAP solver is doing. The idea is to test out randomly some actions in the model. A random generator executes an action and then the result is stored in a graph. And exact here is the problem. The action can only be executed inside a text adventure.
What will happen, if no text adventure is available? Then the solver has to send random actions to the normal physics engine. The problem with Box2d, ODE and Bullet is, that there performance is low. They are providing the future state of a system but for doing so lots of cpu ressources are needed. It is not possible to plan longer sequences of around 1 minute with these engines. 1 minute is equal to 60 seconds = 1200 frames. If 100 actions are calculated, the amount of cpu compuation is enormous.
Perhaps the term “task and motion planning” provides the description itself. A task is a high level action for example “bring the box to the goal”, while a motion is a low level action e.g. “move 20 pixels forward”. The normal physics engine works on a motion level, it has to do with a near time horizon of 1-2 seconds and detail movements. In contrast, a task planner has to provide the long term strategy which includes the selection of waypoints and define subgoals. On a task level a pick&place operation can be described with natural language:
1. moveto object
2. grasp object
3. moveto goal
4. ungrasp object
This short plan isn't providing any details. It's not possible to execute the plan directly on a physics engine. A physics engine needs a concrete command for example “left(-20)”. And that is the reason why task and motion planning are handled as different layers. There is a need to plan the actions with different hierarchies.
Practical example
For controlling a puck collecting robot the first thing to do is to create the motion planner. It is working on a low level and affects the underlying physics engine. The motion planner contains of two subfunctions which are “reach angle” and “forward”. The first one controls the direction of the robot, while the second one effects the forward motion. The details of implementing the motion primitives is up to the programmer, in most cases a simple difference calculation is sufficient. After the sourcecode is written it's possible to send to the motion planner the following plan:
1. reachangle(45)
2. forward((100,200))
The interaction with the robot works with these motion primitives. They are providing an interface to control the robot movements. It's not possible to control complicated tasks with these primitives, but only short horizons issues. For longer plans a task planner is required. The task planner is equal to a text adventure and provides also some primitives. The task primitives are:
1. moveto(goal)
2. graspbox
3. ungraspbox
The task planner is not allowed to send commands directly to the robot but the taskplanner sends commands to the motion planner. That means a high level task like moveto() is decomposed into motion primitives like reachangle and forward.
Avoiding the task planner?
If motion primitives are able to control the robot and it's possible to write a longer program which contains a sequence of motion primitives, why is there a need for a high level task planner? Suppose the plan for the robot is to drive to the box, grasp the object, move to the goal and place the object at the position. All the motion primitives are executed in a linear fashion and now an interruption takes place. The robot looses the box during the transit. The motion planner itself doesn't recognize the problem, only the higher instance will detect the issue.
A motion sequence should be tolerate against interruption. And the task planner has to figure out the new motion sequence.
Semi autonomous control
Unfortunately, the amount of frameworks and algorithm to implement a task and motion planner is low. Creating such a software is mostly an art but not an engineering discipline. A good starting point is to set a focus on manual control. If the robot is controlled manual, it's for 100% sure that a task is fulfilled. A planner should be understand as optional. The idea is to start with a teleoperated robot and improve the system slowly into an autonomous system. From the programmer perspective the question is how to improve the control of the robot in a way, that the workload for the human gets lower.
A typical example for this transition is to replace a keyboard control with a mouse control. A normal robot arm for example in an excavator is controlled by different sliders. With slider1 the operator controls motor1, with slider2 motor2 and so on. The first step is to write a software which takes a mouse as input and calculates the servo signal as the result. In the literature the concept is colloquial described as inverse kinematics and it helps a lot to reduce the workload. An inverse kinematics doesn'T mean that the robot works autonomously, it means, that the human operator points with the mouse to a target and the robot arm reaches the point.