August 04, 2019

Pros and cons of the Shakey the robot project


A while ago a paper was published, which introduces the Shakey the robot project again and explains the advantage and disadvantages of the STRIPS planning system. On page 1 it was also explained, that Rodney Brooks wrote an anti-Shakey paper in which he explains that formalized planning is a dead end. It's important to focus first on the idea of a logical model of the environment. The Shakey robot has a preprogrammed environment model in which his own position, the allowed actions and other objects are foramlized in the situation calculus. This model allows Shakey to plan from the current situation into any future goal state.
The disadvantage of the concept and the reason why Brooks wrote a STRIPS critique is, that such a logical model is hard to program and it doesn't fit to the environment. In reality, it's not possible to reuse an existing STRIPS model. The Shakey description can't be utilized in a different robot for example a modern Lego Mindstorms system. Instead the STRIPS model has to be programmed again, which takes a large amount of time.
To understand the problem we have to go a step backward. The normal interaction between a robot and a human operator is done via teleoperation. That human can move the robot by using a joystick. If the robot should drive autonomously, he needs a logical model. The basic question is how to come from a teleoperated robot into an autonomous robot system. The answer is called plan recognition and learning from demonstration. This is the step inbetween and means, that the human interaction with the robot is tracked and converted into a model.
Plan recognition is equal to model tracking. The idea is not that Shakey should plan the next actions, but the idea is analyze if the logical model of the environment is right. The idea of STRIPS and Shakey goes into the right direction, what is missing is the ability to analyze human interaction with a teleoperated robot.
In the literature the idea of plan recognition is a new development because it's hard to explain why this technique is needed. From a practical standpoint it's equal to control a robot with a joystick and the software is able to recognize the actions. That means, the human operator let Shakey collide with an obstacle, and on the screen it is shown “collision detected”. Because the human operators knows the information in advance it seems that for such a message there is no need. Without a working plan recognition it's not possible to verify or build a logical representation of the environment. That's the reason why most STRIPS based projects have failed.´
Shakey the robot and STRIPS is working great, if the logical representation is there. The planner can take the model and plan the next steps to reach a goal. It's not very complicated to write such a planner and he will run with maximum performance. The bottleneck is there if the logical model isn't correct or no such model is available. In such a case, the robot won't make any action.
Plan recognition is equal to human-machine communication.[2] The robot and the human operator are speaking the same language. The problem is not how the Shakey software works internally, the question is, if Shakey is able to understand the teleoperator.
The plan recognition problem is a relative new develoopment which was analyzed after Shakey was built:
quote ”Schmidt, Sridharan and Goodson [1978, 1976] are the first to identify plan recognition as a problem in its own right” [3]
In contrast to robot control, plan recognition doesn't result into a working system. Instead the idea is to annotate the movements of a teleoperated robot. Somebody may argue, that it has nothing to do with Artificial Intelligence because the robot is controlled by a human operator. Additionally, the detected events and activities are grounded in natural language and psychology which is outside of computer science.
Debugging
Plan recognition can be seen as a model debugger. It is only successful, if the plan library contains of predefined actions which are able to detect events in the environment.[4] This allows for the programmer to implement and test new plan libraries, similar to writing computer code. He types in an action and used the plan recognizer to verify if the action makes sense.
Plan corpus
To simplify the process of plan recognition it's useful to build a plan corpus. That is a large plan library which contains action primitives and events for detecting and annotating raw data. It's not possible to generate a plan corpus automatically, but it's a manual task similar to create an English dictionary. A plan library is usually created by asking human participant to do a task, for example to walk on a line. Then the motion capture suite is recording all the information and they are annotated manual. On top of the recorded trajectory a parser is programmed. The overall plan library project has to provided as Open Science project in the internet which allows other researchers to participate.
Examples for corpus from the past are: HASC corpus, USC-HAD, Hugadb[5], PRAXICON and other datasets for activity recognition. Most of these projects were realized in the last 10 years.
[1] Shanahan, Murray. "Reinventing shakey." Logic-based artificial intelligence. Springer, Boston, MA, 2000. 233-253.
[2] Pollack, Martha E. "The uses of plans." Artificial Intelligence 57.1 (1992): 43-68.
[3] Mao, Wenji, and Jonathan Gratch. Decision-theoretic approach to plan recognition. ICT Technical Report ICT-TR-01-2004, 2004.
[4] Goultiaeva, Alexandra, and Yves Lespérance. "Incremental plan recognition in an agent programming framework." Working Notes of the AAAI Workshop on Plan, Activity, and Intention Recognition (PAIR). 2007.
[5] Chereshnev, Roman, and Attila Kertész-Farkas. "Hugadb: Human gait database for activity recognition from wearable inertial sensor networks." International Conference on Analysis of Images, Social Networks and Texts. Springer, Cham, 2017.