December 18, 2025

From robotics algorithms to grounded language

 Artificial Intelligence from 1960 to 2010 was working with a bias which prevented the development of powerful robots. This bias was a closed system driven by numerical algorithms. The understanding was that a robot is some sort of computer or a Turing machine which executes a computer program step by step. The computer program controls the robot and limits its ability to solve problems in the real world. Until 2010 it was unclear how to use a programming language like C/C++ for motion planning in robotics. Possible attempts like the famous RRT motion planner resulted into a high CPU load but struggled in the reality.

Computers until 2010 were limited to classical computing tasks rooted in mathematical domain. They were able to analyze statistical data or show a graphics on the screen. These classical tasks were implemented in software packages and operating systems. None of these libraries was able to control robots with artificial intelligence and this has slowed down the development of robotics.

The paradigm shift towards intelligent robots was initiated by natural language as a communication interface. The principle is not completely new because since the 1980s it was available in text adventures like Zork and Maniac mansion. In both cases, the human player communicates with a computer over a textual interface. For example the player can select the command "move to door, open the door" from the menu and the avatar in the game will follow this command.

What was unknown until 2010 was, that such a text based user interface works great for robot control. Advanced robotics can be imagined as some sort of point&click adventure which provides a vocabulary on the screen to activate commands. This point&click interface ensures, that the man to machine communication works smoothly. That means, the robot is following the human instruction.

Under such a constraint, former motion planning in robotics are no longer an obstacle. Because the robot doesn't need to solve an optimization problem but the robot has to listen to the human operator. The hard problems are delegated to the human located outside of the robot and this is equal to an open system. Open system means, that natural language is feed into the robot and is submitted from the robot to the operator.

The single modification of using text based teleoperation is the core technology in modern robotics after the year 2010. It allows to solve all the former problems without inventing new hardware or new algorithms.  Instead of designing a sophisticated artificial intelligence hidden inside a robot, the goal is to invent a user interface which connects a robot with a human. In case of the Maniac Mansion videogame there are 3x5=15 located on the bottom of the screen. This interface allows to control the game. A character like Dave or Bernard can't be called intelligent in the meaning of computer science but they are able to parse commands and they give feedback in natural language so they are good examples grounded language in a video game.



No comments:

Post a Comment