December 28, 2025

Von Python zu Forth

 Über die Vorteile der Python programmiersprache braucht es nicht vieler Worte zu verlieren. Es ist die anfängerfreundlichste Sprache die es derzeit gibt, unterstützt sowohl prozedurale als auch objektivorientierte Konzepte, bietet gut leserlichen Code, hat eine umfangreiche Bibliotheksauswahl und die selbe Software läuft sowohl in Linux als auch Windows (zumindest in der Theorie).

Dennoch gibt es einige wenige Python Programmierer die unzufrieden sind mit dem Ökosystem und die neugierig sind auf mögliche andere bessere Programmiersprachen. Die wohl faszienrende Sprache abseits des Mainstreams ist zweifelsfrei Forth um die sich zahlreiche Legenden ranken während die wenigsten Programmierer praktische Erfahrung damit sammeln konnten.

Der folgende Blogpost erläutert die Forth Sprache für geübte Python programmierer. Der wohl wichtigste Unterschied ist, dass Forth mit deutlich weniger Speicher auskommt als jede andere Sprache. Forth benötigt sogar weniger RAM als die hocheffiziente C Sprache. Wenn ein Betriebssystem wie Haiku was in C++ geschrieben wurde rund 500 MB an RAM benötigt um den desktop anzuzeigen, kommt die Forth sprache für das selbe Betriebssystem mit nur 5 MB aus, also um den Faktor 100x weniger. Der Hauptgrund für den Effizienzvorteil ist der Verzicht auf Variablen. Forth hat keine Variablen die an Funktionen übergeben werden, sondern Forth hat einen Stack der nur wenige Byte groß ist und die Kommunhkation zwischen den Funktionen sicherstellt. Weiterhin ist der Befehlsumfang von Forth kleiner, so dass die Programme weniger Speicher im RAM belegen.

Einschränkend sei gesagt, dass das selbe Programm einmal geschrieben in C und einmal in Forth, in beiden fällen gleich schnell abläuft. Das heißt eine Routine um grafiken zu zeichnen oder eine Game engine die 30fps erzeugt ist in beiden Programmiersprachen mehr oder weniger gleich schnell. Um Forth schneller zu machen bzw. die selbe Software mit weniger Joule an elektrischer Leistunbg ablaufen zu lassen benötigt man speziell angepasste Forth CPU wie deb GA144, was ein hocheffizienter Vektorprozessor mit 144 Kernen ist.

Aber zurück zur Programmiersprache Forth. Im wesentlichen ist Forth eine Teilmenge einer Assemblersprache. Anstatt die üblichen Assembler Mnemonics wie Mov, add, mul, and bietet Forth nur Befehle die auf den Stack zugreifen, also push und pop. Es gibt keine Register wie EAX, EBX, ECX sondern nur diesen einen Stack plus einen zweiten für den Instruction Pointer und mit dieser Teilmenge müssen Forth Programmier dann auskommen.

Leider ergibt sich daraus ein gravierender Nachteil der die geringe Verbreitung von Forth erklärt: die sprache ist noch umständlicher zu lernen als Assembler. In Assembler wurden in den 1980er auf Heimcomputern relativ viele Spiele programmiert, speziell auf den 8bit Rechnern von Commodore und Atari ist Assembler bis heute die einzige weit verbreitete Sprache. Assembler gilt im Vergleich zu C als unleserlich und als maschinennah. Forth übertrifft Assembler im Minimalsmus und ist noch unleserlicher und noch minimalistischer.

Der Hauptgrund warum Forth im Mainstream niemals Akzeptanz fand ist, dass die Sprache ein Problem löst, was es so technisch nicht gibt und zwar das Problem des Hauptspeichers. Such Sicht einer Forth virtual machine ist Speicher extrem knapp. Die Annahme lautet dass der verfügbare Arbeitsspoeicher weniger als 10 kb beträgt. Und mit diesem knappen RAM muss die Software dann vorlieb nehmen. In der Realität hingegen haben moderne PC deutlich mehr Speicher, und zwar 1 GB und deutlich mehr. In so einem Umfeld hat ein Forth Programm keinen Vorteil gegenüber Code der in C geschrieben wurde.´

December 24, 2025

Python for event detection of a warehouse robot

 The Python programming language is recognized as a high level language which allows to program complex software in a low amount of code. The reason is, that python has many builtin libraries and has an easy to understand syntax for parse. Its much easier to create a prototype in Python than in other programming languages like C or Assembly.

Unfurtunately, the Python language remains a classical imperative language which means, that the programmer has to define functions, classes and variables. This formal syntax ensures, that a computer can execute the statements. To increase the abstraction level further another syntax is needed which is presented next. The goal is to program an event recognition system for a warehouse robot. Instead of implementing a GUI prototype the goal is to write only a list of [tags] which can be detected by the robot. The tags are described in a python dictionary:

warehouse_robot_taxonomy = {
    # --- Navigation & Localization ---
    "NAV_WAYPOINT_REACHED": "Robot base has arrived at the specific coordinate destination.",
    "NAV_PATH_OBSTRUCTED": "LIDAR/Depth sensors detect an unexpected object in the path.",
    "NAV_RELOCALIZATION_REQ": "Robot uncertainty in pose exceeds threshold; seeking landmarks.",
    "NAV_FLOOR_HAZARD": "Detection of liquid spills, debris, or uneven surfaces.",

    # --- Manipulation & Payload ---
    "MAN_SHELF_ALIGNED": "End-effector is centered and parallel to the targeted rack slot.",
    "MAN_GRIP_SUCCESS": "Tactile/Force sensors confirm object acquisition.",
    "MAN_GRIP_SLIP": "Loss of contact or shifting weight detected during transport.",
    "MAN_LOAD_SHIFT": "Internal IMU detects payload instability while the robot is moving.",

    # --- Perception & Identification ---
    "PRC_SKU_VALIDATED": "Barcode, QR code, or RFID successfully read and matched to manifest.",
    "PRC_MISPLACED_ITEM": "Vision system identifies an object where the database expects a void.",
    "PRC_SHELF_FULL": "The destination bin has no available volume for placement.",

    # --- Safety & Human-Robot Interaction ---
    "SAF_ESTOP_ACTIVE": "Physical or software-based emergency stop has been engaged.",
    "SAF_HUMAN_NEAR": "Safety scanners detect a human worker within the 'Slowdown' zone.",
    "SAF_COLLISION_IMMINENT": "Time-to-collision calculation triggers immediate braking.",

    # --- System & Maintenance ---
    "SYS_BATTERY_LOW": "Charge level requires return to docking station.",
    "SYS_COMMS_LOST": "Loss of heartbeat or high latency with the Warehouse Management System (WMS)."
}


Such a list of tags has not much in common with a software project, but its similar to a database. The table stores two columns: name, description. The dictionry stores the items in the table.

The idea behind the project is, to annotate the sensory perception with one of the tags. For example, if the battery is low the tag [SYS_BATTERY_LOW] gets activated or if the robot has scanned an object the tag [PRC_SKU_VALIDATED] gets activated. The entire game state is projected towards a tag vector with 16 entries, each of the tags can be true or false so the game state is encoded in 16 bits, which is a very compact representation. The python dictionary ensures, that the human operator has a better understanding of each of the tags. There is a name plus a descripotion given.

December 20, 2025

Symbol grounding with tags

 

The screenshot shows a prototype for a maze game. The dominant features is a camera which determines semantic tags for the mouse position. Such a camera is a basic demonstration for grounded language, it doesn't photograph the picture on a pixel level but it captures the meaning of a cell. This meaning is encoded by tags like [wall], [junction], [robot] and [straight_right].

Without the tagging mechanism its impossible to communicate with the robot in natural language and its also impossible to execute tasks like instruction following or visual question answering. The detected tags are the precondition for human to robot interaction with natural language.

From a software engineering perspective, the semantic camera is a GUI widget element. It contains of a rectangle shown in the video game and there is a textual output shown on the bottom of the window.

December 18, 2025

From robotics algorithms to grounded language

 Artificial Intelligence from 1960 to 2010 was working with a bias which prevented the development of powerful robots. This bias was a closed system driven by numerical algorithms. The understanding was that a robot is some sort of computer or a Turing machine which executes a computer program step by step. The computer program controls the robot and limits its ability to solve problems in the real world. Until 2010 it was unclear how to use a programming language like C/C++ for motion planning in robotics. Possible attempts like the famous RRT motion planner resulted into a high CPU load but struggled in the reality.

Computers until 2010 were limited to classical computing tasks rooted in mathematical domain. They were able to analyze statistical data or show a graphics on the screen. These classical tasks were implemented in software packages and operating systems. None of these libraries was able to control robots with artificial intelligence and this has slowed down the development of robotics.

The paradigm shift towards intelligent robots was initiated by natural language as a communication interface. The principle is not completely new because since the 1980s it was available in text adventures like Zork and Maniac mansion. In both cases, the human player communicates with a computer over a textual interface. For example the player can select the command "move to door, open the door" from the menu and the avatar in the game will follow this command.

What was unknown until 2010 was, that such a text based user interface works great for robot control. Advanced robotics can be imagined as some sort of point&click adventure which provides a vocabulary on the screen to activate commands. This point&click interface ensures, that the man to machine communication works smoothly. That means, the robot is following the human instruction.

Under such a constraint, former motion planning in robotics are no longer an obstacle. Because the robot doesn't need to solve an optimization problem but the robot has to listen to the human operator. The hard problems are delegated to the human located outside of the robot and this is equal to an open system. Open system means, that natural language is feed into the robot and is submitted from the robot to the operator.

The single modification of using text based teleoperation is the core technology in modern robotics after the year 2010. It allows to solve all the former problems without inventing new hardware or new algorithms.  Instead of designing a sophisticated artificial intelligence hidden inside a robot, the goal is to invent a user interface which connects a robot with a human. In case of the Maniac Mansion videogame there are 3x5=15 located on the bottom of the screen. This interface allows to control the game. A character like Dave or Bernard can't be called intelligent in the meaning of computer science but they are able to parse commands and they give feedback in natural language so they are good examples grounded language in a video game.



Grounded language for open system communication

 Robotics in the past was imagined with a computer program as a closed system. The software has to solve a mathematical problem by itself and doesn't communicates with the outside world. The result of this autarky process is the np hard algorithmic problem. Large motion planning problems are overwhelming the processing power of the robot and this results into a robotics failure.

The improved robot control paradigm works with an open system and grounded language as interface to the outside world. The robot is able to execute commands from an operator and reports detected event back the the operator. For doing so the robot is using English which consists of verbs, nouns and adjectives. A typical communication might be "operator: move_forward until junction", "robot: done I've reached junction".

Such an open system isn't affected by the np hard problem, because the robot has no algorithm but the robot executes commands from the outside world. Each command is parsed by the robot in a few milliseconds and the CPU usage is very low. Possible problems like obstacles ahead won't produce a dead end because they are reported back to the human and then the human operator is in charge to bypass the obstacle.

The communication in an open system contradicts existing understanding of Artificial Intelligence. AI in the past was mostly imagined with the previously mentioned closed system paradigm. The robot follows its own algorithms which requires an advanced software and a highly developed computer. It should be mentioned that closed systems in robotics are a dead end. They never worked in the past and its unlikely that future algorithms can address the problems. Closed systems are working only fine for mathematical problems like sorting an array or searching in a database, and they are the wrong paradigm for complex problems like dexterous grasping or biped walking.



December 15, 2025

A gentle introduction into Artificial Intelligence

 Existing tutorials about Artificial Intelligence were mostly written with a certain bias. Either the goal is to explain the social implication of intelligent robots for example that they will improve the society. Or in the second approach, AI is explained as enabling technology for realizing breakthrough innovation in healthcare, material science and logistics.

The problem with such an approach is, that it redirects the reader away from AI towards non AI subjects located in society, application or marketing. On the long run it will weaken the understanding and doesn't explain what AI is about.

This introduction text prefers a computer science focus in which AI is explained in the same discourse space used by AI experts on the field over decades. AI was never imagined as a powerful technology to enabling automation but computer scientists are defining AI as unsolvable problem namely np hard.

An np hard problem can't be solved by existing algorithms nor computer hardware because such a problem has no answer. These complex problems are very fascinating for computer scientists and there are endless amount of papers available about the subject. The typical np hard problem is surprisingly easy to explain to non experts, for example the videogame Lemmings is often referenced as np hard.[1] Another more abstract problem is the traveling salesman problem which has its roots in mathematics. The goal is to find the shortest path in a graph.[2] Another important unsolvable problem in the context of robotics is motion planning. According to most experts it is unsolvable with current algorithms and hardware.[3].

The history of Artificial Intelligence is mostly the history of unsolvable problems. It is not an exaggeration to claim, that endless amount of papers were written why a certain problem is np hard, and why existing algorithms are not powerful enough to solve such a problem on a computer. The discourse space around pspace, np complete and np hard is the center of gravity in the AI debate over the last decades. It shows the limitation of computer science.

AI in the past was trying to solve these problem, but it wasn't successful. The overall workflow is similar to the myth of Sisyphus, in which a hero in greek mythology struggles in solving a problem.

References:

* [1] Viglietta, Giovanni. "Lemmings is PSPACE-complete." Theoretical Computer Science 586 (2015): 120-134.
* [2] Zambito, Leonardo. "The traveling salesman problem: a comprehensive survey." Project for CSE 4080 (2006): 11.
* [3] Hoffmann, Michael. "Motion planning amidst movable square blocks: Push-* is NP-hard." Canadian Conference on Computational Geometry. 2000.

 

 

How robots understand language

 

The symbol grounding problem is about converting sensory perception into natural language. A concrete example is given in the screenshot. There is a virtual camera available which creates the tags for the mouse cursor. In the example the mouse is pointing on a yellowbox located in roomA. These tags ensure that the robot will understand a command like "Moveto roomA and grasp the yellow box" because this command is referencing to detected tags.

A semantic camera doesn't need a certain algorithm, but its mostly a GUI element. There is a rectangle on the screen and in the background the found objects are added to the status line on the bottom. All the information about walls, objects and rooms are already stored in the game engine, the camera widget formats the information only in a textual format.

December 14, 2025

Semantic camera for grounded language

 

An entry level demonstration for the symbol grounding problem is a semantic camera. The human operator points with a mouse on the screen and the algorithm shows the detected sematnic tags. A tag might be "green, circle" or "blue, rectangle". So the algorithm describes in words what can be seen on the screen.

Implementing such a camera in python is not very hard. The newly created class takes the game engine and the mouse position as input and generates a list of tags as output.

There are many applications for such a semantic camera. A robot which is equipped with such a device can navigate with ease in a maze, because the robot can communicate much better with a human operator. The robot generates words to describe its perception, and it can execute commands from a human also formulated in high level natural language.