April 10, 2026

AI powered by natural language

An often asked question in the last 70 years of robotics research was: where is the AI located in a robot? It took around the year 2010 until researchers have answered this important question. The AI brain is located in natural language. A state of the art robot converts natural language into servo motor movements and has a access to large database with natural language commands. This allows the machine to think, take decisions and act.

The principle of realizing artificial intelligence with natural language was imagined before the year 2010 in projects like Televox robot (1929), SHRDLU (1968), Maniac Mansion game (1987) and Vitra visual translator (1987). The innovation which takes place after the year 2010 was, that AI was located only in English sentence but not in other potential sources like path planning algorithms, genetic algorithms, neural networks or expert systems.

The claim is that without natural language interface, its impossible to program a robot. Natural language is the only needed and most powerful tool to realize intelligent machines. There is no need to focus only on the English language because other human languages like German or French works also well. The only precondition is, that its a natural language with a large vocabulary to describe the reality in nouns, adjectives and verbs. Its not possible to use a programming language like Java or C++ for this purpose, but only natural language are powerful enough to grasp the environment of a robot.

Modern robotics after the year 2010 can be described as a semiotic interpretation engine which is able to use grounded language to convert high level symbols into low level motor actions. In other words, modern robotics doesn't work with algorithms or computer programs but the AI was realized with an advanced user interface similar to what is known from Maniac Mansion adventure game. The user interface on the bottom of the screen allows a human to machine interaction and this translation process can generate artificial intelligence.


April 09, 2026

Dataset for grounded language

 A DIKW pyramid consists of 4 layers for describing multiple abstraction level for a game state. In the context of a warehouse robot the following dataset is available.  The task for the neural network is to translate between the layers, this allows a human to machine communication. Human are expressing in full english sentences, while machines are measuring the environment in numerical sensor values.

[
  {
    "id": 1,
    "data": {"pos": [40, 10], "rgb_color": [255, 0, 0], "dist": 2.0, "trajectory": "traj3", "battery": 0.7},
    "information": ["box", "red", "kitchen", "entrance"],
    "knowledge": "Pick up the red box in kitchen room"
  },
  {
    "id": 2,
    "data": {"pos": [15, 5], "rgb_color": [0, 0, 255], "dist": 1.5, "trajectory": "traj1", "battery": 0.65},
    "information": ["box", "blue", "corridor", "obstacle"],
    "knowledge": "Move the blue box out of the corridor to clear the obstacle"
  },
  {
    "id": 3,
    "data": {"pos": [60, 30], "rgb_color": [255, 255, 0], "dist": 5.0, "trajectory": "traj_return", "battery": 0.15},
    "information": ["charging_station", "yellow", "dining_room", "low_battery"],
    "knowledge": "Aborting task: Return to dining room for immediate charging"
  },
  {
    "id": 4,
    "data": {"pos": [45, 12], "rgb_color": [100, 70, 20], "dist": 0.5, "trajectory": "none", "battery": 0.58},
    "information": ["obstacle", "brown", "kitchen", "entrance", "blocked"],
    "knowledge": "Wait for 10 seconds: Entrance is blocked by a moving obstacle"
  },
  {
    "id": 5,
    "data": {"pos": [10, 80], "rgb_color": [200, 200, 200], "dist": 10.0, "trajectory": "traj_search", "battery": 0.9},
    "information": ["room", "corridor", "empty", "pickup_zone"],
    "knowledge": "Scanning corridor: No items found in designated pickup zone"
  },
  {
    "id": 6,
    "data": {"pos": [33, 44], "rgb_color": [0, 255, 0], "dist": 0.1, "trajectory": "traj_dock", "battery": 0.5},
    "information": ["box", "green", "dining_room", "drop"],
    "knowledge": "Drop the green box at the delivery point in the dining room"
  },
  {
    "id": 7,
    "data": {"pos": [5, 5], "rgb_color": [255, 0, 0], "dist": 2.2, "trajectory": "traj_safety", "battery": 0.45},
    "information": ["hazard", "red", "corridor", "liquid_spill"],
    "knowledge": "External command: Avoid red marked area due to a spill in the corridor"
  },
  {
    "id": 8,
    "data": {"pos": [50, 50], "rgb_color": [255, 255, 255], "dist": 0.0, "trajectory": "none", "battery": 0.4},
    "information": ["inventory_list", "mismatch", "kitchen", "room"],
    "knowledge": "Import external data: Re-scan kitchen room to update missing inventory"
  }
]

Warum Python in der Spieleentwicklung Rust überlegen ist

 Die Rust Community behauptet eine Allzweck programmiersprache geschaffen zu haben die für alle Aufgaben von low level bis high level empfohlen wird. Und folglich ist es nur eine Frage der Zeit bis veraltete Programmiersprachen wie C++, Java oder Javascript durch Rust ersetzt werden.

Es gibt jedoch berechtigte Kritik an der Rust Sprache die kurz erläutert werden soll. Moderne Softwareentwicklung basiert zwingend auf der zuhilfenahme eines Large language models, und sei es nur um Fehler im Code zu finden. Der Nutzer postet ein Codesnippet in das LLM seiner Wahl und erhält anders als bei Stackoverflow unmittelbar die korrekte Antwort und muss nicht erst 2 Tage warten.

Large Language modelle sind zwar technisch in der Lage in jeder Programmiersprache den Code zu erstellen, aber ihre Datenbasis wurde mit einer unterschiedlichen Menge an Beispielcode gefüttert. Mainstreamsprache wie Python oder C# werden von LLMs am besten verstanden weil die Menge an Beispielcode von github und die Menge an Tutorials am größten ist. Die These lautet dass jene Programmiersprache die beste ist, die die größte Verbreitung hat, so ähnlich wie English die beste Weltsprache ist, weil es von den meisten Sprechern verwendet wird.

Rust fristet zahlenmäßig ein Nischendasein, es gibt zwar durchaus beispielcode bei github worin gezeigt wird, wie man ein Tetris spiel inkl. grafischer Oberfläche programmiert, für das selbe Problem gibt es in der Sprache jedoch hunderte oder sogar noch mehr Respositorien. Ein weiteres Problem bei Rust ist dass wegen der Neuheit sich die Syntax laufend ändert und die Sprache selber Gefahr läuft durch modernene Forks wie Rue ersetzt zu werden. Das mag hilfreich sein bei der Weiterentwicklung einer modernen Allzwecksprache die über einen eingebauten Memory checker verfügt, macht aber über Nacht den vorhandenen Sourcecode bedeutungslos und fragmentiert das Ökosystem weiter. Das kann bei Python und C# nicht passieren. Die Menge an Beispielcode ist riesig und wächst konstant.

Hier mal einige Zahlen.

- Anzahl Stackoverflow Fragen für Rust: 45k
- Anzahl Stackoverflow Fragen für Python: 2.2 Million

Automatisierung von Teleoperation in der Robotik

 Es gibt einen rationalen Grund warum ferngesteuerte Robotik über Jahrzehnte ein Schattendasein führte. Und zwar benötigt teleoperation einen menschlichen Bediener und lässt sich nicht automatisieren. Robotik versucht jedoch explizit Prozesse zu automatisieren ohne dass dafür menschliche Bediener nötig sind. Ergo wurde Teleoperation als Sackgasse definiert.

Seit ca. 2010 erlebten Ferngesteuerte Roboter ein Revival was auf ein besseres Verständnis für teilautonome Systeme zurückzuführen sind. Frühere Bedenken gegenüber ferngesteuerten Systemen konnten entkräftet werden. Teleoperation kann sehr wohl automatisiert werden und das soll kurz erläutert werden.

Angenommen ein ferngesteuerte Lagerroboter versteht einen Befehl wie "fahre in Raum B und bring mir die Box #4". Streng genommen handelt es sich nicht um einen autonomen roboter sondern wegen des Befehls ist es nur ein ferngesteuerter Roboter. Die Interaktion lässt sich aber mittels Script leit automatisieren. Im SCript steht folgende Sequenz:

1. "fahre in Raum B und bring mir die Box #4"
2. "fahre in Raum A und bring mir die Box #2"
3. "fahre in Raum B und bring mir die Box #8"
4. "fahre in Raum D und bring mir die Box #12"
5. "fahre in Raum C und bring mir die Box #1"

Nach Ausführung dieser Sequenz hat der Roboter ohne weitere Interaktion immerhin 5 unterschiedliche Boxen von verschiedenen Standorten geholt. Er war dazu mehrere Minuten im Einsatz und verhielt sich fast wie ein autonomes System. Wenn die Kopmmandos auf einer hohen Abstraktionsebene formuliert werden, wie im obigen Beispiel zu sehen, ist es keine klassische Joystick Teleoperation mehr sondern solche Systeme haben starke Ähnlichkeit mit autonomer Roboter.

Zwar erhält der Roboter technisch gesehen Befehle von einem menschlichen Bediener, denkt also nicht selber, nur kann die o.g. Befehlssystsem auch in einem SCript oder Makro gespeichert sein, es wird also gar kein menschlicher Bediener benötigt sondern der mensch erstellt einmalig ein Script und damit wird der Roboter autonom.

Die Frage ist weniger ob ein System autonom, teilautonom oder ferngesteuert funktioniort sondern die eigentliche Frage ist auf welchem Abstraktionslevel die Fernsteuerung erfolgt. Wenn über einen joystick lowlevel steuersignale gesendet werden sind diese tatsächlich nur schwer zu scripten. Man kann die Joystick kommandos nicht aufzeichnen und erneut abspielen weil der Roboter in einer anderen Ausgangsposition startet. Wenn jedoch die fernsteuerung mittels text interface erfolgt lassen sich die kommandos leicht aufnehmen und erneut einsetzen. Es lassen sich sogar Programme schreiben, um die Scripte automatisiert zu erzeugen.

Moderne Ferngesteuerte Robotik ab ca. 2010 versucht primär über Text interfaces ein hohes Abstraktionsniveau zu erreichen bei der Mensch maschine Kommunikation. Damit kommt man dem Ziel einer künstlichen Intelligenz sehr nahe.

April 08, 2026

Twelve-tone music on a student party


 

The need for human to robot communication

Layered communication with a DIKW pyramid solves a simple problem: how human can submit commands to a robot. For example the human operator submits "bring me the red box" to the robot and the robot will fetch the box.

The unsolved question is why such a human to robot interaction is needed? In the classical understanding of AI until the 2000s, such kind of pattern was interpreted as a dead end. Artificial intelligence was described as autonomous system which doesn't need to communicate with humans. If a robot doesn't communicates with humans the robot is described as a closed system which is able to run a computer program written in C or runs a neural network algorithm but the robot isn't using human language because there is no need for doing so.

Grounded language realized in AI systems like SHRDLU (1968), Vitra (1987) and M.I.T. Ripley robot (2003) is only needed if human to machine interaction is intended. A possible explanation for this paradigm shift has to do with the weakness of closed AI systems. Existing attempts to build autonomous robots have failed because closed systems are overwhelmed by a complex environment. Even if the robot's software consists of 100k lines of code in the C/C++ language this code won't match to a warehouse robot task because ambiguity and vague goals. Classical computer programming language are working only in a predicted environment like sorting an array or showing pixels on a monitor. Computer programs are the internal language of machines but the code can't store the knowledge for robots task.

Before the advent of human to machine interaction there was another available to build more powerful software for robots based on ontologies. Instead of storing the world knowledge in computer code the goal was to capture knowledge in a Cyc like mental map. Unfortunately, this concept has failed too. Even OWL ontologies are not powerful enough to store domain knowledge. Only teleoperation matches to high complexity. A teleoperated robot arm can do any demanding task including dexterous grasping and novel trajectories never seen before.

During teleoperation the AI problem gets outsourced from the robot itself towards an external source which is the human operator. The teleoperation interface allows a man to machine interaction which translates the external knowledge into robot movements.

April 07, 2026

Introduction to the Rust Programming Language: Safety, Speed, and Sovereignty

 The landscape of systems programming is undergoing a generational shift. For decades, C and C++ were the undisputed kings of performance, but they carried a heavy burden: memory unsafety. Most Common Vulnerabilities and Exposures (CVEs) in modern software are caused by memory leaks and "use-after-free" errors in legacy C/C++ code. Enter Rust, a language designed to provide the performance of C with the safety guarantees of a managed language.
Origins and Philosophy

Rust was created by Graydon Hoare (who also contributed significantly to the development of Apple’s Swift). Because of this shared lineage, developers often notice a familiar, modern syntax between the two languages. Unlike languages that rely on a Garbage Collector (GC)—a process that periodically scans and removes unused memory—Rust introduces a revolutionary Ownership Model.

In C, developers must manually manage memory using malloc and free, often relying on external tools like Valgrind to hunt down leaks. C++ improved this with smart pointers, but Rust takes it a step further at the compiler level.
The Power of Ownership

The heart of Rust is its ownership system, governed by three strict rules:

    Each value in memory has a variable that’s called its owner.

    There can only be one owner at a time.

    When the owner goes out of scope, the value is automatically dropped (cleared from memory).

This prevents memory leaks and ensures that a "use-after-free" error is physically impossible to compile. This level of safety is why giants like Microsoft are planning to transition significantly to Rust by 2030, and why even the Linux Kernel has begun integrating Rust code (currently sitting at roughly 0.3%).
Ecosystem and Tooling

The developer experience in Rust is centered around Cargo, its highly praised build system and package manager.

    Project Initialization: Use cargo init --bin to start a new project.

    Dependency Management: Simply edit the Cargo.toml file to add a "crate" (Rust's term for a library).

    Execution: Run cargo run to compile and launch your application.

Rust is also being used to build entire operating systems, such as Redox OS. Redox is a Unix-like OS written entirely in Rust, featuring its own display server called Orbital and supporting the NetSurf browser.
Learning Through Graphics and Games

Many developers find that the best way to master Rust's strict compiler is through visual projects. The Piston engine is a long-standing choice for game development, but for those seeking simplicity, Macroquad is an excellent library for 2D/3D graphics.

Developer Note: If you encounter errors like "floating point arithmetic is not allowed in constant functions" while using Macroquad on Linux Mint, ensure your compiler is up to date by running rustup update stable.
References and Further Reading

- From C# to Rust: A 42-Day Challenge https://woodruff.dev/from-c-to-rust-a-42-day-developer-challenge/
- Microsoft: Rust to Replace C/C++ by 2030 https://www.martinsfeld.de/blog/microsoft-rust-ersetzt-c-cpp-2030/
- Game Development in Rust with Macroquad https://mq.agical.se/
- The Rust Programming Language (Official Book) https://doc.rust-lang.org/book/
    
  

Grounded language in robotics

The symbol grounding problem and especially the detail of grounded language is a very new subject in computer science. It needs to be explained because its the key element of artificial intelligence. As an introductory example a warehouse robot has stored a json file:

{
  "knowledge": "pick up the red box in kitchen room",
  "information": [box, obstacle, entrance, room, pickup, drop, red, blue, yellow, kitchen, corridor, dining room],
  "data": [pos=(40,10), rgb_color=(100,70,20), dist=20, trajectory=traj3, direction=(40,10), battery=0.7],
}


This json file shows 3 of 4 layers from a DIKW pyramid. It is the current situation of the robot also known as the game state. According to the DIKW pyramid, this current situation is stored in different layers which have a different abstraction. The lowest data layer stores numerical information gathered from hardware sensors. While the information layer stores the vocabulary and the knowledge layer stores a concrete instruction.

The main task of the robots artificial intelligence is to translate between these layers. The human enters a command and the robot understands the command because its translated into the low level data layer. This translation process is called grounded language.

In contrast to former natural language processing (NLP) the goal is not to check the grammar of the input instruction e.g. to verify that the word "room" is a noun or that "red" is an adjective. So the question is not what language is by itself, but the problem has to do with converting from high level abstraction towards low level abstraction in the DIKW layers.