February 24, 2022

3d2 Grounding with a cost function

 < 3d1 Pipeline for robot programming

 

In the AI domain there is a single problem available which is less frequently discussed. It is the symbol grounding problem. Solving this problem has practical relevance because it allows to program robots. To understand why grounding is important we have to describe first why robots in the past have failed.
Suppose someone has built a robot in hardware which contains of servo motors, sensors, and electric battery pack. On the other hand the user has access to an advanced supercomputer which can process millions of operations per second. The supercomputer is equipped with the latest Unix operating system which includes all sort of programming tools and databases.
But, the robot won't move a single step. Something is missing to connect the robot with the computer. And it is not a wire or a wifi connection. What is missing is transformation from the robot problem into a computional problem. Computers are great in solving something, but before they can apply algorithms to a problem, somebody has to define what the problem is.
It is difficult to circumcise the grounding problem in technical terms. It is part of the problem that grounding remains something not available for modern robotics. But there are some hints available how grounding works in theory. A robot domain consists of features. For example the current position. These features have to be converted into a cost function. And a cost function can be solved by a computer with algorithms.
So it seems that grounding has to do with converting features into a cost function. The problem is the terms features and cost function are very general term which can mean anything and nothing. Perhaps it makes sense to give an example. Suppose the robot is 50 cm away from the goal. Then the feature is “distance_to_goal” and the value is 50. A possible cost function would be: cost=distancetogoal/100.
In the concrete example, it is possible to determine the score for the robot. The cost would be: cost=50/100=0.5 And the goal is to bring this value downto zero. Then the robot has won the game.

 


February 23, 2022

3d1 Pipeline for robot programming

< 3d The bottleneck in robot programming

The following simple pipeline allows to create fully autonomous robots for all domains:
1. Create a simulation with object oriented programming
2. control the robot in the simulation with teleoperation
3. Extract features like distance to goal and angle of the robot
4. convert the features into a cost function
A cost function in a simulation allows to solve the concrete domain. If such a cost function is available the domain was grounded.
 

3e Cognitive simulation with text adventures

< 3d The bottleneck in robot programming

Most existing robotics was realized in a spatial environment. That means, the robot has a position in the 2d space and can move forward and backward. The interaction with the robot works providing numerical action signals for example the human operator press the forward key which moves the robot 1 cm ahead. Even if the robot can do useful thinks the man machine interface is working only on a low level without using natural language.
Cognitive architectures are trying to introducing a more human like interaction. The assumption is that human thinking and human decision making is working with natural language as a descriptive layer. From a technical perspective such an interface can be programmed as a point&click interface similar what was introduced in the game Maniac Mansion a long time ago.
Point&click graphical adventures are working by selecting action verbs and mention the objects by name. The robot isn't simple moving 1cm forward, but the robot executes a skill which has a name and it can reach a waypoint which has also a name formulated in English.

 

Understanding cognitive models and cognitive architectures 

What is symbolic AI? 

February 21, 2022

3a1g Playing Tetris with a cost function

 < 3a1f Reward maximization in a production line

Similar to most AI domains, the game of tetris is an np hard optimization problem, which means, that a desktop PC will need to process the information for over 1 million years, until the next action can be determined ... To overcome the bottleneck a heuristic is needed. To be more precisely, it is about a heuristic cost function or short “a cost function” which is used to play Tetris by an AI.
Such an attempt to solve the game might sound a bit surprising, because most programmers assume that the game of Tetris provides a cost function as default and there is no need to define the rewards. The problem with the built in default cost function is, that it works with a delay. The player will recognize that he has lost if it is too late. So what is needed is a modified cost function which provides a continuous very precise feedback signal if a certain action makes sense or not.
In the game of Tetris, the human player can take two decision: at which position the next block is placed and at which rotation. The interesting situation is, that it is possible to judge about the decision even before the block has fallen downwards. This allows to determine a score before the official score was determined.
On base of this cost score, a receding horizon planner can determine the optimal action. What the planner is doing is to maximize the reward. The open question is, what costs exactly should be calculated for a certain game situation? This depends from the heuristics. It is possible to invent a simplified cost model or a more advanced scoring system. A slightly accurate cost function allows a desktop PC to play the game of tetris with super human level performance without occupying too much CPU ressources.


February 20, 2022

3a1f Reward maximization in a production line

 < 3a1e Improved production line robot



The example with the production line game was introduced in a previous post. Even if a human operator can interact with the game, the more exciting task is to program a robot which can solve the game. Because the game is already grounded with an elaborated reward function such a robot solver is pretty easy to realize. For the concrete game, only 30 lines of code in the python language were enough to create such a robot.
What the machine is doing after it gets started is simple. It will maximize the reward. That means, the solver has a clear understanding about the goal of the game, and it can plan the optimal action sequence. The resulting performance is much higher than what a human operator has to offer. Not a single mistake was made and the amount of picks per seconds is high.
The reward is growing constantly over the time, also the robot is visiting regularly the charging station. This sense making behavior is not the result of the onboard AI itself, but it has to do with the reward function. The reward function defines which actions are good and and which not and this guides the search in the state space.
The perhaps most surprising insight was that not the robot is intelligent, but the game has this built in feature. The opposite is a game which has no reward function. Without a reward it is not possible to play this game automatically.

 

3a1e Improved production line robot

 < 3a1d Programming an assembly line robot



In addition to the previous simulation game, the GUI was improved a lot. The robot has to sort the tokens but this time the task is more complicated. Also the performance evaluation was improved, so the robot needs to adjust it's movements.
One thing remains the same. The robot isn't controlled by an AI, but the game is played by human intervention. That means the operator has to press the arrow keys and this will trigger the actions of the robot. The AI is located in the referee, that means, the box with the performance information on the left lower side is generated by software on the fly.
For example, if the robot worker has placed the token into the wrong bin, his reward is reduced by -4. Also if the robot keeps on walking even the energy level is low, a negative reward is the result. Why this performance evaluation is needed is because of grounding reasons. The task of navigating a robot in a factory is translated into a score. That means it is not important what exactly the robot is doing but the only thing who cares is the measured reward.
At the end of the game it is pretty easy to judge what the robot has done. For example he has executed 16 picks, has reached an overall score of 11 and it took a time of 92 seconds. All the quality criteria are available as numerical integer values, that means, they can be stored in a computer program very well. In contrast, the original domain which has to do with sorting tokens by color and place them to the correct position are hard or even not possible to understand by a computer. So we can say, that the shown game is an example for a grounded Artificial Intelligence.
 
Let us take a look into the game itself. The incoming line on the left side delivers new tokens in random order. The task for the robot work is to sort them by putting them on the outgoing lines on the right side. So it is some sort of pick&place task.
The interesting situation is, that the robot worker isn't controlled by a sophisticated Artificial Intelligence but it is working in the teloperated mode. That means, the shown simulation is a normal computer game. After pressing the keys the robot is doing something. The new and advanced element is, that the virtual referee determines very precisely if the action is making sense or not. That means, every action of the robot is tracked, monitored and translated into a reward score. The robot worker is under total surveillance and has to explain itself for everything. So it is a highly accurate scoring system to determine the performance of the robot.
On the first look such a game doesn't look very pleasant, because human workers doesn't like the idea to be monitored. But from the perspective of Artificial Intelligence it makes a lot of sense. Because such a domain is a great testbed for an optimal control algorithm. The domain has possible actions (up,down, left, right), and the domain provides a feedback stored in the total reward. The feedback is a numerical value and the task for the model predictive control algorithm is pretty easy. The goal is to maximize the reward. This is equal to win the game.
Here the pipeline in short:
Teleoperation -> simulation -> features -> cost function
 
Playing the game with an algorithm
How humans are playing such games is easy. They are looking at the monitor and decide which actions should be done next. With a bit of training a human operator will reach a better performance and for sure he will make some minor mistakes after repeating the actions over and over again.
The more interesting question is how will an optimal control algorithm play the game. The algorithm doesn't have any sort of human intelligence so he has to focus on the information box on the left lower side. The box contains the variables: elapsed time, energy, picks, totalward reward.
From a computer perspective these variables are stored in an integer array which contains of 4 elements. In addition the computer will need another variable to store possible actions for the robot (0=left, 1=up, and so on). The interesting thing is, that with this minimalist information a computer is able to play the game. There are many existing algorithm available to determine the optimal action sequence. Most of them are working with graph search which is improved by the reward information.
 
The inner working of the AI
In the concrete example it is possible to explain who the AI is working. The normal conception is that the robot is doing something and the robot should be determine by it's own intelligence which action is the best one. So the question is how exactly does the AI know, that the the robot has to go the charging station or place the token to the correct place?
The surprising answer is, that the robot doesn't know the answer. What is available instead is a virtual referee. The referee determines the score during the game. He determines if the action was good or not. Such a referee is used in most video games to determine the collision of a player with walls. Here in the production line simulation the referee is more advanced and determines many other details. The interesting situation is that the referee can judge about human players and AI Controlled robots itself. So the new understanding is, that the robot doesn't need an onboard AI, but the game needs a virtual referee.
This referee allows to ground a game. Grounding means to convert the pixel map which is 320x200 into a small list of variables shown on the lower left. These small amount of variables are used by a solver to play this game automatically.



February 19, 2022

3d The bottleneck in robot programming

 < 3c The misconception about bottom up robotics

In contrast to a famous myth, the bottleneck is not to program the robot itself. But it has to do with defining the cost function. The initial situation is, that a teleoperated robot is available. For example a robot arm or a remote controlled car. What is missing is a numerical judgement if the actions of the robot are pleasant or not.
In most cases this decision is made by humans but not by algorithm. And this missing scoring system prevents that the task can be automated. So the challenge is to create a software which is judging about a teleoperated robot. For example if the car collides with an obstacle the costs are high and if the car stays on the lane the costs are low.
Such an algorithm or software is mostly a mathematical equation. It takes features as input and determines the cost value as output. This processing step is the core element of a virtual referee and the bottleneck in today's robotics. There is a large difference is such a scoring system is available or if it is missing.

3a1d Programming an assembly line robot

< 3a1c Game design with petri nets



The picture shows an assembly line simulation game. The user can control a robot in the middle and the task is to sort incoming tokens. The robot has a battery level and all the actions are scored. If the robot puts the wrong token on the outgoing conveyor a certain amount of error costs are created. So the overall objective of the game is to reduce the costs.
Sounds not very complicated, right? The AI is located in the game engine. The game engine determines the score, and simulates pick&place actions in the game. The shown game can be played by a human player very well.
The interesting situation is that such a grounded domain can be automated easily. All what is needed is to solve the given optimization problem. The goal is, to minimize the costs for the robot and the costs are calculated by the game.
 
Let us try to elaborate the situation a bit. In bottom up robotics the idea is to program the robot in a certain way, that he is solving a task. Such a program is not needed here and it wasn't implemented. The idea of top down robotics is, that the AI is equal to the virtual referee. The virtual referee monitors a game and determines the score for a player.
The example simulation allows the robot to do certain actions. It can pick a token, it can place a token, it can walk around and it can charge the battery at the lower position. All these actions have consequences. For example if the battery level is below a certain threshold, the costs for the robot are growing fast. So it is a classical video game, except a strong emphasizes was put on the scoring function.
The interesting situation is that after starting the game in the command line the robot won't do anything. the reason is, that it wasn't the objective to program the robot. Instead the idea is that core element is the scoring function which is located inside the physics engine. What this scoring function is able to do is to judge about the actions. It converts possible behaviors in the game into a score. This score is shown on top left of the screen. It is a numerical feedback about the meaning of actions. Only actions which are generating low costs are sense making.
The principle has to do with social roles. There is an actor which is the robot. In theory, the robot can do anything which includes to put the token on the wrong conveyor. In the game, such actions are producing a higher costs. This virtual referee is sometimes called a critic because he judges about the robot.

February 16, 2022

3a1c Game design with petri nets

 < 3a1b Top down robotics



The most practical way in realizing top down robotics is to invent a game, which has to played by the AI. Simple games are taking place on a graph and can be visualized with petri nets.
What the robot can do inside the game is to execute actions. The game engine monitors the possible game states and determines the score for the robot.
If such a game was created it is possible to search for a path. The solver needs constraints and a goal as input and determines the action sequence to reach this goal.
sources :
Jensen, Kurt. Coloured Petri nets: basic concepts, analysis methods and practical use. Vol. 1. Springer Science & Business Media, 1997.

3c The misconception about bottom up robotics

< 3b Micromouse 

Bottom up robotics is working with certain assumption about how the world looks like. The idea is that some sort of problem is available and then the robot has to be programmed to solve the challenge. For example there is a line following challenge and the robot has to stay on the line.
The misconception is to assume that solving a well defined problem is hard or has something to do with artificial intelligence. Nope, the real challenge is located somewhere else. The real problem is that most robotics challenges aren't formalized enough.Suppose there is a well defined hierarchical problem which includes a cost function, subgoals and possible actions. Then it's trivial to control the robot autonomously. Such a problem is not an AI problem but it has to do with solving optimization problems
Controlling a robot isn't very complicated if the problem was defined very well. The reason is, that the actions of the robot have to do with maximizing the own reward. This can be realized with cost biased graph search algorithms like A* or RRT. And reinforcement learning is able to fulfill the same task.
But, if robot programming from a bottom up perspective is easy why are real robots struggling in solving tasks? Because it's very complicated to define in software what the problem is about. A human sees immediately if a robot has struggled in a problem, but a virtual referee which is a computer program has more problem in recognizing the issue. In game programming this module is often called a game engine and for most domains like grasping or self driving car no such game engine is available.

February 14, 2022

3a1b Top down robotics

< 3a1a What is a robot?

A less common approach in realizing robots is working with the top down method. In contrast, to bottom up robotics, intelligence is not defined as the ability to solve problems, but it is the judgment about the behavior of other who have to solve problems. Top down robotics is equal to implement a referee in software who decides which player has won the game.
The perhaps most often used example for top down evaluation function is available for the 15 puzzle game. The 15 puzzle game is implemented together with the manhattan distance scoring function to measure who far a certain position is away from the goal state. It is up to the player in the game how to minimize the costs.
 

3b Micromouse

< 3a Programming robots with learning from demonstration

The micromouse competition is a concrete example what narrow AI is about. There is a clearly defined goal and even non experts can judge if a robot has reached the goal or not. The details how the robot is doing so can be ignored and it depends on the programmer how to build the robot.
Most examples for successfully created robots are working with a combination of a path planning- plus a steering algorithm. This principle is called bottom up robotics because the robot has to fulfill external goals which is reaching the goal. This paradigm is so widespread used that it is hard to explain potential alternative approaches.
Another, more elaborated strategy to program a micromouse robot, is the top down method. Here the idea is that it doesn't take matter if the robot has reached the goal, but the more interesting question is which score the robot has reached. For example if the robot collides with the wall it gets a -1 reward and if he has reached a waypoint, it gets rewarded with +1.

February 13, 2022

3a1a What is a robot?

< 3a1 Reward function

The terms robot and robotics is used frequently in this blog without giving a precise definition. The only thing which is relative sure is how to create robots from a hardware perspective There is a large industry available which is delivering components like servo motors, microcontrollers and gear boxes. Remote control toy cars can be seen as robots as well.
The more interesting and unsolved problem is how to program such devices. This remains an unsolved problem. What s available today are an endless amount of academic publication about the question how to program micromouse robots, automate something with pick&place robot and make biped robots walk.3 Artificial Intelligence
A relative new and very advanced technique is learning from demonstration. LfD stands in contrast to the former bottom up robotics. Bottom up robotics was invented in the late 1980s and is the blueprint for today's robot challenges. But bottom up robotics is similar to the BEAM method strongly focused on the robot itself and ignores the underlying game rules in which the machine has to operate.

3a1 Reward function

< 3a 

As an entry level task, a robot can be realized by defining a reward function. A reward function isn't about programming the robot itself, but it defines the game. Or to be more specific, it's a scoring algorithm to evaluate the actions of the players within the game. The player's role can be fulfilled by humans and robots as well. For example, in the famous pong game, the scoring mechanism decides, that player1 gets a +1 score if the opponent player wasn't able to block the ball.
A good scoring mechanism provides a continuous reward. The game can be paused at any moment and the algorithm determines, who has won the game with a floating point number precision level. Visualizing a reward function or a cost function as well, is usually done with a potential field which is heat map.

3a Programming robots with learning from demonstration

< 3 Artificial Intelligence

Before a robot can be programmed, there is a need to define what the problem is about. The problem is not to invent yet another programming language or speak to the robot on a hardware level. Also the problem is not about creating databases, or program a servo controller in the assembly language. All these topics are solved already and are getting analyzed by computer experts and programmers. But they are located outside of AI.
The AI side of a robot has to do with keyframe models for animation, defining a reward function and trace human actions with motion capture suits. These tools sound a bit uncommon because they have nothing to do with classical robots. They are located within AI and have to do with how to build future technology.

3 Artificial Intelligence

< About this blog

AI can be spitted into narrow AI and AGI. Narrow AI is easier to grasp because it has to do with building micromouse robots, in game AI, pick&place robots and chatbots.
The interesting situation is, that the subject of AI is large domain within science. Around 5 Million papers and endless amount of books were written about the subject. So the question is not what is AI itself, but the question what was published about the subject.

2 About this blog

This blog is about Artificial Intelligence and robotics. Even it contains articles about programming and operating systems like Linux the focus is on AI. AI means thinking machines, game ai, neural networks, and of course practical applications like building line following robots.
As today, the blog consists of 587 posts. The earliest one were created in 2018 about the Forth language Ext4 filesystem with Forth and typesetting a paper in LaTeX. Typesetting LaTeX for somebody else

February 08, 2022

What is a Zettelkasten?

At foremost an index card is something which has become obsolete since the advent of computers. Index cards were used until the 1960s and where later replaced by electronic databases and word processing software. It is some sort of outdated non electronic medium which can't compete with the Internet.
So it is a bit surprising to recognize that since a while the index card has become a famous technology. Similar to the analog vinyl record there is a huge interest available for this medium. This interest can be seen in sold books about the topic at amazon. The book an position #1 at the bestseller list is about index cards and the underlying Zettelkasten method.
Before we can judge about the topic we have to explain shortly what an index card notetaking system is about. The main principles are that the amount of text for each card is limited to around 500 byte and that the cards are numbered in a complicated system. This elements are visible for electronic and analog Zettelkasten systems as well so we can assume that they are the blueprint for the system.
The main contrast between analog and electronic storage systems, is that analog media are limited in the amount of space. The US-letter format has a size of 8.5x11 inches while an index card has a size of 3x5 inches. This limitation is not available for electronic media. A HTML page doesn't know a border but the width depends on the webbrowser. A window can have a width of 800 pixels, or only 200 pixels on a smartphone. The direct result of a fixed size for example of an index card is, that the amount of bytes which can be stored is limited. Writing an index card by hand results into a size of approximately 500 byte but not more.
Let me give a concrete example to explain why this is important. Suppose a writer likes to a make notes. So he will create a text file, writes the content in the file and after storing the file on the hard drive he will recognize that the file is 300 kb in size.
In contrast, if the same author is using index card to make notes he will recognize quickly that after writing down the first sentence, the index card runs out of space. That means, it is not possible to squeeze 300 kb in a single index card. The logical consequence is to use a second and third card to write the text.
From a writing process this limitation has a huge impact and allows to create a different sort of text. If many index cards are used to store information some sort of sorting mechanism is needed which is mostly a numbering system. The combination of a size limit of 500 byte, plus a numbering system to overcome the bottleneck results into the well known Zettelkasten method.
There are some attempts available to simulate index cards on a computer. In the easiest case, the MS-Word software is used and a table holds the index cards. The table has three columns for the id, title and text and then the user can write the content he likes. If he takes care that each entry has a limit of 500 byte, it is a simulated Zettelkasten.
The open question is how to use such a table to make notes? In the past this problem was never asked, since the advent of the PC there is no such thing like a size limit of 500 bytes. That means in a normal word file it is possible to write down unlimited amount of words or bytes. A carefully formatted table or a physical zettelkasten is a step backward before the advent of the computer. This is what the Zettelkasten community is talking about. They want to explore how note taking was done with a size limit for an index card.
Outliner software
State of the art programs to make notes on a computer are outliner programs. On the first look they have much in common with a Zettelkasten but a closer look will show that the principle works different. In an outliner there are hierarchical sections available, also no space limit is there for a single section. Let me give an example. In a typical outliner the user creates some sections like:
section 1
____section B
section 2
____section 4
____section 5
And then he can write down in each section a text or a table. The principle has much in common with creating files in a folder but an outliner provides the features under single GUI. So it is a very comfortable way in writing. The interesting situation is, that writing with index cards is the opposite of an outliner. Because index cards have limitations while an outliner not.
Or let me formulate it differently. In an outliner there is no need to use a Zettelkasten method or learn how to use index cards.
Sorting cards in a box
Suppose somebody has written text onto a handful of index cars. The next step is to archive the cards in a box. But how? There are at least two competing ways in doing so. One easy to understand option is to give every index card a title and then sort the cards alphabetically.
A more complex approach is to number the cards and then use the number for sorting them. The Zettelkasten method is preferring the second approach. The reason why is a bit complicated to explain. A numbering system is mostly equal to a topic based category system. For example index cards about sport are starting with 1, index cards about music with 2 and so on. The number is used short for a category tree. in case of the Zettelkasten method a more complex system is used in which the numbers are determined on the fly without defining the category tree first. The idea is that a number can't changed later and that the position of the card in the sequential list is important.
The Zettelkasten elite
From the self understanding, the Zettelkasten community has much in common with owners of a fixed gear bicycle. A fixie bike is working different to normal bike. It has no breaks so a certain balancing technique is needed called skidding to stop the bike. A same approach is used in case of a Zettelkasten. If the amount of stored information gets larger a workaround is needed to get in control of the information.
A non zettelkasten user would ask why not simple use a computer to store the notes and use full text search to find something in the database. This is not an option for the community. They are using index cards because they want to manage the disadvantages. Similar to fixed gear bicycles there is a certain sort of social control to handle the situation For example it is not allowed to uses these bicycles on a normal road, and it is strictly forbidden to use a physical zettelkasten to write a book or paper. This rule is so obvious that there is no need to formulate it but everybody knows that Zettelkasten and bicycles without brakes are not wanted anymore. The result is the advent of a subculture which has defined rules in opposite to the mainstream understanding.
Public roads are not the right place for fixed gear bicycles, and libraries are the wrong starting point to make notes on a 3x5 index card. What is used instead to manage information is a modern laptop in combination with an outliner or a citation manager. This is what all the students are using.
There is an easy way to identify which tools are used in mainstream and which not. In so called library prank videos the idea is to create a funny unusual scene and record the reactions of the audience. If a student enters with a physical typewriter a library the other users will smile about it because a typewriter is recognized as an outdated technology. If the same student is using index cards he is using another unusual technology.
The reason why index cards were abandoned from libraries is simple. They have too many disadvantages over notebooks. A single notebook can store more information and is easier to use than a slip card box. Especially the combination of a pen plus index cards is a very costly way in storing information.
Modern information technology
A mainstream compatibly technology to store and retrieve information are laptops. These devices have a huge disc space. They can store easily one gigabyte and more of information. The software allows to search full text in the information so it is a great approach handle high demand. Especially students are using laptops to create bibliographic databases and write down texts.
From a laptop perspective there is no need for index cards. Current programs like outliner and word processing software are more than capable for academic writing. The uprraising of the Zettelkasten method has to be interpreted as some sort of anti-movement which is questioning if full text databases are needed and which alternative information storage options are available.
Footnotes with index cards
There is a youtube video available which explains how to create footnotes. https://www.youtube.com/watch?v=ZNH2Nubcfh4 The Note Card System for Research The idea is to store the footnotes of an acadmic text on index cards. The index card is sometimes called a note card which is short for “footnote card”. And if the paper was written the cards gets archived in a box. This is the basic principle in academic writing.