Robotics and Artificial Intelligence: June 2019

June 30, 2019

Linux is rejected because it's an operating system

Most Linux advocates are trying to defeat Linux against criticism with the idea in mind to convince the other side what the advantages of Linux are. The problem with this argumentation is that a potential alternative to Linux is defined as the wrong choice and it's ignored. The better idea is to understand why somebody argues against Linux.

Linux is the legitimate predecessor to Unix. And Unix was invented as an operating system. Exactly this is the dominant reason why the world doesn't like the concept very much. The reality is, that the concept of an operating system is unusual for the computer industry. By definition, a standardized operating system is able to run on multiple hardware, for example smartphones, desktop PC and server and it's provides to the programmer an abstract platform on which he can write software. Exactly for these features Linux is hated by Apple, Microsoft and IBM.

But isn't Apple provides an operating system itself? No they don't. Mac OS X is not an operating system but it is something different. The reason is, that Mac OS X doesn't run on multiple hardware and it doesn't provide the programmer an abstract platform. Instead, the potential developer has to install the Xcode development plattform. The main idea behind xcode is, that the number of programmers is reduced, the allowed programming languages is smaller and that Apple controls what the standard is. In case of Microsoft the situation is the same.

Instead of arguing pro or contra Linux we have to argue for or against an operating system. Does it make sense to invent a middle layer between hardware and software? Has the middle layer run on different hardware? Is there a need that the programmer can choose it's preferred language? If somebody argues against an operating system he will also criticize Linux.

Let us describe a world in which no operating system is available. If somebody likes to program a computer he has to ask the manufacturer of this computer. The software is written for a certain machine in mind, for example only for the Intel based cpu, which has a unique amount of register. The software won't run on a different platform, for example the ARM CPU. This make the costs for migrating the software to different computer more expensive. It slow downs the development of new programs and will increase the costs for the enduser. Who will profit from such a usecase? The surprising answer is, that in today's computing world lot of companies profit from it. If software development costs more money, than the profit is higher. It's the same situation like in the time before the spinning jenny (by James Hargreaves in England) was invented. The clothing manufactoring companies have no interest in modern technology because it will destroy their traditional business model. An operating system, especially a highly powerful one, is some kind of innovation which increases the productivity. And exactly of this reason, Linux is boycotted from everywhere. Especially from large companies how have a long history. They have understand what an operating system and they are sure, that they doesn't want it.

History of operating systems

In the past, it was common to develop hundred of different so called operating systems. Even if they were called with the term, in most cases it was an improved BIOS Firmware. The best example is the kernel for the Commodore 64 which runs only on that machine. The early IBM/360 and the Mac GUI can be treated as well. It seems, that the different companies had a great need for developing endless amount of system software but only a little attempt to develop a standardized operating system.

The typical firmware in the past, runs exactly on one computer type, was limited to a certain programming language and restricted the amount of programmer who are able to write software for it. From an economic standpoint this development is surprising because a standardized operating system which runs on many plattforms and which is open to everyone would reduce the costs for software development drastically. And exactly because of this reason, a standard OS was never invented. That means, the current software industry loves a highly unproductive world in which the wheel is invented twice.

What Notebook manufactorer or Game console producer are doing very often is to produce a unique operating system. Not because there is a technical need for doing so, but because this is equal to the selling point. A non standardized firmware helps to lockin the customer, gives the hardware company are greater amount of control makes the resulting product unique. That means, the notebook in the first generation has a different firmware than the notebook in the second generation. And the customer has to pay additional money because he has to buy all the software again. The modern computer industry isn't interested in programming advanced software, but they want to earn money.

Let us describe the situation from an economical point of view. How much is Linux more productive / more cost effective than a propriatery firmware which like Mac OS X? The costs are 10x smaller and more. That means, the same software can be produced with less man power, has less errors and will become 10x more affordable. The comparison with the Spinning jenny makes a lot of sense. Linux (or any other operating system) will improve the productivity drastically and this makes the former software development obsolete.

Alternatives to Linux

Suppose a hardware company doesn't like Linux very much. What can he do if he want's to do exactly the opposite? The first thing is to invent a new kind of CPU which isn't compatible to the x86 standard. Then he creates a new laptop around the cpu and programs on top of the system a BIOS and a firmware. This is able to execute only software which was written for this unique hardware.

From a technical point of view, all theses design decisions are useless. On the other hand the chance is great that he is able to sell the product to some customers, because what the company is doing is accepted widely. If he explains, that the all the parts of the notebook like the cpu and the firmware are handmade the company is able to ask for a higher price. If the customer at the other side is not an expert and is happy if he can get access to a computer at all, he will pay the higher price.

The decision to use a new cpu, program a new firmware which is not an operating system, and allow only certain software to run on the machine are the best practice method if somebody likes to resist against technology. Mostly, he is selling a story in which any kind of standard is declared as evil and an operating system isnt mentioned at all. Instead the plot is about happy customers who get access to a PC because they are rich enough to spend thousands of US Dollar for a handmade product. Such a laptop is equal to a custom tailored man suit which wasn't manufactured on efficient machines but produced with hundred of man hours without any modern technology. The underlying idea that neoluddism makes sense, that a computer is an expensive good and that the own system is unique from other products.

Weakness of LInux

Linux has some features which can become problematic under certain constraints. One of the disadvantages is, that a single program has the tendency to become executable under any computer hardware. That means, the same piece of sourcecode will run under x86 PCs, Smartphones and server operating systems. All what is needed is a simple make command but sometimes, the binary file will run out of the box. Another disadvantage of Linux is, that Linux software has the tendency to produce a small profit or no profit at all. That means, it is impossible to sell a game for 100 US$ to the Linux community. The enduser is not trained to pay such price. The only thing what makes sense to Linux users is, if the game is provided for free or for a little amount of money which is 0.99 US$ for an AAA title.

Both disadvantages are part of inner structure and can't be overcome. If the goals are the opposite, than Linux is a bad choice.

June 29, 2019

Reinventing agriculture with more knowledge

On the first look the topic of food production on mass scale is not very interesting because humans are using agriculture since hundred of years and everything is researched already. Surprisingly only the general idea of producing food is known already what is unclear is how to reduce the costs in doing so. A common myth about plans and agriculture is, that most researchers are unsure about the importance of the sun. The only thing what flowers really need is direct sunlight. For a human person it would be deadly to take a sunbath for 3 hours in the high noon sun, in contrast the plants on the farm are doing so each year.

In a typical region for agriculture the sun is shining up to 16 hours a day and no kind of sun protection is there. That means, plants have a natural desire for Ultraviolet radiation and for large amount of energy from the sky. They are using sunlight to produce leaves.

Another misconception for agriculture is the importance of water supply. Farmers in the past never used additional water on the field because water is an expensive good or they utilized garden sprinkler in the summer but can't afford it on the long run. The answer to the problem of water supply is called a drip irrigation system. This reduces the amount of needed water to a minimum. The main feature is, that it will takes many hours until a small amount of water is put onto the farm, which means that a typical drip system will only allow 0.5 litre per hour as a throughput to the field. Additionally the time is important in which the system is active to reduce the amount of waste of water.

June 28, 2019

Group management made easy

Working in groups is different from working alone. Most groups are organized around a higher instance which is equal to the manager. The good news is, that the details of group working can be researched with a scientific method which is called Total customer orientation (TCO). TCO is often referenced as a management principle which has to be used in reality, but the more powerful scenario is to use TCO as a theoretical framework for explaining group working in general.

Let us describe a group situation. The manager gives the order to the subordinate to do a certain action. Why does the subordindate obey? What will happen, if not a manager but a different person gives the same order? Total customer orientation has the answer to all of these questions. The reason why groups are working is because a certain speech acts make sense or not. A group contains of the people in the circle plus an external customer. The customer can be an individual person or it can be the abstract market. The funny thing is, that the group will do everything what the customer want's. All what the manager has to do is translate from the customer to the needs of the subordinates in the group.

A typical example for a group coordination is given in the following example.

Customer to the manager: I'd like to buy a pair of shoes.

Manager to the group: My customer needs shoes.

Group to the manager: Which one, we have one in black or in white?

Manager to the customer: Which color do you like, black or white?

Customer to the Manager: The white one.

Manager to the group: give me the white shoes.

Group: Here, please.

And so on.

This kind of dialogue make sense. The best example is the direct order from the manager to the group (“Give me the white shoes”). The reason, why the group obeys, and the reason why the group likes to hear this order is because the speech act makes sense for the group. The group knows, that the manager is in a talk with the customer and if they provide the shoes in the white color, the chance is high, that the customer will buy the product at the end. That is the reason, why the group allows the manager to give them a direct order.

The simple explanation what management is, is that the manager is acting as an interface between a group and an external customer. What he is doing is not give orders by himself, but he translates speeches back and forth. He translates the group speech into the customer speech and the other way around. If a person likes to become a manager he has to identify this social role in reality. He has to put himself between an existing group and the environment. In that position he is allowed to give orders to the group.

A concrete example of an ancient bazar is given in the image. The merchant is trying to sell fish to the market. Two customers are at the desk already, they have management privileges. What does that means? It means, that customer1 is allowed to formulate a speech act which is:

“Give me the smooked eel and make me a good price.”

Customer 1 can give this statement as a direct order to the fishstand and for the other side this order makes sense. Sense is important because only commands which have a meaning can be fulfilled. In response to the command, the manager at the fishmonger has to translate this command into the language of the employee. He can formulate a statement like:

“Put the eel into a package and give it to me!”.

This order makes for the employee also sense. Because this kind of speech acts can be expected at the market. The chance is around 100% that the employee at the fishstand will obey. The general question is, why can give one person another person a command? Because he is higher at the hierarchy. Each customer request is started by a customer. That means, the customer sitts on top of the pyramid. From a macro economical perspective this is called a market driven economy. The manager in the fishstand sitts below the customer. He can only give an order to the employee if the order has something to do with the customer. Otherwise the employee will ask back why he should do something.

It's important to locate the position of the manager. He is the interface between the customer and the employee. He is talking to both sides. In contrast, the customer never talks to the employee within the company and Vice versa. The communication flow defines who is in the position of a manager and who not.

The drawing explains also who the manager can get more influence. He has to talk with the second customer as well. If he has collected to fish orders from the customers his authority is much higher than in the case before. Both customers together will pay 50US$ for the fish which is more than what a single customer will pay. As a result, the employee will obey more quicker to the manager needs, because the statement makes more sense to him.

Like i mentioned in the beginning Total customer orientation is not only a management principle but a theoretical framework for understanding group related actions. It can explain, how a fishmarket operates and who has to fulfill which order.

MS-DOS is the real gaming console

Many tribute to old computer systems are made available in books and videos. Some examples of oldschool hardware is the Commodore 64, the Sega Megadrive or the Amiga 500 homecomputer. Sure, many interesting games were published on these machines but one computer is superior to all of them which is called the classical MS-DOS PC. MS-DOS was an operating system from Microsoft and ontop of MS-DOS many games were released. The first examples from the late 1980'er were not very advanced. The Tetris game in that time looked ugly, but the programmers learned fast how to develop advanced software.

What most people doesn't know, that typical arcade games like Streetfigther or Turrican were released for MS-DOS as well. And the graphics was comparable to gaming consoles. The most advanced DOS games were programmed around the year 1996. In that area, DOS was replaced by Windows and the commercial publisher stopped to develop more games. What makes MS-DOS unique from other gaming plattforms is, that amateurs and professional programmer can write software for this platform. All what the newbie needs is a descent C compiler, a graphics library and he can write any game he likes.

MS-DOS combines a gaming platform with an open architecture. This makes the environment superior to Commodore 64 or the Sega Megadrive. An outdated MS-DOS machine is the ideal gaming system which can tell a lot about computing history.

Do we have operating systems at all?

A common told story is, that many different operating systems are available and the customer can choose between them. Examples for operating systems are Mac OS, Windows 10, iOS or AIX. But are these software products real operating systems? Sure, it's a rhetorical question which asks what the difference is between propriatary firmware which runs only on a specific hardware over a true operating system which runs on different hardware from different manufactorer.

Let us make a simple experiment and we are take the Windows 10 firmware and are trying it to install on a IBM/z system which is the latest mainframe which runs with the power9 cpu. Will it work out of the box? Or let us take the Mac OS X firmware and install it on a Rasberry PI microcontroller, do we see the bootup logo from Apple? Or let us take the Xbox One firmware and install it on an outdated Pentium III PC, can we do it?

Somebody who is bit familiar with the different hardware systems will answer that it's not possible and even the attempt in doing so shows, that we haven't understand what the purpose of Mac OS and the other programs is. And indeed we have interpreted their purpose the wrong way. The idea was, that Windows 10, Mac OS X or the Xbox one Firmware is an operating system and as a result it is able to run on different hardware. We have ignored, that the term operating system doesn't fit to these programs. That means, Windows 10 or the other example were never designed as a standard operating system which makes things easier for the customer, but they were created as a hardware dependent, company dependent firmware which is similar to the AmigaOS useless, if the underlying company is dead.

The interesting fact is, that no commercial operating system is available for the customer. We are living in the 1950s in which the operating system wasn't invented yet and as a weak alternative, each company has it's own firmware which is not documented publicly.

And here comes Linux into the game. Linux is not the best operating system in the world, it's the only operating system. Linux is the only software which is able to run on an x86 PC, on a smartphone, at the IBM /z machine, at the xbox and even on microcontrollers. This is the minimum requirement before a system can call itself operating system. It makes no sense to argue, if Linux is good or bad, because no competitor is available. Let us make the point more clear. We are take on outdated Amiga500 with the Kickstart firmware and compare the AmigaOS with Linux. What is better? The answer is more difficult than it looks at the first impression. The Amiga Kickstart is a classical firmware. It was designed to run on the Amiga device and only on this computer. That means, it is not possible to run normal software on the KickartOS but only games and application which were tailored for the system's need. In contrast, The Linux kernel is a real operating system which acts as a layer between hardware and software.

The question pro or against Linux is a question pro or against an operating system. If the user comes to the conclusion, that a propriatary firmware is everything he needs, than he will argue against Linux. If the user thinks, that he need a layer ontop of the hardware which makes programing easier than he will argue pro Linux. The question which remains open is, if anybody needs an operating system.

It seems, that today's computing market has answered the question with no. Mac Users, Windows users and Xbox gamers doesn't need a high level operating system, instead they are dependend from the hardware specification. The result for this collective ignorance is, that not using an operating system makes software development more costly. The typical Windows application costs around 200 US$, while the normal Xbox game costs 50 US$. The reason is, that the software has to be programmed for the target platform and can't be migrated to other platforms. The situation looks similar to the Betamax videoformat in the 1980s. At this time the customers has paid a lot of money for betamax cassettes and were not able to use the format in a different video player. They didn't even know, that a common standard would make the life easier and the manufactoring company were smart enough to say nothing.

June 26, 2019

Introduction into Learning from demonstration

I would like to introduce the topic with a small example, suppose an Artificial Intelligence should play Lemmings, because, quote “The game of Lemmings has been offered as a new Drosophila for AI research” [1] The naive approach would be to understand the Lemmings game as some kind of search problem in which the solver has to find the actions for winning the game. Because the state space is very large, this attempt will fail. To make things a bit shorter, the better alternative is to invent a plan language which will guide to search process. But let us go into the details.

A simple level of Lemmings in shown in the figure.

The entrance is on top in the middle and the lemmings must master the falling step, then a stopper has to be set on the right side, and the wall needs a digger before all the lemmings are allowed to go into the exit at the left bottom. The arrows in the map demonstrate what the solution is. The interesting point is, that these markers are equal to the walkthrough tutorial, that means, the level is already solved and the Lemmings has to follow only the guidance. Exactly this aspect is typical for learning from demonstration. Before the software gets started the walkthrough is available, which is formalized as a plan.

Learning from demonstration means in it's core to formalize a plan language. If the plan language exists a solver can calculate the subactions. Now let us imagine what will happen in a different Lemmings map. No plan is available and the Lemmings doesn't know what to do. The funny thing is, that the Artificial Intelligence is no longer forced to control the game in all details, but the only missing thing is a plan. The overall pipeline contains of two items: finding a plan for a map, using a plan for control the Lemmings.

The plan source can be human but it can be also a pathplanner which is working on a abstract level. Even if the human provides the plan, the system will work a bit autonomous. Because the human has to draw only the plan into the map and the system is doing the rest. From a certain standpoint this can be called cheating, because the AI only follows the walkthrough tutorial which is already there.

And exactly here comes “Learning from demonstration” into the game. The main principle is to invent a plan notation for a domain. This plan notation is used for recording and playback of demonstrations. The sad news is, that no standard is available for a plan language. It can be a graphical notation, a textbased langauge or a trajectory which is produced by Dynamic movement primitives. What is important to know, is that LfD is located between teleoperated robot and autonomous robot. The link between both consists of an abstract plan language.

Let us observe what a Lemmings AI is doing if the plan is known. The input for the system is the figure which contains the arrows. The plan is stored internally as a waypoint list which has some smaller annotation. For example the first step in the level is annotated with “parachute” while the second step is annotated with “stopper”. The AI Solver takes this plan and converts it into low level actions. It has to make sure that the plan is fulfilled. This is equal to a subgoal. For the solver, a subproblem is given which is only some seconds long and in which the general idea is provided by the plan. And the solver has to figure out only the detail adjustments to the plan. This can be done on a standard PC without much effort.

The funny thing is, that the same principle can be transfered to any game. No matter if it's called Sokoban, grasping robot, Lemmings or RC-Car control, in all these domains a plan notation is used as an intermediate between walkthrough-knowledge which is provided by humans and lowlevel solver which executes a plan.

Human guided teleoperation

The term “learning from demonstration” is a bit misleading because the association is, that some sort of machine learning algorithm takes place. The more exact terminology is to call such systems human guided teleoperation. Teleoperation means, that in the basic setup a human operator controls the robot with a joystick. Plan guided teleoperation is equal to replace the joystick with a plan notation which allows to control the system from an abstract level. In both cases the knowledge how to solve a robot problem comes from the outside. Either by direct human commands, or from an abstract plan provided also by a human.

The overall system doesn't have much in common with a classical AI planner, but it's more a human-to-robot interface. The plan notation is similar to a joystick an input device which transmits the human knowledge into machine readable information.

[1] Kendall, Graham, and Kristian Spoerer. "Scripting the game of Lemmings with a genetic algorithm." Proceedings of the 2004 Congress on Evolutionary Computation (IEEE Cat. No. 04TH8753). Vol. 1. IEEE, 2004.

Real and fake operating systems

Sometimes, GNU Linux is compared with other operating systems like Windows 10 or Mac OS X with the aim to explain what the pros and cons are. The assumption is, that Mac OS X is an operating system which can be compared with Linux. Suppose, Linux is the only operating system which is available what exactly are Windows 10, Mac OS X and the OpenVMS system? To describe it we have to go back into computerhistory.

Let us take a look into pseudo operating systems from the past. The Amiga 500 had a preinstalled kickstart firmware which was able to display a GUI on the screen. The Atari ST was equipped with the GEM operating system, the IBM had a preinstalled MS-DOS, Apple computer were delivered with Mac OS, and IBM mainframes are using the AIX operating system. All these pseudo operating systems are working with the same principle in mind. The first fact is, that the sourcecode has no GPL license, and secondly, it's technically not possible to install the software on different hardware platform.

Let us take some examples. MS-DOS was an operating system in the 1980s. But it was not able to use MS-DOS on the Amiga 500 or on the IBM mainframe. That means, it was not a universal operating system which was able to manage any kind of hardware but it was designed for the IBM PCs needs. The same is true for the modern Mac OS X system which can't be installed on a Rasperry PI 3 computer nor an IBM mainframe computer. That is the reason why this software is not a real operating system. Sure, it is a layer between application programs and underlying hardware, but it is not universal for many different hardware plattforms but was designed as a firmware for a certain computer.

The same is true for the operating system built into the Nintendo Wii gaming console or which runs on most routers. In all these cases, the hardware manufactorer has developed a software for it's own device which is not open source and which can't be installed on a different device. The router firmware from device1 won't run on device2. The current situation in the computer industry is, that hundred and more different firmware systems are available. None of these can be called an operating system. The only true operating system which fulfills as the minimum requirement that it's open source and runs on many plattforms is Linux.

That means, it makes no sense to compare Linux with Windows 10, or compare Linux with Amiga 500 kickstart system. Because Linux is an operating system and the other type of software is a propriatary firmware of the hardware manufcatorer.

Somebody may argue, that there is no need for an operating because. For example, the Mac OS X firmware is well suited to handle the underlying hardware and the same is true for a router firmware. This argumentation results into today's landscape which consists of incompatible firmwares, missing standards and some kind of rant against the only operating system available which is Linux.

Let us think about possible alternatives to Linux. Which kind of software is available which runs everywhere? Right, exactly this is the bottleneck. There is no Linux alternative available. What is used on most computers and hardware devices are firmware systems which are maintained by a single hardware manufactorer and which are equal to the betamax system used in the 1980s on videorecorders before the VHS standard was invented. From the point of view of an Apple User, Mac OS X is great. It will boot the system and runs any application. But what will happen, if the user has advanced needs? Is Mac OS X able to run Linux applications, can the user plugin an external harddrive not certified by Apple? No he can't. Mac OS X, Amiga 500 kickstart and all the other Betamax like systems are developed for the need of the hardware manufactorers not with the customer in mind.

Somebody may argue, that Windows 10 is equal to an operating system standard because 95% of all the PCs are running with this firmware. But let us take a look into the limits of WIndows 10. Is the system able to boot server hardware, will it run on older PCs, can it be installed on a microcontroller, will Windows 10 run on a playstation? No it won't. Windows 10 was developed for the need of the PC industry. It will run on the x86 plattform and only on this hardware. Additionally, the specification is not available as open source and doesn't fulfill professional needs.

Windows 10 and Mac OS X are great example for well designed firmwares. They are optimized for a special purpose in mind and helped to sell a lot of computer hardware to the masses. But they are not different from firmware software from the past, for example SunOS or Palm OS. Which means on the longterm they can be ignored. If the IBM compatible PC is gone and if Apple is bankrupt, nobody will use this software anymore. Let us take a look what happens with the kickstart ROM from the Amiga 500. AFter Commodore was bankrupt, the project stopped. No updates were programmed anymore because outside of the Commodore world nobody was interested in an Amiga 500 firmware.

Requirements for a standard operating system

Instead of analyzing existing pseudo operating systems the better idea is to define first what purpose a standard operating system has to fulfill. The minimum requirement is, that it will run on all the hardware which is available: desktop PC, notebook, gaming console, tablet PC, smartphone, router, microcontroller, single server, cluster server. To make the Operating system a standard the sourcecode has to be available as open source, so that anybody can contribute to the development.

The GNU Linux project is the only available operating system which fulfills in parts the requirements. And if certain problems are available for example the Linux Android system is not fast enough on smartphone devices, than these detail problems can be solved within the existing ecosystem. The funny thing is, that Linux has no real competitor who is also trying to become a standard operating system which runs on all hardware. That means, the self-understanding of Windows, Mac OS X, HP-UX and proprietary router firmware is not to compete with Linux for the best operating system. Instead, these firmware software was designed for different needs. If Microsoft windows was planned as an alternative to Linux, than the software would be developed into a different direction. But Microsoft has decided, that Windows 10 is not a standard operating system, but a proprietary hardware depended firmware which is running on x86 hardware only.

Robotic lawn mower in comparison

Together with window cleaning robots, the robot lawn mower is the first mass produced robotic device which is sold in a larger amount of quantity. Instead of other robotics examples like the Unimation hospiotal robot (produced in the 1990s) and the Baxter grasping robot, the domain of lawn mowing is handled very well by robotics devices. Most costumers are happy with the device because they get something in return for the spend money. The typical price range for a robot mower is around 1000 US$ until 5000 US$. For this price, the customer gets a two wheel electric robot who is able to cut the grass and finds the way back to the garage automatically.

The only thing what can be criticized is, that today's consumers products are working too good, which makes it hard to invent something new. Technology never stands still and the first task is to create a problem which can't be solved by today's generation of robot. Suppose, the customer has the need, to cut the grass only with a biped robot. Then, the Husqvarna product won't fulfill the needs, because all the devices have 2 or more wheels. The more elaborated form of a mower doesn't need wheels but it is walking like a dog. That means, future not released devices will need legs and an advanced software which controls the legs. This would bring technology to the next level. A biped mower can handle more complex situations.

A Plan language for Sokoban

In classical computing the first step before a software application can be realized an algorithm has to identified which can solve the problem. In case of Artificial Intelligence no dedicated AI algorithm are available and the given one for example A* graph search are not working very well. The best practice method in solving AI problem is to invent a plan language as the first step. A plan notation is a human machine interface and can be used for plan recognition and plan execution as well.

The example game, Sokoban, is an easy one. It contains of a single robot plus a single box which can be pushed by the robot. Obstacles in the maze are not available. A possible planning language contains of the following syntax:

1. moveto (100,100)
2. pushto (100,200)
3. pushto (200,200)
4. moveto (0,0)

The plan contains of a sequence of single steps which can be either moveto or pushto. Additional the pixel coordinates are given. Such a plan is stored in a Python list, the first parameter is the actionname and the second one the point-coordinate. The surprising information is, that apart from this plan notation no further algorithms are needed to solve Sokoban, because the plan notation itself is powerful enought. We can do with the plan two different things. The first one is, to observe a human player and write down which plan he executes. This is equal to a formalized game log recording. And second, the given plan can be used as a subgoal list for replay the actions of the robot.

The plan can't executed like a normal computer program. Because the information are vague and they are not fit exactly to the given map. But each step in the plan can be interpreted as a subgoal which can guide the solver. If we now, that the robot should go to (100,100) then a solver can figure out how to do this task exactly. In case of Sokoban the plan notation is not very complicated. It contains of two single commands plus a coordinate information. Different domains will require a different plan notation. What we can say for sure is, that any game can be solved much easier if a plan notation is available, for example Lemmings, Mario AI, Starcraft or Pong.

The plan notation is a step inbetween a teleoperated robot which is controlled by the human and an autonomous AI which is working without a human. It helps to make this transition possible:

1. Teleoperated robot

2. Teleoperated robot who regonizes plan from humans

3. teleoperated robot who can execute plans partly

4. Fully autonomous robot

A plan guided teleoperating system is located between a vanilla human controlled system and a fully autonomous robot.

June 25, 2019

Alternative to Linux

Every operating system has to question itself, what a potential alternative is. Linux is not an exception, it make sense to ask which kind of software is available which has a different philosophy and works better. The amount of alternatives is very high. They have to common principle in mind. The typical Linux competitor is bundled with hardware and it's sourcecode is closed source.

Let me give some example: Mac OS X, HP-UX, AIX, Sun-OS, VMS, Windows 10 and many more can be categorized as Linux alternatives. The reason why these operating systems are better is because they are fullfill the need of the hardware manufactoring company better. Let us take one example of the list which is HP-UX and ask why this operating system was developed. The reason was, that the company HP has developed a new hardware architecture and to sell the system to the customers they need an operating system on top of the hardware. Calling this operating system “unix like” doesn't describe HP-UX very well. Because the idea was not to emulate Unix, the idea was to educate the consumers in a certain way. The main idea behind HP-UX is, that the customer is not allowed to use any other kind of software. Even he owns the hardware, he is not allowed to replace the operating system with something else, because then he will loose the warranty.

The interesting question is, why did HP developed HP-UX? Why did they not used an of the shelf FreeBSD system which will run great on the hardware? This question is hard to answer. We can assume, that HP was not the only company who made this decision. Microsoft, Apple and IBM also decided to invent the wheel twice. They all have programmed their own operating system which fits to special hardware and they not used existing software. One possible explanation is, that this kind of extra work is the business of these companies. Apple for example has a large software development department, their job is to program lots of sourcecode. Suppose the software departmant contains of 5000 programmers and they would decide not to write a new operating system but use code already written by somebody else. Does this fit to the need of Apple? No, it won't be a good idea. From the perspective of Apple, HP and many other companies it make sense to programm their own operating system which runs only on their hardware. In the next step the customer is forded to use this software and the company can educate the customer, why this program is great.

We can say that in the software industry it is a common principle to rewrite existing code. That means, the companies are doing extra work, if the problem was solved by somebody else in the past. That means, HP writes his own file system, Microsoft has written it's own filesystem, Apple too and IBM also. And all of these programs are sold to the customers. As a result, the price for software is high because so much energy was invested into the product. It is some kind of individual product which was taylored to a concrete product. It's important to explain, that not the customer has an advantage but the company how made such programs.

From the perspective of the customer the ideal world would contains of only one operating system under which every programs runs out of the box and which is available for free. This is equal to a customer paradise. Computer companies have no motivation to build such universal operating system because they won't earn much money in this market.

Now we can answer why Linux is special. The first feature is, that Linux runs under every hardware. No matter if it's a PC, an Apple, a server or a smartphone, Linux runs everywhere. The second issue is, that the software is delivered as open source. The combination of both is ideal for the customers but it's a nightmare for the hardware companies. Linux is something which is not wanted by HP, IBM or Apple. Linux is some kind of standard which fits to the need of the mainstream customers. And exactly this is the reason why Linux is hated by IBM, Apple and other. Because the interests of a company are different from the needs of the customer.

Let us analyze the market share of Linux. Technically it's possible to install Linux on an IBM mainframe. It's also possible to install the operating system on an Apple computer. In both cases, the customer will profit from it very well because he no longer dependend from the underlying hardware. But let us focus on the needs of the hardware companies. What is the feeling of IBM if the customer prefers Linux? What will Apple say, if the enduser is installing Ubuntu on the iMac? In both cases, they are not amused. They will forbid it either by technical restrictations, because of warrenty problems or with indoctrination. They are trying to convince the customer to behave in a certain way. Or to explain it from the other point of view. if the apple user will install Linux on Apple hardware he will feel guilty, he has done something which is wrong. The relationship to the Apple company is broken.

Most customers are in fear of such a relationship. They want to become the friend of the hardware company and as a consequence they won't install the Llinux operating system. They are doing so because their position is weak. They are not informed very well about the technical side, and they are in fear to make something wrong.

Some youtube videos are available in which an enduser has installed Linux on an Apple computer. But this is done only for outdated computers and mostly the customer feels badly. The reason is, that he has done something which is against the interests of the Apple company. Using Linux on a computer is equal to become the king of the hardware and ignore the hardware company.

From plan recognition to autonomous robots

The transition from teleoperation to a fully autonomous robot contains of many steps in between. These steps are necessary to realize more complex operations. Suppose, a plan recognition parser was added to a teleoperated robot, how can this capability help to realize a fully autonomous robot?

Let me give an example which is an RC car who is navigating in a maze. The human operator controls the car and the plan recognizer is printing out to the console all the actions on a semantic level. For example the output is, that at timecode 0 the car is in the middle of the maze, then it moves to the north, the the driver makes a turn left and then the car stops. On the first look, the annotated gamelog is not exactly the desired behavior of the AI system, because we need a robot who can drive by it's own and not a chatbot who is commenting what a human driver is doing. But let us slow down the overall situation a bit. A recorded gamelog is equal to plan. In the plan is written how to bring the system from the initial state to the goal state. And this will answer the question how to realize a fully autonomous system. Such a robot control system takes a plan as input and generates the low level commands as output.

A plan is equal to a list of keyframes. They are describing who the overall trajectory will look like. And it's up to the solver to realize the plan in reality. The plan provides the reward, the subgoal and the timing. It's not very complicated to transform a given plan into lowlevel actions.

Let us research how a robot control system will look like if no plan and no plan notation is given. Usually the system is in the initial state and should reach the goal state. And then the planner has to plan everything from scratch. This can be very complicated because the state space is large. In the more easier task a plan is already there and planning means only to transform the plan into smaller activities.

Sure, the plan can't executed direct because of minor obstacles. The plan is only the general plot which have to be followed. It is some kind of subgoal list which reduces the state space. The resulting real sequence will look like very similar to the original plan.

Linux against everybody

Linux is started with the aim to replace any other operating system. But what exactly are alternatives to Linux? The best practice method in developing a non Linux operating system is defined by hardware manufactorer. They are using two principles in combination: first, the operating system will run only on their own hardware but never on the competitors hardware and secondly the operating system is published under a proprietary license. This strategy was common 40 years ago and even today it's the normal case what the hardware companies are doing. Does it make sense? For the companies it is useful but for the customer it's a nightmare.

Let us give some examples: IBM has released in the 1960's the mainframe operating system System/360. This software was only able to run on IBM hardware but nowhere else. Apple has released a nextstep based operating system which is tailored for Apple hardware. Sony Playstation has invented it's own operating system, the same is true for the SunOS workstation in the 1980s, the HP Workstations in the 1990s and the famous Windows Operating system runs only on desktop PCs too, but not on server hardware.

So why have all these companies an individual tailored operating system? Because standardized software would reduce their income. If the Sony playstation system would be able to playback xbox games, if Apple software would run on IBM PCs then the world ends. Which means, the profit for the companies would be lower, and the customer would become more important. This is what the hardware manufactorer are trying to avoid. They are argue, that the customers doesn't want a standardized operating sytsem, but it's not the wish of the customer but the need of the manufacturer.

Let us analyze what the plan of IBM, Apple, Microsoft and all the other tech companies is. Do they want to produce innovative software which fulfills the need of the mainstream? No, they want to make profit. And this is only possible against the customer. They are inventing lots of incompatible operating systems which are restricted to their own hardware and which are sold for money. They are able to do so, because the customer is in a weak position. It's the same situation like in the early videorecorder market in which VHS, Betamax and Video 2000 have compete against each other. If the customer has bought a Betamax system he can't playback the media on a Video 2000 recorder. So he has to buy the Video 2000 hardware too which will increase the overall sales.

In short, this is what Apple, Microsoft, IBM and all the others are doing. Linux would solve the issue, Linux is a standard which runs everywhere and is very consumer friendly. .And because of this reason the companies hate Linux. Let us take a look at the websites from larger hardware companies. They are ignoring Linux or they explain why Linux is a bad idea. This description has no technical reason but it has to do with the company's policy. Boycotting LInux is a longterm strategy which is run by every hardware company. It's done against the customers need. Let us compare the situation in a 1:1 table.

Linux: open source, runs on different hardware

non-Linux: closed source, runs on a single plattform

The Microsoft Windows operating system is located somewhere in the middle. It is more open than the Mac OS X operating system, because Windows 10 can be installed on different hardware. At the same time, it is not open enough to compete with Linux. Microsoft Windows can be interpreted as a transition from hardware controlled operating system which was common in the IBM and SunOS area, to open systems which is Linux. From the point of view of very closed hardware companies, Microsoft Windows is too open. Apple doesn't allow the user to install Windows on the Apple hardware, even it's possible from a technical point of view. The same is true for some propriatery workstations made by HP. Only the HP-UX system is allowed to boot these machines. Linux can be imagined as some kind of improved Windows operating system. It runs on more hardware plattform and the costs are lower. Additionally the sourcecode is available for free.

It remains the question open why the customers are not argue against the hardware manufacturer. The reason is, that a company like Apple doesn't only sell a computer it sells an ideology. The customer identifies with the needs of the company. He thinks, that after buying the latest MacPro he is part of the Apple family. And then the customer defends the needs of the company against the public. That means, the perfect Apple customer is no longer an end user but he is interested in making Apple a success. That means, he will argue, that a Mac OS X propriatery operating system is superior to an open source system which runs on every piece of hardware. The customer has adapted to the ideology of the hardware company and he feels stronger with this ideology.

In contrast, Linux fanboys are normal customers. They are arguing against the hardware company. Linux user are not interested in a certain computer from Apple, HP or Dell but they are interested in running software on the system. From an abstract point of view, the today's computer market is driven by customers who are identify themself with the companies. Very similar to the hifi market in the 1980's in which the customers of Pioneer were different from the customers of Kenwood. What the customer is doing after buying a product is to tranform himself from a customer who argues against a company into a loyal part of the family. As a consequence the customer will defend decisions of the company for example the decision that the operating system runs only a single hardware but nowhere else.

What i want to explain is, that Apple, IBM and HP doesn't have real customers because the enduser doesn't argue against the manufactorer. The customers are usually on the same side like the manufactorer and this prevents that they are installing Linux on the hardware. Linux can be understand as a customer ideology which is against a certain hardware manfucatorer. The famous statement of Linus Torvalds against the nvidia company is a good example. An Apple customer would never argue against the hardware in that way.

Let us define what a true operating system is. A true operating system is independent from the computer hardware, fulfills the need of the enduser, costs nothing, is available as open source and is equal to a standard. The only operating system which is available today is Linux. All the other software on the market (Windows, HP-UX, AIX, Mac OS X) are not operating systems but ideologies which allows the customer to identify with the needs of the hardware manufactorer.

June 24, 2019

Plan based robot control

The classical understanding for robot builders is to focus on the software which controls the robot. The idea is to develop a so called AI which is able to drive the machine by it's own. What happens in reality is, that these AI systems are usually broken. The user press the start button, but the robot is not able to follow the fine on the ground and the assumption is that something is wrong with the software. But the real problem is located somewhere else. The idea of creating an AI itself is the problem. In the following blogpost I'd like to describe an alternative development method in which the plan notation stands in the middle.

Suppose the aim is to drive an RC Car in a parkour under the constraint to not collide with other cars and respect the normal traffic rules. The best practice method for building such a system is start with a normal teleoperated robot. That means, a human driver is in control and he has to accelerate and steer the car. Such a human control rc-car will drive with nearly 100% accuracy. That means, the task is fulfilled. What is different from normal RC cars is, that all the actions are recorded. A motion tracking system records the car, the other car, the actions of the human. On top of the motion recording additional features are calculated for example the distance to another car, or the direction the car moves.

The next step is to build a game-log parser. That is an engine which takes the input data and generates a symbolic plan description. That means, it converts lowlevel information into high level description. For example it prints out, that an obstacle car is in front of the rc-car. Or that the human driver stops because of the red traffic light. Plan recognition is equal to sense making. The game logs are interpreted sematnically. The interesting point is, that for classical robot engineers such a plan recognition system is not important and in most cases it's not available. The recommendation is to give this software element a much greater priority. It's important to know, that a plan recognizer is not able to control the car. Even the software works great it won't replace the human user. It will only track his actions in the current situation.

The funny thing is, that plan recognition and model tracking is the precondition for constructing any sort of robot control system. If no plan recognizer is available, it makes no sense to talk about a control system. The idea is to invent some sort of plan notation language. Such a language is described with a grammar / ontology similar to a domain specific language. It contains of functions (stop, speed slower, steer left) and objects (own car, other car, crossing, traffic light). If the human operator starts a new circuit the plan notation formalizes the situation. It is a handy way in storing motion capture data to the harddisk.

A language for describing robot plans is sometimes called an interface. Which means, it can't control a robot but is a communication device for human robot interaction. Sure, there are many robot languages described in the literature. The only problem is, that the amount of languages is too low. That means, the concept of a language is the right idea, and only details improvements are necessary to invent the perfect robot language.

In the literature the concept is described under the keyword “learning from demonstration”, but it's important to know that such system are not able to learn and that the system can not repeat the action by itself. The more interesting part of LfD is the plan notation. Before Learning from demonstration can be realized a plan language has to invented first which converts the raw mocap demonstration into symbolic plans. This kind of interface is the interesting part of the overall architecture. Such a system can be imagined similar to a computer language parser. It's based on a context free grammar and a formal domain description. The human user is allowed to do some tasks and the parser is able to identify these tasks. Constructing the parser is more complicated than building a C++ parser, that's the reason why the amount of literature is low.´

A mocap-to-plan parser: As input the system takes the raw data of a camera and as output it prints to the screen what the robot is doing.[1]

Trajectory encoding

A more general description what Learning from demonstration is was given in the paper [2]. On the lower level it is a trajectory encoding language and on the higher level a symbolic plan encoding which is equal to the pddl syntax.[2] The term learning shouldn't be interpreted as machine learning, instead the idea is that a plan notation language is available which acts as an interface. The aims is, that during execution of a task by a human operator, the AI system is able to recognize the actions and match it with the plan language.

There is a bit confusion what LfD really is. In most projects around the topic, the demonstration phase is extended with an autonomous robot who is doing the task alone. But the ability to replay an action is not the most important one. Sure, at the long hand the aim is to program a robot, but this feature can be ignored as a minor problem. The more important aspect is, that the plan notatation and the plan recognizer is working. Which means, a normal teleoperated robot can be realized as a learning from demonstration framework.

Let us analyze the precondition for so called robot teaching. Before the human operator can guide the robot arm to do an action, a task model is needed. That is a plan description language who converts lowlevel actions into semantic descriptions. The goal is not, that the human operator moves the robot arm, the goal is to invent a robot planning language which can store these movements. It's correct to say, that LfD is the same as “game log recording”.

Another synonym is trace annotation.[3] The actions on the screen are recorded to a log file and the logfile is extended by additional information. Never the less, the term “trace annotation” isn't used very frequently in the literature, the more common keyword is “Learning from demonstration”.

The amount of techniques for realizing a plan notation is huge. In the simple case, it is done with a path notation. That means, the trajectory is stored as a list of waypoint. More sophisticated forms of notation are PDDL, grammar based domain specific languages, ontologies and for low level trajectory the Dynamic movement primitive concept is often referenced. All these techniques have in common that a plan which is executed by a human is stored on the harddrive. It converts the pure mocap information into a semantic description. Such a parser is an addon to an existing teleoperated robot. That means, the robot is under control of a human, and in the background the parser is tracking what the human is doing.

The importance of such a plan interface can't be exaggerated. It can be used either for analyzing the performance of humans but it's also the starting point to construct on top of the plan notation an automatic solver who can replay the action by it's own.

Trajectory annotation

A teleoperated robot will produce in most cases a trajectory. A car-like robot will drive on the 2d state space while a robot arm will generate also a trajectory. Recording the raw data is a normal task which can be fulfilled within the well known principle. The more demanding problem is, to annotate the given trajectory. Annotation is equal to enrich the raw data with additional information. The GPS trajectory of a car alone makes no sense, but if the trajectory is drawn into a map, and if important waypoints are marked in the trajectory it will provide a lot of information.

Unfortunately trajectory annotation is not defined precisely, there are lots of possibilities in doing so. In the example of a car, the trajectory results into a speed profile which is equal to the speed over the time axis and different segments can be identified, for example:[4]

- car exits highway

- loop

- left turn

- traffic light

Like i mentioned before, an annotation software is not able to drive the car by it's own. The trajectory is the result of a human operator how drives the car manual. The trajectory parser is able to explain the human behavior. It identifies a segment in the trajectory and adds the information that the car was in a traffic light situation which results into a certain profile of speed, steering wheel and obstacle detection back and forth. A well annotated trajectory can be stored in a database for further investigation. For example it is possible to search for all situation “traffic light situations”. The process of trajectory annotation is equal to model building, that means the driving patterns are interpreted in an abstract description. This description contains of words and parameters, and is equal to a plan notation.

Sources

[1] Yang, Yezhou, et al. "Robot learning manipulation action plans by" Watching" unconstrained videos from the world wide web." Twenty-Ninth AAAI Conference on Artificial Intelligence. 2015.

[2] Cubek, Richard, and Wolfgang Ertel. "Learning and application of high-level concepts with conceptual spaces and pddl." PAL 2011 3rd Workshop on Planning and Learning. 2011.

[3] Mehta, Manish, et al. "Authoring behaviors for games using learning from demonstration." Proceedings of the Workshop on Case-Based Reasoning for Computer Games, 8th International Conference on Case-Based Reasoning (ICCBR 2009), L. Lamontagne and PG Calero, Eds. AAAI Press, Menlo Park, California, USA. 2009.

[4] Moosavi, Sobhan, et al. "Annotation of car trajectories based on driving patterns." arXiv preprint arXiv:1705.05219 (2017).

Why Unix is hated by everybody

Sometimes, Unix and the modern Linux is described as the most powerful operating system ever which is especially prefered by programmers. But in reality the situation is quite opposite. Unix was never the favorite operating system, and the only one who loved the system was the enduser. Let us take a look what the more liked Operating system is. A hardware company like HP or Apple are prefering their own customized operating system which is Apple OS, HP-UX or OpenVMS. The idea is to program for the hardware a dedicated os which only runs on this computer. The manufactorer has total control over the OS and he decides which programs are running under the system. If the customer needs additional software he can by the program direct from the company which is equal make additional profit.

A similar approach was chosen by IBM. The consumer bought a mainframe for example the 360 system and ontop of the system the customer becomes an operating system. The customer is not allowed to delete the original OS because than the product won't work anymore. This is the common standard in operating system. IBM, HP, Apple and many other hardware companies are operating under this principle today.

And now comes something which is not wanted. The customer likes to become independent from the hardware company. He is not interested in the standard software but likes to install his own programs. He doesn't ask HP if he is allowed to install a new kind of operating system on the server but he is doing it on their wish. This is the basic idea behind Unix/Linux. It is a technology used by customers against the normal standard OS. The customer deletes the HP-UX software and installs a Linux system. Except from the customer nobody else encourages this plan. HP will hate the idea very much because the company will loose the control over the customer.

The same principle is visible in the modern Linux ecosystem. What the Linux fanboys are doing is to buy a normal desktop PC which comes with customized WIndows OS. The company, for example DELL has invested lots of hours into run the Windows OS on his hardware great. But the customers decides to delete Windows and he installs an Linux system which has no warrenty. From the perspective of the hardware manufactorer, this is equal that the customer has made something wrong. He will loose the warrenty on the device. That means, Dell didn't gave the permission that the enduser is allowed to install Linux on the system. Escpeically not a Linux system which was compiled from scratch.

What i want to explain is twofold. First, Unix/Linux is the natural enemy of hardware companies and secondly it is modifing the relationship between a company and the enduser. Somebody may ask why all the hardware companies are programming their own operating system. Wouldn't it be easier to establish a common standard? The only one who is prefering standards is the end-user. If he has an operating system which runs on IBM, Apple, HP and dell he is in better position. But in the classical market the wishes of the customer are not important. Instead the need of the hardware companies is to control the market and a standard would make this attempt more difficult. That means, Apple is not interested that the enduser deletes Mac OS X, Apple is interested that the cusotmer stays within the normal OS which is controlled by Apple.

June 23, 2019

The essence of Learning from demonstration

In the literature Learning from demonstration is not precisely defined. Instead a mixture of Dynamic movement primitives, Reinforcement learning and direct manipulation of robotgrippers are presented under this term. The first step is the define what LfD means at it's core. The basic idea can be summarized as “plan following”. A plan is feed into the system in a high level language. One option (but not the only one) is to create a plan by teaching. That means, the human moves the robot gripper to a goal. But the plan can created with a textinterface as well.

Let us go a step backward and start with a system which is more easier to describe. A teleoperated robot lacks of any kind of AI. Instead the system is controlled with a joystick but it can't do something autonomously. To improve the system, a plan formalization is needed. That is a plan language and a concrete plan in that language. All learning from demonstration systems are based on a plan. Possible way in doing so are natural language vocabulary, waypoint trajectories, photographed keyframes or a function in the state space. The idea of a plan is to reduce the state space, it explains to the robot what to do next.

Learning from demonstration means usually to track a plan, to think about the plan language, to replay a given plan autonomously and to annotate a demonstration with plan elements. The most simple form of a plan notation is a waypoint list for example (100,100), (100,150), (200,100). The plan is equal to an abstract description for solving a task. That means, the AI isn't able to find the solution by it's own, instead the robot has a walkthrough tutorial and executes it.

The open question is, how exactly a plan language should be. In the given example, the plan language is a simple point list. But more demanding tasks like grasping will need a more elaborated formalization which goes into the direction of a domain specific language. Somebody may argue, that a plan is fixed and isn't flexible enough. That is correct, Learning from demonstration is restricted to a concrete domain. If the situation is changing the plan will become useless. But that is not a problem, because the idea is that a robot can do a narrow AI task like open a bottle which is always the same.

Plan recognition for a line following robot

Most line following robot competitions are created with autonomy as a goal. The idea is, that after pressing the run button, the robot will drive by it's own along the line on the ground. The more interesting way in fulfill the challenge is to let a human operator control the robot and track if he is able to follow the line. That means, there is a line given and the operator has to move along the line. Does this has to do with robotics at all? Yes, because it's a plan recognition challenge. The plan is given by the line and the system has to track if the human operator fulfills the plan or not.

Such a systems starts with the assumption, that no robot control system at all is available but the teleoperation mode is the only available technology. On top of a working teleoperation controller an activity tracker / plan recognition system is put on top with the aim to improve the overall software. The idea is, that the transition from a teleoperated system into a fully autonomous one contains of many steps in between and the quest is to explore them slowly.

From a formal persective the line of the ground is equal to a lan. It explains to the robot and to the human what the goal is. The plan is equal to a 2d trajectory. It's a spline which goes through different waypoints. The robot can move on the line of outside the line.

June 21, 2019

Some reasons why C++ is great

In contrast to a common misconception C++ is a beginner friendly language. The only thing what the user should avoid is to try to understand C++ in all the details. Because the amount of information is large and new features are added quite often. The better idea is to solve a certain problem and ask how this can be realized in C++. Let me give an example. The user likes to program a computer game which is working with OpenGL. The game runs two threads in parallel, has a 3d graphics, is using object oriented programming and should use the current CPU efficient. Now the user has to search for C++ standards which is supporting these requirements and he will find them.

This problem seems to be trivial, but many other programming language doesn't support the user in this way. For example, if the user would like to realize the project in Python he will have the problem that the support for multi-threading is not that great. He won't find in the language a powerful support for threads. And if he is trying to program the game in plain C he will have problems to find support for object oriented features like classes and inheritance.

It's rare or even not possible that somebody likes to program a certain application and get's no support from the C++ community. All the operating system libraries have at least a C++ interface and the amount of C++ language standards is huge. No matter if the user tries to read files, uses a database, likes to play a soundfile or spawn a process to a cpu cluster he can do so in C++ easily. The disadvantage of a general purpose programming language is, that learning C++ doesn't make much sense. Because the amount of libraries is too large, and the possible applications somebody can write in C++ is more than 100.

Plan recognition for a kitchen robot

The desired behavior of a kitchen robot is, that the machine can do something useful by it's own. The human has to press only the start button and the robot will cook something. Unfortunately, such an AI system isn't available yet. The overall architecture is complicated and many scientists have failed in building such robots. What can be realized is a weaker form of a kitchen robot, who is teleoperated. Teleoperation means, that the human operator has to cook the meal and he is doing so with a dataglove controlled robot.

From an economical standpoint teloperation is not very productive. The overall workflow will take longer than without the robot in the loop. But it will help to make some AI related topics visible, especially the task of plan recognition. What does that mean? If the human operator will cook the meal he will do some actions for example “grasp the bottle”, “cut the apple with a knife” and so on. Formalizing these actions into a plan description language is a first but important step towards robot autonomy. The sad news is, that a perfect working plan recognition system isn't able to repeat the task by it's own. The human operator has to control the robot with the teleoperation interface. The extra service is, that at the same time the executated plan is made visible on the screen.

The proposed plan description language contains of a plan library which holds the actions: grasp, open, close, ungrasp, cut and so forth. And it contains of subplans, for example “open bottle” means to approach the object, put the finger on the top side and move the closure to the left. That means, a plan contains of hierarchical actions which are taken from a library.

I'd like to describe such a system in action. The first thing what the human operator is doing is to put his hand into the dataglove. This gives him control over the robot hand. If the human operator opens the finger, the robot will do the same. All what the dataglove is doing is to transmit the actions to the robotarm as fast as possible. This allows the operator to manipulate the scene. The second element of the system is a plan recognition system in the background. It checks what the human is doing and matches the actions with the plan description language. The generated plan is similar to the real actions of the robot. The human operator is doing something, and at the same time the textual description is shown on the monitor.

The interesting point is, that more AI related technology is not needed for the moment. The operator can do every task and with a bit luck, the system will recognize the subactions with a parser. The combination of a remote controlled robot arm plus a plan recognizer are useful introductions into the subject of robotics. They will not result into a fully autonomous household robot, but they can bring Artificial Inteligence forward. I belief it's important to identify such low hanging fruits. They are located between a teleoperated system and a fully autonomous robot. Somewhere in between is the demand for AI related research. The goal is to realize the steps in between in software. The fallback mode is always the teleoperated robot. Teleoperation is something which works always. Even if no Artificial Intelligence at all is available, it's possible to control a robot with a joystick or a dataglove. It's similar to playing a computergame the normal way, which means that the human is pressing buttons and moves the mouse.

Artificial Intelligence is everything which goes beyond this minimum requirement. It can be a plan recognition system, a learning from demonstration framework or in the maximum degree it can be equal to an autonomous robot who can handle the task by it's own.

The steps on this path are unknown. And the technology to realize them too. Which means, it's unexplored land and the propability of failure is high. In most cases, an autonomous robot won't work. After starting the system with “run” nothing will happen, because something is wrong with the AI. This is a hint, that a major step from teleoperation to fully autonomy is missing. Engineers have to answer the question which step in between is needed. This missing step explains the reason of failure. I would give a rought outlook how the transition can be described in detail:

1. Teleoperation

1a: plan recognition

1b: hierarchical plan recognition with subactions

1c: learning from demonstration

1d: sketch based goal formulation

1e: plan creation and monitoring

2 Fully autonomous robot

The steps in between are not complete. It's only a general description what the missing steps are. Most failed robot projects can be located on the coordinate system between step 1. teleoperation and step 2 fully autonomy. The overall task is very similar to building a bridge. he left side (teleoperation) is well known. The technology in doing so is available out of the box. A dataglove, a microcontroller and a robothand is sold in most electronics store. The other side of the bridge (the autonomous robot) is not available. It is only a vision, known from movies. The question is what are the steps in between? How to connect the bridge?^

The reason why Teleoperation is equal to the baseline is because it's reproducable. If somebody has made a youtube in which a teleoperated robot manipulator is shown, it's obvious how to build such a system from scratch. It's mostly a hardware problem of connecting the joystick to the robot gripper and then the signals are transmitted over the wire. There is no magic but it's normal engineering.

IN contrast, if somebody has shown a self-working robot who doesn't need teleoperation it's a mystery. Because this technology wasn't invented yet. The engineer has invented something which is new. Such a system may be working or it doesn't work. The details have to figured out and perhaps the robot can't be reproduced.

The reason why there is a difference between teleoperation and fully autonomous robots is because in the first case, the data doesn't contain semantic information. Teleoperation works usually by transmitting raw signals from the input interface to the robotarm. That means, the joystick is pressed upward, and the robotarm is doing the same action. The problem is, that “upward” has no meaning. What is missing is a domain model in which a certain low level action make sense. A normal teleoperated robot doesn't have such a model and this is reason why each action has to controlled by a human in the loop. The human has the overall plan and he knows what the current task is. The challenge is to make parts or even all hidden knowledge of the human visible for the computer. This would allow to improve teleoperation into something better.

JSON Format

planlanguage = { 
"lowlevel action": "left",
"lowlevel action": "right",
"lowlevel action": "up",
"lowlevel action": "down",
"highlevel action": "open gripper",
"highlevel action": "close gripper",
"highlevel action": "walkto",
}

A convenient way for storing the plan language grammar is a json dictionary. Such a datastructure allows to crate a hierarchical string list. In the example only two layers are available and a small number of skills. In contrast to the PDDL format and in contrast to a BNF grammar, the json dictionary can be parsed in most programming languages easily.

The purpose of a json dictionary is, to restrict the allowed dataset which stores the game log. All the actions in the game log belong to the json dictionary. The actions of the human are stored with a predefined format in the log file. Somebody may argue, that a game-logfile and dictionary to parse the file itself are useless and he is right. Because it's not possible to control a robot with such a logfile. The purpose is to define a standard how a plan will look like. The overall system is a teleoperated system which is enhanced by a plan specification language and a logfile.

Plan recognition as the transition from teleoperation to autonomous robots

The main problem in robotics is, to extend a given teleoperating system to an autonomous system which is equal to lower the workload of the human operator. The baseline for highly complex tasks with a robot is teleoperation which means, that the human is using a joystick to control the gripper. Such technique is working great but is not very advanced. The more elaborated way in robot control would be a software which is working alone. One way in realizing such system is with the help of plan recognition.

Plan recognition means to formalize actions on a higher level. The simplest form of a plan is path through a maze. If a plan is already there it is much easier to solve the task. In most cases, the robot has only to follow the plan. In most situation the problem is not to generate a plan with a planner, but to realize a formal plan at all. Most domains like dexterous manipulation have the problem, that a formal description in a plan language is difficult. That means, it's not possible to give the system a predefined plan because the notation is unknown.

Let us introduce some plan notations. For a maze game, the plan is equal to a 2d trajectory. Which means it contains of points which are connected with edges. The plan is equal to a spline. In case of a pick&place scenario the plan is build with a plan library. Each node in the planning graph is equal to a skill for example “open gripper”.

If a robot is controlled with a teleoperating system the human user is executing a plan. He has a rough idea who to solve a task and the robot is used for executing the plan. The main challenge is to convert between plans and actions in reality. A given plan can be translated in control actions, and executed actions can be recognized as a plan.

Planning language

In the trivial case of a maze solving robot, the plan is equal to a 2d trajectory. It can be stored in a table in the form:

p1, p2, p3, p4, p5

The plan is that the robot moves across the waypoints. He drives from p1 to p2, then to p3 and so on. The more formal description includes the walkto action in the plan:

start, moveto p1, moveto p2, moveto p3, moveto p4, moveto p5

In more demanding domains like a manipulation task the plan is more complicated. Instead of following points the robot is doing something at each point in time. This requires a more elaborated plan description language. An example would be:

start, opengripper, moveto p1, graspobject o1, closegripper, moveto p2, end

Similar to a maze solving robot, the plan describes the single steps. But this time, the steps are more complicated than only a trajectory in the 2d space. Instead the actions are given by natural language instructions. The term “opengripper” describes a certain motion primitive which is executed by the robot's gripper. During executing the robot doesn't move to a different place, but he standsstill and is doing something with the hand.

I'd like to describe the minimum requirement for a plan recognition system. Such a system is not working autonomously, it can only monitor the execution of a human's plan. The teleoperated robot hand is trying to solve a pick&place task. The plan is known in advance. And the AI is monitoring if the human is following it's own plan. According to the plan, the first step is to open the gripper. Does the human operator doing so in reality? If not, then the AI can print out a warning message.

The difference between a Domain specific language and a library

On the first look it seems, that both are similar. If somebody would like to realize a game he needs some kind of game library for painting lines to the screen. And if he likes he can call the library a Domain specific language (DSL) which contains the same functions. But there is large difference which can be explained in the programming game “Karel the robot”. Karel is a robot in a maze which has to be programmed, and Karel works with a domain specific language. The interesting point is, that the script which controls Karel runs inside the main program. Which means, that first the environment is started and then the karel1-scripts gets startet which moves the robot 4 steps upwards and 2 steps sidewards. The Karel script is executed by an abstract machine.

As a consequence a DSL is a something which runs in a box separate from the main program. The question is if Karel the robot can be realized as a library? The problem would be that the Karel program can't stopped independent from the game simulator, both are connected together. In most implementations, Karel runs as a thread independent from the main program. The working thesis is, that a DSL is equal to a subthread which runs in a program.

Dlang vs. C++

The D language project is the attempt to replace C++ sourcecode. Similar to C++, it's a compiled language and it comes with a large amount of libraries out of the box. The executation speed is a bit slower, but the syntax is easier to read. So what is the deal, will replace D the C++ standard?

No it won't because of two reasons. The first one is, that dlang is not compatible with C++, which means that the Dlang compiler can't read C++ sourcecode. As a consequence, D stands in competition to C++ which is a hard job because the C++ universe is large, powerful and contains of a large user base. The second problem is, that C++ is more advanced than it looks on the first impression. It's correct, that outdated C++ code from the year 1993 looks for today's eye obsolete, but new standards like C++11 and especially the upcoming C++20 standard will introduce lot of improvements. For example, in the C++20 standard the handling of import modules will become more easier. Very similar to the dlang, it's enough to write down a simple “import std;”

The chance that a programming language is able to replace C++ in the near future is low. Which means, that the C++ compilers are producing the fastest code on earth, that the community is working on improvements and that large amount of productive code is written right now in C++. Smaller attempts to replace C++ like Java, C#, Dlang or Go will fail. In some minor aspects they are working better than the C++ standard, but in general they are not a replacement for the queen of all programming languages.

To understand why C++ is the queen of programming languages it makes sense to take a look into the compiler infrastructure. There are at least three major compilers available: GCC, clang and Visual studio. Additionally, Intel has also a C++ compiler. These compilers are not small projects but they are the backbone in professional computing. It's hard or even impossible to compete with these projects. Calling C++ the standard in modern programming is an understatement. It's the most influential language and the most powerful project available. All the other programming languages from the TIOBE index like PHP, Python, Java, Perl, Javascript and Swift are only an addition to C++. These are niche languages, but C++ is the main development platform for the important code.

Let us understand why C++ is critized. A common argument against C++ is, that the language has to much features and the syntax is changing over the time, which makes it hard to work with existing code and increases the complexity. To adress this issue we have to go a step back and describe what modern programming is exactly. The idea is, that standard libraries are available, and that the language provides object-oriented features to build on top of these libraries new software. This can be realized in the C++ universe very well and results into highly productive workflow. What the average C++ programmer is doing is search for a game library, writes some C++ classes from scratch, tests the code and then he compiles the code into a binary file which can be deployed. So, what is the problem? Right, there is no problem, it's the best practive method for creating software and C++ is doing a great job. Problem which are available can be solved within the C++ ecosystem. They are minor problems in a healthy environment which can be fixed in the next C++ standard or with individual workarounds. That means, if somebody things, that the pointer system in C++ doesn't make much sense and takes this as a reason to switch over to Java, he has taken a small issue for arguing against a powerful language. Sure, the pointer situation in Java is better than in C++, but Java is not better than C++.