Robotics and Artificial Intelligence: June 2018

June 27, 2018

Why dogs are aggressive

The healthy communication between humans and dogs and has to do with reading the mind of the other with language. It is a two way communication: the human is talking to his dog, and the other way around. An example would be, that the human is reading a book aloud in front of the dog, to indoctrinate his mind with the right ideas, but the opposite direction is also important. Every dogs is sending out signals and it is up to the human to interpret them.

The most important question is maybe: why are dogs so aggressive? Answering this is easy, because they are defending limited resources. A god usually lives together with other animals in the wild life and he needs some goods: a place to sleep, something to eat, a position in the hierarchy and so on. All of these resources are confined, that means they are not infinitely. If dog1 is eating the fish, dog2 can't eat the same fish. The result is a conflict about who owns the meal. And a conflict is solved with aggression. And that is the reason why dogs have invented lots of aggressive behaviors and signs to warn, inform and fight with other dogs.

To provoke aggressive behavior at dogs the easiest thing to do is to limit the given ressources. For example to take away a fish from the dog, to reduce the amount of space he has, to disturb his play. From an abstract point of view this is equal to make the limit of the ressources obvious. As a consequence the dog takes a decision, in most cases he decides to fight for his ressources. If he is a wild dog this is equal to an immediate attack, if the dog was trained by human he can decide for a non-aggressive behavior for example he can beg. That means, he is asking polity if he can get the fish back, if he can get his space back and so on.

Begging for food is a very usual behavior. It is a learned behavior which transforms formally aggressive fight over low resources into a social accepted behavior with the same aim: to get in control of the limited food. Perhaps, dogs are so fascinating for humans, because human are using the same technique. Like dogs, they have the major problem that food, space, and social hierarchy are limited ressources so they have invented strategies to get in control of them.

To make the point clear: The precondition for any dog behavior are resources. If no conflict about limited amount of food, space or hierarchy is there the dog will do nothing. He is bored and ignores the situation. That means, if nobody has stolen his fish then there is no problem. Perhaps the dog is memorizing about previous experiences from last week to improve his behavior in the future, but in most cases he has forgotten the episode. That is the difference between dogs and humans. Humans can make notes on their laptop, that is outside of the scope of animals.

What I want to explain is, that dogs itself are not aggressive. They have no brain which controls their behavior. Their behavior has always to do with games they play. The situation is located outside of the dog, for example if dog1 is catching the food of dog2. This game is about: two players, a limited ressources and a social hierarchy. And it is up to the individual to play the game. And this results into a certain behavior.

So called Alpha dogs have developed strategies for getting the most ressources. Either they are super-aggressive or they are super-cute. In both cases they get the most of the high-values ressources, which means free space, high quality food, social hierarchy, fresh water and so on. What the dogs are talking to each other is how to develop such strategies. And if they have acquired a behavior they will use it in reality.

Painting is easier then expected

How does art work? It is mostly a mystery. The normal non-artist is imaging that doing art is something which depends from a certain person. That means, he has discovered early his skills to draw images, and over the years he become better and better. So it is a long and demanding journey, right? No it is not. Painting is very easy and can be mastered by everybody.

In a previous blog-post I explained the usage of gimp for tracing the contour of an image. The idea itself was the right direction, but the technique needs a bit improvement. In the previous blogpost I explained, that on the left screen the user should make the original visible, while on the right side he opens up the GIMP drawing software with the purpose to trace the lines according his eyes. There is a feature inside the GIMP tool which is extreme useful, called layers. Here is the tutorial for the absolute beginner.

Step 1 is to ask the google image search for an painting which is already there. That can be a certain motive, and it can be realistic. Step 2 is to open the jpeg file in GIMP. As default it will be opened as layer0. Step 3 is to create a second layer which we call “trace”. With the window menu it is possible to switch between the layer and make it visible or invisible. And now comes the trick. After creating the new layer, the menu asks us if want a transparent background. Sure we want. On step 4 we have both layers above each other and drawing the contour of the original image with the pencil. That is surprisingly simple, even without using advanced input devices. A normal mouse or trackpad is enough. In the last step we are filling the trace with some colors which can be similar to the original and exporting our layer as a new jpeg file.

Let us reflecting the technique a bit. According to the description, painting is equal to drawing contour lines of an image / photo which is already there. And the GIMP layer tool is the perfect choice which allows even for beginners to create his own masterpiece in under 5 minutes of work. No it is not a joke. The image from top of this posting was created in under 5 minutes and without much effort. Is it art? I don't know, probably yes.

Somebody may ask if trace back the contour of an image is not painting it is simple to create a bad copy. So what is the difference between tracing back the lines and making a photo? There is a huge difference, because our copy looks completely different from the original. If we are not telling the public what the template was they will never ever recognize it. And there is a trick available. Suppose, we want not to trace back a real photo or a real image but want to create everything from scratch. There is also a way to go. All what we need is a smaller puppet, made for artists. The puppet gets clothes and a photo is made. Now we are tracing back this photo. And voila, every part of our image was created really from scratch without copying anything. But to be honest, even this painting technique is a copy, because the workflow has always to do with open the original in layer0 and making the trace on layer1.

Impress the non-artist

If we are looking into some art schools and books about painting we will never find a similar explanation about the workflow. That means, the artist who are painting real images are not telling, that they are simple trace back images with the layer-feature of gimp. Question 1 is, are they doing so? And question 2 is, if yes, we they are not telling so? Answering the questions is easy. If everybody is able to create art, nobody will need artists anymore. And it might be also the case, that in former times it was not recognized widely that painting means always copying something. What in most art-schools of the past was teached is a certain type of copying. For example, the students are sitting in room and painting all together a flower which stands on a table. But painting has nothing to do with this situation. Because, in the example are many elements which are not important for the creation of the image itself. At first, it is not important to paint in a group, if somebody is alone in the room with the flower he will get the same result. The second aspect is, that an art school is also not a precondition. And the last point is, that a real flower is also superfluous. What I tried to describe in this blog post is some kind of minimal artist setup. This is a workflow which results into art, but is using a minimal set of ressources. The workflow consists of:

- in gimp layer0 is the original image, which can be a painting or a photograph from a puppet

- in layer1 of gimp, the artist is drawing the contour line and fills the colors with the aim to copying the original. He can add some noise to make the difference more obvious.

C++ the language of the future

Since 1-2 years there are some articles available which are introducing C++ as the best programming language ever which is superior to Java, C# and Python. Are these articles right? To answer the question we must take a look back into the year 1995. The advantage is, that the history is well understood and many material is available. How was the situation in programming language in the past? Programming in C++ was possible but there was many pitfalls out there. At first, the Borland C++ compiler cost a lot of money and needs huge ressources. If somebody was interested in writing a simple Hello World program such a compiler was not the best choice, the better idea was to use a basic interpreter from MS-DOS or the Commodore 64.

But, it was clear that at least in the year 1995, C++ was an advanced high-end programming language. Because it contains lots of features: compilation of sourcecode result into fast applications, object-oriented programming is highly productive and templates allows to write the same algorithm with less code. Suppose, somebody was familiar with C++, has a lot of money and has also a fast developer workstation, then C++ was the way to go.

Since then lots have changed. State-of-the-art C++ compilers like GCC are available for free, a cheap consumer PC can be used as a workstation and new language feature like std::vector allows it to use C++ like Python. C++ works the same like in the year 1995 but without the pitfalls. And that is the reason, why some people are arguing that C++ is the language of the future. Basically it is the same language like in the year 1995, but today the costs are much lower. Perhaps the most interesting aspect is, that C++ scales to many demands. At first it is possible to write any application with it: web application, command line programs, GUI applications, games, compilers and operating systems. And secondly it is possible to use a C++ compiler in many ways. The beginner can program in C++ like in a python interpreter. That means he is writing down a hello world function which contains apart from a loop and an if-statement no interesting feature, while the expert programmer can define his own class library and use template metaprogramming. It is even possible to extend C++ into a stack-based programming language, which is called UNconventional Threaded Interpretative Language (UNTIL) and was invented by Norman E. Smith.

A look back into the year 1995 is also useful to determine the importance of GUI-libraries. In 1995 there were two major libraries out there: Object Window Library (OWL) from Borland, and Microsoft Foundation Classes (MFC; Microsoft). Such a library together with an object-oriented language is a powerful tool for creating complex application in a short amount of time. And this answer perhaps the question which programming language is right for today. In the year 2018 there are also two important open question: which language is right, and which GUI library is right?

Let us take a look into a bottleneck of today's C++ development. Under MS-Windows, C++ works great. Or to say it better: there are C++ libraries out there. If the user want's to program in Linux a GUI application he will run into trouble. The gtkmm library is available, it can be installed for free, but it is very weak documented and no introduction tutorials are available. That means, a newbie is not able right now to program under Linux a C++ GUI application. In contrast the situation in Windows is much better. I would guess, that a missing C++ GUI library which is well documented in an open Source operating system is the major bottleneck in todays C++ development. That means, if somebody is arguing he is not using C++ but C# under MS-Windows because it has the better library then he is probably right.

June 24, 2018

How to use a scientific bookstore?

An academic bookshop is specialized on education material which is used in universities. They are able to search in databases from Springer and Elsevier for an article and it is possible to buy standard books from a subject. They are also offer conference procedings, from the last year. The customer gets usually 150 pages for 40 US$ which is very cheap. A rare proceeding will cost 100 US$ for only 50 pages.

In recent times, some customer are not aware what an academic bookstore is. They do not want to buy books, instead they are asking how they can create a profile with their name, photo and a list of publication. The problem is, that an academic bookstore was never invented to support such demand. It is only possible to buy printed books and ebooks too. It is not necessary that the customer gets his personalized profile page on the internet, because a bookstore is a place to buy information not to drop down a personalized marketing information.

June 21, 2018

First look on glade

Programming a GUI application in Linux is hard. There are lots of standards available (gtk+, Qt, mono develop, Python Window) and it is unclear which of them is a here to stay. The hypothesis is, that the combination of: C++, gtkmm and the Glade GUI designer is the best practice method to develop own apps easily. But I'm not sure how this works in detail. Today, I only want to take an introductionary look into the Glade GUI designer and evaluate what is possible with the tool.

The installation is easy. A simple “dnf install glade” is enough to fetch the 3 MB size file onto the local Fedora machine. After starting the program the user sees a screen like this one:

In the menu, the user can drag&drop his GUI application which works like in a painting program. And he can execute the preview function to show, how the GUI will look like.

But, there is one part, which i haven't figured out right now. How to use the GUI in C++ program. As far as i know, the file is stored in the working directory with the “.glade” extension. It is an XML formatted file which can be used inside Python, C or C++ apps.

But the details are unclear right now. Nevertheless, the glade tool itself, seems very powerful. The above example GUI-window was created in under 5 minutes with simple point&click mouse movements. The user selects first the window type, specify a grid and moves them the elements like a menu bar, the textfield and the button into the window. Then he can give names for activation events for example “pressbutton”. These event should be parsed later in the C++ program.

Learning C++ fast

C++ itself is an awful language. Concepts like object-oriented programming, pointers, templates and low-level assembly are not the right choice for a future ready programming language. The best way to deal with C++ is to ignore it and take help at other programming language who fits better to the beginner. The interesting aspect is, that everything what the newbee learns outside of the C++ community helps him to write better C++ later. For example, if somebody starts programming with Python and is familiar with object-oriented paradigm. Then he is able to create class diagrams in a programming language and perhaps he sees the advantage of inheritance. If he switches from Python to C++ he can use this knowledge the same way. Because the OOP feature of C++ are very similar to Python.

Or if somebody is familiar with the Forth programming language and is an expert for writing factored code, which means to write short 3 lines long functions and using stackbased method calls, this knowledge is also useful for writing more efficient C++ code. If somebody is coming from the C# community and loves the absence of pointers, he is also welcome in the C++ language, because the new C++17 standard is trying to convince the programmer to not using pointers anymore.

What i want to explain is, that it is possible to learn something about programming outside of C++ and use this knowledge later to write better C++ code. Let us make an example. We are taking the best practice methods from C#, Forth and Python and mixing them together. Then we are writing in such a style C++ code. This project is very good to read, compiles to fast machine code and can be seen as a wonderful example of C++ how it should be. I think, C++ is some kind of melting point for all the other communities.

The best example is perhaps a transition from Python to C++. If somebody has written Python Code in the past, and uses this knowledge for writing C++, then his C++ programs will look like Python. That means, they are using object oriented features and have no pointers. Because this is the python writing style. The typical Python sourcecode is very elegant. That means, it has no complicated language features or is written for performance, but Python is some kind of pseudo-code with pedagogical reasons. The interesting aspect is, that a C++ compiler allows to program in the same style. A C++ program can be formatted that it will look like Python.

But why should a Python programmer use a C++ compiler if he can program against the Python2 Interpreter? That is a joke question, because from a computational point of view, the Python2 interpreter is not very efficient. That means, it can't generate machine code and it doesn't support threading very well. And to be honest: Python2/3 will never be as fast as a modern C++ compiler like GCC. Python was simply not designed as a compiler language. Most languages outside of C++ have similar bottlenecks. For example, C# is also a wonderful language which is in some aspects superior to C++. But C# has the problem, that it is not often used in the Linux environment and doesn't compile very well on Android devices.

But why should all programmer transit to C++, why not to another general purpose language like Java? The answer has to do with a potential alternative to a language. C++ is the only language which can't be replaced by something else. That means, it is not possible that in 30 years C++ is dead. Sure, it is possible to invent new programming languages, for example PHP, go or D. But none of them can replace C++.

June 19, 2018

Reason against Forth

Finding reasons not to using Forth is not easy. At first, we must admit, that from a technical point of view, Forth is superior. It allows to write small operating systems and efficient BIOS routines. But there are some reasons not to using Forth, but a programming language called C together with the MINIX operating system. An example homebrew project is the MAGIC-1 computer http://www.homebrewcpu.com/overview.htm

Why is Forth not used on the device? From the technical point of view, it would be possible to implement a Forth interpreter on that machine. I would guess, that Forth would be more efficient then a MINIX port. On the other hand, Forth is a closed source system, while Minix is open. What does that mean? Suppose, we are booting Minix on the system. Then it is possible to compile any c-program which is available under the GNU license for that computer. That means, we have access to a huge amount of software written in the past. In case of Forth we have no access to such software. Because Forth is a programming language, while Minix is a social movement.

The question is not, if an algorithm contains a stack or a for-loop, the question is about the license of the software. The term license is a legal term and describes the copyright situation. That means, we are able to download GNU software for free from the internet, while it is forbidden to download Forth software for free.

In a diary http://www.homebrewcpu.com/minix_port.htm, the inventor of the MAGIC-1 computers describes how he have installed Minix and also a webserver on the system. The most interesting aspect is, that he himself is not the programmer of the software. That means, he has only used the Minix OS, written by somebody else and he has no idea of how to program a webserver. And that is the major difference between a Forth driven cpu and a minix-driven homebrew CPU. Minix is some kind of social biotop in which knowledge is distributed freely, while Forth is about protecting electronics knowledge behind a software license. If a Forth programmer has no idea about TCP/IP, he is not able to run a webserver. He can not download a ready-to-run software out of the internet. Because, at first there is no webserver available and secondly not under the GPL license.

In a youtube video, the inventor of the MAGIC-1 homebrew CPU explains how the Minix operating system works on his computer, https://www.youtube.com/watch?v=0jRgpTp8pR8 The surprising fact is, that according to the video, the MAGIC-1 computer has solved the software crisis. That means, after installing the MINIX operating system on it, the computer is not only working on a technical level, but the user has also access to a huge kind of software library. In the youtube-video it is demonstrated the minix system itself, which contains the /bin directory with lots of preinstalled software. But the more interesting aspect is, that it is possible to compile a lot of more software from sourcecode and to start it also on the machine. Software, which was already written and is available right now. That is the major difference between the MAGIC-1 and a homebrew computer programmed in Forth.

Role playing games

Right now, there are two famous example of homebrew CPU available on the internet: Mark-1 FORTH (which is forth based CPU), and the MAGIC-1 (which is a Minix based homebrew project). What is the difference? The hardware is nearly the same. In both cases a computer was build from scratch which is able to run software. And both projects can be seen as a replica of early real computers. And here comes the difference, because the projects are emulating a different social role playing game.

The Mark-1 Forth CPU can be seen as a second trial of the Microsoft idea, while the Magic-1 project has a lot of common with the PDP-11 project which is a Unix machine. The question is not, what the machine itself can do, the question is what follows after the machine was realized. Microsoft was based on the idea of programming commercial software. Which means the sourcecode is restriced and books about the inner working of the operating system are not available. That is the same idea behind the Mark-1 Forth project, and the later iforth compiler, which is a commercial Forth compiler. In contrast, the Unix development around the PDP-11 machine was realized as an open system, which means that the sourcecode is free.

Let us describe some of the ideas behind UNIX. The main idea is, that the amount of space consumed by the programs is huge. A unix machine in the 1980s needs around 200 MB at a harddrive space, while todays machines needs around 10 gb. The reason why is, because the number of programs is huge and often they are many programs for the same purpose. For example under Unix are at least 100 different texteditors available, written by people from all over the world. The reason why so much software is available for UNIX has to do with the costs of creating new software.

The original tapes with the BSD-unix were distributed for free, and even beginners are able to write his own texteditor in the c programming language in a short amount of time. In contrast, writing software for a Forth system is hard and the numbers of people who are able to do so is limited. The typical Forth programmer is working for a commercial company, for example for Microsoft. And that is the reason why the software quality of Microsoft is higher then a Linux system.

Easy and hard tasks

Building a computer and a BIOS system from scratch is hard, right? No it is not. Wiring some transistors together, form a cpu and boot the system into a Forth prompt is an easy task. On the hardware level some lowcost components are enough and porting a forth integrated development system to the new CPU is also very easy. Easy in this situation means, that both parts (hardware and software) can be realized by single person in under a year. That means, the amount of work which is needed is one man month, not more.

That this is not only a theoretical assumption but the reality can be seen at endless number of homebrew cpu projects. Usually there are completed successful. At the end, at least a Forth system and sometimes a BASIC interpreter can be seen at the boot prompt. But there is another task, which is way more complicated. Writing software for such a computer. That is a task, which needs lots of more energy. A single programmer can write in one year only a limited amount of code. That means, a single amateur is not able to program a Linux like operating from scratch in his garage, even if he is doing nothing else. His project will fail.

Even large companies have problems to develop large software repositories from scratch. That means, a company which has 10 highly skilled programmers and let them program for one year on a computer game, they will probably fail. The reason is, that the amount of man hour which is needed to develop a complex game is greater, then the output of only 10 programmers. That is not a problem of understanding the inner working of a computer. Usually, the programmers are familiar with programming languages, compilers, stacks and the reverse polish notation. The problem is somewhere else. It is called productivity, and means, that a single person can not write more then 10 lines of code per day. And if the new game should have 10 million lines of code, the number of needed workhours can be calculated. And that is the reason why huge software projects fail. And because of the same reason we have a software crisis.

An easy task is everything which needs a low amount of manhours. For example, building a homebrew CPU from scratch, or porting a 2 kb long forth interpreter to a new CPU. The amount of work can be predicted in advance, and it is low. That means, the average expert needs less then a year. In contrast, difficult tasks are problem which need more amount of man hours. That means, even if the programmer is an expert on his field, he is not able to present a result in under a year. Let me give an example. Suppose we need a operating system which consists of 1000 lines of code. According to the average productivity, such a software can be programmed in 100 days by a single person. That means, it is an easy task because the programmer will present a working result in under a year.

And now we need an operating system which contains 1 million lines of code. If the same programmer with the same productivity is starting right now his job he will be finished in 274 years. That means, the project fails. What is the reason of failure? It is not a misunderstanding of what a computer is. It is not missing programming skills. The problem has to do with management. For realizing huge software projects, many programmers have to work in parallel. The programming work has to be distributed between the people. And this is the reason why Linux is superior to Forth. Not because of technical reasons, but because of management issues.

The question is not, who an operating system work from a technical perspective, the main question is how to manage large groups of programmers that a huge codebase can be created and improved in teams. The reason why the C programming language and not Forth is used for big applications has to do with this question.

Z80 UNIX

Since a while, there are some projects out there which are trying to port the UNIX operating system to Z80 cpus. https://www.youtube.com/watch?v=1WG8zopGzaA The Z80 cpu can be seen as the standard computer of the homebrew computer community. But what is so special about it? Usually, homebrew computer chips are driven not by dedicated operating systems, instead CP/M or Forth like system kernels are used. Installing Forth or CP/M on a homebrew computer means to not install any operating system. Because the amount of RAM is low, and the assumption is no OS is needed to run the machine. And instead, many z80 projects from the past doesn't need no operating system. Instead the proof of concept is dealing with the computer itself, that means a simple Forth prompt without any further features was enough.

The difference between using a dedicated operating system like UNIX and not using any kind of OS has to do with the amount of lines of code. A system which has only a BIOS like interface contains not more then 100 lines of code, perhaps 200 lines of code. The focus is not on a software-engineering task but on the computer. From the perspective of a user, such systems are useless. This has nothing to do with understanding the computer, but with using the computer for doing useful stuff. The idea of installing UNIX like operating systems on small 8bit machine is a new development, it is something which breaks with the past. The idea is, that a z80 computer (even it works great) is useless, because a computer without software can't be used in reality. The idea is, that the user is not interested in registers, memory address or interrupts, but he is interested in running games, webservers and databases on the machine. Such a use case is a bit esoteric because it is located of core computing science. It is defined in the literature as practical computing and software-engineering.

The main idea behind “UNIX on the Z80” computer is simply: not the programming language C or the UNIX guidelines are important, but such a project is about lines of code. That means, the typical UNIX implementation contains at least 10000 lines of code, and sometimes more. The amount of feature is proportional to the lines of code.

To understand what is new on the FUSIX operating system we must focus on how Z80 computers were used in the past. Usually, they were used together with assembly code or with a slim operating system written in Assembly for example CP/M or MS-DOS. The idea is, that the software is an addon for the CPU, which is less important. That means, the typical homebrew Z80 computer is delivered without operating system software or with a small Forth interpreter which fits on 1 kb of ROM. The assumption was, that the average doesn't need more. But in reality the reason why this was common has to do with the amount of energy which needs to be invested. Programming a operating system which has a lot of feature is way more complicated then writing a simple BIOS routine in Forth. And to save the time, it wasn't done.

Nowadays, nothing has changed. In spite of the progress in technology it is harder then ever to write an operating system from scratch, especially it should have features like TCP/IP and a GUI.

What is wrong with post-publication peer-review?

The classical academic journals works with pre-publication peer-review. That means, the author is submitting his manuscrupt to the journal, they have 6 months time to read the paper and then the journal will reject the manuscript because of it's low quality and missing of innovation. Why is such a system remarkable stable for at least 100 years? It has mainly to do with the workflow of how journals are authors are creating new manuscripts. The technology they used is printing based. That means, the author is using a mechanical typewriter and the journal is using a steam-driven printing press.

Now, let's take a look how the typical phd-dissertation is written in the year 2018. The surprising information is, that the workflow is very oldschool. That means, the phd-candidate is using MS-word, prints out every iteration for manual proofreading it off-screen and the final dissertation is printed out in a copy shop and gets a leather binding which costs alone 100 US$. It is possible to build on top of this outdated workflow something different from a pre-publication peer-review? No way.

Now I want to describe the alternative. At first, the creation process has to be adapted. A digital only manuscript is the result. That means, the paper is written entirely onscreen, and no draft version are printed out. Also the final manuscript isn't printed out, it is simply a pdf file on the computer. Then PDF file is send to the publisher, which also never print it out, but put the information direct on the webserver, so that everybody can read it. This wonderful state-of-the art workflow has only a minor problem. It is not used in reality. It is not the way how current phd-candidates or current academic journals are working. They are preventing modern technology because they are not familiar with it, or because they are in fear, that there is much resistance against trying out new things.

The funny thing is, that an all-electronic workflow and post-publication peer-review is the same. If the author and the journal are not printing out the manuscript they are able to put the content first on the webserver, and ask only in the second run if the quality is high enough. In my opinion, only post-publication peer-review is a real peer-review. Because only if the potential peers are not aware that a manuscript is under the way, they can decide to read it. The problem is not, that a wrong paper gets published, the problem is, that classical academic publishing has no plan to handle wrong information. It is simply not possible, that Elsevier published first a manuscript and retract the paper later because there is something wrong with it. Instead, every manuscript which is published is valid.

If we are reading carefully in the papers published from 1950-2010 we will noticed that nearly 100% of all citations are positive citations. That means, author 1 cites author 2 because he has the same opinion and adds only minor comments. All the authors and publishers are basically saying the same, and have the same understanding of science. There is no debate, instead academic publishing works like an ivory tower.

Electronic publishing and post-publication peer-review are both disruptive technologies. That means, they are working differently to what is known in the past. It is something which is not invented yet. As far as I know, there is no electronic publisher out there. It is something which wasn't invented yet. The only place in which peer-review and electronic publication works right know, is an imaginary future called Open Science, which is a plot only available on powerpoint presentation but not in reality. Open Science is indeed electronic and works with post-publication peer review. But it is not something, which we have today, it is something which can be realized in 20 years, in 30 years or never.

The debate around Open Science has the goal to prevent such a world. On the one hand, the phd-candidates are printing out their manuscripts and the publishers are doing the quality check before the manuscript gets published and on the same time, the public is told how wonderful the opposite would be. The best example is perhaps a printed journal, which is behind a paywall, which publishes lots of manuscript about future academic publication. That is a very safe way to debate the future, if it is not realized in reality.

Let us listen carefully to so called Open Access discussions. Are these talks about the current situation in academic publishing? No, Open Access and Open Science is mainly a requirement which the researcher have. That means, they hope that Open Science will become true and they are explaining why they need it. For example, if the paper is electronically the costs are lower. That requirements are not fulfilled yet. There is a need for new Open Access publishers which are not available right now. The funny thing is, that even publishers are argue in the same way. That means, Elsevier for example is explaining their requirements to Open Access publication. And the public, that means somebody else from Elsevier should fulfill such a demand.

The MAGIC-1 homebrew computer

http://www.homebrewcpu.com/overview.htm is a website about an amateur CPU project, called MAGIC-1. It is not the first attempt to build a computer from scratch, but it is one of the most interesting. Not because of the hardware part (it is made by TTL chips and some RAM) but because of the software stack.

The inventor describes this as very demanding:

“Although the hardware design and construction of Magic-1 usually gets the most attention, the largest part of the project (by far) has been developing/porting the software.“

That means, developing the CPU itself, wiring the cables and make the current flow was easy going. Only the part of programming an operating system is really difficult. And the inventor is right. The main problem with homebrew computers is not the electronics part of the project. The most man hours gets usually invested in the software stack.

The MAGIC-1 project is interesting, because it is not using CP/M but a Forth system, better known as MINIX. Minix is a stackbased operating system which can be reprogrammed and has a built-in TCP/IP stack.

But why is developing the hardware for a computer an easy task and programming the software a hard one? Let us reduce the situation to a synthetic game. At first we are building the computer itself. We are not taking real transistors but programming a Brainfuck interpreter. Such an interpreter can be realized in under 100 lines of code, and it can be done without any problems. https://github.com/kgabis/brainfuck-c/blob/master/brainfuck.c If we are starting the interpreter we get a fully functional computer. It has a memory and can execute commands. This is equal to build a computer in hardware. That means, after switching the device on, it is ready and can execute any program.

Now comes the difficult part. We want to program our Brainfuck interpreter. The surprise is, that even if we had invented the brainfuck interpreter ourself and know every detail of it, we are not able to program on the machine. For example, if the task is to program a simple operating system, or a string search algorithm, we need many weeks until the software will be ready. And this situation is not only true for the Brainfuck soft-emulator but is the normal case for any homebrew CPU project. The computer itself (no matter if is an emulator or real hardware) can be realized in a short amount of time and is described in the literature in detail. But programming a game on the computer, an operating system or something else is very hard.

June 16, 2018

Why Forth is used by homebrew computer enthusiasts

Forth seems to be a remarkable language, because it is so different from mainstream programming language. The main feature is its stack and the reverse polish notation. The reason why Forth is a great programming language can be understand if we focus on a special sub-discipline in computing: the homebrew computer scene. The experts there are creating his own CPUs and his own operating systems from scratch. That means, they are ignoring standard x86 CPUs from Intel, they are not interested in overclocking raspberry PI hardware and they are rejecting even Open source operating systems like Linux. The reason is, because “it was not invented here”, which is a term for understanding a system from scratch. For example, the Linux kernel consists of 100 MB sourcecode, written by others, not by the homebrew computer scene, so it is not ok to use the software on the own system.

Let us go a bit into details. Suppose, we have created out of 2000 transistors a simple computer in 4 bit style. What comes next? Right, we need a programming language and an operating system. The simplest way to program a programming language from scratch is called Reverse polish notation. The general idea is given by a calculator program which is able to parse and execute calculator statements: https://rosettacode.org/wiki/Parsing/RPN_calculator_algorithm#C.2B.2B

As an input, the system gets a string like “3 4 2 *” and and then it prints out the result. The reason, why the RPN notation is used is because the resulting sourcecode is very small. The examples from the rosetta-code challange are not longer then 100 lines of code. No matter which programming language was used (C++, Python or whetever) it is very easy to write a RPN parser. If we want not only parse calculator statements but complete computercode the situation is equal. Writing a parser for a RPN programming language can be done in less lines of code, then writing a normal parser which can parse expressions in the correct linear order.

The constraints under which the homebrew computer scene operates has to do with limitations. They need the smallest possible computer, and the smallest possible operating system. They are not interested in 64bit Intel CPU with billions of transistors, they are not interested in an operating system which with a size of 10 gb, but they want a tiny 4 bit cpu, made of simple transistors and an operating system which is smaller then 1 kb overall.

Does this make sense? Yes and no. On the one hand, RPN based programming languages, and parsers are small and powerful at the same time. On the other hand, it is not possible to use a homebrew computer for any useful purposes. For example, it is not possible to watch a youtube video on such a cpu, or using Forth for programming object-oriented sourcecode which has a huge amount of lines of code. That is the reason, why homebrew cpu and RPN notation is forgotten by the mainstream. It is something which was replaced by more advanced systems. For example, a modern CPU is faster, then a 4bit cpu, and Linux is more advanced then a self-made Forth OS in under 1 kb.

In theory it is possible to extend a Forth system into a much larger operating system. For example by adding new features like TCP/IP and a graphical user interface. In reality, such projects are not existing. That means, if somebody is interested in a powerful hardware-software combination, he is preferring the C programming language, together with a COTS CPU.

We can assume, that homebrew cpus and Forth will never die. Because if complexity of mainstream hardware/software will raise, on the same time there is a need for a low complexity system for educational purposes. That means, from a teaching perspective a 4bit homebrew cpu together with 1 kb Forth operating system is a good starting point to describe a computer from scratch.

Without any doubt, Forth is the simplest programming language to parse. That means a software which is able to execute a Forth program needs less discspace then an interpreter for C++. On the other hand, Forth is not the language of the future. That means, modern software will not be written in Forth, because it is not comfortable to use. A so called high-level programming language like C++ together with hundreds of library is very complicated to parse. For example, to get a system which is able to execute a hello World GTK+ program we need at least a computer with 4 GB of RAM and 10 GB of harddisk. The needed amount of manhour to program a C++ compiler together with the GTK+ libraries is huge. But, using C++ in daily life is a easier then to use Forth. That means, the software industry is investing the effort, to produce something which is more powerful then Forth. Not on a theoretical level, because any problem can be solved in Forth too, but from the community aspect, that real users need real computer software.

Let us describe the situation from the consumer perspective. Suppose, company A is delivering a Forth computer. The system has 5 kb of RAM, has an integrated Forth system which is errorfree and tested, and a light user manual which looks very similar to the handbook of the Jupiter ACE homecomputer. In contrast, company B is delivering a 64bit computer with lots of RAM, a preinstalled Fedora operating system and software which has lots of bugs and needs an update every day. Which product will be more interested for the home-user? Right, the second one. It has faster hardware, more software and is more advanced.

If Forth and 4bit microcontroller is so great, why did we see a software-crisis since the 1970s? Let us go back to that date. In principle, a 4bit cpu together with Forth is able to run any program. It is a working computer and it is possible to create error free software for it. In reality, such an attempt will fail. The mismatch between the demand of the enduser for advanced software and the inability of the software industry to deliver such a product is called software-crisis. That means, if a company can only deliver a 4bit microcontroller plus 3 kb of RAM which has to be programmed in Forth, this is a good example for a software-crisis. The enduser isn't interested in learning Forth, he want's to play games in high-resolution, watch videos and typing in text in a GUI. The software-crisis in the 1970s was solved with better hardware and software. That means, with cpu who have 32bit or even 64bit, and with software like a c-compiler, graphical user interface operating systems and object-oriented programming. Sure, we can ask if we need really a 64bit cpu and if we need really a mouse on a computer to scroll through the internet, but this would ignore the problem called software crisis. The argument is, that with a 4bit microcontroller and a RPN parser is everything great, but only the consumer is the problem, because he doesn't understand it.

Microsoft and Forth

Between Forth and Microsoft are many similarities. It may be a bit surprising, but most Forth enthusiasts are using MS-Windows as their main operating system to program their custom build Forth CPUs. To answer the reason why we must go back to the early day of Microsoft. The company was founded with the idea to bring homebrew computers to the masses. Early Apple II and the Altair computers are example for low-cost hardware which was extended with Microsoft software to a Personal computer. The motivation behind Microsoft in the 1970s and behind many homebrew CPU projects today is the same: to build a simple computer, write a compiler and ship it to the customer.

On the first hand, Microsoft software and the Forth programming language solves a lot problems. For example the problem of how to boot the computer. That means, the hardware is powered on, the user see a prompt and can enter a short BASIC program or a short Forth program. Every homebrew CPU is proving something. It proves, that it is possible to build a computer and run software on it. Usually, the demonstration that a new homebrew project was success is to run a hello world program on it. In most cases some prime number algorithm or a mini Tetris like game. If this demonstration is done, the work of the homebrew cpu enthusiast was done, he is out of the loop. He has build up the 8bit cpu from transistors, he has programmed the BIOS and provides also software for programming the device.

What today's homebrew cpu fans are doing is to replicate the work of Microsoft in the 1970's. The assumption is, that a homecomputer is not available, and that the task is to build such a system from scratch which needs a certain amount of knowledge in electronics and also a bit knowledge about compilers.

There is only one problem. Microsoft and other homebrew cpu projects are leaving out one important point. What comes after the project? The example with a homebrew Forth cpu and also with a computer who runs Microsoft Windows shows, that it is not enough to simply build and deliver a home computer. In theory, it is possible to use such a device for something useful, in reality the more demanding problem is to write additional software for it. That means, even the homebrew CPU project was a great success and the system works (including the Forth interpreter), it is not possible to use such a device as a database server, a LAMP server or a graphics workstation.

In an earlier paragraph I introduced the term software crisis. Homebrew CPUs and Microsoft have produces the software crisis. A software crisis is a situation in which the hardware works, and the computer is printing out the hello world prompt, but there is missing something else, what the user needs. A software crisis means not, that the computer is broken. No, the hardware works, and the Forth interpreter too. A software crisis means, that ontop of this minimal system no other software is available. And this is the reason why Linux is something different from Forth.

Linux is not about building a homebrew CPU or writing the BIOS system. Linux and especially the GNU movement is about creating high-level software, which implies end-user application like a word-processor, databases, programming languages compilers and so on. The precondition of a GNU system is, that the computer works. That means, the hardware must boot up, and some kind of simple BIOS must be there. Linux is not able to answer the problem of how to build a computer out of transistors, nor it answers the question of how to implement a BASIC on that system. Linux is about how to solve the software crisis, that means to provide high-level software.

Surprisingly, this task was never fulfilled by the homebrew computer scene. The answer of Microsoft to the customer was that he should buy for a word-processor or a webserver if he needs such software. And the answer of the Forth community is the same. That means, Microsoft or the Forth community doesn't provide high-level software for free to the customer. They are not solving the software crisis.

Microsoft and Forth are both technical motivated. The idea is to realize a working computer system. And this knowledge is used for a commercial model to write high-level software. The consequence is, that the current iforth distribution costs money and that the MS-Word version costs money. In contrast, the GNU movement is a non-technical ideology. The idea is, that the sourcecode should be free and in which language the sourcecode was written is not important. The Linux kernel was written in C, some highlevel programs like Google Chrome are written in C++ and the LaTeX system was written in another language. To be honest, the GNU movement has no favorite. Because the GNU movement isn't interested in technology. They want to solve the software crisis.

Let us make a concrete example. We are taking an Altair 8800 computer from the museum and booting the system until the BASIC prompt. From a technical point of view, the system works great. That means, the current flows through the wire, the BASIC prompt shows that everything is ready and if the user enters a small Hello world program he can see the result. So we have a wonderful computer on the desk and the Microsoft BASIC is working great. What's next? And exactly this is the problem. There is nothing the user can do with the machine. On the first hand, the machine itself is great. From a technical point of view, the system is error free. On the hardware and on the software level everything works fine. And this is the best example why the problem can be called “software crisis”. Because there is no software available. Only the microcode in the ALTAIR and only the BASIC interpreter is there. That means, the amount of software in total is around 2 kb, perhaps 4 kb.

Can we use the amount of 4kb for the software for doing something useful with the machine, for example typing in a LaTeX document? No, the 4kb BIOS system only provides an interface to the hardware, to use the computer for any purpose we must program the machine first. If we are not able to program it, if we have no preprogrammed software on magnetic tape, and if we have no money to buy such software in the store we have the software crisis life and in color. That means, a working Altair computer with minimalistic operating system is a solution for one problem and the precondition for the next problem.

Let us describe what the problem not is. The problem is not how to wire transistors together to a CPU. The problem is not to program a Forth interpreter as a BIOS, the problem is not to booting the machine. All of these problems were solved. The problem is, that we as a consumer need something more apart from a Forth, a Microsoft Basic and a working homebrew CPU.

Now we can describe better, what a possible answer to the software crisis is. A possible answer would be in case of the Altair computer a box full of magnet tapes which are labeled with “wordprocessor”, “database application”, “game 1”, “Mario game”, “Tetris game”, “webserver”, “Pascal compiler”, “spreadsheet application” and so forth. The similarity between the examples is, that these kind of products have nothing to do with computing itself, but with legal aspects. Software is usually copyright protected, and the software can't be created out of nothing. To create such software a company is needed and many man hours too.That means, the magnetic tape with the Tetris game has nothing to do with reverse polish notation, 8-bit assembly instructions or a graphics card, but it has to do with the question who is the author of the software, which software company has programmed the game, what is the price to buy such a software, what is the license, which features has the software, and so on.

A homebrew Forth CPU can be called a technical movement, and the GNU manifesto can be called a social movement. A Forth CPU has to do with stacks, adress spaces and memory consumption; while the GNU movement has to with the GPL license, making the world better and patents.

June 14, 2018

Operating systems for homebrew CPUs

With the advent of the internet, a huge amount of websites were created with the topic of self-grown CPU. That are self-made computers in 8 bit, often created by 2000 transistors or less and with the aim to understand fully what computing is. Such projects are going into the details, and are building not only the ALU in hardware, but also the memory and the microcode. A side-discussion on homebrew CPUs is the question under which operating system such a systems can operate. Usually one of the following software is used: Contiki (8-bit TCP/IP system), BASIC (classical 1980s home computer choice) and Forth (stack-based language and operating system).

The general idea behind the software is very equal. The constraint is, that a homebrew cpu has only limited ressources, which means it is a 8 bit cpu or sometimes a 4 bit cpu, with little amount of physical RAM. The idea behind Contiki, BASIC and Forth and is to run under such conditions. At the same time it become obvious what the disadvantages are. For example, the Contiki system is well programmed but compared to a real operating system like Linux or Android it has a lack of feature. That means, it is not possible to run LaTeX under Contiki or use it as video encoder.

The main difference between operating systems for homebrew cpus and their full-blown equivalent is the amount of lines of code. THat means, the Linux kernel needs 100 MB, while contiki is happy with 2 kb. To assume, that Contiki can do everything what the Linux is able to do is wrong. That means, it makes no sense to install Contiki on a standard-pc. Such a system would be inferior to a real Linux server.

The main purpose of homebrew CPUs, BASIC, contiki, Forth and other small sized systems is education. A homebrew CPU together with Forth is usually not created because somebody needs a computer or want's to run software, but because he want's to learn to program a computer and design it from scratch. A 2000 transistor CPU together with Contiki is a training environment to learn and teach certain elements in computing.

A typical example for a BASIC interpreter in homebrew cpus is EhBASIC. It is a modern BASIC os which runs on 6502 like CPUs. The purpose is similar to a Forth system. With EhBASIC the users gets a minimalistic environment in which he can writer further programs. The EhBASIC system has the purpose to demonstrate the working hardware. If the self-made CPUs boots up the basic interpreter and runs a short prime number program, everything works fine.

The most dominant aspect of EhBASIC is, that it can't be extended. THat means, it is not possible to grow the system into a working full-blown operating system by adding some extra sourcecode. It was designed with a special purpose to be a small system. Let us compare EhBASIC with the Android operating system. What is the difference? At first the demand for CPU and RAM ressources is higher on the Android world. And secondly, it is easier to write software for Android, because the operating system has a lot of working APIs for graphic and network interfaces. Somebody who is booting a BASIC interpreter asks perhaps, why Android is so huge. Isn't it possible to write an operating system in under 5 kb? No, it is not. If someone reinvents Android or Linux from scratch he will always get into the same amount of 1 gb of sourcecode (or more) which needs lots of RAM to run.

Using C++ like MS-Excel

Microsoft Excel and the Open Source clone Libreoffice calc are very good programs for developing prototyping spreadsheets. A new Excel file is created fast, and the user can try out some calculations. The major problem is, that the amount of programming inside an Excel sheet is not very high, that means it is impossible to use MS-excel for prototyping GUI games, or simple search algorithm.

What is possible is to use the MS-excel metaphor for describing the workflow in the C++ programming language. Let us imagine, that C++ is like Excel and we want to create a software prototype. How do we do this?

At first, we are creating a working directory and inside the folder we are executing “git init”. In the folder we are creating the project-subfolder called “project1”. In this folder we can store small .cpp files. Every cpp file is equal to an MS-excel tab. That means, the project1 folder is equal to an excel file, and the cpp files are the tabs in the file.

Now we can enter some content. Like in Java, each file can only have one class. file1.cpp has class1, file2.cpp the class2 and so on. As an additional constraint the maximum lines of code for each files is limited to 100 lines of code. That means, the files must be small. And at a last constraint the sum of all files should only have 500 lines of code, otherwise we must create a new project-folder.

The overall idea of this structure is to use C++ as a prototyping programming language for trying out new sourcecode. We can switch between the parts of our program with browsing through the files, and the idea is to use only well known statements like std::string, for loops and if-statements. That means reducing C++ to a Python like language which is not aware of pointers, STL and other complicated stuff. Instead we imagine, that C++ is our MS-excel spreadsheet and the idea is not to write software but doing some trial&error workflow.

A new project folder is not created for writing stable software, but make a new iteration in the software prototype. If the limit of 500 lines of code is reached, the old project will get stopped, instead a new project is created from scratch. The idea is, that if the lines of code is slower, it is easier to understand the code.

Working with git and C++ in such a way is from a technical point of view possible. It is only a question if the user is aware of it. Most users think, that git needs branches and that C++ projects needs to be very complicated and contains a lot of pointers. No, they don't. There is no need for making things complicated. Small is beautiful.

I think, C++ and git were developed for prototyping reasons as a replacement for MS-Excel. The use case is to write new algorithm quickly and throw away everything if the user is bored.

June 13, 2018

Natural language interface for solving the frame-problem with a layered game-engine

Title: Natural language interface for solving the frame-problem with a layered game-engine

Author: Manuel Rodriguez

Date: 13. June 2018

Abstract: A planning algorithm like A* and RRT works only efficient for small problem spaces. Most robotics problems have a huge problem space. The answer to the mismatch is a domain model which is enriched with heuristics. In the early history of AI this problem is discussed under the term “frame problem” and means to describe the action model in a object-oriented programming language. For the example of an textadventure and the parking with a car, a game-engine is presented which is using natural language commands for storing domain specific knowledge.

Table of Contents

1 AI Planning
1.1 Knowledge based planning
1.2 The GIVE challenge
1.3 Automatic creating of a pddl file
1.4 Theorem proving for beginners
1.5 Storing domain knowledge in a game-engine
1.6 Plan stack in HTN-planning
1.7 Building a simulator for HTN-planning
1.8 Combining autonomous and remote-control
2 Example
2.1 HTN Planner
2.2 Car physics
References

1 AI Planning

Mindmap

1.1 Knowledge based planning

Artificial Intelligence in simple games like chess and more complex robotics domains can be realized with AI-planning:

“Planning is the most generic AI technique to generate intelligent behaviour for virtual actors.“ [8]

How a planning system looks like for a chess-like game is already known. It is a brute-force search technique in the game-tree. More complex games can be planned too, but the algorithm is more complicated. A combination of a hierarchical planning system together with machine-readable knowledge representation is the standard procedure. It means to store the game on different layers in a formal model so that it can be searched in real time. Well known forms of storing knowledge are STRIPS and PDDL. Both are languages for formulating a symbolic game engine. They are used for the high-level-layer of a planning domain.

But let us go a step backwards. The only known algorithm for solving a planning task is a search in the game tree. The algorithm is called A* or in newer literature RRT. RRT alone is not able to solve complex tasks, it must be implemented on different layers at the same time. The layers are depended from the game, they are representing the domain specific knowledge.

The bottleneck is, that for most games, no machine readable domain description is available. For example in a pick&place task for a robot, it is unknown how the story looks like and what the outcome of an action is. The programmer is not searching for a solving-strategy, instead he is searching for the game. That means, he must enrich the given game with detailed rules. Only these rules can be solved by the planner.

[12] calls the process “domain knowledge engineering” and describes a software (GIPO) which makes it easier to create a planning game from scratch. On page 4 an example is given. The original game is the “docker worker robot” and the programmer must define potential states of the robot like “load”, “unload”, “at-base”, “busy” and so forth to enrich the game with knowledge. The result is a pddl file which can be used in a planning task.

LfD & HTN

A human operator has an implicit domain model, he knows that before he can grasp an object, first the gripper has to be opened. This domain model is not available in sourcecode and has to be programmed first. To overcome the gap the "learning from demonstration" is the right choice for constructing a "hierarchical task network" from scratch.[3] [6] The human operator has a GUI in which he records a manual motion, and from this demonstration a machine readable ontology is created. This task model can be used by the HTN-planner for handling the task autonomously.

The aim is to transfer the knowledge from the human operator to the agent. The knowledge is similar to a walk through tutorial for games. It is a description of how to play a certain game. In a hierarchical task network the knowledge is formalized like a symbolic game engine. That is a software-modul which can predict future game states, e.g. robot-grasp-object -> object-isin-hand Usually the description is based on natural language. Instead of using simple variables in the gameengine like:

bool a,b,c

int d,e,f

the variables have sounding names like “bool object-isin-hand true/false” or “int distance-between-object-and-gripper”. The reason is, that the domain model is not primarily programmed for a machine but as help for the software-engineering-process. Creating a domain model is not a mathematical algorithm like A* but a software-engineering-task like UML and agile development. The result of this step is not a PDDL file or executable code, it is a human-readable paper which is called the specification of the game. The algorithm doesn't learn by itself the game, instead a version control system like git is used to bring the domain model to the next version.

Grounding means to combine tracking control with natural language.[9] The idea is not only to describe a task with “subtask1”, “subtask2” and so on, but with the correct words e.g. “heating the water”, “fill the cup”. It is not possible to describe a domain without using natural language. From a machine perspective perhaps, because all the words have no meaning for the computer, they are only labels. But for maintaining a task model by human engineers they need natural language. That means, a task model is foremost a dictionary and not a mathematical algorithm.

Language model

Under the assumption that every task model is based on natural language the question to investigate is how does the language model will look like for a certain domain? The GIVE challenge is trying to answer this by generating natural language instructions to guide a human for doing a task.[2] The idea is, that on the first hand, a dictionary is coded in computercode and can be executed like a game-engine, and the output of the engine is used to solve a task.

Perhaps an example will make the point clear. In [7, page 6] is on top of the page the map of a game visible. It is a normal maze game, in which the player moves around and can press button. Below the image the pddl description is given, which can be seen as a natural language game engine. It provides commands like “move”, “turn-left” and “manipulate-button”. The pddl description contains of two important aspects:

• natural language words, for example “turn-left” instead of a simple “action2”

• state-action-pairs, that means by activating an action, the system is in a new state

The overall system can be seen as a living dictionary. It contains one the first hand, words and action-names, and they can be executed by the user which brings the system to a new state. The pddl file contains knowledge about the domain.

1.2 The GIVE challenge

The term “Generating Instructions in Virtual Environments” (GIVE) is used for describing a programming challenge with the aim to generate natural language for a domain.[5] The output itself is usually produced by a PDDL solver, that means for a given current / goal pair the solver is trying to find a plan through the domain. The solver needs as a precondition the PDDL domain model.

A more colloquial description of the challenge is to compare it with a textadventure game which is enriched by a 3d map on top of the GUI. It is comparable to the early Point&click adventures in the 1990s in which the human-operator has some actions like “go north”, and must reach a certain point in the map while is doing subtasks. The interesting aspect in the GIVE-challenge is, that from a graphical point of view, everything is minimalist instead the idea is the task model and it's potential application to Artificial Intelligence. Another important aspect in the challenge is, that natural language is in the center of focus. That means, the idea is not only to play a game by an agent, but instead the idea is to generate natural language which guides a human who is already capable of understanding English.

From a programming point of view, the GIVE challenge is one of easier tasks. It needs less energy to solve the challenge compared with Starcraft AI or Robocup. The task is not so easy as programming a normal 2d computer game, but it can be mastered by beginners in AI.

1.3 Automatic creating of a pddl file

A domain model consists of natural language. From a technical side, the domain model is stored in a symbolic planning language like PDDL, OPL or ABPL. The first one (PDDL) is a classical language, while the second others have object-oriented features. The easiest way for creating a domain model is program the domain-description by hand. It is the same technique like a textadventure is created. A more sophisticated idea is to create the language model automatically from event logs.[13]

The idea is to record all the event in a section called “agent memory” and then construct out of the relationships the pddl / ABPL description. That means, at the beginning the agent has no domain model, he has to build one from scratch while he gets new experiences. In the literature the term “action model learning” is used. An action model is a symbolic game-engine which can be stored in a PDDL file.

Instead of explaining how to realize such a system, at first we must describe how to evaluate an action model. At first, we need a working symbolic game engine, for example an instance of an textadventure. This game-engine produces a stream of natural language. The user can input text, and the dialogue system gives feedback also in natural language. On the right screen there is an empty prototype. The prototype has the obligation to emulate the working game-engine. That means, the prototype observers the events, stores everything and after a while he is acting in the same way. The goal is to reverse engineer a symbolic game-engine.

1.4 Theorem proving for beginners

In the early history of Artificial Intelligence, theorem proving played an important role. In the context of the STRIPS planning system, such systems were capable to prove mathematical questions. But what is theorem proving exactly? Why it is so hard?

Theorem proving means basically to create a puzzle, for example Rubik's cube, and search for a sequence. The cube has a starting pattern, certain operations are possible and the goal is to bring the system into a goal situation. For example, make all sides clear, or bring only one side into a healthy condition.

Mathematical theorem proving works with the same idea in mind. There is a starting equation, a number of allowed operations and a goal situation. Like in the Rubik's cube example the idea is to search for plan, if such a sequence was found, the theorem was proven. In reality, theorem proving is equal to game-playing. A game is system which has allowed moves and it is up to the player to decide which moves he want's to execute. Automatically theorem proving works surprisingly simple. The so called SAT solver is using brute-force-search and that's all. If the problem is small like in the rubik's cube example, the STRIPS program is successful, in much higher state-space for example “a theorem prover for chess” is much more difficult to realize, because the number of possible plans is higher.

In the historic paper [4] of 1971, Nilsson introduces the so called “frame problem”. This means basically, that the STRIPS language is only a simple planning language and has no object-oriented feature for describing more complex problems. More recent planning languages like “A Better Planning Language” (ABPL) can overcome the frame problem.

Example

From school mathematics there are equations known plus rules which can used onto these equations:

a+4=7

a+4=7 |-4

a=7-4

a=3

The starting situation was an equation, and we have applied an operator to it. After the action, the equation is in a new condition. In the example, we selected the action manually, but it is also possible to formulate the problem in the STRIPS language. Such feature is integrated in most computer-algebra systems. The so called solver is playing around with the equation to fulfill a certain condition.

The “frame problem”

With STRIPS is a powerful planning language available for proving any theorem. The problem is now: how to formulate a robotics-problem in the STRIPS syntax? The question is discussed in the literature as the frame problem, because the assumption is, that frame based aka object-oriented programming is part of the solution.

A more precise formulation of the problem is given under the term “General game playing”. The challenge is here to invent a game from scratch. That means, as input the system gets a plan trace of checkers, or a textadvanture, and the system is able to construct all the game-rules and codes them into the Game description language. The sad news is, that until now “General game playing” didn't work very well. It is not possible to construct real games from scratch. But that is not a real bottleneck, because it is always possible to manually program in Strips, GDL or any other language. It is not necessary to use automatic programming for realizing a robot.

In reality, the frame problem is equal to a software-crisis. That describes a situation in which a demand for software is there but no sourcecode is available because of many reasons. The software crisis in the area of operating systems was solved with Open Source software, and the software crisis for the special domain of game-playing and planning languages will be solved with Open Science. For example, if programmer A describes in a paper an UML chart, a ontology and executable code for implementing a pick&place robot, than programmer B is able to reproduce the result and use this as a basis for a more sophisticated system.

The frame problem, the grounding problem and the general game playing challenge can all be solved with a better science communication which works manually and is working with Open Access papers and Open Source software.

1.5 Storing domain knowledge in a game-engine

Domain knowledge has to be stored in machine readable form. Most literature questioned how exactly the data-storage should be, for example in the PDDL format, in ontologies or in semantic networks. More important is the question how the result will look like if the domain knowledge is available. It will look like the game-engine of a textadventure. That means, it is possible to send a command to the engine, and the engine will output the future state.

A symbolic game engine which is controlled by natural language is able to predict future states. For example, a command like “grasp apple” results into the output “apple is in hand”. This logical reasoning has to implemented in a part of the software called game engine. A game engine can be programmed in a textadventure markup language, in Javascript and even with ontologies. In the easiest form a textadventure is programmed in Python with object-oriented feature. But in general it is only a minor problem, more important is the information that domain knowledge and a working game-engine is equal.

Let us describe what in the so called ICAPS conference is usually done. In most cases a domain like Blocksworld is converted into a pddl description. But that is not what the inner goal is. The more precise description is, that a textadventure for the blocksworld domain was created, and apart from PDDL this can be done in any other programming language. At the end, a game-engine must be implemented which can be filled with user commands.

But why is a textadventure so important, isn't it possible to write a normal game-engine with a graphical interface? The terms grounding means to connect natural language with actions. Grounding means, that the engine can parse a command like “grasp apple”. Every grounding results into a textadventure, because a textadventure is about the understanding of natural language. The only open question is, how to program such adventure for a certain domain.

Existing textadventure which were programmed since the 1980s contains domain knowledge in a machine readable form. They have a clean interface, it is possible to send a request to it, and the game-engine calculates the follow state. A textadventure can be seen as the inner core of a working robot control system. It is the part in which the domain knowledge is stored.

In the context of Artificial Intelligence the general name for domain knowledge which is stored in a textadventure is “dialogue system”. Dialogue system like TRAINS and FrOz are usually programmed to support the human player. They have a planning feature out of the box, that means, the engine was programmed with the goal to find a path through it by a solver.[1]

1.6 Plan stack in HTN-planning

[10] describes on page 8 the plan-stack of a HTN-planning system. A so called plan stack is a list of high-level commands:

1. task1

2. task2

3. task3

and so forth. The term “stack” is correct but it is a bit misleading, because the datastructure of storing a plan is not very important. Any other data structure for example a SQL-table, a csv-file and so on would also work great. The more important aspect is, that before building the plan stack the programmer needs to know the name of the tasks. In the cited example, the tasks are having to do with the soccer domain. Their names represents elements of the soccer game, for example, “pass-the-ball” or “shoot”. The knowledge is not stored in the HTN-planner directly, but in the domain-model which is used by the HTN-planner.

I would guess that the stack and the HTN-solver are the least important part of the AI-system. That means, to store a list of task-names in a computer-memory is a trivial programming task and can be implemented easily. The bottleneck in Hierarchical task networks is somewhere else. Like i mentioned above it is the domain-model. In general we can say, that the domain model is equal to a high-level game-engine. It is some kind of textadventure. The user has commands which he can send to the game-engine and the engine is executing them. The game-engine has the obligation to predict future game-states. That means, after the command “pass-the-ball”, the game engine changes the position of the ball to the receiver of the ball.

Basically spoken, the domain-model of a HTN-planner is equal to a computergame. Because of it's high-level-nature it is often realized as a textadventure but can have additional graphics to visualize the internal states. Let us go into the details for the Robocup case. A game for playing Robocup can be programmed on many levels. At first it is possible to play the game physical with real robots, while in the 3D Simulation league only a physics engine is used. In the 2D Simulation a different kind of physics engine (perhaps a 2d one) is in the loop, and it is also possible to simplify the engine further to a more abstract domain model, which is working without a realistic engine but only with a idealized physics engine.

For example, it is possible to use the SFML graphic library to program a soccer game which is not very accurate. The game is not realistic but is similar to a Pacman clone. That means, the players have limited actions possibilities and the simulation would look like a Atari 2600 game. The trick with a htn-planner is, to combine different game-engines in layers. For example on top a high-level textadventure, in the middle a graphical representation and on the bottom a 3d realistics physics engine. Optimizing these different layers is the key factor for a successful HTN-planner.

Model acquisition

Surprisingly many papers are discussing the automatic creating of action models for HTN-planning. The idea is to use plan traces and a very complicated algorithm which generates the action model. To be honest, automatic generation isn't working. If the aim is to get a real system for example to play a game, only handcrafted models can be used. Realizing an action model for a HTN-planner has to be done as programming in general. That means, that lots of man years has to be invested, and some kind of version control system is needed. That means, the action model can not be generated autonomously, it has to be programmed by man.

There are many working examples available for example from the gaming-industry or from the Robocup challenge. In all cases the workflow was hand-crafted. That means, a team of 10 programmers has written the documentation, they have painted UML charts and they have implemented the action model in a simulator. The process can be called a “Software engineering task”. Plan traces may be useful but there is no algorithm but only a project which can transform them into a executable action model.

What can be done automatically is to use a given action model with an automatic solver. If it is clear that the task contains the subtasks “pass” and “shoot” and if it is clear, what the follow state is, that an automatic solver is able to generate the best plan. It is the same strategy which is used in computer chess to generate the next move. It is done entirely by the computer and human intervention is not needed.

What is possible, is to use an existing serious game (which has an API) and run on top a HTN-planner. If the game contains the action model, the tasks and the events it is possible to calculate the plan. The programming effort is only minimal in such cases. But, here the programming effort has to be invested by somebody else who has programmed the original game. That means, the action model is not generated from scratch it was programmed to in an external software engineering project.

1.7 Building a simulator for HTN-planning

HTN-Planning itself is easy to understand: a given “action model” is sampled by a solver and the found plan brings the system into the goal state. The more demanding task is to create the action model. Let us describe in detail how does it look like.

On a programming level every “Action model” is realized with an ontology. That means there are some C++ classes which are containing methods and attributes. They can call each other. The purpose of the ontology is to realize a simulator, that is a piece of software which is mimicry the reality. A well known simulator which is used in the Robocup domain is a physics-realistic simulator. It is realized with a dedicated physics engine and calculates what will happen if the player kicks the ball. But a physics-simulation is not enough, in the context of hierarchical task networks, many layers of simulators are combined together. There is a need also for a high-level-simulator, a tactical simulator and so on. All of these software moduls can be realized with ontologies, better known as UML classes. The question is only how to transfer a certain domain for example a dexterous grasping task into a simulator.

Again, let us imagine what the benefits are. Suppose we have a layered simulator for a dexterous grasping task. On the lowlevel it is a physics engine, on the mid-level a simplified 2d game and on the high-level layer a textadventure which can be controlled in natural language. If such a detailed model is available, it is very easy to solve the game. All what the HTN-planner has to do is generate random actions on different levels and search for a certain situation which is called the goal. The only open question is, that for most domains, such a layered simulator isn't available. And without such system, the HTN-planner has nothing to do.

The open question is: how to program for a certain domain, a layered simulator in an object oriented programming language. Programming only a realistic physics engine is not enough, what a HTN-solver needs is a mixture of different simulators because this will speed up the search process. Let us make an example for the Robocup domain.

In the high-level-simulator we can execute a command like “move-to-ball”. After executing the command, the engine places the player direct to the ball. The cpu-consumption for doing so is exactly zero. That means, the game engine simply moves the player position and that is. There is no collision detection, no path planning and no check if the battery is empty. On a second layer, this command is resolved into detail commands which are more realistic and only on the lowlevel layer a realistic 3d physics engine which includes collision detection is needed. Such a layered ontology based simulator can be called the main part of a HTN-planner. It is a useful tool for generating a plan for an agent.

layered abstraction

Because the layer architecture is fundamental for HTN-planning, I want to a give an example. Let us suppose we are again in the Robocup domain. A normal simulator would implement a physics engine in 3D. In HTN-planning this is not enough. The idea is to search for a plan on different abstraction levels and especially on layers which are computation inexpensive. A 3d accurate physics-simulation is very cpu-demanding, it is the worst choice for testing out random plans in it. The better approach is to construct first an artificial game on top of the simulator. We can call this a simplified soccer simulator. It is only 2d based and has no dedicated physics engine. Instead it works like a board game. That means, the pieces on the table can be moved freely without any constraint. If agent #1 should go to a certain place we can manipulate his position directly. That means, the action “move” needs no cpu-consumption instead it is executed directly. On this simplified board game, it is possible to figure out basic strategy. For example, if we want that 3 players are on the same place, we must move all them to that place. The abstract game engine can answer certain question. All of them are abstract nature. It is a game on top of the original game.

Now we can combine both layers together. On top the 2d abstract game with a simplified mechanic and on bottom the elaborated 3d simulator which simulates an accurate physics engine. Before we will ever try out a move in the 3d simulator we are asking first the high-level-layer what the result will be. The overall domain must be implemented in at least 2 layers of different game engines. That is the basic concept of an action model behind a HTN-planner.

The high-level layer didn't replace the former 3d physics engine. An accurate physics engine is great for planning lowlevel actions, for example if we need to know how strong we must kick the ball until he reaches a certain velocity there is no alternative to a physics engine. It gives the exact result back. But, only a small amount of question of an autonomous agent has to deal with this detail question. In many other cases the agent needs advice if he should kick the ball to position A or B, no matter how exactly this can be done. This high-level-decision can't be answered by an accurate physics engine.

In reality, a domain is modelled by different game engines which are arranged in layers. At least 2 layer are needed, but more layers are better. This raises the problem of how exactly the different layer-engines has to be implemented. Manual programming is the only working choice. Because the question is similar to any other software-engineering project. As an input we have a system specification, for example “program a software modul which simulates strategy aspects of a soccer play”, and the result is after some man years working code which fulfills the specification.

The concept of “Action model learning” is discussed sometimes in the literature, but it is only a future vision not a technology which is available today. The hope, that an action model can be derived from plan-traces without handwritten code will be disappointed.

1.8 Combining autonomous and remote-control

It is obvious, that writing a software-package which controls a robot is possible. Because, a robot is a machine which can be driven by commands, and a software can generate these commands. It is only a detail question, how exactly such a program will look like. In real robotics projects this is the bottleneck number one. That means, on a theoretical level it is often clear, what the robot should do, but the users are not good in programming or have not enough time to realize the software itself.

A possible answer is a semi-autonomous control system. The human operator takes first a joystick for controlling an underwater vehicle, and only in the second step the robot is following the demonstration. [11] The system described in the paper is at first hand a teleoperated underwater vehicle. That is according to the definition not a real robot, but it is more a remote controlled toy-boat. The advantage is, that writing the software for a human-in-the-loop simulation is relatively easy. The second part of the project (autonomous control) is postponed to the step #2. That means, even this subproject fail, it is always possible for a human to control the system manually.

The second advantage of this step wise programming is, that a concept like “Learning from demonstration” answers the question, how the Artificial Intelligence will work. The task can be specified under the term tracking. That means, the robot has the obligation to reproduce the human task. Implementing this task in software is not easy, but it is possible. In the above cited paper the authors are using the PDDL language for the high-level-planner and DMPs (parametrized dynamic movement primitive) for the lowlevel part.

But let us go into the details. The boat can be controlled manually. That means the human operator has a joystick and lots of keys he can press, and the boat has a certain task, for example “docking”. The process is the same like playing a computergame, that means the operator must press some keys, the robot reacts and the water in the tank increases the difficulty because the fluid produces disturbances. It needs usually more then one attempts until the human operator has mastered the task.

The strategy of the autonomous system can be sloppy described as Decision support system. That means, the AI is not really capable of controlling the robot, instead it is a support for the human operator. Sometimes the concept is called supervised autonomy, because the human operator is always in the loop and must observe what the robot is doing. How exactly works the system? It is mostly a controller. That is a piece of software which generates control-signal for the robot. The controller consists of a lowlevel and a high-level part. It is similar to a GOAP-planner which is used in computergames for controlling non-player-characters but is integrated in the overall robot-monitoring-system. The basic idea behind a controller is a planning task. That means, from the current situation a plan is generated to bring the system into a goal state. It is the same principle like a chess-engine works. There is a game-tree, different branches and an evaluation criteria. The difference is, that a robot-controller is more complicated then a chess-playing software. It contains more submoduls which are doing the hierarchical planning process.

Another interesting feature of mixed teleoperation is, that from the software-engineering side the project is relatively easy. That means, a teleoperated controlled robot is mostly a hardware problem which is understood very well. If hardware is available like a robotarm, a camera, an object and a joystick all the devices has to be connected and the system is ready. The joystick can be replaced by a data-glove and the manipulator by a larger model. That means, it is not necessary to write complicated control software for the task or have a theoretical understanding of robotics to pickup the ball. The main idea is to postpone the complicated task of writing the software to later state, namely for the Learning from demonstration. It is mostly the result of playing around with the system and try to improve their functionality a bit. Or to make the point clear: it is complicated to fail a telerobotics project. The reason why has to do, that a human-operator is capable of executing nearly all task. Remote controlling a machine is a robust way of interaction with the environment.

Suppose the manual teleoperation works, what is the next step? The next more advanced form is to utilize a pddl planner in the loop. A task model is formulated in the pddl language and this calculates some decision in realtime. Such a system is not a real robot system, because pddl is only a basic form of planning, but it is a good transition from a pure human-controlled system into a semi-autonomous system.

2 Example

2.1 HTN Planner

The difference between a Hierarchical task network and a behavior tree is not easy to understand. Both concepts are about Artificial Intelligence and Game-programming, but in general a HTN-planner is superior. Perhaps a simple example in sourcecode makes sense to grasp the idea.

The figure [fig:Textadventure] shows a compact C++ class which implements a textadventure. The user can send to the engine different commands like “init”, “open-door” and so on. After the command is parsed, the engine changes internal variables. In the concrete example, the user must first open the door until he can enter the room. The C++ class is not a behavior tree which says which commands must be executed in sequence, instead it is a game-engine who accepts commands and prints out the internal state. The user can play around with the game and send different commands to the engine in the hope, that he will reach the goal. It is not the HTN planner itself, but the domain model which can be used by a HTN-planner. The question is: which commands must be executed to bring the system into a goal-state.

class Textadventure {

public:

std::string door;

std::string position;

void action(std::string name) {

std::cout<

if (name=="open-door") door="open";

if (name=="close-door") door="close";

if (name=="go-in") {

if (door=="open")

position="inside-room";

}

if (name=="go-out") {

if (door=="open")

position="outside-room";

}

if (name=="init") {

door="close";

position="outside-room";

}

if (name=="show") {

std::cout<<"* ";

std::cout<

std::cout<<"\n";

}

Textadventure() {

action("init");

action("show");

action("go-in");

action("show");

action("open-door");

action("show");

action("go-in");

action("show");

}

};

Textadventure visual

The interesting aspect is the hierarchy of actions. On the lower-level the player can open and close the door and on the high-level layer he can enter and leave the room. Let us make a practical example. The game starts with a fresh instance and the agent want's to enter the room. He types in “action("go-in");” but it doesn't work, because the game-engine prevents that the player can direct manipulate his position. Instead there is a build-in game-mechanics, that means on the low-level the player must first open the door until he can execute the high-level “go-in” command.

The problem is not to solve the game automatically, this can be done by a brute-force sampler in under a second. The problem is to describe the domain in a machine-readable form. That means to program the game-engines on the lower-level and on the higher-level. A game-engine is a software-modul which specifies what will happen if the agent executes an action.

2.2 Car physics

Suppose we want to program a car parking controller, what is the best practice method? At first, we need a simulator with a top down physics engine. The idea is not to use a real car, but testing out the controller in a computer-game. A realistic physics engine is also helpful because it allows to detect collisions. But there is only one problem: a working simulator isn't equal to a working controller. A simulator means only, that we can control the car with a joystick, but the aim was to get an autonomous car.

Let us investigate what the problem is. If the car is visible in the simulator we can test out different plans, for example a trajectory for parking. The question is: how does look the trajectory for getting a certain goal? This is usually answered with a brute-force sampling planner, that means we are testing out 1 million trajectories and evaluate what will happen. So we can generate the parking trajectory like a chess engine works. There is only a minor problem. The CPU-consumption would be very high. Even if we are taken a modern efficient physics engine like Box2d or bullet, a normal computer is not able to evaluate more then 100 trajectories in a second. That means, it is not possible for testing out 100 million trajectories, it would take years for doing so.

Car-parking-scene

The answer to the problem is called “hierarchical task network”. It is a planning technique which constructs a layered simulator and planning on different levels. Let us make an example. Figure [fig:Car-parking-scene] contains three scenes of a parking maneuver. The standard programming technique to implement the game is a realistic physics engine, because it is highly accurate. The alternative is to program a simplified physics engine from scratch. This engine doesn't have a collision check or sophisticated force calculation. Instead it works on a symbolic level.

The game starts with the init-keyframe. That means, there is a car and a parking lot. The user has different commands he can enter. The command “u-turn” does not activate a complex AI-controller which is doing the u-turn maneuver with a detailed trajectory. No, “u-turn” means only to change the direction of the car direct. That means, in the game the position of the lamps of the car will be changed to the new position, that is all. If the user enters “u-turn” the game-engine simply flips the car. The next command the user can enter is “parking”. Like in the example before, it is a very basic command, if the user enters “parking” the position of the car is simply changed to the parking lot. That physics engine works in a reduced form, it is similar to a pddl specification.

As a consequence we have a car-parking game, but a very abstract kind of game. Playing around with the game isn't generating the detailed trajectory, but only the subgoals. It is a way of storing knowledge and to enable high-level-planning. Let us now investigate how “parking” works. The user enters two commands: “u-turn”, “parking”. That's all. With the first command he flips his car, and with the second command he moves the car into the lot.

Until now the idea may look a bit useless, because we need a detailed motion controller and not a subgoal generator. But the described abstract game is an important step into this direction. We can use the high-level-game for constructing the low level controller. If we know, what the subgoal is, we can do the planning on the realistic physics engine. The question is not longer: what is the overall trajectory, the question is only “how to realize a u-turn”? That means, the high-level commands are equal to skill primitive. Calculating the correct trajectory for reaching out these skills is easier then planning the complete domain. It is possible to use a realistic physics engine like Box2D for calculating such subgoals.

References

[1] Luciana Benotti, "DRINK ME: Handling actions through planning in a text game adventure", XI ESSLLI Student Session (2006), pp. 160--172. https://cs.famaf.unc.edu.ar/~luciana/content/papers/files/benotti06.pdf

[2] Donna Byron, Alexander Koller, Kristina Striegnitz, Justine Cassell, Robert Dale, Johanna Moore, and Jon Oberlander, "Report on the first NLG challenge on generating instructions in virtual environments (GIVE)", in Proceedings of the 12th european workshop on natural language generation (, 2009), pp. 165--173. http://www.aclweb.org/anthology/W09-0628

[3] Aaron St Clair, Carl Saldanha, Adrian Boteanu, and Sonia Chernova, "Interactive hierarchical task learning via crowdsourcing for robot adaptability", in Refereed workshop Planning for Human-Robot Interaction: Shared Autonomy and Collaborative Robotics at Robotics: Science and Sys… (, 2016). https://people.csail.mit.edu/cdarpino/RSS2016WorkshopHRcolla/abstracts/RSS16WS_17_InteractiveHierarchicalTask.pdf

[4] Richard E Fikes and Nils J Nilsson, "STRIPS: A new approach to the application of theorem proving to problem solving", Artificial intelligence 2, 3-4 (1971), pp. 189--208. http://ai.stanford.edu/~nilsson/OnlinePubs-Nils/PublishedPapers/strips.pdf

[5] Andrew Gargett, Konstantina Garoufi, Alexander Koller, and Kristina Striegnitz, "The GIVE-2 Corpus of Giving Instructions in Virtual Environments.", in LREC (, 2010). http://cs.union.edu/~striegnk/papers/striegnitz_conference_lrec_2010.pdf

[6] Andrew Garland and Neal Lesh, "Learning hierarchical task models by demonstration", Mitsubishi Electric Research Laboratory (MERL), USA--(January 2002) (2003). http://www.merl.com/publications/docs/TR2002-04.pdf

[7] Alexander Koller and Ronald Petrick, "Experiences with planning for natural language generation", Computational Intelligence 27, 1 (2011), pp. 23--40. http://www.coli.uni-saarland.de/~koller/papers/ci-crisp-11.pdf

[8] Miguel Lozano, Steven J Mead, Marc Cavazza, and Fred Charles, "Search-based planning for character animation", in 2nd International Conference on Application and Development of Computer Games (, 2003). https://pdfs.semanticscholar.org/cba3/a6527f7a4e7209b463cc70bc300cda01b171.pdf

[9] Dipendra K Misra, Jaeyong Sung, Kevin Lee, and Ashutosh Saxena, "Tell me dave: Contextsensitive grounding of natural language to mobile manipulation instructions", in in RSS (, 2014). https://www.cs.stanford.edu/people/asaxena/papers/misra_sung_saxena_rss14_tellmedave.pdf

[10] Oliver Obst, Anita Maas, and Joschka Boedecker, "HTN planning for flexible coordination of multiagent team behavior", Fachberichte Informatik (2005), pp. 3--2005. ftp://ftp.uni-koblenz.de/pub/outgoing/Reports/RR-3-2005.pdf

[11] Narćıs Palomeras, Arnau Carrera, Natàlia Hurtós, George C Karras, Charalampos P Bechlioulis, Michael Cashmore, Daniele Magazze…, "Toward persistent autonomous intervention in a subsea panel", Autonomous Robots 40, 7 (2016), pp. 1279--1306. https://www.researchgate.net/profile/Charalampos_Bechlioulis/publication/283339628_Toward_persistent_autonomous_intervention_in_a_subsea_panel/links/587574d108ae8fce492823bc/Toward-persistent-autonomous-intervention-in-a-subsea-panel.pdf

[12] RM Simpson and W Zhao, "Gipo graphical interface for planning with objects", International Competition on Knowledge Engineering for Planning and Scheduling (2005), pp. 34--41. https://pdfs.semanticscholar.org/5ef6/d24c8f496963a40b4939b5f3e012c977edca.pdf

[13] Qingxiaoyang Zhu, Vittorio Perera, Mirko Wächter, Tamim Asfour, and Manuela Veloso, "Autonomous narration of humanoid robot kitchen task experience", in Humanoid Robotics (Humanoids), 2017 IEEE-RAS 17th International Conference on (, 2017), pp. 390--397. http://h2t.anthropomatik.kit.edu/pdf/Zhu2017.pdf