October 29, 2019

How to control online communities lies Stackoverflow, Wikipedia and even Arxiv

In the domain of robotics there is term available called “underactuated system”. It describes a control problem which is working passive. In the reality there are two sorts of ships available, he first category is driven by a diesel engine. The captain defines the direction of the boat. In contrast, a sailboat can only controlled partly. The next action depends on the environment. That means, the wind direction and the action of the captain have to understand both.

The technical term for controlling these system is model predictive control. It means, that the user can't bring the system directly into a goal state, but the more interesting question is what the system will do without the users action. Let us given an example. If the wind isn't blowing, the sailboat won't drive forward. The captain is not able to give the command “move ahead”. The same principle is working in online communities in the internet. What a single user can do is very small. The more interesting question is, what the overall community will do. Controllling such systems depends on a realistic prediction. If somebody is able to describe an online community in advance he is able to control the system. He can answer what-if questions.

What makes the situation more complicated is, that most users have a certain awareness what will happen next, but this knowledge is made explicit. It's stored in the mind which makes it hard to reproduce the steps. For realizing Artificial Intelligence, the ability to reproduce knowledge is very important. Only wisdom which is stored in a computer program can control a system. If the aim is to describe the inner working of online communities with an AI perspective, there is a need to formalize the inner working.

In the easiest case this is done with a walk through tutorial, known from a computer game. This is fulltext information written for humans which explains them what to do next in a game. Writing such a tutorial for online communites like Stackoverflow and other is hard, but possible. One interesting question is who decides about the rules on these websites. The interesting answer is, that the existing rules are not describing the inner working of an online community. But they are only the help section of provided by the website themself. That means, in the WIkipedia help section large amount of text is provided which explains to the newbies what they should do and what not. All these guidelines are correct in the sense, that it's the official wikipedia help section. But apart from the help section there is a need to provide an additional tutorial which describes the inner working more realistic. A typical example why this is needed are edit conflicts. In the wikipedia help section the possibility of an edit conflict is described only vague. According to the help section, the users should avoid in producing such conflicts and if they are unsure they are allowed to ask.

But, this description doesn't describe the phenomena from an academic point of view. The better source for information are existing academic papers with the title “edit conflicts in the wikiipedia plus an extensive literature list with 100 entries”. Such a paper is the more elaborated way in describing the Edit conflicts.

To get a realistic understanding of how an online website is working, it make sense to observe the situation from a neutral point of view. Suppose, a newbie not familiar with the project already, sign in for a new account and is posting his first text. What will happen next? It's the obligation of the tutorial to explain what will happen then. I'd like to give an easy example. Suppose asks a question about the Java programming language at Stackoverflow, but he has chosen the tag C#. What will happen next is for 100%, that an admin of the website will change the tag to the correct one.

Or let me give a second example. Suppose, a programmer from Japan asks a question at the normal stackoverflow website in japanese. It's for 100% sure, that the post gets deleted because the only allowed language is English. It's possible to describe what-if scenarios from lots of domains. What all the use cases have in common is, that the fictional newbie user isn't give orders to the admin. He doesn't even know them. Instead he is doing his action and the Stackoverflow admin will do his job. That the reason why such systems are called partly controllable. On the first hand, the user has no control over the situation but at the same time he has fully control.

The amount of control which can be achieved is correlated to the understanding of an online community. Expert users of WIkipedia know in detail what the opponent will do with a certain edit, while newbies have no such prediction model. That's the main reason why some users are successful at Wikipedia while other not. The reason is not located within a certain user account, in the sense that posting from a concrete long term users are accepted. But it depends on the prediction model of the user.

That means in detail: it's not relevant if content gets posted from the normal account or from an anonymous user. Wikipedia and other website will act the same way. What is important is the question which kind of edit are generating a sense and which not.

Crying for a broken broomstick

customer witch: “I don't know how this was happening, but the new broomstick can't fly anymore.”

helpdesk: “Please hold”

helpdesk to grand witch: “I have a crying customer on the phone. It's the second one this day.”

grand witch: “Very good. I can create a ticket.”

Helpdesk: “This will solve her problem for sure.”

grand witch: “Why don't you create the ticket?”

helpdesk: “I will tell you why. Either you find a solution how to make a broken broomstick fly within 1 minute. Or we are done.”

Grand witch: [tears are running over her face]

helpdesk to customer: “Fixing a broken broomstick doesn't fulfill the minimum standards. It's your fault. Bye."

October 28, 2019

A witched helpdesk

In the following recording of a fantasy helpdesk, the situation escalates quickly because of many unsolved problems. The users are communicating the wrong way back and forth, and this will make the workplace a toxic one.

customer witch: “Last week, i've bought a broomstick and it's broken.”

Helpdesk: “One second, I will search for the grand witch.”

Helpdesk to Adrianna: “The new collection of broomsticks has a malfunction. I have an upset client on the other line.”

Adrianna to Helpdesk: “Perhaps i should create a ticket?”

Helpdesk to Adrianna: “No, it's ok, thank you”

Helpdesk to customer witch: “Sorry, but I'm not qualified to give an answer. Have a nice day.”

next customer: “Hi, my name is Gwen, the good witch from the yellow land. I have a small problem with the magic spell for the candle.”

Helpdesk to Gwen: “Please hold”

Helpdesk to Adrianna: “Hi Adrianna. Do you have a minute for me?”

Adrianna to Helpdesk: “What is the problem?”

Helpdesk to Adrianna: “I only want to explain, that there is second issue, but this time it's a missing magic spell. Can you help me?”

Adrianna: to Helpdesk: “Sure, if you can provide some detail information which spell exactly is needed.”

Helpdesk to Adrianna: “Do you want to explain, that each spell is handled differently?”

Adrianna to Helpdesk: [Hangs up]

Helpdesk to Gwen: “Sorry, i don't know what the spell is. Bye.”

How to infiltrate an overtake an existing wiki

Wiki systems are a powerful tool for collaborative working. They allow to manage a version history, provide user accounts and makes it easy to use the wiki syntax to format text. During the upraising of the first wiki systems the technology was understand from a technical perspective. Computer experts have discussed which kind of linux server is a good choice for hosting a wiki software, and how to improve the efficiency of the sourcecode. These tutorials are available today but the more interesting approach in describing a wiki is from it's social perspective.

The assumption is, that the underlaying hardware and software works properly, was protected against remote execution of terminal commands with PHP scripts and some users have provided content for the wiki already. To infiltrate an existing wiki it's important to have a goal what has to be archived first. It has to do with squiring a needed ressources which is usually not available for new users. This resource is not the root access to the underlying server, but within the social community it's the ability to delete content in the wiki.

Each Wiki installation have rules who to delete content in the system. In most cases, it's done with a deletion debate. Winning this debate is equal to reached the first goal in the walkthrough tutorial. Let me explain it from a different perspective. In a wiki there are two sorts of users, Users who are allowed by the community to delete content and users who don't. Perhaps a simple example would explain the idea.

Infiltration of a wiki works only by partly fulfill the exception of the opponent. The new user has to pretend, that he is interested in contributing his effort to the wiki project but in reality, he has different plans. If the community beliefs, that the newuser is doing something important in the wiki project, they will trust him. The first step for the newbie is to search in the wiki for an article which looks outdated and is maybe spam. Deleting such content would improve the overall quality and produces an added value for the wiki. So the infiltrate has to put the page to the deletion discussion and provides extra information why this page can be deleted.

In response to the action, the wiki community which includes the admin, has two options. Either they can recognize the deletion request as spam by itself, or they can accept it. If they are doing so, the page gets deleted. This is equal to that the community has voted for the newbie. It's a positive interaction.

In the next step the deletion process is repeated. As a result the user gets a certain status in the community as somebody who flags spam which is a productive contribution. To improve the situation, the user has to observe which kind of content in the wiki is wanted and which not. He has to anticipate the judgment of the overall community.

Some critiques would argue, that putting content on the deletion list is not so important, and the more valuable contribution is adding new content. The funny thing is, that most users (and sometimes even admins) are not aware of the importance of deletion. They belief, it's a minor task which can be ignored. But in reality, deletion content has a top priority in the wiki. Most wikis doesn't delete any content. Because the overall project is small and the fear is, that if some pages are gone, the wiki project gets failed. The opposite is the truth. A healty wiki project has a lot of deletion request and response discussion. Not because of technical requirements but this is equal to create a group consensus. If certain content gets deleted, it's equal to quality judgment. That means, the wiki is moderated, and this is an advantage. Wiki systems are working interactively. The question is not what a single user likes to post, but the question is, what the purpose of the overall wiki is.

In the easiest case, the user is allowed to delete a page which is clearly identifed as unwanted. But suppose, all the content is deleted already which has a poor quality. In the next step, the user has to convince the community to change their opinion about normal content. That means, the wiki contains of 10000 pages, and all of them are well written. The user is trying to delete one of them, and before he can do so, he needs the approval of other users. Otherwise, it is equal to one man show which gets identified as vandalism by a single user. Instead of convincing the group that a certain page has a low quality, the better idea is stay passive and scan the discussion pages for arguments against existing pages. For example if a certain page was criticized by two longterm users in the past as low quality, it make sense to put exactly this article on the deletion list. The antipathy against the content is already anchord in the group, but it wasn't made explicit. The probability is high, that the group will approve the deletion as well. This increases the status of the infiltrator further.

In the third step, the new user becomes more self-confident and is trying to manipulate the target wiki directly. This can be done on the discussion page of an article. The question is, what should the newbie write to the page? To answer the question we have to describe what a wiki is trying to archive. The goal is to mirror and aggregate knowledge which is already available in the Internet. Between a page in the wiki and the overall Internet there is a knowledge gap. The knows only parts of the overall information. This gap can be utilized in the discussion. A valuable contribution would be to post a link to useful external sources and explain who to integrate the information into the wiki. The alternative would be to explain why the wiki is wrong and how to reformulate existing content. It's important to produce only a few edits and give the community time to read the comment. The reason is, that a new user has no reputation and as a result his actions are observed carefully.

October 24, 2019

The bottleneck in desktop microfactories

In the last years a new technology was invented, called MEMS. MEMS stands for ultrasmall motors and nanofactories in the size of a table. A well known consumer product which is using ultrasmall motors is a DLP projector which contains 2 million magnetic actuators for controlling the mirros. From a mechanical standpoint, MEMS technology is available at mass scale and for a small price. It is possible to print an array of motors and sensors.

The bottleneck isn't located in the hardware side but has to do with controlling of the motors. Suppose there is factory which contains of unlimited amount of working cells, transport systems and assembly stations. Each of the units contains of 30 degree of freedoms. A mechanical test will show, that the factory is working great. After pressing a switch, the motor drives in the correct position. The problem is, that without a control program the MEMS factory is useless. It is same situation that somebody creates in a mechanical simulator a robot but he doesn't know what the correct program for the robot is.

In contrast to larger mechanical systems, for examples a real factory or a car, it's not possible to control a nanofactory with human intervention. Somebody may argue, that this is not a problem, because the software is doing the same like a printer driver. The software sends a signal to the motors and then the factory is working. Unfortunately this kind of software is very hard to program. The problem is equal to controlling a real time strategy game. The difference is, that a microfactory is much more complex than a Starcraft AI Simulation.

The good news is, that the bottleneck can be localized precisely. It's not the MEMS hardware itself, which makes trouble, but the missing software for controlling million of servo motors in a game environment. The problem is solved, if algorithms are available which are able to play real time strategy games which are very complex. That means, the player has to direct many hundred units at the same time to fulfill a certain task. From the computer science perspective, it's not clear which kind of software is needed here. The only thing what is known is, that some kind of Narrow AI has to solve the task. That means, the Narrow AI gets the sensor readings as input and produces the control signals as output. But what is in the software is an open problem.

Let us go back to the MEMS array from a DLP projector. Such a device is sold for under 10 US$ in electronics store. The user gets in exchange for the money a huge number of motors which can be adjusted individually. In the task for mirror the light in the red green blue colors it's obvious what the advantage is. But it's not possible to use these motors for anything else. Not because of the hardware, which is great. The problem is, that before the motors can work together they need an input signal. And this input signal has to be determined by an algorithm. For testing reasons, it's possible to send a random signal to the motors. The problem is, that they won't produce any useful with it. It's the same situation if a programmer tires to control a line following robot with a random generator. It won't work. The robot won't do what he should. If the number of servo motors is higher, the problem will become more complicated. At the end, the programmer sitts before hundred of servo motors which are working technically great, but the overall robotics project failed.

To understand the problem in detail, we have to focus on a much smaller problem. Suppose an hardware engineered has built a mini robot with only 2 servo motors. One is mounted on the left side, the other on the right side. On the first look, the robotics project is done. It looks great. Surprisingly, the robot can't do anything. Creating the software on top is not a detail problem, but it's the only problem which is available. That means, wiring the motors together and use electric current is the easiest part of the game. It will take less then 5 minute in doing so. Writing the software to move the robot on a straight line is the harder part, which will take 20 years and longer and in most cases it fails.

It's some kind of paradox. On the one hand, the engineers have developed advanced hardware. They are able to build robots in the size of 10 centimeter, 10 millimeter and even 10 micrometer. They can use as material normal metal, but it's also possible to build robots out of wood. What is missing is the part which is not visible, it's the software, better known as Narrow AI. The missing software is the reason, why MEMS nanofactories are not available.

October 20, 2019

Can the railroad struggle in the future?

A look into the numbers shows, that the US owns a small amount of locomotives, which is 26546. In comparison around 250 million cars are on the road in the United states. Suppose, the future for the railroad looks bad, to which amount can the number of locomotives shrink? Right, there is not much room to the bottom. The minimum amount of locomotives can't become negative, but 0 is the lowest amount of rail guided vehicles.

How could it be, that the railroad has a low priority, while all the money is spend for buying new cars? Has the car a higher productivity, can it be operated without human drivers, is the car using electric energy? It's exactly the opposite. The car is preferred in the US because it produces to much problems. The car industry is some kind of pyramid building game. The idea is to produce extra work which is not needed. Instead of using a single freight train hundred of trucks are used. Each of them needs an operator, the trucks have to be built first and they need a lot of repair. Additionally, all the trucks need more energy than a single locomotive. It's a very luxury game in which all the money and all the manpower is thrown away for nothing.

Inefficiency is a common feature in state controlled economy. If a society prefers the car over the railway it's a sign for missing incentives. That means, the stakeholders in the game have no motivation to reduce the fuel consumption or doing the same task with less man power. This prevents technology advancement. The car has become a symbol for missing progress. Producing more cars means basically that nothing will change and that efficiency has a low priority.

According to the latest statistics, the US car industries has around 8 million employees. In contrast, the railway industry employees only 1/30 of that number. That means, 97% of the money and the manpower goes into the car industry. The funny thing is, that no matter which kind of technology gets invented in the next 30 years, it's not possible to increase this value over 100%.

It's interesting who the car industry has become so powerful, that the value was constant over decades. Nobdoy has asked, if all the invested money into cars make sense, it has become common sense, that new cars are needed.

Let us listen to the car advocates, how they would like to solve future demands in transportation. The idea is, that the car industry is not big enough. The number of 8 millions employees is too low. So there is need to put more money into the sector. The ratio from 1:30 (railway vs. car) can be adjusted into 1:100, which means, that 99% of the overall ressources are use to buy new cars and building new roads. This won't solve the logistics demand, but it will make the car industry stronger.

Suppose, the hidden agenda is to waste all the manpower and all the money into an outdated transportation system. Then the car industry is a here to stay. It is not possible to make a car more efficient in term of fuel consumption or reduced human power. Every single truck needs a human driver and lots of fuel. It's the most luxuary transportation device ever. Only rich countries have enough ressources to use cars for everything.

Increasing the capacity

Suppose, a freight has to be transported from place A to place B. The dominant vehicle today is a truck. This kind of technology is so common, that nobody asks if it make sense. Before a truck can start, a truck driver is needed.

Transporting the freight with a railroad works a bit different. Because on a single locomotive many trailers can be added. No driver is needed, except the single one who is already available. The extra amount of freight won't need axtra human manpower. The funny fact is, that under the condition of autonomous driving nothing with change on this calculation. From a technical perspective it's not possible to build self-driving trucks. Automated cars will need always a human driver in the loop.

That means, especially with a perspective of the next 20 years, the train has a big advantage over trucks. It needs less manpower.

Peer review works by deleting content

It's a bit hard to define what an academic peer review is. A good starting point for providing an answer is to compare different kinds of existing reviews in other domains. One example is the deletion debate in Wikipedia, another well working system is the review section in Stackoverflow and last but not least a peer reviewed academic paper is working with the same principle.

Let us go through the examples step by step to emphasize the inner working. In Wikipedia and Stackoverflow as well the idea is, that an admin deletes an article from the system and before he can do so, he needs 4 or more users who pressed the deletion button as well. What the group is allowed to do is to collectively identify and remove unwanted content. Somebody may ask why this make sense, if the server space of the Wikipedia is large enough for holding all information. The reason for deleting content is given by the quality requirements.

An academic paper is working by the same principle. A group of people decides in a voting which content gets deleted. The result is always the same. The original author of the Wikipedia article, the Stackoverflow question and the academic paper will become upset. A deletion is an escalating process which produces a conflict. Because of this reason, a deleting voting will split the community. Somebody argues pro deletion, while other are trying to safe the content.

To simplify the description it make sense to focus only on a technical side. In the Internet age, all the content is stored in a SQL database. Removing an entry from the overall table needs administrator privilege. If the post or the paper was deleted, it is no longer available on the webserver. The bottleneck for the overall decision process is, that some users are available which are voting for deletion and other users have to submit content which gets deleted.

With this background knowledge in mind, it's possible to describe a generalized academic journal. What is needed is the ability to create new user accounts, to submit new content and to vote for deletion of old content.

October 18, 2019

How to solve the traffic problem in India

On the first look, India has a lot of trouble with traffic accidents. The roads are crowed and many accidents take place. Because the problem is so big, it make sense to think about how to overcome the issue. A deeper look into India's traffic situation shows, that all sorts of transportation is used on daily basis: bicycles, private owned cars, trucks and of course the railroad. India has not only crowed cities, but it owns the second largest railroad network in the world. How can it be, that the basic logistics problems remains unsolved?

From complex problems it is known, that it is to simple to argue, that more investment are necessary, because the money is not available. Instead the question has to do with the right priorities. That means, a fixed amount of resources has to distributed to different places. And the question is, what is the perfect investment in logistics.

Before we can answer the issue, a short look into the statistics make sense. Unfortunately, only worldwide numbers are available. The amount of worldwide cars is 1.3 billion, while the amount of locomotives is only 65000. Most of the locomotives are located in U.S., U.K,, Germany and France. The amount of locomotives in India is very little. But why is the total number of cars so big, while the number of locomotives is so little? Sure, a locomotive costs more than a car, but if we measure the vehicles with a pricetag the current investment in private cars is 50x bigger than for locomotives. And exactly this is the problem. The relationship between cars and locomotives is wrong.

What the world in general, and India in detail has to do, is to invest more money into locomotives, and reduce the investment in cars. This will solve the transportation problem. The reason is, that a locomotive has a better cost to profit margin than any other transport vehicle. The railroad nees less energy, can transport more freight and needs less human manpower than cars. Surprisingly, most countries including India are living in the luxury situation to ignore the railroad and invest the available resources into cars.

From an abstract point of view, the logistics problem of a countries follows the same rule like a computer game. The computer has to deal with limited ressources, the amount of money he can spend is restricted and he has to decide, if he likes to buy 1 locomotive or 400 new cars. If the invests the money in the wrong ratio he will loose the game. Loosing means, that a countries is not able to transport all the freight to the destination and too many accidents take place.

In a pdf paper the exact number of locomotives in India are given.[1] According to page 3 the total number is 11461. 50% of them are Diesel engines, and 50% are electric driven. In comparison to the population this number is very small, and according to the amount of cars in India this is also small. That means, the railroad has in India (like in many countries in the world) an extraordinary low priority. In contrast, all the money is put into private owned cars. Does this make sense? No it doesn't and as a result, many problems can be observed in reality.

Now we can imagine what will happen, if some of the locomotives need repair. This situation is common for all technical machines and as result the amount of available locomotives will become much smaller. Are the reamaining locomotives able to transport passangers and freight in a large countries like India? No they don't. That means, the capacity is too low.

India is not the only country who gave the train fleet a low priority. It is bit difficult to get exact numbers but according to Google the total amount of trains in China is only 21000. The problem with the chinese railroad network is known. In the 1980s they have invested nearly nothing into the railroads, and since then the situation doesn't changed that much. Today, the railroad system in China has a low priority, while private owned car have become the top priority. According to the latest statistics, China has around 250 million cars. Perhaps it make sense to compare the priorities according to financial aspects.

- 21000 locomotives, each costs 10 million US$ = 210 billion US$

- 250 million cars, each costs 25000 US$ = 6250 billion US$

- the relationship is 1:30, which means from 100 US$ in total only 3 US$ are spend for the railroad, while the rest is invested into car-based transportation

Solving the crisis

... has to do with adjusting the priorities. There is no need to spend more money into transportation but modify the priorities into the direction of a railroad. The question which has to be answered is, how many US$ from 100 US$ in total should be spend for locomotives and how many for cars. The current situation is, that the railroad gets nothing, and all the money which is 97% goes into cars.

[1] INDIAN RAILWAYS, FACTS & FIGURES 2016-17, http://www.indianrailways.gov.in/railwayboard/uploads/directorate/stat_econ/IRSP_2016-17/Facts_Figure/Fact_Figures%20English%202016-17.pdf

Why a robotic workcell make sense

The forefront in industrial automation is given by robotic workcells. It's a cage, in which a robot is doing tasks. Combining many cells together results into an assembly line which can do more complex work. Right now, robot workcells are used seldom in the reality. The reason is, that it's unclear if the concept is the right one. To investigate the pros and cons of a cell we have to look at the robot itself. It doesn't look like a real robot, but it's movement are very mechanical. In contrast to a real robot, the machine is not able to interact with humans, and everything what the robot is doing looks unnatural.

This is the most dominent reason, why robotic workcells are rejected by engineers. Their goal is to program more advanced systems which are looking human like and can interact with humans. The open question is, if this kind of behavior make sense and perhaps it slows down the technology. In general, there are two different sorts of robots available. The first one is hidden in a cage, which means, that humans can't interact direct with the machine, while the second category of robots is able to interact with humans.

The interesting fact is, that only non-interactive robots can be used for factory automation. If the costs of the assembly line should be reduced and if the aim is to replace human workers with machine the only way in achieving that goal are robots in a cage. These systems are designed with autonomy in mind. The idea is that the human operator press the start button, and then the machinery is doing all the work.

The other group of robots, which are working outside the cage next to humans are not designed for maximizing the productivity, but they are useful for story telling. A typical example are social robots which are used in a screenplay as an actor near to a human actor. This kind of machine needs to be interactively. That means, it is working in the same space like a human and it make no sense to start this machine if no humans are in the environment. On the first look, intteractive robots are more advanced, but they have a poor performance in producing goods.

Let us suppose, the goal for a robot is fixed and it's equal to produce something in an assembly line. The workcell which is a robot in a cage is the only sense making design for this purpose. The main advantage is, that the robot is working in a simplified environment. Every object is on the right position. This allows to create software which is doing the same task over and over again. The result is a typical robot movement which is boring to observe. Most cage centric robots are hidden in a box. That means, the human is not monitoring the movements anymore, but for the human only the output is relevant. The system is working similar to a 3d printer. He gets some input and produces some output. And what the robot is doing in detail is not important. Such kind of automation pipeline is a here to stay. That means, robotic workcells are the best practice method for factory automation. There is no need to program the robot in a human like way or allows the program to become more flexible.

There is no need, that humans and robots will become friends, this allows the robot to do what robot can do best, which is executing the same task with high precision. Nobody would ask, if an office printer, a cooler or an elevator can become more human like. Because there is no need to communicate with these machines. They can be hidden behind a wall and the only interesting question is, if the output of the machine has a high quality. That means, the inner working of a printer is less important, but only the produced sheet of paper is under investigation.

In the case of interactive robots which are working outside of a cage the situation is the other way around. For a human like robot it is very important who the arms are looking and if the movement is natural. In most cases, highly interactive robots are equal to social robots. They are designed as an cute animal which has eyes, emotions and the ability to understand natural language. This improves the human machine interaction. Or let me explain it from another perspective. A social robot without eyes, without a coat and without emotions doesn't make much sense. The social robot is byitself the interface to a human. That means, the human approaches the robot and the robot should smile and come closer to him.

It's important to know, that social robots are not useful for factory automation. A negative example would be an office printer which has emotions. He can decide by it's own which kind of quality is required. This kind of skills are not needed for an office printer. It would lower it's performance. In most cases, a robot design has to given in advance. An robot can become a cage robot or a social robot but not both at the same time. This design decision has to do with the requirements. For example if the requirement is, that the machine should print 100 papers in a minute in high quality, the resulting machine can only be realized with a certain design. To reach this high output, the machine will work mainly in the repetitive mode which very robotic. The problem is, that this will prevent, that the machine can interact with humans or has the ability to express emotions.

industrial automation

Industrial automation is not about robots itself, but a robot gripper is only a tool for the assembly line. An industrial workcell which is operating on the assembly line looks different from a robot. The main difference is, that all the tasks are repetitive. There is no need to make decisions in a complex environment because the factory provides a simpler task. All the objects have a certain position and the program is always the same. The term robotics doesn't fit to such tasks, but it's an automation problem. There are different machines connected with an assembly line and the task to program these machine for a maximum throughput.

October 17, 2019

Are self-driving cars make sense without a driver?

A naive understanding of self-driving cars is, that the human operator is no longer in charge and the drive can steer by it's own. Thinking this idea further results into a car which is able to drive without a human driver to the destination. That means, the owner can stay at home and sends to his a car the goal position somewhere in the world and the car will do all the actions by it's own.

This description emphasizes the economic benefit. The idea behind autonomous driving is – according to the description – to utllize existing vehicles more efficient. The idea is to replace human drivers with a software and this will reduce the costs in truck driving and taxi companies. It's important to know, that such kind of economic outlook can't be realized with the technology. Self-driving cars are working very different from it. They are not improving the car but they are motivating humans to become a better driver.

A practical usecase in which a self-driving car can show it's full potential is, if the vehicle is attended by 4 persons at the same time. 2 on the front seat, and 2 on the backside. The task for the humans is to learn driving. And the person who has the most problems is asked to take the position behind the wheel. Basically spoken it's a situation known as a driver school. The self-driving car is a computer game which is more entertaining than a PC based driving school. It allows to check the own skills in a life situation. This is especially important in questions of the allowed speed, the distance to other vehicle and which or the vehicles on the street are allowed to drive first.

What the human in the car are able to learn is if they are familiar with the traffic rules. They can compare their own decision with the car's decisions and open problems can be discussed during the ride. The bottleneck in the game is not the car. Because the car is only a machine who cares about nothing. The only thing what is important are the humans in the car and their ability to take the right decision.

Let us take a look why every year around 1 million people are dying in traffic accidents. It's not because of external reason which are located in the technology, but the main reason for nearly all of the accidents is that the humans are driving the wrong way. They are driving too fast, they are not respect the traffic light and they are not focussed on secure driving. Basically spoken, the human drivers are not educated well in safe driving. And exactly this issue can be solved with self-driving cars.

A self-driving car is a personal driving school which provides feedback to the driving style of the human. The car rewards, if the human is driving with the right speed and do not crashes into other vehicles. It's a serious game with the goal to educate the human in driving secure.

Crash statistics

A common statistics to measure the crash probability is to ask how many miles a certain has driven before an accident happens. The problem is, that exact statistics are not known, because a crash is always an exception. According to a blog entry a normal car has 4 crashes for 1 million miles, while a self-driving car has 9 crashes. https://carsurance.net/blog/self-driving-car-statistics/

But in the internet there is also a different statistics available in which the relationship is the other way around. A valid statistics is known for sportscar. According to the facts, they have a greater crash rate. The reason is, that Sports car are driven more risky. The question is now if the cars available today have to be categorized as sports car or not.

In general it make sense to estimate, that it's not up the car how secure it is, but it depends on the driver. Even a sportscar can be driven very safe, unfortunately this is seldom done. So we have to ask under which conditions, the human is driving carefully and under which condition not. Let us make a simple thought experiment. Suppose the idea is to drive 1 million miles without a single crash. In theory, this is possible, all what the driver has to do is to drive in the normal speed and take care of the traffic lights. The open question is how to educate the humans so that they will drive in this safety first style. If all the human drivers in the world can be convinced to drive safe, the crash rate would become much lower. The interesting question for self-driving cars is, if they are motivating the human driver to drive more aggressive or more carefully.

A nearly universal robot control system

Most robotics problems can be summarized into a number of given joints which have to be controlled according to sensor inputs. This description fits well to a single robot arm, but describes the situation for a self-driving car and a complete nanofactory at the same time. The difference is the amount of joints which can be moved freely. A complex domain like a fully automated factory has many hundred of joints.

The problem is, that from a technical point of view, this kind of description is not the answer to the problem, but it's the problem. Because it's unknown what how to program the software which is able to control the robot's joint. A possible path in solving the issue is object oriented programming. The idea is, to describe a domain with the help of classes which are arranged in an UML chart. This object oriented system is not able to control the robot but a more easier to realize system is only capable of annotate the actions of a human. Let us make a short example.

The task is a robot arm who have to fulfill a pick and place task. At first, a newly Python file is created on the harddisc which contains all the object oriented classes which is a class for the trajectory, the high level sequence, the list of motion primitives, the position of the object and so on. Now the human operator sends action signals to the robot simulation. The robot control system has to identify these signals according to the UML diagram. It converts low level input into a semantic description.

The problem is to formalize a domain into an UML diagram. For classical software engineering the idea of using classes is widespread used. But for modeling a domain for example a pick&place task the idea is new.

Benefits of self-driving vehicles

Autonomous driving doesn't mean, that the car can drive by it's own, but it's a security feature which helps to reduce the accident rate. In cars from the past, only the human driver is in charge of adjusting the speed. The result is, that sometimes, the human are overestimate their ability to handling the sky and they drive in the urban area faster than it's allowed. A crash with other vehicles is the result.

Self-driving cars are monitoring the current speed, and compare the value with the allowed speed. This allows to give feedback to the human driver. That means, the car recognizes if the human driver is too fast and can ask him to reduce the speed. This won't prevent any mistake, but it helps to educate the person behind the wheel.

If somebody is an expert driver he won't need a self-driving car. But if the driver is not familiar with driving at all, it make sense to use the additional help of a computer-based tutorial system. It will tell the driver what the correct position in the lane is and what the allowed distance to other vehicles should be.

The disadvantages of self-driving cars is, that the feature is very expensive. To provide the mentioned guidance a lot of cameras and lidar sensors are needed. Additionally, a full blown onboard computer with the ability to interpret the raw data in realtime needs a lot of energy. Only expensive cars are equipped with such capabilities. The question for the engineers is how to reduce the overall costs, so that a driverless feature can be realized in any car on the market.

The surprising fact is, that an autonomous car without a human driver doesn't make much sense. Because the car knows already what the allowed speed in a city is. The bottleneck is the human who doesn't know exactly how fast he is allowed to drive. Instead of driving a self-driving car alone, the better idea is to invite 3 friends to attend the car, because they can be educated as well. A self-driving car is some kind of mobile driver school in which the humans in the car can discuss which kind of behavior is right.

Has the overall society a need for educating humans in driving? Yes there is a demand. Most of the 1 million driver accidents each year are the result of human error. That means, the driver behind the steering wheel are not familiar with the traffic rules. They were driving too fast, the distance to other vehicles was too slow and they doesn't see the red traffic light. If the human drivers are educated much better, the vision zero can be realized.

October 15, 2019

A circuit for Nanorobotics

The difference between nanoparticles and a nanorobot is, that the robot can be programmed. Programming means to convert a domain into a computational one, that means not biology or chemstry is the correct environment for describing the phenomena but computer science. The question is how to transfer the world of atoms into the world of computer science?

Before something can be programmed, there is a need for a circuit. A circuit is the building block for a computer and a computer less function block as well. Let me give an example. Suppose the idea is to program a prime number checker. The sourcecode in the Python language is known. To converting the sourcecode into hardware we can convert the program into the VHDL language which is needed for building a circuit or we can convert the sourcecode into binary code with the aim to run it on an existing computer. The computer is also realized in a hardware circuit.

In both cases, a computer program can only be executed in the form of a circuit. This allows to reduce the problem into the search for hardware which allows to build circuits. One option in doing so are classical electric circuits. The referenced prime number checker can be converted into wires which are powered by electric current. The other option is to build circuits with the help of biological components. This is called a genetic circuit. Different from that it's possible to build optical circuits in which the building blocks are powered by light.

The precondition for programming a nanorobot is that a logic gate is available. In the logic gate, the working program is stored. This can execute a turing-like sequence. A full blown computer which includes registers and the ability of calling subroutines is only a special case of an circuit. It is used for executing longer programs.

Before a cellular automaton can be realized in a living cell some preconditions have to be fullfilled. One of them is quote “Sending and receiving signals between cells.”[1]

Somebody may asks, why a cellular automaton was chosen, if a computer can be build much easier as a classical computer which contains of registers. The reason is, the living cells are not providing wires and electric current but their natural property is decentralized information which is send back and forth. The principle is decentralized which is different from how an electronic computer works. The cellular automaton is the intuitive mechanism to take advantage of the advantage.

Perhaps it make sense to explain the situation from the other perspective. The goal is to program something. Before it's possible in doing so, the hardware is needed which is equal to a circuit. And now we can investigate which options are available to build reliable circuits in the reality. One option is to use resistors, the other option is to use living cells and so on.

If a cellular automaton was programmed and executed a Lattice gas automaton is the result. Such a lattice gas automaton consists of pixels which can be virtual pixels in a computer game, or living cells in the domain of synthetic biology. It is doing the same task, like the Intel 4004 processor is doing, except the fact that the cellular automaton is distributed as default. That means, it is not a single CPU which is feed by a program, but it's a decentralized super computer.

To understand cellular automaton better, a short look into the Esolang wiki might help. We can read: “Any cellular automaton can be considered a programming language”[2] The mentioned programming language are very similar to the Brainfuck language and it is possible to convert an existing program written in C into that language. That means, at first we need a high level program written in C, then the program is converted into a cellular automaton programming language, and this code can be executed on the physical cellular automaton.

[1] Sakakibara, Yasubumi, et al. "Implementing in vivo cellular automata using toggle switch and inter-bacteria communication mechanism." 2007 2nd Bio-Inspired Models of Network, Information and Computing Systems. IEEE, 2007.

[2] https://esolangs.org/wiki/Cellular_automaton

October 13, 2019

Nanomedicine for beginners

The topic of nanotechnology and Nanomedicine is an interesting subject which is hard to understand by newbies. The published blogs and books are either pure speculation or they are containing lots of expert vocabularies. Instead of explaining what Nanotechnology is, the more interesting question is how to reduce the entry barrier to a level that anyone can understand it.

What i have learned from a first survey is, that nanotechnology means basically to build a 3d printer farm which is able to replicate itself. Some youtube videos about 3d printing farms are available. In most cases, it's a shelf of 4-8 printers who are working at the same time. If the printer was programmed to build parts for a new 3d printer, the setup is able to grow quickly. From the abstract perspective the printing farms asks for electricity, filament and and human worker and in exchange it produces physical objects.

In case of a molecular assembler the idea is to replace human work with robots, that means the 3d printing farm will become fully autonomously and doesn't need human intervention anymore. This allows, in theory, to build a structure similar to what the Star trek replicator is capable of.

October 08, 2019

Software design for a grasping robot




Programming a pick&place robot is on the first look a problem for Artificial Intelligence. It has to do with creating an algorithm which is capable of learning grasping poses. A closer look into the problem will show, that AI isn't needed in the domain. Instead, it's an engineering project which has to do with programming a simulation.

The mindmap on top of the posting shows a rough concept of the idea. All the terms used in the chart are domain specific. It's an attempt to formalize the grasping workflow. The mindmap can't executed on a computer, but it's part of a software engineering process. The idea is to program a prototype with the Python language, and the mindmap helps to identify subparts of the project.

The chart is not complete, because a real grasping robot system contains of many more requirements and design principles. Even the task of pick&place looks not very complicated, it can be a demanding project to write a simulator for this purpose.

On the other hand the potential benefit is great. If the grasping domain can be realized in software, this is equal to automation. In many segments of economy, the same task is done millions of time. All supermarkets, all container terminals, all warehouses and most agriculture production facilities are confronted with the simple problem of grasping an object and release it at the target location. Today, most of the work is done by humans, and not by robots. The reason is, that reliable grasping robots are difficult to program. The task is not a toy problem which can be realized in 300 lines of code, but it's a large scale software projects which needs a lot of heuristics preprogrammed into the system.

But let us go into the details. The main idea is to focus on a simulator which is realized with the object oriented paradigm. The domain of “robot grasping” is converted into an UML chart which consists of many classes. The classes are used to store information about the events, the grasp pose, the trajectory of the robot arm, the position of the objects on the table, the result of the vision system and the planned high-level actions. Right now, it's unclear how many classes are needed to model the overall domain. I would guess 100 classes are the minimum requirement for this complex domain.

The problem is, that the pick&place task consists of many subproblems. One of them is called inverse kinematics. Inverse kinematics has to do with controlling a robot gripper indirectly. Even if the inverse kinematics problem was solve, lots of other problems are available for example the gripper speed during the grasp-phase or what to do if the robot gripper lost the object during the transit.

The overall grasping robot will fail, if only one of the subproblems isn't handled well enough. So there is need to structure the overall task hierarchical. I think, that for creating the prototype, it make sense to get an overview with the help of a mindmap. This helps to identify subdomains of the grasping pipeline which can be solved separately. The idea is, to handle the robot task similar to the problem of programming an operating system for an IBM PC. The idea is that every detail has to be handled with sourcecode, and if the projects consists of millions of codelines, the resulting software will run great.

From the perspective of Artificial Intelligence this sounds a bit disappointing, because the AI Community is interested in building simple but powerful system. The secret goal is to program in 500 lines of code an Artificial Intelligence which can learn by itself, which makes software engineering obsolete. This kind of vision can't be realized in reality. Robot projects in reality have the tendency to become complicated and looking similar to normal software engineering projects from game development or application programming.



To minimize the failure probability of the project, it's important to define some constraints in advance and make the domain easy to realize. The first question which has to answer is, which kind of hardware layout make sense for a grasping robot. The most reliable layout is a portal crane which is used in the reality by container terminals. The advantage is, that the overall system can transport heavy loads, it was tested many times for real problems and it's conservative by default.

Other example of robot grasping systems for example a robot arm or a delta robot are interesting for research projects but they were not tested in reality. Usually these designs are utilized for exploring new path and open new research fields. This kind of open ended problem is not needed here.

The second issue which can reduced in complexity are cluttered object grasps. For the beginning it's much easier to avoid these requirements and define that the robot has only grasp normal objects which are aligned in advanced. So it's not a universal grasping robot, but a simple container crane who is working with a repetive mode.

The resulting mindmap looks clearly, the amount of open tasks is small and it's possible to program the prototype with a low amount of codelines. It's important to mentioned, that even with the simplications, it won't become a toy problem but a large scale software project. That means, the task of programming a simulator for a portal crane is highly complex.

Using Baidu the right way

Everything in China is a mystery. The country is very large, and in the last 10 years many new things were developed. In traditional business, China has become very successful, but is China able to outperform the Silicon Valley as well? The most dominant example of a service hosted in the Bay Area is the Google search engine. It's the core infrastructure which is used by nearly everybody. The chinese counterpart is available under the URL baidu.com

The website is hard to user for western eyes, because everything is written in chinese. The best way in getting comfortable with the service is not to ask for an english translation but learn a bit chinese. So we enter into a dictionary the english word “Model railroad” and gets informed that the chinese translation is 铁路模型. This can be copy & pasted in the Baidu search box and after pressing the button, which is hopefully the search button, a long list of results is shown on the screen.

Again, most of the information is written in Chinese, but some videos are available too. It take a bit until the video is loaded, but then it can be played very well in the webbrowser. It seems, that the Baidu website has an intergrated video hosting website, similar to youtube, but programmed for the chinese market.

What can be seen in the video is an all chinese, model railroad community. They have created large table with lots of trains. Everybody is speaking chinese, and all the trains were made in China. It looks very similar to western model railroad clubs and they are using digital control everywhere.

Unfortunately, the Chinese companies didn't only copied the good things of the western world, but are imitating the bad things as well. In the video portal Youku.com the user has to watch a commercial advertisement before the clip gets started. And it's not possible to skip the commercial break. The quality of the model railroad clip itself looks reasonable well. The video quality is on the same level known from the youtube, and the chinese speaking voice from the background explains what is shown in the video. The users sees some trains which are driving around and a train station is modelled realistically.

One advantage of the Baidu search engine over Google.com is, that Baidu doesn't indexes all the English spoken websites. This is a great service, because English is no longer the world language but used only by a minority for distributing lyrics of popsongs.

October 07, 2019

Mindforth is a simulator, but not an AI

Mindforth is long running project which is marketed in different online forums and chatgroups under the keyword “AI has been solved”. Most users are irritated by the sourcecode which contains of 10k lines of code, written in the Perl language https://ai.neocities.org/perlmind.txt Other example of the Mindforth program were written in Forth and Javascript as well.

Because the amount of users who have tried to understand the details is high, Mindforth has a long history and even it's own FAQ page it is equal to a cultural phenomena which has to researched a bit in detail. The short explanation for everybody who is in hurry is, that Mindforth can't solve puzzles, but it is equal to existing puzzles like the boardgame Scrabble.

The longer explanation has to do with some definitions. At first, there is need to describe precisely what the difference is between an environment and an AI bot which acts in the environment. An environment is equal to a domain, a simulator and a game. Typical examples are the toy problem blocksworld, the mentioned Scrabble game, or a game of pacman. All these puzzles have in common that the user has to act inside the given action space. In the Scrabble game the user has to figure out which words he can put down. And he is only allowed to use the pieces on his own side. The user accepts the fixed rules because he want's to play the game and the interesting question is which kind of actions result into winning the game.

In contrast, an AI bot or a strategy used by a human player has to obey to the given rules and it allows the user to reach in the game a certain state. If decides carefully he can win and vice versa. Automating the task of decision making is researched under the term Artificial Intelligence. The idea is to program a computer program which can act in an environment autonomously.

After this short introduction we have to determine which kind of situation is available in the Mindforth project. The surprising information is, that Mindforth is not able to play existing games, but it's equal to an environment. Basically spoken, Mindforth is a game in which the player has to maximize it's reward. He can do so by entering words on the commandline. The input is parsed by the Mindforth engine and then a result is shown on the screen. To play the Mindforth game right, it's possible to enter words manually, or with a script. Mindforth will receive the input, parse the input by it's own rules and gives a feedback back to the user. The idea is, that the user enters a certain sentence to maximize the reward.

How exactly the user can maximize it's score it's unclear, because the game rules are a bit chaotic. And a variable like “score or rewards” isn't available in the mindforth software. But in general we have to assume, that the intention is not, to use mindforth as an AI which solve existing problems, but the idea is, that Mindforth is simulating a domain which is hard to solve by the user.

Does this make sense? Yes and No at the same time. Most people who are interested in Artificial Intelligence are interested in computer program who are able to play games. Because many games like TicTacToe or Scrabble are already available but the question is how to solve these games. They are disappointed if they are confronted with the fact, that Mindforth can't help them to play games. On the other hand it's an interesting problem how to create a simulation. Mindforth is according to the documentation a cognitive simulator. That is a certain kind of game which emulates psychological processes. It's important to mention, that a cognitive simulator is not solve to solve thinking games, but it's a domain specific language to create new problems.

I know the description is highly vague, so let me give an example. In a simple guess game, the user has to enter a number between 0 and 10. If the number is equal to the number imagined by the computer he has won. The sourcecode is shown here:

#!/usr/bin/env python3
import random
random.seed()
goal=random.randint(0,10)
a = input("number (0-10)? ")
if a==goal: print("you won")
print(a,goal)

This kind of sourcecode formulates a domain. The game has a inner structure, rules and asks the user to do something. If the user enters the right number, he has won the game. Creating such games is not very hard. Many formalized puzzles are available. What they have in common is, that for solving these puzzle the user has to think about the case. In case of the number guess game, the user enters maybe the number in the middle which is 5 in the hope, that this will maximize it's reward. It depends on the game, which strategy is the right one. The ability to solve existing games is equal to Artificial Intelligence.

It's important to make clear, that the listed Python sourcecode isn't equal to an Artificial Intelligence. Even the program doesn't contains a compiler error and was programmed well it's a puzzle. That means, the program provides the rules, and a potential rewards and for solving it, the user has to be intelligent or he needs some luck.

How to pass the Mindforth Turing test?

The mindforth program is available in Perl https://ai.neocities.org/perlmind.txt, Forth and Javascript. http://mind.sourceforge.net/Mind.html All the programs are operating with the same idea. They are playing a game with the user, called Turing test or captcha. The idea is, that the user has to convince the mindforth engine, that he is a human but not a machine. He can do so by entering words in the subject verb object notation which is known as a triple storage. If the user has entered the right words, Mindforth comes to the conclusion that the user behind the screen is a human.

October 05, 2019

Creating a simulation but not an Artificial Intelligence

A common understanding of Artificial Intelligence is, that a certain type of software is able to control robots. This kind of software is referenced as neural networks or a cognitive agent. Unfortunately, it's unclear how to use a certain framework the right way, so the attempt of building an AI fails. But what is, if AI isn't the result of an algorithm but it can be build as a simulator? Let us make a small though experiment. The idea is to program a self-driving car. We are doing so not with AI frameworks, but the only allowed tool is by creating a simulation.

The first step is to program a car simulator. That means, we are modelling the sourinding of a self-driving car with the help of an object oriented programming language. A class is created for the road, a second for the path on the road, a third class for other cars, another class for pedastrians, a class for the traffic light and so on. In short, it has to do with programming a car simulator game.

There is no need, that the simulator can act by it's own, it's enough to provide the environment in which the human user can press on buttons. After a while, a realistic simulation was created. But the original problem of building an AI remains unsolved. What we have done is to formalize the domain in a computer simulation. In the next step, the simulation gets improved. Instead of creating an AI character in the given simulation the idea is, that the simulation is not realistic enough. So the programmer has to think how to improve the car simulation game.

Most existing simulation games can be improved by high level simulation layers. In case of a car game a high level layer can simulate a crash. That means, if the driver leaves it's own section of the road, the simulator shows a warning text message. It's interesting to know that this warning message is not the result of sophisticated Artificial Intelligence, but it's part of the simulator. Another option is, to show visually, that the speed of the car is too high. The resulting improved simulation game looks like a driver school. On the screen a car simulation but a virtual teacher is visualized at the same time.

The working hypothesis is, that an AI character can be replaced by a highly realistic simulation. If more details are included into the game, no AI at all is needed. The sad news is, that it's complicate the program a realistic simulator for complex domains. Creating a car simulator which includes all the elements of real traffic is a large scale software project. From a technical perspective, it's known how to realize such a project, but it needs a large amount of manpower.

Why cognitive architectures can't solve real world problems

Under the term “good old fashion AI” many attempts were made in the past to develop a universal Artificial Intelligence. That's a computer program who mimicry the thought process and the behavior of a human. A well known example for a cognitive architecture is the BDI paradigm which is a framework to develop agent systems. On the first look, it's unclear why cognitive architecture are not able to solve real world problems. Because if a software like Agentspeak or SOAR consists of working memory, an inference engine and a sensory buffer it should be well prepared for all sorts of philosophical problems.

It has to do with a missing understanding of environments vs agents. The connection between both of them is explained in the literature as grounding, but the definition is not precise enough. The terms agent and cognitive simulation are often used with the same purpose. For solving practical problems from robotics only the simulation part is more important, and the AI can be ignored. If the domain was converted into a simulation the problem was solved. That means, a simulation doesn't need a sophisticated Artificial intelligence.

But let us take a deeper look into the BDI framework. The Belief desire intention concept is often described as a software for creating AI-agents. But in reality, a BDI agent has at foremost the obligation to represent the problem. If the agent was designed for a robocup like game, the agent will contains of procedures and variables from the soccer domain. That means, it provides a variable ball, a function “moveto” and an event like “lost the ball”. According to strict border between agents and simulations these terms are not located in the agent but they are part of the environment. That means, the variable ball doesn't belong to a certain agent who likes to play the game, but it is provided by the game engine of the domain.

The question is not how to play a given game, but how to create a formalized game for a domain. Before a software program can be implemented which kicks the ball in Robocup, there is a need to write a simulator which allows agents to play the game. If the simulator has more features and was programmed well, it will become much easier to write an AI for it.

A typical mistakes of robotics engineers is, to leave out the step of programming a simulator. They are using a robot in hardware, for example an Arduino board and the idea is, that after pressing the on button the robot is able to play the robocup game. The beginner assumes, that the robot itself needs a certain amount of intelligence and knowledge to understand the game and determine the next action. This assumption leaves out the importance of a simulator:

Robocup domain -> agent plays the game

The robot who plays the game was programmed with an agent architecture. Such project will fail. The agent aka the AI is not able to interact with the domain in a meaningful way. The more elaborated workflow is:

Robocup domain -> simulator -> agent

To understand why the second pipleline is more efficient we assume that the agent is equal to a random generator. He can't inference anything, but the robot is producing random numbers all the time. On the first look this strategy will fail to solve the robocup game. Surprisingly it is working great if the underlying simulator was programmed already. The simulator provides meaningful motion primitives like “take ball”, “kick ball”. If the agent sends random numbers to the simulators, it's possible that the agent plays the game reasonable well.

The intelligence is not located within the robot but in the simulator. The domain simulator converts a domain into machine readable API. It's called grounding and is the most important part of an Artificial Intelligence system.

Creating complex simulations

There are some techniques available for creating complex realistic simulations for a domain. In traditional software engineering there are object oriented programming languages invented. They can be utilized for creating hierarchical object oriented models. That's an UML chart which contains lots of classes distributed in hierarchical layers. Sometimes the protege tool is recommended to design such object oriented model The idea behind Protege is, that the user can creates classes which are describing a domain.

Object oriented programming and the Protege tool is used for creating simulations. A given domain, for example a soccer game, is mapped into objects. All the allowed sensor rules, actions, and events are formalized in an object hierarchy. This is called by game programmers a game engine, or rule engine because it holds the game itself.

In a short but readable tutorial it was explained how to use object oriented programming for create videogames, https://gamedevelopment.tutsplus.com/tutorials/quick-tip-intro-to-object-oriented-programming-for-game-development--gamedev-1805 Three different domains were given: Asteroids, Tetris and Pacman. All the games can be realized by creating objects which have attributes and methods. This is equal to create a game simulator, it's a computer program which executes a certain game. After the game objects are created, it's possible to interact with the game engine. For example, in the Asteroids game it's possible to send a “turning” command to the spaceship which will modify the trust variable.

It's interesting to know, that without a simulation written in a object oriented language it's not possible to play a game. Also it doesn't make sense to discuss a possible AI which can play the game autonomously.

A convenient way for accessing lots of UML models for games is the “site:genmymodel.com pacman” website. It's possible to ask the site for a certain game, and it will result a list of UML diagrams which are used as a game engine. Easy games are containing not more than 10 classes, which are connected on the same hierarchical level. More complex games are realized by a hierarchical object model which allows to store 50 and more classes. It's interesting to know, that no object model is available for the problem of simulating an AI or a robot. Because this kind of game is to general. Instead, only concrete domains like pong, pacman, soccer and RPG games are available.

From a simulation to an AI

On the first look an existing UML Diagram for a video game or the written sourcecode for the game engine doesn't answer the question how to play this game autonomously by an AI. Because the question is not how to program a game, but how to realize the Artificial Intelligence ... It's interesting to know, that both is connected together. A well written game engine can be easily scripted. A script is short computer program which sends commands to the game API. If the Game API has more features it's much easier to write a script and vice versa.

Apart from scripts there are other options for utilizing an existing game engine API, for example neural networks, a random generator, reinforcement learning and so on. All these AI techniques become more powerful if they are not used from scratch but are producing commands for a given game API. That means, the intelligence of the resulting non player character isn't located in the neural network but in the game engine which provides the allowed actions for the neural network.

Let me explain this strange situation on a concrete example. Suppose a real time strategy game was programmed already. The game engine supports the creation of new buildings and it's possible to move units on the screen. The only missing part is the AI. Instead of programming a dedicated AI, the user writes a 10 line python scripts which is using a random generator for generating a number between 0 and 100, and then a random action from the game API is executed.

The resulting AI will produce only sense making actions. He will build first some buildings, move the units and builds more buildings, very similar to what a human user would do. The reason is, that the game API transforms the random generated numbers into semantic correct behaviors. If the strength of the AI is too low, not the AI has to be improved but the game engine.

October 03, 2019

Linked lists vs object oriented programming

A linked list is good example for comparing classical structured programming with modern object oriented one. For the programming language C, a linked list is the most advanced form of data storage. It is superior to an array of structs, because a linked list allocates memory dynamically. A linked list implementation in C works the same way, like a python dictionary. The programmer can create a hierarchical database and then store values into the storage. Nearly all advanced C programs are using for internal storage a linked list. Either with a self-implemented retrieval algorithm or with a standard library.

The interesting point is, that for object oriented languages like C++, Java and Python a linked list isn't very important. It is used seldom and in most cases it's equal to a bad design. The reason is, that the concept of a class and the ability to store a class in an array is more powerful than a linked list. Let me give an example:

Suppose a game written in C has to store obstacles on a map. Each obstacle contains of a position, a size and a color. The natural way for realizing such datastructure in C is a linked list. There is no need to define the amount of obstacles in advanced, but the list can grow on the fly. It's easy to store new items into the list and retrieve existing one. The combination of the C language plus a linked list, allows to create a fast video game.

A C++ programmer will realize the same feature with a vector of objects. He defines first a class for the obstacle and then adds an object to an empty vector. The same technique is used by Python programmers, and Java coders. The difference is, that in object oriented programming language the programmers are not machine oriented, but are describing a domain. How the compiler is converting the list of classes into a memory representation is not important for the programmer.

Now it's possible to compare linked lists with a vector of class-objects. The main difference is, that a linked list holds only the data but not the program code. In a c program there a 20 subroutines who have all access to the linked list. It's equal to a centralized data-class in C++. That means, the programmer defines a class for information storage but the routines for updating the class are outsourced in a different class. This simplifies the interchange of information between different parts of the program.

According to the OOP paradigm, information interchange has to be prevented. The idea behind creating different classes is, that they can't modify external information. So a linked list and object oriented design is the opposite. The reason, why OOP has become successful is because larger software projects can be implemented more easily. That means, a linked list work only for small amount of codelines. Only C++ like languages are supporting bigger software projects.

Symbolic AI explained

The amount of literature about classical AI is large. It's important to extract the dominant ideas. Symbolic AI is often mentioned together with cognitive architectures. This is equal to a cognitive simulation. It means, to mimicry the behaviors of a human in a computer program. Realizing a cognitive simulation in computer code can be done with an agent-based simulation.

Agent-based programming languages like Agentspeak are working with the belief-desire-intention model. This is an abbreviation for 1. recognize the current state, 2. define the future state, 3 create a plan. It's interesting to know, that this kind of workflow can be implemented with a STRIPS like planner. The STRIPS Solver needs as input the current world, it needs also the goal state, and then strips is able to plan the actions in between.

STRIPS and the more recent declarative AI language PDDL are examples for a cognitive architecture. They are used to simulate the behavior of a human. Unfortunately, it's not answered yet how to convert a certain domain into a STRIPS notation or into a agent-simulation language like golog. Most practical projects from within the robotics domain are not focused on simulation of human thinking but they are domain oriented. That means, converting the Mario AI game into the strips notation is done with actions and goals from within the Mario game, while converting a snake game into a agent-simulation has to be done with the typical motion primitives from snake.

So in reality, the term cognitive architecture is a bit misleading. In reality, it's a more a domain-specific simulation in which not the human is simulated but the task he is trying to solve.

Agentspeak = qualitative simulation

Agentspeak defines itself as a cognitive architecture written in Java. The idea is, that the programmer gets a tool for creating a multi agent system. It's interesting to know, that the programmer doesn't define the agent itself, but he writes down the domain in a machine readable form. Converting a domain into the agentspeak syntax is equal to program a qualitative simulation.

Let me give an example. Suppose the domain has to do with a robot in a maze. For simulating the robot a symbolic game has to be written. The game consists of a game state, and possible actions. In the game programming literature this is described as a game engine or rule engine. The robot has a position (variable pos), he can move in 4 directions (action move), and he can collide with an obstacle (event collision).

Where exactly is the difference between a cognitive architecture written in Agentspeak and a game engine? Right, there is no difference, it's the same. Agentspeak can't be utilized for creating agents, but it's a game construction kit. From an abstract point of view, an agent simulation is utilized for simulating a system. The term system isn't referencing to an Artificial Intelligence, nor a cognitive agent, but to a domain. A possible domain is the “Super Mario game”, the Snake game, or a robot in a maze. The question is how to convert these domains into a computer simulation. This process is called agent programming, or game engine programming. From an engineering perspective the correct term is forward modelling or system identification. It means to convert a given domain into a computer simulation.

The most interesting aspect is, that system identification doesn't need an intelligent agent in the loop. If a the movement of a rigid body are formalized by ordinary differential equations, it's for sure, that the rigid body isn't able to think. Instead he is following physical laws. The same is true for implementing a racing car. The game engine which calculates the current position of the car in response to the speed value isn't an agent, nor an Artificial Intelligence, but it's the normal game engine. The result of the game engine is visualized by the graphics engine of the program.

That means, in practical projects, an agent can't be identified in a given domain. If the domain was transfered into a simulation, all the work was done.

October 01, 2019

Pyglet hello world program



#!/usr/bin/env python3
import pyglet
from pyglet.window import key

class GameWindow(pyglet.window.Window):
  def __init__(self, *args, **kwargs):
    super().__init__(*args, **kwargs)
    self.pos=(200,200)
    self.angle=0
    self.movement=20
    imagetrain = pyglet.resource.image('locomotive.png')
    self.spritetrain = pyglet.sprite.Sprite(imagetrain)
    self.label = pyglet.text.Label('Hello, world',x=100,y=100,color=(100,100,255,255))
  def on_mouse_press(self, x, y, button, modifiers):
    pass
  def on_mouse_release(self, x, y, button, modifiers):
    pass
  def on_mouse_motion(self, x, y, dx, dy):
    pass
    #print("move mouse")
  def on_key_press(self, symbol, modifiers):
    #print("key was pressed",symbol)
    if symbol == key.A: self.angle-=10
    elif symbol == key.B: self.angle+=10
    elif symbol == key.ENTER: pass
    elif symbol == key.LEFT: 
      self.pos=(self.pos[0]-self.movement,self.pos[1])
    elif symbol == key.RIGHT: 
      self.pos=(self.pos[0]+self.movement,self.pos[1])
    elif symbol == key.UP: 
      self.pos=(self.pos[0],self.pos[1]+self.movement)
    elif symbol == key.DOWN: 
      self.pos=(self.pos[0],self.pos[1]-self.movement)
    self.spritetrain.update(x=self.pos[0],y=self.pos[1],rotation=self.angle)
  def update(self, dt):
    pass
    
  def on_draw(self):
    self.clear()  
    pyglet.gl.glClearColor(1,1,1,1)
    pyglet.graphics.draw(2,pyglet.gl.GL_LINES,
      ('v2i', (10, 15, 130, 135)),
      ('c3B', (100,100,255)*2), # one color per vertex
    )
    self.spritetrain.draw()
    self.label.draw() 

if __name__ == "__main__":
  window = GameWindow(600, 400, "pyglet",resizable=False)
  pyglet.clock.schedule_interval(window.update, 1/30.0)
  pyglet.app.run() 

Update

In a second version of the program, a dedicated class for the physics state was created. This simplified the programming a bit. The remaining commands for drawing the lines and the sprites will become easier, because once the code is written, it will draw all the elements to the screen.

Secondly, it was measured the exact cpu consumption. For 100 frames per second, the GUI app needs 15% of the totai CPU.

ps -p 31339 -o %cpu,%mem,cmd
%CPU %MEM CMD
15.1  1.3 python3 1.py

If the framerate is increased to 200 fps, the cpu consumption is 18%. It's important to know, that the only objects on the screen, are one sprite, one line and a small “hello world” text. It seems, that Python programs in general are running slow? No, that is not the case. Because a program written in high speed C++ with the SFML library has the same performance problems. It seems, that the requirements of updating the screen 200 times per second is too much for computers in general. No matter in which programming language the code was written.

Low hanging fruits for Artificial Intelligence

The subject of artificial Intelligence and many interesting domains are available. Speech recognition, self-driving cars, biped walking robots and cognitive architectures are only a few example. Especially for beginners in the domain it's important to know which kind of domains can be ignored, because they are too complicated and not understood quite well. This allows to focus on the remaining problems which are located within artificial intelligence and can be solved with success.

One example for an easy to automate task is a metro monitoring device. That's an onboard computer who creates a logfile for a metro or a subway in a city. This kind of task contains of many different assumptions. The first one is, that "Automatic train operation" domains are much easier to formalize than self-driving cars or biped robots on an unexplored island. The train network in a city, is fixed, well known and won't change in the next 20 years. Additionally, there are no obstacles and everything is defined by rules. Secondly, instead of programming a software which can operate the train automatically, it's much easier to monitor a human operator who is doing the task. That means, the software is doing nothing, but has the only task to observe the human.

From a technical point of view, the realtime task is about recording existing raw data in a machine readable format. In each second the subway has a speed value, a position and a status, for example the doors can be closed or open. The overall situation can be monitored similar to the EEG of a human body. The next step is to parse the data stream and convert the information into semantic information. For example, to detect if the train is fulfilling the plan, if an emergency is there and so on.

The most interesting aspect of this task is, that the technical side isn't too advanced and at the same time it's a useful application. That means, the real subway will profit from the project. Such a black box is calle Train event recorder.

Rush hour

From a mechanical side, a subway train is not very interesting. The technology was invented decades ago and is sold out-of-the box. What makes subway trains interesting is a special use case, called rush hour. In the morning and in the evening lots of people are crowed at the platform. They are waiting for the incoming train, and they are leaving the train, if they have arrived.

The problem is not located in the train itself, but it has to do with capacity. The number of people who are requesting a service is higher, than the free stand places in the subway, as a result there is a traffic jam. The commuter are loosing their spare time and have to wait together with random strangers until a random train comes in. Most people at the station doesn't know the technical details of the subway. They are not aware of how much electricity the subway needs, if the system is older or newer or what the maximum speed is. The thing they are asking for is, why they have to wait. Why it takes so long and why so many people are driving at the same time with the same train.

Formalizing Artificial Intelligence

In addition to a previous blog post, I'd like to describe a general strategy how to create a game playing AI. The first step is to select a certain domain which has to be solved. For example to control a self-driving car in a simulator. The next step is to divide the problem into subproblems. For example, a car needs a vision system, a pathplanner, a distance control, a speed control and so on. The categories are dependent from the game domain, a self-driving has different subproblems than a real time strategy game. In the next step, for each category a system identification task is realized.

System identification means, to predict the future game state. It's a what-if algorithm. Let me give an example. The desired subproblem is the abstract pathplanner for the car. The options which are available for the car driver is to drive to a certain node in the graph. As a result, the game is a new state. Sometimes the strategy is called model predictive control, but system identification is only a part of it. It means, only to anticipate the future without selecting an action.

As a result the AI prototype can be realized. It contains of submodules for pathplanning, speed control and wheel control and each submodul is able to predict the future. The last step is to create an action generator which is using the existing information for controlling the car. This kind of problem solving technique can be repeated for any domain. If the domain is more complex, for example in a real time strategy game, the amount of subproblems growths quickly, and it will become harder to predict future states.

A general strategy for handilng complex problem is to see it as a software engineering task. It can be solved by lots of people who are creating commits in a version control system. In parallel they have to write the documentation and test if the system is working correctly. And exactly this is perhaps the largest bottleneck. The described general AI system isn't a library written in C++, but it's only a software engineering pattern, which is equal to a guideline how to build new AI Systems from scratch. The workflow can't automated, that means, the AI won't evolve by itself, but it's up to the programmer to write sourcecode.

For simple domains, like a snake game in 2d, a single programmer is able to realize such a project. He can create all the tasks by himself. At first, he defines subproblems, then he creates the system identification for each subproblem and at the end, he writes the controller and tests it in the game. For complex games like a self-driving car, UAV or a humanoid robot it's not possible for a single programmer to realize such project. Existing robotics projects are usually the result of a group of programmers how have invested many years and wrote million lines of code.

The reason why robotics and artificial intelligence is so complicated is, that the written code can't be used for a different project. That means, each domain needs a different kind of sourcecode. It's not possible to write a general system identification module which can predict all domains. Or to define in general which subproblems a game has. The only thing which is fixed is a programming language. The same Python interpreter and the same git version control software can be used to manage all sorts of AI projects.

Are Python dictionaries an important technique?

Python dictionaries are less known technique in creating software. It is basically a luxury version of a linked list. The user doesn't has to handle pointers and hash values, but can store and retrieve data directly into the dictionary. Additionally it's possible to create subdictionary so that large amount of information can be stored in the main memory.

The disadvantages of the idea should be mentioned. A python dictionary is similar to a data class in object oriented programming. The idea is, to create a centralized data storage which holds all the data from the game, and all the classes have access to the storage. The classes are not forced to communicate to each other but they have access to the data as default.

Usually, a centralized data storage is an antipattern in object oriented programming because it avoids OOP at all. The problem is, that the resulting structure will look the following: there is a centralized dictionary with 50 entries. And around the dictionary there are 20 subfunctions with together of 500 lines of code. Technically it will run fine, but nobody likes to bugfix such a sourcecode. Because there is no structure available. It's unclear which function is doing what.

In general, it's a good idea to avoid dictionaries and use python classes as an alternative. The class stores only a subpart of the information and holds also the function for manipulating the data. This is equal to an easy to debug sourcecode.