April 27, 2020

How to reduce the traffic in a blog to zero

The first thing to do is to avoid larger blogging websites like wordpress and medium and select Google Blogspot as the blogging website. Google Blogspot doesn't has commercial advertaisment which makes the space not attractive for a larger audience. Additionally, Google websearch doesn't indexes the blogspot subdomain so the newly created blog is nearly invisible in the internet. After creating some blog posts the user has to rename the blog. This can be realized with the settings menu. The result is, that even the blog was discovered by external search engines like bing, the URL becomes obsolete and the newly created URL has to discovered again. This will drop the daily pageview counter down to zero for sure.

The open question is, what to do with a blog which has no audience? This is outside the scope of this tutorial. Here, it was only explained how to create such a C- weblog. The only thing what is for sure is, that the no comments were written by external internet users, and that no search engine will find the blog.

First look into Debian 10

Debian 10 was released in mid 2019. Installing it on a virtual machine make sense, but the more interesting idea is to install it on a phyical notebook. I have done so and it works reasonable well. The most obstacle was the installation proecudure itself. After booting the USB stick, the user has the choice between a Gnome based installer, a graphical installer and a text install. I have choosen the graphical installer but it was a bit complicated to browse through the options. After a bit reading the manual an installation was possible, but especially newbies will find the installer difficult to use.

The more elaborated way is the modern gnome based installer. Which works better but needs more system ressources. In the next installation it is the better choice. After the system was installed on the PC the first bootup was using the wrong graphics settings. The user has to manual install the non-free driver which fits to the graphics card. Selecting the correct package has to be done manual. That means, the user has to know that the resolution problem can be solved with a non -free driver and then he has to read the wiki section to identify which driver needs to be installed. Similar to the installation software this step is a bit hard for newbies.

After mastering the step the system runs great. All the programs are available which means that firefox, Spreadhseet programs, python 3 and all the other open source software runs out of the box. The system requirements are on the same level like in Fedora and Arch LInux, which means that an idle PC will need around 2 GB of RAM and the Debian OS occupies around 15 GB on the harddrive. Compared to early Linux systems for example slackware the hardware requirements are high, but in comparison with Windows 10 it's a midsize system with moderate requirements.

What is important to know that before the user will accept the Debian philosophy he has to understand the advantages of a stable system over a rolling release distribution. If the user isn't familiar about the details of git branches and how a stable branch is monitored for security issues, he won't like the Debian philosophy very much. The reason is, that a short look into the version history of all the software will show, that Debian is outdated. Firefox is obsolete since 6 months, the Linux kernel is an older one and Python is not the current version.

The main advantage of Debian over other Linux distributions is, that all the Debian users have installed the same software. On 2020-04-27 it Debian 10.3 which means, that the user will need for exactly this version a handbook, security patches and updates. This makes it more likely that after installing an update the system will run without interruption. In contrast, the situation at Gentoo Linux and Fedora is, that every user has installed a slightly different Linux system which makes it hard to trace errors back. Therefor, Debian has much in common with Windows 10 in which all the users have installed the same version. This is important to blame the right opponent. That means, if the user has installed Debian 10.3 already, and something isn'T working it the fault of the Debian project but not of the single user.

April 25, 2020

Debian Release management

A short look into the Debian release schedule https://wiki.debian.org/DebianReleases provides helpful information for the endusers. In general, each version is maintained for 2 and a half year which means, the user has to install the Operating system once and can use it very long. If the 2.5 years are over the user can update to the next release with a simple command line. This brings the user in a very comfortable situation.

What is not answered in the release chart is how to program all the software. The Linux operating system contains of hundreds of programs and it's unclear how these programs are working together. From the users perspective this is a mnor problem, but it's up to the developer to program and test the code.

Perhaps this is the most obvious difference between Debian and other operating systems like Arch Linux. In Debian there are two conflicting social roles: normal users vs. developers. The social role of the user is installing the software and then it is using all the code without doing anything in return. While the social role of a programmer has to do with fixing issues, compiling sourcecode and monitor security issues.

The simple explanation is, that two conflicting roles are equal to a professional Linux distribution. If a linux distribution has only one role which is the developer, it can't be called a serious distribution. A wrong assumption is, that the conflicting roles are only available in closed source ecosystem in which the user has to pay 100 US$ for the operating system while the other side takes the money and programs the code. The surprising situation is, that the same social roles can be imited in the Open Source world as well. The idfference is, that Debian users aren't paying money. The result ist, that the quality of the software is lower. The latest Debian 10 software has a weaker quality than the latest Windows 10 version. In Windows 10 the graphics card is working better, the PC needs less energy and the installation works more smooth. These disadvantages have to be accepted by the Debian users because they get the .iso file for free.

The advantage of Debian over other LInux distributions like Gentoo, OpenSuse and Fedora is, that in Debian it's possible to become a normal end user. An end user is somebody who doesn't write sourcecode nor he fills out bug reports, but he is using the software. Very simlialr what most users are doing with Wikipedia. They type in the adress into the URL bar and read the content which was written by other.

Most Debian experts are arguing that their operating system has a greater stability than other linux distributions. What are they talking about? A naive assumption is, that stable means, that the software is secure or has no bugs. This is only partly true. The current Debian 10 stable has a lot of security issues, and some minor bugs too. What stable means more concrete is, that the development is done in a stable branch. That means, Debian contains of a unstable branch and a stable branch. The existence of a stable branch allows to publish longterm versions which are running 2.5 years on the computer. In contrast, Linux distribution which are not stable like Arch Linux or Fedora need to be updated once a week and in case of doubt, the system won't boot after the update.

The term stable is referencing to a publishing schedule in which every 2.5 years a new release is available and during this period the user is running the same version on the computer. The stable branch is a technique to provide this release schedule.

Some arguments for Arch Linux

Arch Linux plays a unique role under all Linux distribution. Because the Linux distribution can be explained very easily. The latest version of each software is compiled and installed on the PC of the user. The Arch Linux wiki and the pacman package manager are supporting this workflow very well. Most users understanding the idea behind Arch LInux so they using it at least for playing around.

From a more abstract perspective, Arch Linux is a developer friendly distribution. It supports the idea of agile software development. If a certain subsystem has a problem, a bug is created, the sourcecode is improved and with a delay of less than 24 hours the updated binary version can be downloaded from the server. No matter if the sourcecode of Firefox, Linux kernel, a texteditor or from a game was improved, the sourcecode gets compiled into binary versions and the user can download it from the server.

Unfurtunately, the Arch Linux project has some limits. It is used not very often in productive envirionments. In theory it's possible for doing so. That means, on a vserver an Arch Linux system can be installed and on the Laptop as well, but only few people are doing so. The exact reason is not defined clearly. Sometimes the explanation which is provided is, that sometimes the Arch Linux system won't boot after an update. But with the recent improvements of pacman this is seldom the case. In most cases, the boot process is working fine, and if not the manual intervention is minimal. Another explanation why Arch Linux isn't used in reality is because the concept is too new. That means, the concept of agile development and a rolling release doesn't fit to the well known waterfall software cycle so its hard to convince a larger audience in using the software in reality.

The more realistic description why Arch Linux isn't used for productive environment has to do with conflicting needs of developers and normal users. Arch Linux was developed from coders for coders. The project is located in the upstream and explains that the upstream is equal to the downstream. Everybody is a programmer and in exchange he gets the most secure software ever programmed. This story doesn't fit to the reality. First thing is, that most users are not interested in creating software, but they want to use it as a normal user. Arch Linux ignores the idea of software quality checks.

Let us describe the preconditions behind the Arch Linux workflow. The idea is, that the upstream never makes a mistake. If the Linux kernel was improved from version 1.0 to 1.1 this improvement make sense and there is no time to argue for the reason why. The problem is, that most software was written by amateurs and they are not programming the software for the normal users, but they are programming the code for other reasons. Especially in the Open Source ecosystem most software projects are started because the developer team likes to try out something. For example, somebody likes to learn how the C language is working and therefore he starts a gaming project in which the C language was used.

The average user assumes that the upstream has programmed malware, which is spying the data from the user. In contrast, the upstream assumes, that the normal user has no experience with computers at all and therefore he needs pre-defined settings. The consequence is, that no trust at all is available between upstream and downstream. This problem is ignored by Arch Linux. Arch Linux assumes, that no conflict between upstream and downstream is there.

Update over the Internet

Rolling release distributions like Arch Linux have become successful since the advent of fast internet connection. If the users are euipped with a stable internet connection, its possible to update the osftware every week. This narrative reduces the comparison between rolling release and stable release to a file transfer problem. The more elaborate comparison is focused on the development process. The bottleneck is located on the upstream level. Before a software can be installed somebody has to write the code. Software development is done with the git version control system in which a team of programmers are writing lines of code. The software development process has to be organized in a certain way. The management of writing code can be realized with rolling release and stable release.

Rollling release is equal to a single branch model which is trunk. It is the same principle used in a wikipedia article. There is only one current version of the wikipedia article and everybody is allowed to modify it. It is surprising to know, that in reality most software project doesn't work with a single branch model. Because software development is more complicated than creating a wikipedia article.

The first reason is, that the amount of commits is higher. The average wikipedia article contains of only 20 commits over a timespan of 1 year. While the average software project contains of thousands of commits. The second problem in software development that different tasks has to be solved in parallel. It's is possible to create new features, improve the security, update the documentation, and fix existing bugs. The best practice method in doing so is to use two branches or more.

The problem with two and more brnach models is, that no longer a current version is available. A current version means, that all the branches are merged into a single one, which is not the case. Instead, the average software project has many current versions at the same time:

• a current testing version

• a currrent security version

• a current stable version

• a current bugfix version

• and so on

The additional problem is, that these versions are improved independent from each other. This is the major advantage but also the major disadvantage of the git version control system. A rolling release software makes only sense if the development model is based on a single trunk branch.

Let us describe a common three branch software development model. If a one man project or a small team is starting a new project at github they will create three branches: stable-branch, issue-branch, testing-branch. If the developer likes to fix an issue from the bugtracker he will submit into the issue-branch, if the maintainer of the project likes to aggregate different bugfixes into the testing version he will merge the issue-branch into the testing branch and if a new stable version should be created, the testing branch is copied into the stable branch.

This three branch model is some sort of best practice method in software development. The surprising information is, that it's not a rolling release version. Instead the new versions in the stable branch are produced with a time lag. That means in january 2019 the bugfix was created, in March 2019 the testing branch was updated, and in June 2019 the new stable branch version was created. In this example, it took 6 months until the bugfix was available in the stable version. This timelag can't be reduced. The reason is, that the amount of ressources in a project are limited. For example, if the github project was created by 2 programmers, the maximum amount of written codelines per day is only 10x2=20 lines per code.

Let us make a small example. Suppose, the team likes to improve the software with 3000 additional lines of code. According to the math, they will need 3000/20=150 days for the task. If they are starting today, they are finished in 6 months. This delay produces the time lag in the release workflow. The only way to reduce the time between the occurence of a bug and until it was fixed in the stable version is to increase the amount of programmers. If the team has access to 200 programmers, they can reduce the timelag drastically.

Freezing the upstream

In the first example, a rolling release software project is described. It contains of a trunk branch which is updated once a day. The normal user is asked to install always the latest version, because it contains all the improvements and security fixes.

In the second example, a stable release is described. It is created by freezing the trunk branch. That means, on a certain time in the past, a copy of the sourcecode is created in a different folder. And then the copy gets improved to fulfill the needs of the normal user. Freezing the upstream is done as an addition to a normal upstream development. At the same time, the upstream trunk branch gets improved without interruption. That means, the stable team is able to create the freeze independent from the upstream developers.

It depends on the concrete software project how complicate it is to freeze the trunk branch. In most cases, the point release is created together with a handbook, security updates and bug reports against the stable version. The only thing what is sure is, that an additional stable branch needs more effort than only improve the trunk branch. A trunk branch has to do with the software project itself. Which is focused on the sourcecode and the improvements. While a stable branch has to do with the needs of the normal users.

Wikipedia edits made easy

Creating Wikipedia edits is a very complex task. Lots of papers were written about the topic in the past. The good news is, that Wikipedia edits can be formalized and repeated over and over again. The common Wikipedia edit contains of two steps: creating keypoints and writing prose text.

Unfortunately, most long term Wikipedia authors are combining these steps into a single edit. They submitting the changes to an article without further comments. This behavior makes it harder for the newbies to create edits by their own. The better idea is to assume, that the newbie has no experience with WIkipedia at all and likes to contribute to the project in a predictable way. That means, without producing conflicts and without getting banned because of false edits.

An easy to follow edit strategy contains of the described two step pipeline. In step 1 the newbie is posting keypoints for an article to the discussion page and in step 2 he converts these keypoints into full sentences. The good news is, that this strategy is described under the term “creating powerpoint presentation” Since decades. A powerpoint presentation contains of the same steps. In step 1 the author prepares the presentation at home and writes down the keypoints to the slides. In step 2 he helds the presentation which is equal to convert the self created keypoints into natural speech which contains of full sentences. The second step is done in front of the audience. That means, the lecturer not only reads the keypoints loud, but he is using the keypoints to talk about the subject.

Its interesting to know that the two step pipeline for creating powerpoint presentation is the international defacto standard. All the presentations in the world contains of written keypoints stored in the slides plus the oral presentation in which the speaker formulates full sentences. Its not possible to make a presentation in a different way.

Its a bit surprising that in the Wikipedia ecosystem this two step workflow is not known. In the official help section the steps are not mentioned. Instead the official tutorials are assuming that an edit is the smallest item which can't be divided into sub tasks. This assumption is wrong. A wikipedia edit is equal to submit prose text to Wikipedia which is annotated with bibliographic references. Before this prose text can be added to an article, the author needs a preparation step. He has to read through the existing information and he has to make some keypoints what he has read in the papers.

Most authors are storing this prestep either on their local harddrive or they are trained well enough to not need such a step. For newbies the recommendation is, to submit the created keypoints to the talk page, because this helps to get a better overview. Newbies are allowed to make mistakes, which can be located in two steps. Either the newbie struggles in making notes by reading existing information. Or the newbie isn't able to formulate the self-created keypoints into prose text. Getting feedback at which step exactly the error was introduced will help a lot.

That means, it is not enough to judge that a certain edit is wrong. But the more elaborated question is, if the creation of keypoints was a problem or the transfer of keypoints into prose text.

Well written articles

Let us analyze the existing articles in Wikipedia. What they have in common is, that they are written for the enduser. They are formulated in prose text and they are equipped with bibliographic references. A wikipedia article and a recorded powerpoint presentation have much in common. They can be read/listen from start to end and in most cases the text makes sense.

What is not given by the average Wikipedia article are the presteps until the article was created. A naive assumption is, that an article is created by smaller edits. But this definition hides the fact, that the individual authors are using their local harddrive to prepare the edits. The prepation steps on the local harddrive is never uploaded to Wikipedia, therefor its much harder for the newbies to reproduce the steps for creating articles by their own.

The interesting point is, that in the normal tutorials about creating academic text the prestep of notetaking is described in detail. Nearly 100% of the manuals in which the process of creating academic presentations and academic papers is described, the user is asked to create first the keypoints and then formulate the prose text. It's not very difficult to transfer this tutorial for creating Wikipedia articles. The reason why this is not made in existing Wikipedia tutorials is, because the average long term Wikipedia author is already familiar with academic note taking. For the Wikipedia expert there is no need to talk about creating notes, because this step is assumed. This untold assumption makes it harder for newbies to do the same what Wikipedia experts are doing. What the newbies are doing is not making notes, but they think its possible to create on the fly edits.

Let us describe the imaginary on-the-fly edit in detail. On the fly means, that an edit can't be divided into substeps. Somebody reads a fact in a book, and adds this fact to Wikipedia by submitting an edit. This workflow is described in existing Wikipedia edits. The problem is, that in the reality it won't work. Especially not for newbies. The reason is, that the subjects are too complicated, the quality standards in Wikipedia are too high and the newbie isn't familiar with academic writing. The result is that an on-the-fly edit will become a reason why the newbie get banned.

Two step edit pipeline

The recommended edit workflow is much easier to master. Creating keypoints from existing academic papers is not very complicated. The user has to write down important facts and he makes a note from which paper the information comes from. Training this behavior is not very advanced. The second step in the overall pipeline is also easy to master. Taking existing keypoints and convert them into prose text is something which has to do with academic writing. The facts and the literature are given in advance and what the user has to do is formulate the facts in an easy to read paragraph.

The interesting point is, that after combining both steps the result is a high quality Wikipedia edit, which is accepted by the admins. A win win situation means, that such an edit make sense for the newbie and the admins at the same time. The newbie is proud because he has edited in the Wikipedia, while the admin is happy, because existing articles were updated.

A good starting exercise for Wikipedia newbies is to focus only on making notes. The task is, to take 10 existing articles and post keypoints to the talk page. That means, the newbie isn't creating real edits, but he is posting only keypoints to the talk page. The most interesting effect is, that none of these keypoints gets deleted from the talk page. Because the talk page is the perfect place for storing the preparation notes.

April 24, 2020

Understanding the concept of Longterm releases

In the Open source software development model there are long term stable versions available. For example Firefox ESR, Debian stable, Ubuntu LTS and Linux kernel LTS. What most users doesn't know is what the difference is to the normal version. In most cases, the long term version is older which means, it is not the latest update but its outdated. If the aim is to update the system to the latest sourcecode, LTS versions doesn't make much sense. On the other hand, Firefox ESR and Linux LTS are available as an option so why are LTS versions are available if they are a poor choice?

To understand the situation we have to take a look into closed source software development. All the commercial software is published in long term version. The Windows XP Operating system is available only in the LTS version, the same is true for Windows 10, Mac OS X and so on. The most obvious difference between a rolling release and a long term release version is, that in the LTS version some components are missing. That means, the Firefox ESR browser is a feature reduced version of the latest Firefox browser. If the normal Firefox software has a menu with 10 menupoints, the ESR version is equipped only with 8.

The decision pro or against a LTS version has to do with the missing features. Its the difference between 8 menupoints and 10 menupoints. From the development perspective the interesting question is, why are the missing 2 menupoints not available in the LTS version? It has to do with stability. Improving a software with additional features needs a lot of work. If the manpower in terms of programmers is small, it will take some time until the missing 2 menupoints are added.

All long term versions have in common that with a time delay all the missing features are added. If the endusers waits a bit, the imaginary Firefox ESR version will have 10 menupoints too. The only question is how long does it take. In most projects, the time gap is 1-2 years. That means, the developers have programmed all the 10 menupoints already, but they are only available in the trunk branch but not in the production version. If a software contains of a long term version plus a trunk version at the same time it's a positive situation. It means, that the development team is improving the software and another team is testing the changes for stability and security. On the other hand, if a software project contains only of a trunk branch and doesn't offer a LTS version it's a toy project. That means, the software is programmed just for fun but isn't used in reality.

Let us analyze this hypothesis in detail. Suppose a development team is trying to improve the security, the stability and the amount of features in a software. The only way in doing so is by creating additional branches. One feature branch, one security branch, one stable branch and so on. After these branches are created a time lag is the result. That means, the security team will analyze version 1 of the software, but the feature team is implementing version 2 of the software. The security team is testing something different from the development team.

Let us imagine, the software project contains of a single branch which is called trunk. It is not possible to establiish a quality control on a single branch development model. The reason is, that all the latest updates are commited into the trunk branch and it makes no sense to ask if a certain version is secure. Because the version is changing twice a day. The result is, that a trunk-only repository isn't tested for stability nor security. Instead it's the development version which isn't recommneded for productive usage.

Let us investigate why some software projects have no long term version. The reason is, that the software developers want to minimize their effort. Its a one man team or a two man team which is programming the sourcecode. No quality control is available. Such a software development cycle isn't used for commercial software. Because most customer are paying money for the software and in return they are expecting a high quality. Only open source developers are bold enough to publish the trunk version of the software.

Point releases
Software development consists of two conflicting roles. The end user of a software likes to install the program on the PC, needs a documentation, and is interested in a bug free system. The needs of the software developers are the opposite. A software developer likes to modify the code twice a day, he is testing out new modules and he doesn't like to waste his time in writing a documentation.

Two conflicting roles means, that if the end user is happy, the developer isn't and vice versa. It's only possible that one social role in the game wins. The reason why long term versions were invented is to fulfill both needs at the same time. Let us first describe the end users perspective. The end users is downloading the LTS version of the software. This version is bug free, is well documented and runs out of the box.

From the developer perspective the LTS version is useless. The developer has no need for a documenttation because he has written the sourcecode himself, and he has no need that the software can be installed easily, because he compiles the sourcecode from scratch. What the developer is prefering is a trunk branch in which new commits can be applied. Because of the different needs, there are two versions of the same software available. The LTS version which is adressed to the enduser, and the trunk version which is the playground for the developers. Between both versions, there is a time lag. That means, they are out of sync. This allows that both social roles are happy at the same time.

Switching to Debian isn't easy

After experimenting with the Debian 10 operating system in a qemu enviornment, i have decided to install the software on a physcial machine. Unfurtunately, the installation menu was a bit complicated. First thing to mentioned was, that the touchpad wasn't recognized, so all the settings has to be made with the keyboard only.

Second problem was that after the first boot up the display resolution was wrong. Only a vesa mode was shown which was below the normal resolution. After experimenting with different grub settings (none of them are working) the answer was hidden in the debian wiki https://wiki.debian.org/AtiHowTo What the user has to do is to install a non-free AMD graphics driver. After the next bootup the normal display resolution is shown.

It's unclear how other Linux distribution are handling the situation. Perhaps they are installing in the background the non-free package and doesn't ask back. So nice, the resolution is now improved.

One thing which isn't working yet in debian is the edge scroll of the touchpad. It seems that with the new wayland display manager the settings aren't recognized. Perhaps it's possible to find a workaround or in the worst case the touchpad is working below their optimal quality.

Let us describe a unique feature of debian which has to do with installing outdated software. The standard webbrowser in Debian 10 is Firefox 68.7 ESR. In contrast to the normal Firefox software, this version was released in 2019 and then it was improved slightly. So what is the difference? The interesting situation is, that for most users the ESR version makes more sense. The story told in the version history looks predicatable. That means, a year ago the software was programmed, and then it was improved by security updates.

Now it is possible to compare this story with the trunk branch of Firefox and Chrome. In the trunk branch the story is, that the user has to check twice a week for an update, and if an improvement is available he has to install the latest version of Chrome within 24 hours, otherwise the system becomes vulnarable to attacks. Or let me tell the story a bit different. Suppose a user has installed the latest Chrome browser and hasn't updated the software since a week. From the perspective of the Chrome development team, the user has made something wrong. He was adviced to check for updates twice a week, he wasn't doing so, and as a result the user has made mistake.

Rolling release webbrowser are blaming the user if the system becomes vulnarable. In contrast, Longterm versions like Firefox ESR are blaming the upstream. That means, if something with the Firefox 68 ESR is wrong it's up to mozilla to fix the issue. And if Mozilla isn't able to fix the problem, the next question is why does a certain compoenent was introduced in the ESR version which needs so frequenetly an update?

Touchpad in XFCE4

After playing around with a different display manager the problem with the edge scrolling has been solved. In XFCE4 the touchpad can be configured differently than in gnome. Which allows to even scroll the content on the screen with the touchpad itself.

April 18, 2020

Creating a peer reviewed academic journal from scratch

There are some peer reviewed academic journal available. They have in common, that the published information has a high quality and most of them have a long tradition in the classical university ecosystem. Since the advent of Open Access there is need to start new academic journals. The open question is how to combine the Open Access philosophy with a peer review pipeline.

The technical side of an Open Access journal is very easy. In most cases, it's enough to upload a pdf document to a webserver and the paper can be read by a worldwide audience. We can discuss about the details, that means which software is producing the pdf format and which sort of webspace is the right one for hosting a journal, but in general this kind of pipeline will result into a high quality journal. That means, the document can be displayed on any device, and the webserver will deliver the information to any reader in the world within seconds.

The more advanced and seldom discussed issue is how to create a peer reviewed journal. A normal Open Access journal doesn't has a peer review, but it's some sort of pdf hosting website. That means, the admin of the journal uploads the pdf file, but the paper was never read by someone before the publication. In the classical academic publication system there is some sort of prepublication peer review available which allows to increase the quality, but its unclear how to reproduce this workflow in an open Access journal.

The current situation is, that some journals are experimenting with overlay journals, open peer review system and community driven peer review. One option is that somebody is only allowed to upload a new paper if he has peer reviewed an existing one. Another option is, to ignore peer review at all and allow the normal reader to comment newly published information. This will result into some sort of arxiv website which is extended with a comment section.

A truely peer review system is working a bit different. To framework for explaining the details is located in the software industry. The git version control system has a build in peer review feature. This feature can be activated with a dual branch workflow.

But let us go a step backward. Software development with git works usually with a single branch model. In the trunk branch the changes are submitted to a remote server. A single branch workflow doesn't has a peer review. A peer review has to do with creating two branches which are out of sync. A stable branch and an unstrable branch are needed as the minimum requirement.

Peer review and merging two out of sync branches is the same. The amazing feature of merging two branches is, that it will produces a conflict in any case. This kind of conflict creates a need for the stakeholders to negotiate about the issue. This negotiation is equal to a peer review. It is very different to comment a paper from the readers perspective, because a branch merge is done in the pre-publication step.

Now it make sense to transfer this philosophy to an Open Access journal. A minimal peer reviewed open access journal contains of two sections: unstable upstream and stable downstream. In the upstream section the incoming papers are stored, very similar to the arxiv repository. In the stable downstream section, the next issue of the journal is created. The interesting point is, that the stable section doesn't referencing to the upstream section but a complete copy is created. It's the same principle like in a git version control system. The stable branch and the unstable branch can be edited independent from each other. That means, the paper in the upstream section can be modified without affecting the paper in stable section.

Peer review can be realized with a dual branch model which is out of sync. To sync the branches a negotiation is required. Negotiation means to discuss the next issue of the journal with colleagues. An interesting side effect is, that the social roles in each branches are different from each other. That means, an author is allowed to upload a paper to the upstream section, but this doesn't mean, that this paper gets published in the downstream section.

Let us create a single example. The author uploads paper1 to the upstream section of the journal. The journal editor reads the paper and comes to the conclusion that the quality is too low. He decides not to publish this paper in the next issue. It is available in the upstream section, but it doesn't get copied into the downstream section. This produces a communication conflict, because the journal editor sends a rejection notice to the original author. This sort of communication is typical for all peer reviewed journal. What is available is a conflict between different social roles of the journal. These conflicting roles are attractive for normal readers because it makes the publication system more robust against wrong information.

From a technical point of view, there are many options how to realize a dual branch system. One option is to use the github project for hosting an academic journal. A more easier to realize system is to ignore the git tool at all and store the branches in different sections of a wiki. That means, the upstream branch is section 1 and the downstream branch is section2. This allows to create a peer reviewed academic journal on a single wiki page.

Peer review

Let us describe in the context of the branch model, what peer review is about. If the Open Access journal was created with the described workflow it contains of two sections: unstable upstream and stable downstream. The interesting point is, that this outline doesn't solve problems, but it will create many new tasks. One of them is the question which of the papers should be copied into the downstream section. That means, from the perspective of the journal editor the situation is, that some papers are available in the upstream section, but it's not clear which of them will fit into the next issue of the journal.

There are more than a single option to adress this question. A naive attempt is to use a dice and decides with a random generator which of the paper fulfill the quality standards of the journal. A second more elaborated decision making strategy is, if the journal editor decides by himself which of the upstream papers is well suited. And the best practice method is, that the journal editors delegates this question to a group of peer reviewers.

Because this point is equal to peer review it make sense to describe the process in detail. The starting point is, that the journal has two sections (upstream and downstream). To copy a paper into the downstream section a decision is needed about the quality. This decision is delegated to a group of people. What the group can do in response is to peer review the paper or not. In the worst case, the journal editor doesn't find an external peer reviewer. So he has to decide by himself if the paper fulfills the need of the readers.

But even in this case, its a peer reviewed journal. Because there was a decision which was taken. The decision if a paper fulfills the standards or not is only needed in a two branch model. In a normal repository there is no need to judge about a paper.

In the software industry, the principle has a long history. In the git tool a so called branch can be created easily. Creating a branch means to copy a folder and then it's possible to edit the folder without altering the original folder. Sometimes it's called a fork. Because the same sourcecode is available at two places at the same time. The interesting point is, that after creating a branch both branches will get out of sync. That means, a user can edit branch1 and branch2 isn't affected. This principle is a very powerful one and allows to divide software development tasks into subproblems.

Branches are used in the Open Source world for many things. There are feature branches available to fix a problem, and there are stable branches available to update complete operating systems. What comes very close to a peer reviewed academic journal is a stable release linux distribution like Debian. It's the same principle. Debian is peer reviewed software, that means, the Debian ISO file is different from the debian upstream branch.

Freezing the upstream

Every open source project starts with an upstream branch. The upstream is a repository which stores the sourcecode on a server. In most cases, the upstream is equal to a github folder, but the upstream can also be located on a SVN server or a FTP server. The upstream repository allows the creator of the software to update the project. He can upload new files and alter existing one. In case of content the upstream is equal to a wordpress blog. It's a place in the internet in which information is stored.

The interesting point in open source projects is, that apart from the upstream repository a second action is needed, which is called freezing. Freezing means to convert the sourcecode in the upstream into a release which can be delivered to the normal user. The interesting point behind freezing is, that from the programmers perspective this step has a low priority. What the software authors is trying to do is to improve the software with new updates. He isn't interested in stopping this update cycles. The only one who needs a freezed stable release is the end user.

In the Linux ecosystem there is a long duration discussion available if the normal user has a need for a freeze version or if he can use a rolling release version. Rolling release means, that no freeze is available but the normal user installs the same version as provided by the upstream. The interesting point is, that rolling release was never a success for real projects. All the major software systems like Debian, Windows 10, Red Hat, Apple Mac OS and Android are delivered in a release version which is froozen. So called nightly build versions are only available as an alternative. But they are not installed on productive systems.

The reason why it make sense to analyze the Open Source development model is because the concept of freezing the upstream is available since many years and it's discussed in the literature. Its the best practice method in open source software development. The same concept can be adapted to scholarly paper writing. Freezing a paper is equal for creating a peer review. The shared principle is, that the original author of a paper isn't interested in freezing a paper. Because this is equal to loose the control over the content.

In the reality a peer review is something which is working against a paper. A peer review is desired by the readers. A peer reviewed journal communicates between authors and readers as an intermediate.

April 15, 2020

Building a modern robot from scratch

The main reason why Artificial Intelligence has failed in the past in real robotics projects is because its focused on computer science but not on the underlying domain. The untold assumption is, that np hard problems have to be solved with a certain algorithm which should be implemented in a programming language. After executing the program it will solve a certain AI problem, for example to grasp an object with a dexterous hand.

Why this assumption won't result into a grasping robot is because nobody knows how the algorithm can solve a task. In contrast to sorting an array, so called AI tasks have nothing to do with computing itself, but they have to do with driving a car, the shape of objects and communicating in natural language.

The better idea for realizing AI systems is to start with a teleoperated robot which is extended later with a database of trajectories. In the first step the human operator controls the robot arm with a joystick. This allows him to grasp an object. In the step 2 the pipeline is extend with grounded natural language and a learning from demonstration motion database. Both modules are not located in classical computer science nor mathematics but they have to do with applications of Artificial Intelligence.

Perhaps it make sense to go into the details. Suppose the human operator is able to grasp an object with a joystick. In theory, he can do so many hundred times, but the goal is to transfer the task into software for higher productivity. One important step towards this direction is to repeat the same action and record the trajectory. The result is a motion capture database. If the scene in front of the robot fits to the recorded scene the recorded action is reproduced in the playback mode. An interpolation between different trajectories will increase the accuracy.

The next step towards advanced robotics is to tag the trajectory database with grounded language. That means, the database is annotated with labels like “open gripper”, “close gripper” and “push object”. This allows to search in the database easier. For example, if the next task is about “push object”, an SQL query to the motion database will return all the trajectories from this domain. Then the solver will select some of them and creates the interpolated trajectory which is executed on the robot.

The combination of teleoperated robotics, learning from demonstration, and natural language grounding is a powerful technique to realize robotics projects which can be used in reality. That means, the system is not only an academic project to teach the students who they should do something, but the robot can be used for solving practical tasks.

The reason why this approach is nearly unknown in mainstream robotics and AI has to do because its easy and very complex at the same time. The described method combines artifacts from different domains. It has to do with motion capture (which is used in movie production), with grounded language (which is used in natural language processing) and with spline interpolation which is located in regression analysis. Combining all these subjects into a single project is not common in normal computer science. What computer scientists in the past have done is to solve a single problem. For example, they want to search in a database for a value. This limited problem is analyzed in depth, and the algorithm is created in a high level programming language. Unfortunately, this problem solving strategy fails in AI domains.

A good starting point for all sort of AI applications are teleoperated robots. Teleoperation means, that the machine has human level capabilities as default. The idea is, that a human operator is in charge of the system all the time. He is not allowed to leave the joystick because then the robot will fail to solve a task. If this teleoperated paradigm is working, the next step is to think about how to reduce the workload of the operator. That means, that he can control the robot hand more easier and relax a bit.

Trajectory replay

The interesting effect of a trajectory replay is, that on the first trial it won't work. If the robot repeats the prerecorded trajectory in a new situation the robot isn't able to reach the goal. But this failure doesn't show, that the idea is wrong, but it shows, that trajectory replay isn't the answer to the problem, but its the problem itself. The question is how to program a trajectory replay system which can adapt to different situations? Learning from demonstration is some sort of challenge which has to be addressed with modern algorithms.

What is the current status in commercial robotics?`

Artificial Intelligence is from a technical point of view a complex domain. There are lots of papers with theories about neural networks and all sorts of robot control systems available. Even experts are not sure, which of them is a well written paper and which is providing boring information. A better idea is to judge about the status of robotics by take a look into commercial available products. Some brandnames are available like Agility Robotics, Boston Dynamics, Waymo, Moley robotics. They have in common that more than a single youtube video are available in which the engineers are demonstrating what is possible today.

The latest series of Agility robotics is able to walk on two legs, can climb stairs and is able to hold a box in the hands. The Waymo car is able to drive on the street alone, while the latest robot from Moley is able to cook a meal. All of the videos have someting in common. They are very new which means, that the videos are uploaded in the last 2 years and all of them showing robotics not available before. That means, the technology has made a big progress and it seems that all the difficulties are solved.

What will happen, if only the robotics from the mentioned companies are used in reality? Lots of human work can be done with these machines. There is no need for human truck drivers, for human cooking chef and for a human postal service. The main problem with this development is, that it is much faster than even experts have assumed. And what will happen in 2 years or even 4 years from now? Under the assumption that the trend is valid, the robots get improved and the result is, that the robots in reality have more skills than the robots from movies.

The interesting situation is that the current society isn't prepared for this kind of technology. What is expected by science fiction authors is, that in around 30 years from now some small progress is made towards fully autonomous production. But if the revolution is available within 2 years it will become too fast for the world. The only hope for critiques of technology is, that the videos of Boston Dynamics and Agility robotics are fake. That means, that the robots are performing great in a staged scenario but struggle on real tasks. This will make it impossible to use robots in real life condition.

Analyzing if robots are useful in practical application can be done by monitoring two parameters. First, the price of goods and second how important human work is for a society. If a robot revolution is there, the price for goods will drop to zero and human work becomes a low priority, because all the work is done by robots which are working for lower cost. If the engineers are struggle to introduce robots in the real world, the price for goods remains constant and human labor can't replaced by machines.

What is a fully automated factory?

There is a myth available about so called perpetual motion machine. That is a wonder machine, which will work without interruption after pressing the on button. Most engineers have come to the conclusion, that such a machine would violate the physical laws, or to be more specific the law of thermodynamics. What the engineers are not aware is, that perpetual motion machines are not invented as real machines, but they are the subject of stories about automation.

The concept is about a technology which doesn't need human work but works without human labor. The question is not how to build such a machine with the law of physics, but such a machine has to fit into economic context. To be more specific, a perpetual motion machine is something which the owner of a factory likes to buy to reduce the costs. What the owner of a factory is trying to achieve is to produce a maximum output with a minimum amount of work. This will increase his profit.

What is available in the reality are example of factory automation. The typical machine needs a lot of energy and is repeating the same task over and over again. A printing machine is an example, but a pizza making assembly line is also a good example. From a physical standpoint these machines are the opposite of perpetual motion machine, but from an economic standpoint they are. What automated factories have in common is, that the costs for the factory owner is low, and at the same time the factory is producing lots of pizzas. Each of them costs nearly nothing. It's surprising at which low price it's possible to produce goods if all the steps are fully automated.

Suppose a fully automated pizza line was installed in a factory, and the raw materials are available. From a users perspective such a device is a miracle. The user can enter who many pizzas he like to eat, and after entering the number of 100k the start button is pressed. The machine won't stop until all the units are created. No further interaction is needed. The most interesting point is, that such a workflow is available in the reality. That means, real pizza making factory can be visited and they are used to produced food for the population.

To understand fully autonomy machines better we have to ask for potential bottleneck. What all these devices have in common that they need something as input. Its electricity plus raw materials. If no energy is available and no cheese is there, the machine won't work. The interesting point is, that these input materials are endless. Producing energy at low costs is an easy task and producing tones of cheese is also a solved task. If the production of raw materials is combined with fully autonomous assembly lines the result is a fully automated economy. Literately spoken, such a system can produce endless amount of goods for zero costs.

April 12, 2020

OpenRA servers hit new record in numbers of players

The OpenRA game is available since many years. Its an open source clone of the famous Command&Conquer series. What makes the software interesting is, that it will run under Windows and LInux as well. In contrast to other Real time strategy game, no costs are charged for the users. Since two weeks the amount of players on the server has hit new records. In the past, it was difficult to find enough player to fill a map, but since a while this is not a problem.

Today there are more than 300 players at the same time on the servers, and new games are starting all the time. Technically this was possible in the past as well, what was missing in the year 2019 were a larger amount of players. It seems, that some newbies have discovered the game and are participating in the matches. On the website there is a statistics available https://www.openra.net/players/ which shows the increase of active players since April 2020.

Perhaps it make sense to introduce the game itself. What the users has to do is to manage a large amount of units at the same time. In contrast to a simple jump'n'run game there is not only a single character on the screen, but the user is in charge of 50 and more units at the same time. It's some sort of chess, but much faster. In the typical match 4 vs. 4 players, the situatiion will become chaotic very soon. That means, each player is in control of 50 and more units and the map is showing hundred of sprites at the same time who are doing something or not.

The OpenRA game has much in common with World of Warcraft. The difference is, that the graphics is only 2d and no background story is told. The result is, that the game engine itself is very little. The .exe file needs only 18 MB on the harddrive, which includes all the graphics, the multiplayer mode and even a replay mode to analyze games played in the past.

April 08, 2020

From a burnout society to an open society

In the sociology literature it was observed that modern societies are effected by the burnout problem. If a burnout doesn't effect a single individual but a larger group, it's called a burnout epidemic. That means, that the situation is out of control and the question is how to handle the stress level of the group.

A naive assumption is, that the problem of stress, burnout and burnout epidemic will disappear without external intervention and that in the future, the tasks for the individual will become easier to solve, but not more complicated. A look into the reality shows, that since the advent of the Internet and a demanding complexity in the macro economy, the problems will become bigger but never smaller. That means, that future societies will be effected more by the stress problem, but not less.

The good news is, that an answer to the situation is available. It was first introduced for the software industry but can be adapted to other domains as well. The answer is to transfer existing processes into Open processes. Instead of creating proprietary software the idea is write open source software. Instead of creating paywall protected academic papers the idea is to publish a paper as Open Access. And in contrast to manage a society as closed society the better idea is to establish a open society culture.

But what does the term Open means in reality? At foremost it has to do with a different role model between consumer and producer of a good. Open Source software is from an economic standpoint an example for a consumer first ideology. The consumer which is the end user of the software, gets the latest security updates, the most advanced software and he doesn't has to pay anything for it. All the Debian users who have installed the software on their PC never pay something in return. They get only the latest linux kernel, the LaTeX tool and the powerful gimp graphics program and they have to provide nothing in exchange. The same case is true for the OpenAccess ecosystem, in which the enduser can read as many high quality pdf papers in Google Scholar without paying a single cent.

On the other hand, somebody has to produce all these good. The OpenSource software has to be written and a pdf paper too. The interesting point is, that this problem is up to the single producer and isn't managed by a company.

Today, software industry and academic content creation are the only domains in which the term Open was introduced. In all the other domains of economy, for example in logistics, retail industry and in the medical sector the paradigm is focussed on the classical closed economy model. Closed economy means, that that the consumer of a good has to pay the price, and the producer of the good gets a monthly salary. There is a reason why open source is available since the 1980s, while Open logistics not. Because software can be distributed over the Internet, but logistics services not. With so called telerobotics this can be changed. Suppose there is a truck which can be controlled from remote. There is no need that the driver is located physically in the truck, but everybody who has access to the internet is able to control the truck. Under such a contraints, it's possible to manage the task of cargo transportation as a game. Very similar to writing software it can be handled with the open paradigm. From the consumers perspective the situation is pleasant. Suppose a consumer has a need that the load is transportated from a to b. Similar to all Open services, he never pays a price for the task, but he asks if somebody is able to do it for free.

This sounds a bit uncommon, so let us go a step backward into the domain of the software industry. The current situation for endusers who are familiar with Linux is, that they are searching for a piece of software. For example a file manager. And the additional criteria is, that this software needs a GPL license. All the proprietary file manager are ignored by the consumer. He defines the GNU license as mandatory.

Now its easier to imagine what future consumers of logistics services will do. They are defining as a criteria, that the transportation needs to be handled without any costs. It's up to the opponent how he can reduce his costs downto zero. If somebody isn't able to do so, he doesn't get the task.

Today's economy isn't powerful enough for the Open society ideology. If somebody is asking for a truck for free, he won't find a single example. But with advanced technology and especially with remote control of machines it's possible to realize this ideology in the future. A possible toy example is a cheap drone which is controlled over the internet, and the advanced feature is, that the consumer doesn't has to pay for the service, but it's financed with advertisement or something else. Similar how Open Source and Open Access is financed.

Open society means to extend the term Open Source, Open Access and Open Science to the society in general. That means, that all services which are offered are provided without costs to anybody. This sounds a bit like socialism, but it's the opposite. It can be described as an advanced from of capitalism in which the stress level is increased.

Language patterns in crisis communication

In role playing games, video games and online forums there is a special sort of communication pattern available. It's a conflicting language which is escalated by individuals. The interesting point is, that conflicts in crisis communication are the gold standard for mastering the game. And preventing such a language style won't work in solving real problems.

Let us assume what will happen, if Stackoverflow prevents to downvote existing answers, if Wikipedia stops banning of newbies and if the participants of a video game are only allowed to send greetings to the team players but are not allowed to criticize each other. This is equal to an anti-crisis communication. That means, no problems are there, and no conflicts have to be solved. Such a situation is equal to not playing the game at all.

All the issues in Stackoverflow, all the edits in WIkipedia and all the existing multi-player games in the internet has to do with solving problems. That means, at first there is some sort of issue, and different users have to interact to solve this issue. They are doing so with a crisis communication which is equal to a panic mode. Players who have learned to use such a language are able to become successful in such a game. While players who are not able to cope the stress get excluded from the game or resign by itself.

The best example is perhaps the Wikipedia game. It's a website in which the users are creating an encyclopedia. Everybody who is familiar with wikipedia will describe the situation in the talk sections are stressful. Wikipedia internal conflicts are solved and created with a panic based natural language. In the easiest case, an admin comes to the conclusion that an edit of the newbie doesn't make sense. But longer term Wikipedia editors are criticizing each other in the same tone. Does this crisis communication shows, that the Wikipedia has failed and the projects become obsolete within 2 months? No, it's the opposite. Because at the same time, the Wikipedia articles which are presented as a frontend to the reader have a higher quality since ever.

That means, a stable communication system contains of crisis communication and relaxed appearance at the same time. Let us observe a conflicting computer game from the outside. The different players in the game are communicating against each other. They are tracking conflicts and are not motivated to slow down their voice. At the same time, the game is running great. That means, the experts are playing the game and the success is guaranteed. This sort of mixed impression is available for all complex group oriented games. For example, at Stackoverflow each day thousands of downvotes and negative comments are posted. At the same time, the answer quality of the website is high. That means, if somebody has a problem with programming in a certain language he will find the answer at this single website very sure.

The reason why conflicts and complex problem solving are belong together has to do with asymmetric information. The typical situation in Wikipedia is, that user1 is an expert for a domain, while user2 not. From a technical point of view, both users are not able to work together, because their knowledge doesn't fit together. The same is true for most multi-player online games in the internet. Player1 is a newbie, player2 is an expert and they have never played before in this game. The result is, that they don't understand each other. The interesting situation is that the game will start with this bad situation. The result is, that during the game the users are communicating wrong, and they will make mistake. After recognizing the mistakes, they will lame each other not playing well enough.

This situation isn't located in a certain player, but it's the general pattern for all online games, and for all online forums. The starting situation is, that from an objective point of view, the newly created group isn't prepared and shouldn't work together. But this is never a barrier. No matter which player are attending a game server, the game will start in every case.

The reason why so much conflicts are available is because the players are different. The conflicts will become greater if the background of the player doesn't fit to each other. The users are arguing from very different point of views about the same subject. And the conflict is a clearing mechanism for negotiating with each other, especially in a complex domain.

To fasten things up, it make sense to assume that in every multi-player video games are conflict will become visible and the only question is how the group will solve these conflicts. Solving the conflict means, that the individual needs are matched to the need of the group. For example, a successful interaction with Wikipedia means that an individual is allowed to post something and at the same time the Wikipedia project will profit from it. It's some sort of win-win situation.

If a group or an individual struggles in solving issues, it will become a loose-loose-situation. That means, the user edit get rejected and at the same time, Wikipedia loose an important character who won't contribute anymore.

April 01, 2020

Recent developments in the SE.AI website

The SE.AI website https://ai.stackexchange.com/ is the dominant AI related online forum in the internet. It contains of 6k questions and is part of the larger Stack exchange network. Since a while major changes are taking place in the website. The situation in the past was, that the moderator played a minor role. He didn't posted many comments nor answers, but his objective was administrative nature. He was some kind of technical administrator but wasn't involved in running the website.

Since the year 2019 the situation has changed drastically. The new agenda in SE.AI is, that the moderator is the top rated user in the forum. That means, the current moderator has posted the most answers and has earned the most reputation points. That means, the moderator knows most about Artificial Intelligence and at the same time he is solving conflicts in the community.

Before we can judge about this development it make sense to describe this management style from an abstract point of view. It's equal to elect the best player in a soccer team as the team leader. The result is, that the team leader is the weak point in the overall system. Let us construct an example. Suppose a difficult question is asked in the forum. Only the moderator is able to provide the answer because he has the most experience of all the users. The other users in the community have a weaker position and they not enough skills about Artificial Intelligence. That means, the knowledge and power is distributed unequal.

This management style has some advantages but also disadvantages. The advantage is, that it will minimizes the conflicts in the team. The moderator is accepted because of two reasons, first his social role is strong, and secondly his knowledge is strong. The disadvantage is, that a moderator driven community is vulnerable to a takeover. If the single point of failure makes a mistake the entire group becomes in trouble. The second problem is, that the stress level for the moderator is higher. To defend his strong position he has to create the most postings and has to know everything. The danger is, that the moderator isn't able to do on the long term.

In the management theory there are two different principles discussed: top down moderation and bottom up moderation. Top down moderation is available in SE.AI. It's the classical form of group organization. The group has a strong internal cohesion but fails to adapt to the environment. The team is fixed, new members aren't welcome the moderator is not allowed to leave the group.

It's a bit difficult to predict the future development of SE.AI. One option is, that the moderator is able to handle the disadvantages of top down management style and remains in a active position for the next 10 years. The second option is, that the group isn't able to adapt to future needs. For example, that a request from the outside isn't answered correct. Basically spoken, SE.AI is doing an experiment to investigate if top down leadership works in reality.