November 20, 2024

Uptime mit Hindernissen

 Es war ein ganz normaler Mittwochmorgen in der IT-Abteilung von MegaCorp, als der frisch gebackene Praktikant Kevin mit einem breiten Grinsen und einem Pappbecher voller Energy Drink durch die Tür stürmte.

"Morgen, Herr Schmidt! Ich hab' da 'ne mega Idee!", rief er dem Sysadmin zu, der gerade in seinem heiligen Karteikasten kramte.

Herr Schmidt, ein Mann mit mehr Dienstjahren als Kevin Lebensjahren, blickte skeptisch über seine Lesebrille. "Guten Morgen, Kevin. Was gibt's denn so Dringendes?"

Kevin hüpfte aufgeregt von einem Bein aufs andere. "Also, ich hab gestern Nacht ein bisschen recherchiert und festgestellt, dass unser WLAN-Router total veraltet ist! Der läuft ja schon seit 500 Tagen ohne Neustart. Stellen Sie sich mal vor, was wir alles verpassen!"

Herr Schmidt zog eine Augenbraue hoch. "Und das ist schlecht, weil...?"

"Na, weil es ein Update gibt! OpenWRT hat 'ne neue Version rausgebracht. Mit der könnten wir bestimmt das Internet um mindestens 5% beschleunigen!"

Der Sysadmin seufzte tief. "Kevin, mein Junge, wenn etwas 500 Tage lang ohne Probleme läuft, fasst man es nicht an. Das ist die goldene Regel der IT."

Aber Kevin ließ nicht locker. Mit der Überzeugungskraft eines Gebrauchtwagenhändlers und der Hartnäckigkeit eines Terriers schaffte er es schließlich, Herrn Schmidt zu überreden.

"Na gut", brummte der Sysadmin, "aber nur, weil heute sowieso kaum jemand im Büro ist. Und du machst das Backup!"

Kevin strahlte wie ein Honigkuchenpferd. "Klar, hab ich schon vorbereitet!"

Gemeinsam machten sie sich an die Arbeit. Kevin tippte wie ein Weltmeister, während Herr Schmidt nervös seine Karteikarten sortierte.

"So, jetzt nur noch neustarten und-" PUFF! Der Bildschirm wurde schwarz.

Stille.

Kevin lachte nervös. "Ups, kleiner Wackelkontakt. Moment..."

Fünf Minuten später: Immer noch schwarz.

Herr Schmidt wurde blass. "Kevin, was hast du getan?"

"Keine Panik! Ich probier' nur kurz was anderes." Kevin hämmerte wild auf der Tastatur herum.

Eine Stunde später war der Router immer noch offline. Kevin schwitzte und murmelte Unverständliches, während Herr Schmidt stoisch in seinen Karteikarten blätterte.

"Ähm, Herr Schmidt? Ich glaube, wir haben ein Problem."

Der Sysadmin schaute auf. "Wir?"

In diesem Moment klingelte das Telefon. Es war der CEO. "Schmidt! Warum zum Teufel funktioniert das Internet nicht?"

Herr Schmidt warf Kevin einen vielsagenden Blick zu. "Tja, Herr Direktor, sagen wir mal so: Unser Praktikant hatte eine 'mega Idee'..."

Als der Arbeitstag zu Ende ging, saß Kevin immer noch vor dem toten Router, umgeben von leeren Energy Drink-Dosen. Herr Schmidt klopfte ihm auf die Schulter.

"Lektion gelernt, mein Junge?"

Kevin nickte kleinlaut.

"Gut. Morgen zeige ich dir, wie man einen Router richtig updated. Und jetzt ab nach Hause – aber nimm den Hinterausgang. Der CEO wartet vorne mit einer Mistgabel."

Während Kevin sich davonschlich, murmelte Herr Schmidt: "500 Tage Uptime. Verdammt, das war ein guter Lauf."

Er zog eine Karteikarte hervor und notierte: "20.11.2024: Praktikanten sind wie Updates. Manchmal bringen sie Fortschritt, meistens Chaos."

Mit einem Schmunzeln steckte er die Karte zurück in den Kasten. Es war Zeit, nach Hause zu gehen und ein Bier zu trinken. Oder zehn.

November 19, 2024

Der Übereifrige Praktikant


 Markus, ein 22-jähriger Informatikstudent, begann voller Enthusiasmus sein Praktikum bei TechSolutions GmbH. Mit dem neuesten Wissen aus der Uni bewaffnet, war er fest entschlossen, seine Fähigkeiten unter Beweis zu stellen.

An seinem dritten Tag bemerkte er einen OpenWRT-Router mit einer beeindruckenden Uptime von 400 Tagen. Fasziniert wandte er sich an Klaus, den erfahrenen Sysadmin.

"Klaus, hast du gesehen? Der Router läuft seit über einem Jahr ohne Neustart!", rief Markus aufgeregt.

Klaus nickte gelassen. "Ja, das alte Schätzchen. Läuft wie ein Uhrwerk."

Markus runzelte die Stirn. "Aber das bedeutet doch, dass er seit über einem Jahr keine Sicherheitsupdates bekommen hat. Sollten wir nicht dringend updaten?"

Klaus seufzte. "Markus, in der Theorie hast du Recht. Aber dieser Router ist das Herz unseres Netzwerks. Solange er läuft, fassen wir ihn nicht an."

Doch Markus ließ nicht locker. Tagelang bombardierte er Klaus mit Artikeln über Sicherheitslücken und den Vorteilen der neuesten Firmware. Schließlich gab Klaus nach.

"Na gut, du Quälgeist. Heute Abend nach Feierabend können wir es versuchen. Aber ich warne dich, wenn etwas schiefgeht..."

Markus konnte sein Glück kaum fassen. Um 18 Uhr trafen sie sich im Serverraum. Mit zitternden Händen lud Markus die neue Firmware herunter.

"Letzte Chance zum Abbruch", warnte Klaus.

"Alles wird gut!", versicherte Markus und klickte auf "Update starten".

Die Minuten krochen dahin. Der Fortschrittsbalken bewegte sich quälend langsam. Bei 99% hielt Markus den Atem an.

Plötzlich wurde der Bildschirm schwarz.

"Äh, Klaus? Ist das normal?", fragte Markus mit zitternder Stimme.

Klaus' Gesichtsausdruck verhieß nichts Gutes. "Nein, das ist es nicht."

Sie versuchten alles: Neustart, Reset, sogar ein Gebet zum IT-Gott. Nichts half. Der Router blieb stumm.

Um Mitternacht gab Klaus auf. "Das war's, Junge. Der Router ist gebrickt. Morgen früh steht uns ein Höllenritt bevor."

Markus sank in sich zusammen. "Es tut mir so leid, Klaus. Ich wollte doch nur helfen."

Klaus legte ihm die Hand auf die Schulter. "Ich weiß. Aber manchmal ist das Risiko eines Updates größer als der potenzielle Nutzen. Das lernt man mit der Zeit."

Am nächsten Morgen herrschte Chaos. Ohne den zentralen Router lag das halbe Netzwerk lahm. Markus verbrachte den Tag damit, sich bei jedem zu entschuldigen, während Klaus versuchte, eine Notfalllösung zu improvisieren.

Als der Arbeitstag endlich vorbei war, rief der CEO Markus in sein Büro. Mit gesenktem Kopf betrat er den Raum, sicher, dass sein Praktikum vorbei war.

Zu seiner Überraschung lächelte der CEO. "Markus, ich habe gehört, was passiert ist. Es war ein kostspieliger Fehler, aber einer, aus dem wir alle lernen können. Wir werden unsere Update-Prozesse überdenken und einen Notfallplan erstellen. Und du wirst dabei helfen."

Erleichtert und dankbar nickte Markus. Er hatte eine wertvolle Lektion gelernt: In der IT-Welt ist Vorsicht manchmal besser als blinder Fortschrittsglaube.

Introduction into the Froozen release model

 


The term isn't widely used, what might be common instead are rolling release systems like Arch Linux and stable release distributions like Debian. From a technical term, froozen release is equal that the End of life (EOL) is at the same date like deployment of the software.
Let me give an example: A new embedded device is sold on day 1 to the market. And the vendor explains, that the EOL for this product is also on day 1. That means, the product won't receive any updates which includes security fixes and feature improvements.
The question left open is, if such a strategy is a best practice or an anti pattern. There are many real world examples available for froozen release models like Slackware Linux (has no dedicated package manager), FreeBSD (same like Slackware) and very important most existing embedded devices are working as froozen release without mention it explicit.
From a technical standpoint a microcontroller has no update mechanism, because the installed software is a slim down version of an operating system and has no ability to change the software. Mostly the software is stored on ROM chips which can't be modified. The only option to update such embedded device is to replace the hardware physically.
Its a fact that such systems are not the exception but the normal situation in today's IoT landscape, there are billion of such devices available. There are two possible discourse space available to talk about the situation. First assumption is, that computer security is only possible by frequent software updates. This is the Arch Linux philosophy which implies that the system will become more stable with every update. The counter argument is, that froozen release is the here to stay strategy, which means, that existing embedded systems which can't be updated are more stable than Arch Linux and other rolling operating systems.
The mismatch between froozen release and rolling release has to do with the difference between software development and production environment. Software development works with a rolling release model. A certain git repository is modified every day by the programmers, they are editing the source code of the software, introducing new features and removing other parts. These source code repositories are useless for the end user, because he needs only the executable binary file which provides useful functionality. In case of Debian like stable releease management, the gap is solved with a fixed schedule. Every two years, a copy of the source code is made which gets compiled into binary files and deployed to the end user. That means, Debian users won't get any updates between two releases.
In case of a froozen release model, the freeze duration is extended. That means, the production system is static and only the upstream source code is modified. In case of embedded system the software is combined with hardware, that means the only option to get a new software version is to replace the hardware.
The interesting situation is, that its possible to design a froozen release software in a secure way. The developers need an awareness, that no further changes are possible which means, they have to program the software with security in mind and bug free from day 1. That means, the software gets deployed similar to a ROM cartridge in the 1980s. This doesn't mean, that the software is unsafe, but it means simply that a different development model was applied.

November 11, 2024

Classical AI programming until 2010

Before the advent of large language models, grounded language and interactive Turing machines, there was a different understanding available how to program intelligent computers. This naive paradigm should be explained in detail, because it allows to identify potential bottleneck of outdated Artificial Intelligence research.
Until the year 2010 the shared assumption was, that artificial intelligence is a subdiscipline within existing computer science and could be categorized with established terminology about software, hardware and algorithm design. A practical example was a chess playing artificial intelligence. The typical powerful chess program until 2010 was realized in the C language because of performance reasons, has implemented a powerful alpha beta pruning algorithm, runs on a multicore CPU and was able to figuring out the optimal move by itself.
The only difference between the chess program was which preference was available within this category space, For example, some older chess programs were programmed in pascal in stead of C which was perceived as less efficient, and sometimes a new algorithm was implemented which was able to calculate a larger time horizon into the future. 99% of the chess programs were realized with such a paradigm because it was the only available discourse space how to talk about artificial intelligence and about computer chess in detail.
The reason why a certain chess engine was more powerful than another one was located in improved technology which might be a faster programming language, an improved algorithm, a or a better hardware. All these criteria are located within computer science. Its the same vocabulary used for describing other computer science topics like databases, operating systems, or video games. So we can say, that Artificial intelligence until 2010 was seen as detail problem within computer science which has to do with hardware, software and algorithm design.
Surprisingly, the discourse space has changed drastically after the year 2010. A look into recent papers published from 2010-2024 will show, that a different terminology was introduced for Artificial intelligence research. Performance criteria from the past, like a fast programming language like C or a fast multi core CPU seem to be no longer relevant. Many AI related projects have be realized in Python which is known as a very slow programming language not tailored for number crunching problems. Also, lots of AI papers are introducing multimodal datasets which have nothing to do with computer science at all but they have its origin in statistical computing. So we can say, that AI projects after the year 2010 are trying to overcome the limitation of computer science in favor of external disciplines like computer linguistics, biomechanics and even humanities. Such a shift in focus has introduced many new vocabulary. For example, if a computer is used to create a motion capture dataset of a walking human, a certain bio-mechanical vocabulary is used which has nothing to do with computers.
 
A possible explanation why classical computer science oriented description of AI topics has felt out of fashion is, that it has failed to solve any notable problem. Even for simple games like tictactoe a highly optimized C program won't be able to traverse the state space. And for more advanced problems in robotics kinodynamic planning, a classical focus on programming languages and faster hardware won't solve anything.
Classical computer science which is working with a discourse space of hardware, software and algorithm works only for trivial problems but not for more demanding robotics problems which are always np hard. It seems, that the AI problems itself were the cause why a classical computer oriented discourse space has felt out of fashion and was replaced with a different paradigm which is oriented on non computational requirements. Since the advent of large language models, this new discourse space is equal to chatbot driven AI.

November 05, 2024

Development time for programming a video game

 

Comparing programming languages is traditional measured in machine performance. The overall question is how small the binary file is, how fast its run on hardware. Sometimes, its asked how well the source code looks, but this depends on individual preferences.
A more practical approach is to compare how long it takes to program the same software in different languages. A rough estimation for coding a pong like videogame including graphics is:
  • Assembly language, 14 days
  • Forth language, 7 days
  • C language, 3 days
  • Python, 1 day
Such a table might explain why a certain language is popular while other not. In the 1980s the first two languages on the list were very popular. For many 8bit computers there was no alternative available over Assembly and Forth. Both languages need only a small amount of RAM and are working fine without any hard drive. In the 1990s the C language has become the standard for programming. Not because C is better than Assembly but because it takes a shorter amount of time to write a program. A good starting point to create a pong like video game in C is to use existing graphics libraries like SDL, this allows to finish the project in around 3 days. Since the year 2010 the python programming has become more popular. This might be a surprise from a technical perspective, because Python is known for its slow performance, especially in game development.
Nevertheless, Python allows to create complex software in a smaller amount of time. Finding bugs in a python program is much faster, and the sourcecode is dramatically shorter than the C counterpart. Its not very hard to predict which sort of programming language will replace python in 10 years, which are of course large language model which require only a text prompt as input and can generate the game by itself. This will reduce the time until a new game was programmed further down to hours.

October 31, 2024

Chronological notetaking with problems


 The easiest way to make notes is to append new information at the end. There is no need to use a ring binder or note cards but a normal notebook works fine. A possible entry would be “date: March 10, headline, body of text”. The note is written down on paper and can be read again forever.

There is a large problem available with such a note taking technique. Chronological notetaking will mess up the notes on the long run. Notes about mathematics will follow notes about history and language. The only sorting order available is the date. This makes it hard to read the notes again. Let me give an example.
Suppose the user has written down over a horizon of 2 weeks multiple notes from different subjects. Now he likes to retrieve all the notes about math. The problem is, that the math notes are distributed over different sections in the notebook. Some math notes are written down on march 10, while other are from march 16 and the notes in between have nothing to do with prime numbers but are from different subjects.
Even if the notes are technically well written its impossible to retrieve them because they aren't sorted by topic but only by date. To sort notes by topic, there is need to use a different principle than only adding notes at the end. Some sort of ring binder or Zettelkasten is needed. With such a tool its possible to add new math notes after the old math notes.
The main advantage is, that its much easier to read the notes again. All the math notes are collected together. The user doesn't has to read the entire notebook, but only the math section. This makes it likely that the notes are consulted multiple times. The disadvantage of ring binders and card catalogs is, that its more demanding to create the notes. Before new notes can be added to the system the correct position has to be located. Which slows down the writing process.

October 22, 2024

Pipeline for developing chatbot based robotcs

 Since the introduction of large language models in the year 2023, the term "artificial intelligence" was defined very clearly. AI means simply that the user interacts with a chatbot in natural language. The chatbot is able to answer questions, can draw pictures and also the chatbot controls a robot.

From the user perspective, the software works suprisingly easy to explain. The user enters a sentence like "open left hand" and submits the text to the chatbot. The chatbot is parsing the sentence and executes the command, which means that the robot will open indeed the hand. More complex actions like moveing to a table and wash all the dishes are the result of more advanced text prompts which contains a list of sub actions.

The unsolved issue is how to program such advanced chatbots in software. Before a user can talk to a robot this way, somebody has to program the chatbot first which can be realized in C++, Java, and so on. The basic element of any chatbot isn't a certain programming library like nltk and its not a certain operating system like Windows or Unix, but the needed building block is a dataset. Or to be more specific, a dataset which maps language to perception, and language to action.

Such a mapping is realized with multiple column because the dataset is always a table. In the easiest example the table consists of a images in the first column and the nouns in the second column:

[picture1.jpg], apple
[picture2.jpg], banana
[picture3.jpg], table
[picture4.jpg], spoon

The chatbot software is using such a dataset to understand a text prompt from the user. For example if the user types into the textbox "take apple". The word apple is converted into [picture1.jpg] and this picture allows to find the apple with the camera sensor.

More complex interactions like generating entire motion data are provided the same way. A verb like "grasp" is converted into motion capture trajectory with the following table:

[trajectory1.traj], open
[trajectory2.traj], grasp
[trajectory3.traj], standup
[trajectory4.traj], sitdown
[trajectory5.traj], moveto

Let me give a longer example to make the point clear. Suppose the user enters the command "moveto table. grasp apple". This command sequence is converted into:
1. [trajectory5.traj], moveto
2. [picture3.jpg], table
3. [trajectory2.traj], grasp
4. [picture1.jpg], apple

In the next step of the parsing pipeline the given jpeg images and .traj data are converted into search patterns and motion pipelines. This allows to convert a sentence into robot actions.

There are mutiple techniques available how to program a chatbot in detail. Its possible to use ordinary programming languages or more advanced deep neural networks. What these methods have in common is, that they are requiring always a dataset in the background. Somebody has to create a table with pictures with objects, and annotate the pictures. Also a dataset with mocap data is needed. Such a dataset allows a chatbot to convert a short sentence into something meaningful. Meaning is equal to a translation task from a word into a picture, and from a word into a trajectory.

So we can say, that a chatbot is the frontend of an AI while the dataset is the backend.

October 11, 2024

Einführung in grounded language

 Ins deutsche übersetzt heißt es soviel wie gelenkte Sprache oder strukturierte Sprache. Es geht darum die Realität mit Hilfe eines Fragebogens zu beschreiben um darüber die Maschinenlesbarkeit zu erhöhen. Dazu ein Beispiel:

Angenommen, es soll eine Verkehrszählung durchgeführt werden. Im einfachsten Fall beginnt man ohne größere Vorbereitung und führt eine Strichliste. Die bessere Alternative besteht darin, vor dem Zählen zuerst einmal ein Formblatt zu entwerfen worin Variablen definiert werden, welche beschreiben was genau gezählt wird. Zur Erfassung des Autoverkehrs bieten sich folgende Variablen an:
Fahrtrichtung: von links / von rechts
Autofarbe: schwarz / blau / grün / rot / sonstiges
Fahrzeugtyp: PKW / LKW / Bus / sonstiges
Uhrzeit

Ein solcher Zählbogen ist weitaus detailierter als eine simple Strichliste weil man viele wertvolle Details zum Straßenverkehr erhält. Die o.g. statistischen Variablen sind identisch mit grounded language. Es wird dabei natürliche Sprache so verwendet, dass ein Vektorraum entsteht innerhalb derer dann etwas gezählt wird. Am Ende der Verkehrszählung kann man dann sagen, wieviele rote PKWs von links kamen oder wieviele LKW insgesamt die Straße befahren haben.

Im Bereich Statistik und Soziologie sind solche Erhebungsbögen allgemein bekannt und werden seit Jahrzehnten verwendet. Sie dienen als Hilfsmittel um komplexe Fragestellungen zu strukturieren und zahlenmäßig zu erfassen. Was jedoch neu ist, ist dass mit der selben Methode das Problem der Robotik und Künstlichen Intelligenz maschinenlesbar aufbereitet werden kann.

Im Grunde muss ein Roboter, der sich in einem Labyrinth bewegen soll, nur mit einem Formblatt versehen werden, was grounded language enthält. Über dieses Formblatt kann die Umgebung in einen symbolischen Vektorraum überführt werden, also maschinell-mathematisch gespeichert werden. EIne Kategorie in dem Formblatt wie z.B. "Farbe=blau" kann entweder wahr oder falsch sein. Die Strichliste kann einen Strich enthalten oder eben nicht. Darüber vermag der Roboter ein Protokoll anzulegen und Entscheidungen zu treffen. Das komprimiert den Handlungsraum. Anstatt Sensoren hardwaremäßig abzufragen, hat der Roboter ein konzeptionelles Verständnis der Umgebung. Man erhält einen high level Sensor der die Daten des Fragebogens inkl. der darin verwendeten Sprache verwendet.