Robotics and Artificial Intelligence: November 2021

November 14, 2021

Fully justified text in LaTeX documents

The LaTeX software is known as a high quality typesetting program. The reason why LaTeX generated documents are described as high quality is because they are looking the same like hand-typeset documents created 200 years ago. To understand LaTeX better we have to describe what classical hand typesetting is about.

Typessetting 200 years ago was at first the art of creating fully justified text. The interesting fact is, that this formatting style is only seldom described in the literature. MS-Word has a simple button to activate it and LaTeX is using this formatting style as default. So the user has no further explanation why this is needed.

Fully justified text means basically to adjust the words on a line so that the right and left edge are straight. In case of hand typesetting this art was very complicated to realize and the main reason why it takes so long until the metal characters are sorted on a page. The main trick for created adjusted text lines is to use glue between the words.

In the past, all the newspapers were typeset with this formatting style. What the manual typesetter were trained is to realize this unique shape. From a technical perspective it is possible to create flush left pages with manual typesetting as well. It would be even a bit more economical. But nobody has done so. A closer look into old newspapers will show that 100% of them were typesetted with fully justified text.

The idea of LaTeX is to imitate this formatting style. A standard LaTeX generated document will look like a newspaper which was manual typesetted in the past.

It is not very hard to guess what the opposite of fully justified text is. Flush left text is equal to missing typesetting. A flush left formatted text looks very different from what is used in newspapers in the past. The main reason why LaTeX generated texts are looking all the same is because of missing flush left formatting. The typical latex user assumes that it is prohibited to use a flush left formatting style.

November 11, 2021

Programming larger projects with Forth

The sad news is that it technical not possible to run existing c source code on a forth cpu. The reason is, that Forth cpus have no registers but only a minimal stack. And no c compiler in the world is able to generate the assembly instructions for such a machine. Even for mainstream CPUs like the 6510 it is hard to write c compiler because this CPU has also a low amount of registers. And in a forth cpu the situation is more dramatically.

So the only practical method to write software for the GA144 and other CPU is to hand code the software in Forth. Unfortunately, Forth is known as one of the hardest programming languages ever next to Dyalog APL. An easy way to learn the Forth language should be explained in the following blog post.

Instead of trying to understand the idea of a push down stack the more beginner friendly approach to write something in Forth is to use modular programming. Modular programming is a powerful software engineering method which is supported by most languages like Pascal, C, Python, C++ and of course Forth. The idea is that each file contains of 4-5 functions plus a handful of variables. The functions can access the variables and the module can be included by other modules. The concept is some sort of strip down version of a class.

Modular programming in Forth is a bit more complicated than doing the same in C. but it is not impossible in writing such programs. It is mostly a question of getting experience in writing stack based functions. The concept of modular programming is the same. It allows that the source code can grow to a size of 1000 lines of code and even more. And each single file has a limited size of not more than 100 lines of code. If the file is getting bigger, the programmer has to outsource some of the routines and creates some interfaces to communicate with other modules.

The interesting situation is that Forth based modular programming works well with existing Forth cpus. The only thing what a certain Forth system needs to support is the ability to execute a word, and also the ability to read and write to variables.

November 10, 2021

Understanding the 1990s in terms of AI

In case of normal compurer technology the 1990s and the year 2021 are not very different. In the 1990s wokstation computers were invented already which includes the Unix operating system. Video gaming consoles were widespread available and letter were created of course with a computer but not with a typewriter. So the only new thing today is, that the CRT monitores were replaced by flat screens and the computer is a bit faster. But this is not a revolution but only a normal minor improvements of the last 30 years.

In the context of Artificial Intelligence the situation is much more different. The situation in 1990s and the situation today is nearly the opposite. The people in the 1990s had only a rough understanding what AI is, and no practical demonstrations are available. Instead of describing what the situation today is let us take a closer look into the past.

The first thing to mention is, that in the 1990s the internet was in an early stage. It wasn't the dominant media which includes all the other media but existing media technology like the cinema, television, books and printed journals were used instead. What the people have learned about Artificial Intelligence were published in these media. That means, the understanding of robots and AI were influenced by movies and books about the subject. Robots played only a small role in all the existing media, and it was unclear if such technology can be realized soon. There was a big gap between robots in the movies and robots realized in reality. And this gap was projected into the future. Basically spoken AI was something not available in the 1990s. It was used as part of a science fiction plot in which robots take over the world or it was explained in a computer science book, that solving the np hard problem is not possible.

This public understanding is very different from today's situation. Today the average user of the internet will see lots of videos with working robots including the source code for the software to recreate them from scratch. This includes walking robots, self driving cars and Mario AI autonomous bots. Such a library of existing software which includes textual tutorials wasn't available in the 1990s. In the past, it was difficult to find an entire book about the subject Artificial intelligence and even the advanced one have described only a very rough situation. Another problem in the 1990s was, that modern programming languages like python wasn't invented. That means it was not possible for a newbie to create a line following robot because for doing so he had to learn languages like C++ first. Basically spoken, creating such a robot was in the 1990s a multi million dollar project for a research lab at the university but not an amateur project for a weekend to learn something.

Perhaps these examples have shown, that since the 1990s many things have changed. The dominant new situation is, that the former np hard problem seems to be solved. That means the amount of papers in which mathematicians are able to proof that AI can't be realized within the next 1000 years has reduced drastically. It was replaced by the understanding, that each month a new robot is presented which is a bit more powerful than the previous one.

What would happen if a person from the 1990s gets teleported into today's time? In case of normal computing the situation would be relaxed. He can use his former knowledge about how to use a mouse and a keyboard to control a computer. He will find the same MS-DOS prompt and the same Unix command line and sending e-mails is the same like 30 years ago. But in one subject the teleported individual gets confused strongly which is the current state of AI. After sitting down in front of a computer and watching a youtube playlist he won't believe what he will see. The shown examples in robotics are so much different than what he knows that the person wouldn't understand it anymore. It would be simply too much to see a biped robot, self driving cars, autonomous drones or the poor hitchbot who is waiting on a park bench at night.

The development in robotics wasn't an abrupt event but a continuous flow of changes. But at the end, the year 2021 and 1990 have nothing in common in terms of AI. There is a huge gap between both years and not a single revolution but many of them have occurred. Most of today's AI technology like biped robots, deeplearning, and game playing AI wasn't available in the 1990s. It has entered the world without any warning and there is no sign that the development will stop someday. It seems that the subject of AI was the single driver which has changed the world into a futuristic one.

Modular programming with any programming language

In the past it was some kind of game to compare different languages against each other. C programmers are convinced that their language is the fastest one, python programmers emphasizes how easy it is to write code and Forth programmers are proud of low energy consumption for a single instruction. It is hard or even impossible to bridge these communities.

On the other other hand there is a unique element which have all the programming languages in common. This feature is more powerful than stack based computing, and easier to create than object oriented programs. The single useful feature is the ability to write programs in a modular fashion. The term modular programming is a bit uncommon and sounds outdated so it makes sense to explain it further.

A module is a file which contains of 5-8 procuedures and 5-8 variables. The procedures are allowed to manipulate the variables and the idea has much in common with a class. A module was used in programming before the advent of oop. The pascal language knows the unit statement, the c language can include prototype files, and fortran porgrammer can create modules as well. So we can say that modular programming is the single paradigm which is available in all the languages including forth which has also modular capabilities.

Modular programming looks not very interesting on the first look but it is the key element to write larger program. A larger programs contains of 1000 and more lines of code. Such projects can only be realized in a modular fashion because it allows to structure the program into logical chunks which are programmed independent from each other.

The perhaps most interesting situation is that with modular programming all the languages including forth and C++ are easy to master. What the programmer has to do is to follow strict the rules. That means, if the program gets bigger he has to create a new file and put the procedures into this file. Programming means to manage procedures and variables. This paradigm allows to solve any problem in software.

Let us take a closer look into some larger Forth projects at github. Instead of explaining how a stack based language works let us focus only on modular programming. What we can see in these projects is that some files are there and each file has the same structure. At top there are some variables initialized and in the bottom some functions are written down which are accessing to the variables. So the concept is the same what c programmers are using and java programmers if they are creating new files for a new class. And yes the principle makes sense because it allows to write longer programs which has more features.

The interesting situation is, that modular programming has no limit in the code size. Because newly created submodules can be included in other modules and at the end there are 50 and more files available which have each 100 lines of code. So the overall structure is highly hierarchical. It is less powerful than real object oriented programming but it comes close to the idea.

It sounds a bit trivial to expain this programming style in detail because c programmers are doing so since the 1970s and it is the most common coding style ever. On the other hand the amount of books how are describing the principle is low. Most C introductionary books doesn't even have a chapter about the subject of creating programs with more than a single sourcecode file. Similar to other languages the main focus of a tutorial is to explain the language itself, which includes the statements, but this knowledge doesn't allow to create useful software.

So we can say that an individual programming language is less important. The fact that the for loop in forth works different from a for loop in python can be ignored. What the programmer needs to know instead is how to write programs distributed in units.

Safe artificial intelligence in the past

From a technical perspective it is very hard to prove that robotics isn't possible. It is even much harder to show, that AI is limited to certain level which can't be over jumped in the future. Also it is impossible to slow down the progress or reverse the development. There is no such thing like a frame to control the development of Artificial Intelligence. What is possible instead is to ignore the future and take a look into the past which is much easier to understand.

The major advantage of the 1990s compared to the situation in the now is that in the past robots weren't invented yet, also the amount of books about the topic was little. That means the 1990s were a time in which the problem of AI upraising wasn't there.

The 1990s were the decade before the upraising of the internet. The dominant media in this time was the television for the mass and the printed book for the educated scholars. The amount of information was small and no search engine were used. From the perspective of AI the situation was also in a very early stage. A robot like honda asimo who is able to walk upstairs wasn't invented. A company like Boston dynamics wasn't there and in the beginning of the 1990s the worlds strongest chess player was a human but not a computer. Also the subject of deeplearning wasn't invented. The only thing what was available were normal perceptrons with not more than 20 neurons. These systems were realized in software for the windows operating system but no useful application was known.

Self driving cars were also not invented in the 1990s because of many reasons. What was available instead were lot of movie related robots like the K.I.T.T. car, or the Data android in Star trek. And the perhaps most surprising fact is that the people in the 1990s were not convinced that Artificial intelligence wil become possible one day. A widepread believe was, that simple biped robots will be available in around 300 years in the future or they won't be possible at all. The underlying theoretical concept to prove this assumption was the np complete theorem. NP complete says basically, that all the AI problems can't be solved on a computer because of the large state space. A similar idea was introduced in the highlight report in the 1970s.

A widespread philosophical interpreation of Artificial Intelligence in the 1990s was, that since the beginning in the 1950s the AI researchers has promised many things but none of the goals were reach. So the rational understanding was, that AI is too complicated in general and it is not possible to build such machines. Nobody in the 1990s were able to disprove this assumption so it was equal to common shared knowledge.

So we can say, that the 1990s were the last real AI winter. AI Winter means that the subject was seen as impossible to realize and that the research towards the subject has stopped. What the computer scientists have done instead is to program normal software for example games, operating systems and very important network protocols for the upraising internet. That means, AI in the 1990s was an esoteric discipline not recognized very much.

Suppose a person from the year 2021 travels back into the 1990s and explains to the audience which sort of robots are available in only 30 years. He will say that biped robots are possible, that kitchen robots can be build, that self driving cars can be realized with neural networks and he will explain that the sourcecode for tetris playing AI Bots is distributed as open source to everybody. The audience won't believe any of these words. The audience will say, that it is impossible. If the time traveler would like to give the details and explain how these software can be realized the audience will leave the room because it is outside their horizon. It would be too much for a person of the 1990s to hear what reinforcement learning is about or that chess software can beat a human player.

November 09, 2021

The 1990s from the perspective of Artificial Intelligence

Describing the current situation in AI and robotics is nearly impossible. There are endless amount of projects and documentation available and at least once a week a new robot is shown at youtube which looks more human like than in the week before. Advanced topics like biped walking, speech understanding and playing of videogames are solved already or the chance is high that within the next 12 months such a success story will become visible and fiction and reality have merged to a complex media campaign. Is it not clear if a certain robot head is remote controlled or by an advanced algorithm or if a certain walking robot was able to do the steps in reality or it was drawn into the picture with animation technology.

If the current world is too complex it makes sense to increase the distance and observe something which has a clear frame around it. The 1990s are a great decade for describing artificial intelligence. The major advantage is that the amount of published books is known and that the level of technology in the 1990s was low. Biped robots weren't invented, chess machines were not able to beat the best human player (with a single exception invented by IBM) and most researchers were not sure if AI can be realized in the future.

Computer technology in the 1990s was in an early stage and most problems were located in slow hardware and bugs in the software.. Windows 95 was during that period a common operating system, and the internet was in the early beginning. This setup makes it easy to give a full overview about all robots and AI projects during this time.

Basically spoken, AI wasn't realized in the 1990s but it was a philosophical topic. The question was if experts in computer science think about how realistic AI will be in the future. That means 95% of the population wasn't informed about the subject at all, and the few computer experts who have used neural networks and expert system in reality were not able to demonstrate practical applications.

Artificial Intelligence in the 1990s was mostly a subject for movies and science fiction books. Lots of stories about intelligent robots were available in this period. Even it was not possible to build real robots it was within reach to imagine a future in which positronic brains and other technology allows humans in doing so.

November 08, 2021

The 1990s from the robotics perspective

... were great, because no such innovation was available. All the technology available today, wasn't invented in this period. What was common instead were normal computer technology plus some philosophical books about how to realize Artificial Intelligence in theory. It was even unclear if a chess computer would be able to win against the best human player. The general understanding about AI during the 1990s was, the same like it was formulated in the famous lighthill report. Basically spoken the idea was, that even some expensive AI projects were started at US universities none of them has resulted into something useful.

The only place in which robots in the 1990s were available was the cinema. In blockbuster movies and of course in the star trek series, many examples were shown. But again, none of these things were build by the researchers.

From today's perspective it is known, that the during the late 1980s, the Honda company has developed humanoid robots. But, during the 1990s the internet was not widespread available so even computer experts were not aware of it. It was common sense, that no one has tried to build walking machines because it is too complicated to realize. What was known in the 1990s were some examples for expert systems. Some of them were described in mainstream computer journals. Also it was known, that for playing chess or tictactoe some sort of Artificial Intelligence is needed. But it was unclear how exactly such technology can be realized.

From today's perspective the 1990s have much in common with the stone age. AI was something not invented yet and it was imagined that it will take 100 years or longer to build it. Only to get the figures, the plot of star trek TNG plays around 300 years in the future. And exactly this was the estimated duration until biped robots are build.

Perhaps one surprising insight is, that in the early 1990s robotic competition were not common. From a technical perspective such robots can be realized with 1990s technology easily. But at this time, no one has seen a need for doing so. What was common instead was to program smaller programs in prolog and lisp. For all home computers and PC a compiler was available and some books were available about the subject. What such prolog programs in the 1990s were able to do was nearly nothing. Some more advanced software was able to solve logic games, but most of the programs were created as hello world examples.

With today's knowledge it is possible to identify some advanced projects in the 1990s not mentioned yet. For example the MIT Leglab has built in the 1990s lots of walking machines. But again, the Internet wasn't invented so no one was aware of it. That means, even if they have build these machines and published papers about it, computer experts, and hobby programmers as well simply never recognized it. Or let me explain it the other way around. Suppose a time traveler would visit a larger university library outside of the U.S. in the 1990s and will read all the books He won't find a single piece of information about the MIT robots, the honda asimo project or any other advanced AI project from this time. Sure, if the imagined time traveler is visiting the MIT library and knows which paper he needs to read then he will find the information. But without such an advantage he stays completely clueless.

November 05, 2021

Symmetric typesetting with LaTeX

On the first look the LaTeX ecosystem looks highly complex. There are an endless amount of packages, tutorials and guidelines available. The surprising fact is that it is possible to reduce the idea behind latex to a single screenshot. What the LaTeX community tries to achive is shown in the top of the image. The typset paragraph looks similar to a poem. The headline is formatted with centering and the paragraph is fully justified. The justification was created by adjusting the glue between the words, plus minor adjustments of the microtype package.

In contrast, the picture at the bottom shows what the TeX community tries to prevent. In such a situation the symmetry is missing because everything was formatted with flush left.

The interesting situation is that this preference has nothing to do with the LaTeX software but with typography in general. So we have to understand why formatting something by centering it is perceived as beautiful. The idea is that the written word follows the principle of art. It is not only a book but it is a graphical creation. Or to be more specific, the reader of the text should get the impression. From an authors perspective it is trivial to format something with the centering mode.

Let us take a closer look into the shown paragraph. It is the same text, the same font and the software for rendering it was in both cases the latest version of the lualatex engine. The only difference is that in the second case the formatting style “flush left” was applied. Now the question to answer is which of the cases has a higher score in terms of typographic quality? The simple answer is that the top example gets the quality judgement “beautiful example for typesetting”, while the bottom case gets the judgment “low quality or even no typographic quality at all”.

This judgment is not based on any objective criteria but it is simple a preference for or against flush left. How can it be that the established latex software including the high quality Latin Modern Roman Font can create low quality typesetting? Because the untold assumption is that only centering text which includes the headline and the paragraph is beautiful and everything else is wrong.

The underlying reason why this rule is valid is because it takes more effort to create full justified text. If the text isn't created with an algorithm but with a hot metal printing press it will take endless amount of time to fully justify the text. What the typesetter has to do is determine the size of the words and calculate how many glue is needed. In contrast, putting the letters for the bottom example together can be realized much faster.

November 04, 2021

Modular programming with python – a tutorial for creating a game

Writing with python a small is not very complicated. The existing pygame library allows even newbies in doing so. Most tutorials are assuming that the game is created with the object oriented paradigm That means there are classes for the GUI, for the physics and for the main program. This assumption makes sense because since the advent of the C++ and C# language nearly all games are created this way.

A seldom mentioned technique for creating larger software programs was invented before tor the advent of C++. What was used before is called modular programming and can be realized with python as well. The interesting situation is that modular programs allows similar to OOP to divide a larger project into chunks which are created individual. First thing to do is to create the physics module which contains of a single file.

The formatting has similarity to a class but the class statement is missing. The next step is to create the main module which has also no classes but a variable for drawing the window and two methods. The interesting situation is that the physics module is not initiated as an object but it was only importad and then the main module is sending messages to it.

The game itself consists of a small circle on the screen which can be moved with cursor arrows left and right.

November 03, 2021

Object oriented programming without objects

The python programming language provides a seldom explained feature which is the ability to use modules. A module is different from the class concept and has much in common with header files in the c language. The idea is to avoid classes in pyton at all and destribute the coder over many files which are included with the import statement.

From a python perspective the advantage is low.. But a python script which is using only modular programming is much easier to convert into normal c code. That means the written program or game can be made faster in the future by translating the python code into c code. The interesting situation is, that using modules instead of classes is working surprisingly well. What the programmer can is to create small files with less than 100 lines of code. Each file consists of some functions and some variables on top. The variables are only getting accessed from the python module but not from the outside.

What modules doesn't allow is inheritance. Also it is not possible to put different modules into the same file. Each module is stored in a different file. This will increase the amount of files drastically. But thanks the ability to import modules hierarchically it is possible to create very complex applications. With dozends of modules which are combined into submodules.

The chance is high that python programmers are using a same technique. The c language has the disadvantage that it is more complicated to create such modules. Because in addition a header file is needed which is an interface to the outside. But in theory, this concept can replace object oriented classes. That means, there is no need to convert c code into c++ classes.

Modular programming has felt out of fashion since the advent of C++. Today the situation is that more complicated programs are mostly realized with the OOP paradigm. Also the UML notation knows only of classes but not of modules. But in theory both ideas allowing to create larger projects with thousands lines of code.

The only thing what is not working is to avoid classes and modules as well. If somebody writes down 20 variables and 40 functions into the same file it is hard to determine which functions gets access to which variable. Such code can't be maintained. So it is important to divide the code into smaller chunks with less than 100 lines of code. Such a file can be analyzed easily by a human programmer.

Upgrade to Debian 11 makes trouble

On some blogs in the internet it was suggested that the upgrade from Debian 10 to 11 is very easy. No it isn't. First problem is to fix the /etc/apt/sources.list file. There are different tutorials how to do so. Theuser has to change the name of distribution, but also the URL to the security mirror server. But suppose the user has carefully recreated the file and has run the “apt upgrade” command. What he can expect then is that only the kernel was updated. After rebooting the machine important programs like the e-mail software evolution or the matplot library doesn't work anymore. But at least the gnome environment seems to be stable. So the user will run for sure the apt full-upgrade command but this will make things worse. After the next reboot the icons on the desktop are missing and it is not even possible to start the terminal program. That means a simple attempt to upgrade the operating system has caused a system wide failure.

Only after manual login into a text only session next to the gnome X11 session and updating the system again will solve the issue. After many reboots, autoremove commands and fixing lots of minor problems it is some sort of miracle that the gnome session is running normal. That means the new Debian 11 system is booting and at least major programs like the Terminal and the Firefox browser can be started.

What we can say for sure is that the upgrade to debian 11 is working with lots of trouble and it is compared to Windows 10 not recommneded for beginner computer users. Linux (especially Debian) remains a construction site and the chance is high that after rebooting the machine the user can't login anymore.

Understanding the LaTeX typesetting system for realizing justified text

There are many tutorials available how to create papers and even dissertation projects with the help of LaTeX. What these manuals have in common is that they don't explain in detail why LaTeX is the better choice over word. It most cases the argument is the rendering quality of LaTeX is much higher because of better internal algorithm. To understand what does it mean in detail we have read existing dissertation documents and analyze how the documents are formatted.

What dissertation documents have in common no matter which software was used to create them is, that all of them are formatted with the justified layout. This formatting style is so obvious and so frequently used and has such a long tradition that it isn't mentioned explicit. The typical dissertation is formatted in a symmetric way. That means, the title headline is of couse formatted as center text, and the main body text is also formatted as center text. But for the main body the left and right edge is forming a straight line this is called by typographers a fully justified text.

It depends on the author how exactly this style was realized. A common option is to use the MS-Word software, disable the hyphenation feature and then format the entire text fully justified. The result is that between the words many empty spaces are visible.

Another option used by word authors is to activate the hyphenation feature first and then the rendered justified text has smaller amount of empty space. And exactly this situation is the reason why LaTeX is recommended as a word replacement. Because the LaTeX word wrapping algorithm is able to reduce the empty spaces further. The same text looks with LaTeX different, because LaTeX is using an optimized word wrap algorithm, and very important the microtype package. So the result has much in common with the output of the indesign software which is also able to create high quality text.

So what LaTeX is doing is simple: it creates fully justified text with the help of hyphenation and intelligent word wrapping so that the amount of white spaces is minimized. This ability is labeled by the LaTeX community has high quality output.

So let us go a step back ward and ask a simple question: why exactly is a dissertation formatted as fully justified text, what is about flush left formatting style? No body knows. Even the question is so extraordinary that it is hard to answer it. Basically spoken the paradigm is, that the only allowed formatting style is symmetric, in a way that the headline is centered and that longer texts are formatted fully justified.

This kind of rule is much bigger than the LaTeX community. The rule is valid for other programs like indesign and MS-word as well. The rule is also valid for dissertations written before the advent of the PC. So it is has to do with typesetting in general.

From a technical point of view the LaTeX software can produce better justified text than MS-Word. THis is not a subjective interpretation but a 1:1 comparison will show it. In contrast the difference between LaTeX and indesign is little, both are able to create optimized justified text. The open question is if fully justification in general makes sense. Is there a need to produce centric / symetric documents?

In the history there are two main exceptions from this rule available. Letter are usually created in flush left and the internet based HTML pages are also formatted in flush left. Everything else especially books, journals and dissertations are typesetted in the justified mode.

Perhaps it makes sense to explain the situation from a more positive perspective. A typical introduction into the LaTeX software starts with a direct comparison with MS-Word. On the left side the document is shown formatted with Word and on the right side the same is formatted with LaTeX. Of course the LaTeX rendered pdf documents looks better because the text density is higher. It has little or no white spaces and the page looks similar to a printed book. Because of this ability of LaTeX to generate high quality output, the software is used frequently for academic purposes.

What is not answered in this comparison is the problem of formatting in general. The untold assumption was that both examples (word and latex) have to format the paragraph in the fully justified mode. In this restricted domain, LaTeX is much better. IF the paragraph setting was changed to flush left the result is the same.

November 02, 2021

The case of raggedright in the TeX community

It might be surprising but about the most important issue in typesetting the amount of information is little. Most books about LaTeX are explaining what typography is, and why TeX is great but they do not question the difference between left-justification and fully justified text. One of the few exceptions were made in the Tex journal Tugboat [1].

Such a debate is more general than the common question how to realize a certain layout with Latex because fully justified text and typography are strongly connected:

quote “For centuries, book printing has applied justification to nearly all paragraphs” [2]

And yes the statement is correct, a short look into the history of typography will show exactly this preference. Not only documents generated with LaTeX are looking mostly the same, but all the academic books and journals since 300 years have the same standard layout. It contains of two columns which are typeset with justified text. This produces a perfect straight edge and is described by typographers as good typography. That means, the assumption is everything which is not fully justified is equal to low quality typography.

It is impossible to find pro or cons arguments here but what is possible instead is to describe the situation as it is. It is very hard to find books and journals which are typeset with flush left layout. For example, recent issues of the PLOS One journal is doing so. Plos one is a non-traditional digital only academic journal. And perhaps this gives a hint what the idea is. Fully justified text is a sign for printed traditional documents while flush left layout is typical for online only publication.

References

[1] Are justification and hyphenation good or bad for the reader? TUGboat, Volume 37 (2016), No. 2 First results, https://tug.org/TUGboat/tb37-2/tb116akhmadeeva.pdf

[2] Udo Wermuth: An attempt at ragged-right typesetting, TUGboat, Volume 41 (2020), No. 1, https://tug.org/TUGboat/tb41-1/tb127wermuth-ragged.pdf

What Python can learn from C

The Python language has become famous because of its object oriented features. This was the main improvement over previous scripting languages like Perl. On the other hand Python has a seldom described features which can be used as a replacement to classes. The concept of a module is to put all the functions in a single file and include this file somewhere else.

Somebody may argue, that a class should be stored in a file and including a module is equal to include a class. The situation is more complicated. Because modules can be used without classes. This programming style is what c programmers are doing: they are creating modules in which the scope of variables and functions is limited and then the module gets included into the main program.

Apart from modules Python support also packages. A package is a directory which contains of many modules. This allows to create larger programs and very important they can be converted easily into c code. That means the original Python program doesn't use classes there for the similarity with plain C is bigger.

The real problem in programming is not to learn a certain language but the problem has how to write programs which are larger than 100 lines of code. The amount of 100 LoC fits well into a single file. It is equal to define a small number of variables and apply 4-5 functions to them. The problem is that real programs will need much more code lines and hundreds of functions. Organizing them into a single file is technically possible but it will become unlikely that the programmer is able to maintain such a file. Object oriented programming provides one possible answer to this problem, but modular programming has also an answer.

Let us describe how to write a module in Python without using classes. The idea is to create a new file and import this file in the main program. The new file contains of variables similar to class variables and methods which can modify these variables. Some of the functions can be accessed from the outside so that the module communicates with the main program. In contrast to the C language the workflow was simplified drastically, there is no need to create dedicated header files and the program will run without a compilation step.

Comparison between flush left and justified text

The interesting situation is that there is no real debate which of the formatting styles is better. Other problems in typography can be discussed but everybody knows that it is offtopic to question the idea of justified text. The reason is that the problem of arranging words on a line is seen as equal to typography in general. That means typography is about justified text.

Nearly all the professional books and journals in the last 400 years were printed with justified text. This sounds incredible because in the past no software was used but mechanical typesetting machines. But it seems that the machine experts in the past have solved the problem of word justification over a line. The standard mode of the monotype typesetting machine was that it has procuded justified text and nothing else. The only question which was asked is how to realize this formatting style.

From a technical perspective it is possible to create left justified text with the monotype typesetting machine. And technically, the LaTeX software can do the same for an academic paper. But in the reality nobody is doing so.

To understand the situation we have to go back 100 years and take a look what a human operator at the monotype machine was doing. He gets a manuscript which was either written by a typewriting machine or by hand. And the objective is to convert this manuscript into a justified text stored on a punch tape. The idea behind the monotype machine is to justify the text. This is realized by counting the words, and measuring the spaces, and then the amount of free space is determined. This procedure is repeated for every line.

The interesting fact is the formatting style “Justification” is only one possible formatting style. The text can be read also very well if the step is missing. So why is this step done? Mostly because of tradition. THe idea is that the text is so important and will reach so many readers that the typesetting has to be made with the highest possible quality. And this is equal to fully justified formatting.

This self understanding sounds similar to how books are created before the advent of printing machines. The idea was, that the content of a book is so important that it has to be copied by hand. Creating a copy by hand takes longer and this shows the reverence for the content. Bascially spoken, creating a copy of book by hand is more costly than using a printing press. And creating fully justified text is more expensive than creating only left-justified paragraphs.

Left justified text means basically to lower the quality of typography. Let us make simple calculation. What will happen if the monotype machine is used to create left justified text? Right the duration for typesetting the document is shorter. The human operator needs to calculate less and he can typeset more manuscripts in the same time.

November 01, 2021

Justified text in comparison

To investigate the mystery around justified text, the LaTeX software to render three possible formatting styles: flush left, flush left with hyphenation and fully justified text. The third picture was created with the microtype package to minimize the white spaces in a line. The open question is which of the examples looks better?

What we can say for sure is that fully justified paragraph stands in tradition with typography in general. Most books and academic papers from the past were typeset in this formatting style. While using a flush left formatting is a bit unusual. The interesting situation is that all three examples need 11 lines to show the content. That means even activated hyphenation won't safe space that much.

A second fact is that thanks to the LaTeX internal algorithm the example with the justified text looks very good. That means, the white spaces between the words are not visible and the text looks well formatted. Nevertheless the difference between flush left text and justified text is small. It is mostly a question of personal preference. And most texts in the internet and on smartphone displays are formatted in the flush left mode except pdf documents which are mostly formatted in justified style.

The perhaps most intersesting question is if flush left paragraph formatting is something which has to be fixed. Or let me ask it different. Is there a need for firefox and other browsers to display text justified or is the existing flush left rendering future ready? The situation for screen layout is the opposite as for printed layout. Screen formatting was from the beginning working with the flush left mode only. It is ver

In theory the CSS Specification has the ability to render a text in justified mode, but nobody is using this preference.