December 31, 2021

How to make hierarchical notes



Creating a text file in the Linux operating system is a well known task for all computer programmers. In addition it is possible to create text files which are structured hierarchically. This is done with the markdown syntax. A good texteditor is able to render the structure in an outline pane. This allows to jumps easily to any position in the text file.

Somebody may ask what is the reason for creating hierarchical notes in contrast to a normal text file? The reason is located in theh left pane. Similar to a list of methods in a python file,, the left pane shows the overview. All the details are hidden and only the sections and subsections are shown. This is some sort of mind mapping feature in which a problem is split into sub-topics. In contrast to a drawing on a sheet of paper, a markdown formatted text file is much easier to create and modify.

Another interesting feature is, that English text can be mixed with short notes. There is no need to enter full sentences but a section can be filled with 3-4 words long lines.

December 29, 2021

Creating professional documents with LaTeX

LaTeX is a great choice for professional formatting. The following tutorial is a bit unusual but it will result into a high quality typographic layout. First step is to write the text itself. The best way in doing so is the markdown syntax. The main advantage is, that hierarchical sections can be defined. These sections are displayed in a text editor as an outline so there is no need to use a dedicated outline editor but a normal text editor works pretty well for this job.

If the text was created and proofread it has to be converted into the TeX format.
pandoc -o paperlatex.tex -s paper.md

The next step is to modify the header of the latex file so that it will look like MS-Word and libreoffice:

\usepackage{wordlike} % word like fonts and TOC
\usepackage[none]{hyphenat} % deactivate hyphenation
\setlength{\parskip}{8pt} % deactivate vertical glue
\setlength{\topskip}{0pt}  % deactivate vertical glue




December 23, 2021

How to avoid LaTeX at all

LaTeX is solving a simple image conversation problem. The initial situation is that somebody has a folder with some files which are text files, vector graphics and photos and he likes to combine the content into a single file in the pdf format. According to the LaTeX community the only way in handling this task is by installing an entire ecosystem which contains of 15 GB in total which contains of lots of fonts, outdated macro packages, handreds of documentation and obsolete DVI preview programs.
Nevertheless LaTeX is able to handle the task very well but it is not the only program on earth which is able to do so. A good alterantive is the libreoffice program. The most basic feature of libreoffice is that the user can insert multi images, arange them on a US-letter page and export the result into a pdf file. Even libreoffice wasn't designed as an imagemagic alternative it is pretty good in this task. The user can even decide to reduce the resolution from 300 dpi to 100 dpi to save a lot of disc space.
Let us go to the initial problem. There is a folder which contains of different files and the task is to convert these files into the pdf format. And yes, libreoffice can handle this problem very well. Suppose the software has some bugs, one of them is that the programs a bit slow, especially with some images. But, these are only minor problems and can be fixed in future versions.
The interesting situation is, that Libreoffice can combine existing content much faster than what Latex is able to do. The reason is, that combining existing content into a single pdf file is not very complicated. The most time consuming step is to write down a text file which has for example 500 kb. Creating such a plain text file with an editor can take weeks. Also the creation of high quality jpeg, eps and SVG images can become very time consuming. If someone likes to paint images with the gimp program he will need also weeks until the image is rendered into the jpeg format.
But, if the content is already there and the only task is to combine the content into a single document which is a newspaper, a book or a presentation – such a step can be realized in under an hour. Let us take a closer look how libreoffice is working. After starting the program the user can select “insert ->text from file”. This inserts the plaintext file. In the next step the style sheets for the sections are assigned, and of course this step is made manual. But, the average document has only a low amount of such sections so it is done after a short time. In the next step, some pictures are added. This is also done with the insert menu. Either the picture can be added direct as a picture, or it can be added in a larger frame which allows to modify the position more precise.
Then a short click on “export ->pdf” allows to render the document into a single file. The overall workflow results always into a high quality pdf file, and in contrast to latex the user doesn't need to know the complicated markup language or has to install larger amount of software he doesn't need. The interesting situation is that libreoffice can handle a huge amount of images in a single document. This is realized with the “insert image as link”. The original libreoffice file has a small size and the images are stored externally in the directory folder.
So let us go a step backward. The most time consuming time for creating a technical document is to typing of the text and the painting of the images. The task of combining these information into a print ready pdf file is something which can be handled on the fly. There is no need to automate this task, because then the user will loose the control. The most disappointing moment for latex users is that they will recognize that they can't figure an image where they want. This is marketed as an advantage but in the reality, the user likes to take this decision by himself. Using latex is not recommended anymore, because there is more powerful alternative software available.
The screenshot shows a concrete example. First thing to mention is, that the the display of the images was deactivated in the option menu. This improves the performance of the program. Second thing to know is that both images were anchored to the page. This setting allows to freely position the images similar to a layout software. The user can show the result in the preview winder or export it into the pdf format. Somebody may argue that this ability of Libreoffice is not very exciting. Oh it is. LaTeX isn't able to do so and usually only advanced layout software can handle images that way. What most open source programs like gimp and imagemagic can do is to handle a single image file, but not 2 and more of them.

December 22, 2021

Text vs layout software

 

There are roughly spoken two different approaches available how to create pdf files: Scribus vs. LaTeX. Scribus is graphical oriented while latex is text oriented. To answer which one is better whe have to take a closer look into a working directory. Suppose the idea is to create a book. Then the working directory contains of some files for the graphics, a single text file and more graphics file in a vector format.
It is very likely that the file size is very different. The text file has maybe a size of 500 kb while the graphics file have combined around 20 MB. According to the size the graphics files are more important because they are bigger. And here is the problem, graphics oriented layout software puts a high priorioty towards the graphics file. The idea is these files are handled first. While the text file which has only 500 kb is treated with low priority. The idea is that nobody cares what is written in the text but the audience is interested only in the graphics.
At least this is the assumption behind the scribus software. The problem is that in the reality it is the other way around. Creating a 500 kb large text file takes a long time. A trained author will need weeks or months until he has written such a document. In contrast, creating vector graphics and especially photos with a camera will take only hours or minutes.
Let us take a look how LaTeX is seeing the world. LaTeX assumes that the most interesting content is located in the text file. The text file is rendered into a pdf document and in the first run the draft mode is activated. Taht means the images are not visible but only the text. This allows to proof read the text and new sections. LaTeX is not the only software which has a focus on text. For example the markdown format or the MS-Word software have a similar approach. In all these cases the idea is that graphics are an add on to the text but not the other way around. That means, text without images makes sense, but images without text not.
This understanding is not only a subjective measurement but it has to do how long does it take to create a certain sort of content. Suppose somebody is using a photo camera and is making some photos on a single day. How many content can this person produce? Very large amount of information. He is able to create 500 photos each one has 3 MB so it is 1500 MB in total. In contrast it is not possible to create on a single day a plain text file which has this size. The reason is that the typing speed is much smaller. And proofreading a text takes longer than applying a filter to a photo.
The initial question was which sort of software is great for creating books. The answer is that it should be a text oriented software in which graphics can't be displayed or are handled with a low priority.

Pros and cons of the cherrytree program

 



Cherrytree is an outliner software and the only example in the Debian repository. Outliner tools are a relative recent development in computing and were developed in parallel to word processing software which was recognized in mainstream.
Especially the idea of a two pane outliner software was available since the 2000s. The cherrytree software is one example for such tools. Unfortunately the software quality is low. The menus are not making much sense and the chance is high that the amount of users is low. But let us go through the program itself. The main feature is that the user can create sections in the pane on the left side and then enter a longer text on the right. So it can be compared with the navigator pane in libreoffice.
The other functions of cherry tree are little. The user can import and export some files and he can also insert images and tables. Nevertheless the idea of an outliner is so interesting that it makes sense to analyze the program in detail. An outliner is some sort of mindmap in a fulltext mode. The user is not required to enter a linear text from top to bottom but he jumps between the nodes similar to what programmers are doing if they are creating multiple classes. In addition the user can enter only keywords.
So the program has it strength in creating to do lists, short notes and very important book scripts. Outliner tools have became the defacto standard for writing screenplays and movies because of its ability to create complex hierarchical texts.
Let us talk about the cons of the cherrytree software in detail. Suppose the user has created a longer document and likes to export it into the pdf format. Technically a pdf file is created. But the margin can't be adjusted and the image resolution can't be changed. So it is more a preview function than a pdf generator. If the user likes to create a readable pdf paper he will need additional software like the previously mentioned libreoffice software.
So the question is why the user is creating first the notes in cherry tree if libreoffice writer has a similar feature under the term navigator? This would explain why the amount of users is low becauase if someone likes to create hierarchical notes and format them as a pdf paper he can do so with libreoffice writer much better. Another program which works with a similar principle is lyx. The disadvantage of lyx is that it needs for formatting purposes a fullblown latex installation which needs including the fonts around 15 GB on the hard drive.
But let us go back to the cherry tree program. The best way to imagine the idea is to take an existing text editor like gedit and enhance it with the ability to insert images and add a pane on the left side which shows the hierarchical structure. This results into cherry tree.

Creating a newspaper with libreoffice

 



The libreoffice suite has an interesting feature which wasn't recognized by the public yet. The feature is called siimply text box or frame and allows to transform Libreoffice into a software similar to Scribus. But let us slow down the workflow and bit. Suppose the idea is not to create a text but a newspaper with multiple columns. The existing Libreoffice or LaTeX software is not suited well enough for the task because a newspaper is working visual.
Instead of using a different software, the idea is to built around the frame concept the document structure. The interesting fact is that the frames in Libreoffice can be linked together, similar what Scribus provides. So the user can define different frames and let the text flow in this structure.


The screenshot shows the rendered pdf file. The interesting situation is that the filesize is very small (34 kb which includes the png image) and very important the user can create as many columns he like. He has the freedom of a layout software but can handle text in the frames.

December 21, 2021

Academic typesetting with layout oriented programs

 

In the LaTeX community there is some sort of missing knowledge obvious how to create a pdf paper without the tex engine. Sure it is known that apart from TeX some other programs are available like Quarkxpress, Scribus and layouting software in general but it remains unclear how exactly such tools are used to process text. Instead of explaining how a certain software is working the following blog post will introduce how to use layout oriented software in general.
A good general software which is working with the frame paradigm and is preinstalled in most Linux systems is the Libreoffice draw software. The surprising fact is that the inner working has much in common how scribus and other layout DTP programs are working. After starting the program, the user can create a new text box on the page and fill it with example text.


Right on the screen a pane is shown which allows to adjust the properties of the box which includes the font size and the paragraph justification. With a bit intervention the user can make the text box look different. Also an image can be added.


The idea is that the user creates each page as a separate file and saves it into the vector based pdf format. At the end an entire newspaper or book is created in this way. The main advantage is that the user has the maximum control over the layout, he can arrange the items, add headlines and so on.
So let us explain what the difference is between libreoffice draw and the scribus software. There are only minor improvements available. In scribus it is possible link two text boxes together also it is possible to hyphenate the content of a text box. But the general interaction with the software is the same. So we can say that the Libreoffice draw software is a great choice for creating a multi column newspaper which contains lots of graphics.

Scribus for LaTeX users

LaTeX is a widely known technical authoring tool. It has a community in the computer science department and in mathematics. It is used since the 1980s and allows to generate large complex pdf files mostly automatically. Even if the software is so great there is a need to investigate what potential alternative programs are available. Most users are familiar with the Libreoffice program already but a less described program is Scribus. Scribus is a Quarkxpress clone and is working similar to other layout software with the frame concept. The user can positioning text frames on a page.
The following blog post explains how exactly this workflow is done to create a longer academic paper. First thing to do is to make sure that the latex file is available and all the images are available either in the PNG or in the EPS format. And now it is up to the Scribus newbie to use this content to create a professional looking pdf file.
The first problem is that scribus can't import the LaTeX markup format. What is possible instead is to use plain text as source. So the user has to tag again all the sections in the file. This is realized of course manual but after a while all the sections are formatted this way.
A unique problem of the Linux version of scribus is that some dialog boxes can't be resized so in the very important style menu it is not able to update the style property. This makes it harder to assign a certain font to all the section. What the user can do instead is to format the words manuel which is technically possible but takes extra time.
The text is spread over many frames. Each page has a new frame and they are linked together. For reason of simplicity i have selected the single column mode. If the text was entered the next step is to add the figures. In contrast to LaTeX, there are no float images but the user has to position the pictures manual on the page. The advantage is that he can make the page layout more convenient. For example it is possible to place a wrap figure page on top of the page, which is not possible in LaTeX. Also a table was added.
Then a major problem has occured. Scribus can't create a Table of content. According to the documentation this is very complicated, so i have left out this step and made only a manual Table of contents. And now the long waiting moment is there. The document gets rendered into the pdf format. The output of scribus is much bigger compared to latex. The same document needs 2.5 MB vs. 400 kb.
A sbujective judgment comes to the conclusion that the pdf quality of scribus is worse than LaTeX. Technically it is possible to create pdf files with Scribus but it can't replace LaTeX. Nevertheless it was an interesting experience to see how Frame oriented layout software is working.


Content vs layout
After this report let us discuss what the most obvious difference is between both programs. LaTeX has a focus on text structure. It is some sort of advanced outline editor. The user creates sections and subsections and this allows to add new content to an existing text. In contrast, Scribus is based on the idea of a page. The user has to create new boxes and arrange them to fulfill certain layout requirements. It is not possible to add new subsections, but this will cause a complete reformatting of the project.

December 20, 2021

Some applause for the lyx software

 

It is some sort of mystery why the LaTeX ecosystem has so many followers. The only time in which LaTeX was new an shiny was in the 1980s but then many other powerful software were created but it seems that the TeX community doesn't recognizes the change.
Suppose the idea is that LaTeX and especially the Lyx frontend has become old and outdated and the user likes to use a better alternative. Which sort of software comes close to the TeX experience? There are three possible replacement available:
  1. text based formatting like lout and markdown
  2. WYSIWYG like Libreoffice writer
  3. newspaper layout programs like indesign and scribus
Let us start with the first category which has to do with markdown and HTML. In theory, these programs are working similar to tex. The problem is that it remains hard to convert a markdown file into the pdf format. The well known pandoc renderer isn't able to do so natively but he is using LaTeX as a backend. And rendering a HTML file into pdf will result into lower quality.
What most non TeX users are prefering to write longer documents is perhaps libreoffice writer. The current version is 7.0. But libreoffice has some problems. First one is, that after inserting more than 3 images the GUI will become very slow. The recommneded workaround is to deactivate the display of the images. Indeed this will provide the maximum performance, but then the user can't see what the figure is about.
A second problem in libreoffice is, that the images can't be scaled simple to 50%. After resizing the frame with a mouse the new size will affect the ratio and the dialog box has no ability to maintain a certain ratio. Also libreoffice doesn't supports floating images and the ability to create fully justified text is low.
On the other hand of the line there are sophisticated layout programs available like scribus. Sometimes Scribus and indesign are recommended to fulfill professional needs. But a closer look at this programs will show, that they are doing the opposite from what a technical authors needs. The main idea is that the user is creating frames on the page, but this is not what a text author likes to do. A book or an academic paper isn't defined by it's visual structure but by hierarchical sections. That means a book is working with 1. section, 1.2 as a subsection and so on. This system isn't supported in Scribus. Even the creation of a table of contents is practical impossible.
We can reduce the comparison to a small question: Can libreoffice writer replace Lyx? The answer is no. Even Libreoffice has reached a high version number 7, it is in the current version not able to process lots of images in a technical document. That means there is a reason why latex, lyx or texniccenter is available. All the programs have a measurable advantage over libreoffice. The main idea behind TeX is very easy to explain: There is somewhere a plain text file which is less than 500 kb. This text file is edited in the software of choice and after running the LaTeX software the pdf file is rendered which can contain endless amount of figures, tables and high resolution pictures. The user won't get any problems in creating such large documents nor he gets a slowdown from the software while entering the text.

Comparison between Lyx and Libreoffice for technical writing

 



The good news is that the amount of possible software for creating larger technical documents is low. Software for creating entire newspapers like Scribus and Adobe Indesign are not suited for the task. These programs have a strength in arranging photos but they aren't supporting textual writing nor formatting of bibliographies. The remaining programs which are in theory suitable to support technical authors are LaTeX, Lyx, Libreoffice and MS-Word.
So it is some sort of contrast between LaTeX on the one hand and MS-Word on the other which represents WYSIWYG software. The open question is which one is better?
The screenshot on top of this postings shows both philosophies in direct comparison. What is shown in detail is the document structure which is later rendered into a pdf file. In both cases the user enters a hierarchical text similar to an outline tool and this is converted into linear text document.
Even if the Lyx frontend looks a bit more clean, the winner of the comparison is Libreoffice Writer. The reason is that the software can be used for many additional things than only technical writing and has the larger community. In contrast, the combination of lyx and LaTeX needs very high amount of dies space and the amount of people how are using the software has become smaller than in the past.
From a historical standpoint the LaTeX software was the dominant program for academic writing Because the program was available in the 1980s in which even MS-Word wasn't available. And even today, the ability of LaTeX to create mathematical formulas is working very well.
On the other hand, with a bit o guidance, the Libreoffice Writer software or of course MS-Word can be used very well for technical authoring. All what is needed is to activate the outline / navigator mode so that the user sees in the left pane the document structure. This allows to add new sections in addition to the existing ones.

December 19, 2021

Writing high quality documents with Libreoffice

There is some sort of shared knowledge available that existing WYSIWYG text processors are not able to create academic text especially if they are longer and if the quality requirement are high. And of course the more powerful alternative is LaTeX, Lyx and other authoring tools like Framemaker.
The reason why especially the Lyx software is a great choise for creating technical documents is because it combines an outline editor for creating the content on a textual level with the powerful latex backend to render the document in a high quality PDF version.
But sometimes there is a need to create documents without latex and the question is if this is possible at all? The following blog post tries to explain how to create technical documents only with the libreoffice program which is preinstalled in the Debian operating system.


After starting the Libreoffice software the first time it will look like in the screenshot. The user sees a blank page and a blinking cursor. Also the user sees lots of icons and formatting symbols to change the font which is pershaps the main reason why experienced latex users doesn't like such a software very much. In Libreoffice there is no clear seperation between content and layout and the user can modify both in the text which makes it hard to focus only on the text.
The first step is modify the GUI a bit so that it will look like the lyx GUI. The Libreoffice program has a feature called navigator which can be compared with the outline view of lyx. the navigator shows the document structure in a side tab.


Suppose, the user has entered some text and want to change the font. In LaTeX and lyx as well the only way in doing so is to adjust the global font parameter which will affect the entire document. The Libreoffice program has a similar feature which is hidden in styles->edit style->font


Getting a preview of the pdf file is a bit complicated. First thing to do is to swtich to the normal view and then the print preview button is visible.


The main problem with the software is, that there are an endless amount of buttons and menus available. The user has to learn to ignore most of them. In contrast, latex and lyx has a more minimalistic approach. But from a tehcnical perspective, libreoffice can replace lyx very well. It has also the ability to act as an outline editor and it is also possible to render the document into a pdf file.


Let us focus on the core element of an outline editor which is the outline pane. Unfurtunately the Libreoffice tool window looks a bit messy. It contains a lot of icons and the size is fixed. This makes the interaction compared with the Lyx software difficult. But, it is not impossible to use the software. In theory the user is able to create and edit outline oriented text content in Libreoffice similar to other authoring software like emacs org mode or the scrivener app.


December 16, 2021

How to write longer technical documents

 

There is some sort of mystery available about how to create longer documents and especially academic books. On the first look the problem is easy to manage because there are word processing tools like MS-Word available plus dedicated publishing tools like Framemaker. The problem is that most authors are not interested in using a certain software but they want to know in general how to write a document.
The first thing which is important to know that writing a document has nothing to do with formatting it. Formatting a document can be realized by pressing the CTRL+A key to select the document and then select a font in the menu. So the formatting task is the easier one. The more serious question is how to create the text. This is done with a software tool called outline editor. Without an outline editor it is not possible to write longer technical documents.
Let us take a closer look how the existing programs are working. Software used for writing books are MS-Word, framemaker, libreoffice, latex, lyx and many other. What all these programs have built in is an outline editor. The interesting situation is, that this feature isn't mentioned in the handbook and most authors doesn't know what an outline editor is. But they are using this mode all the time. Let me give an example.
The famous program texniccenter is an IDE for the LaTeX program and according to the amount of documention, texniccenter is used to write papers in the reality. There is a single feature available in the program which is the ability to manage a pane on the left side. This pane can be used to use texniccenter in the outline mode. That means, the user can create notes or longer text and order them in a hierarchical fashion.
The interesting situation that other state of the art programs like MS-Word, lyx or emacs have s similar built in feature. Writing a text means basiically to type in the text into an outline editor. How this editor is called doesn't take matter. And it is also not very important if it is using LaTeX or not. The only thing which is simportant that some sort of outline editor is available. Let us take a look into a seldom mentioned software available as opensource: Cherrytree.
Cherrytree is a normal outline editor without additional formatting options. it can be seen as a tripped down version of Lyx. Lyx needs around 10 GB on the harddrive which includes the TeX back end, but cherrytree works fine with 10 MB space. According to the self description in the debian repository, “CherryTree is a hierarchical note taking application.” And yes the program is more than suited to created longer technical documents.
The interesting fact is, that Cherrytree doesn't belongs to the categories of word processing software. It is not a WYSIWYG software like Word, and it is not LaTeX clone for text based markup. At the same time, the program is the core component of a technical authoring software.

December 10, 2021

Writing documents for the Academia.edu website

 

The Academia.edu website provides lots of academic papers for downloading. But sometimes there is need to upload something. The reason is because it makes fun, because somebody can learn how to write a paper or because somebody likes to share it's knowledge. The only problem is that creating an academic paper from scratch is a bit hard. The following style guides explains the basics for newbie authors and somebody who likes to behave in such a way.
First thing which is important is, that a complicated two column layout is not the best option for the Academia.edu website. The reason is, that documents are displayed as default in the HTML mode and something with the rendered images is wrong, so the chance is high that precise formatted documents won't like great on the screen. The better idea is to use a minimalist layout which contains of a single column and a larger fontsize. Also it is recommended to deactivate any sort of kerning and fully justified text and use only normal wrapping mode from the document software. Such a layout can be realize with many ways for example, Word, markdown and LaTeX My personal favorite is the Lyx software but it depends on the user's experience.
The more serious problem is what to write in the document. Academia.edu is a scientiic website so it makes sense to choose on of the disciplines like mathematics, physics, literature or psychology as subject. Also it is recommended to reference to external sources which is a sign that the document was created for academic purposes. Writing the text itself is much harder to explain. A good workflow is to use existing content written in the past which was stored on the local harddrive and then convert it into an academic paper. If the user has not written any draft document in the past and the local harddrive doesn't contain self written notes or documents than it is not possible to write an academic paper.
 
After uploading a document to Academia.edu something interesting will happen next. In contrast to the expectation the amount of readers for academic documents is low and even zero. Some blog authors are asking themself why their traffic is so low. In Academia.edu there is no problem with traffic at all because the average traffic for a document is 0 views after 6 months. That means, the author will put a lot of effort into writing a document and then it is ignored by the world.
The problem is not located in the website itself but it has to do with academic content in general. That means academic papers in general are producing no traffic at all. Even Acadedmia.edu has an advanced tool for measuring the daily traffic the chance is high that the charts are showing nothing but zero rreadership in the world.

December 08, 2021

Understanding the line breaking algorithm in TeX

 



In the default setting, TeX is formatting a text with the fully justified mode. In contrast, the screenshot shows the opposite which is called left-justified text. The open question is how big is the difference? On the first look there are some white spaces at the right edge, but not very much. It seems that a normal line wrapping algorithm makes sure that in average the right side looks smooth. But let us take a closer look into the example.
The paragraph on the left are mostly correct formatted. That means, no visible space or only a smaller one is available at the right edge. In contrast the left page has some visible spaces. The cause was that the word was wrapped to the next line and no hyphenation or character kerning was applied. But this affects only 2-3 paragraphs and not all. So we have to ask what exactly is the advantage if the entire text gets formatted fully justified?
The interesting situation is that the situation will become more complicated if the text gets blured. If the reader will look fro the distance to the layout it is harder to see the advantage of fully justified text.

December 06, 2021

Is fully justified text really needed?



The LaTeX community claims indirectly that only fully justified text looks clean. The TeX engine tries to create such appearance really hard. Advanced hyphenation, spacing and even intra word spacing is used to make sure that each line has a straight edge on left and the right side as well.
The interesting situation is that the average reader doesn't care about this feature and a causual look at the document can't tell if it was formatted in fully justified mode or not. The example screenshot shows a document which was formatted left justified without any hyphenation. In spite of this absence of high quality formatting, the right edge looks reasonable well adjusted. Sure. It is not forming a perfect line but the document doesn't look very uncommon or even chaotic. The thesis is that the goal of creating fully justified text is maybe no longer important?
In the current example an unsharp fiilter was applied to the image so that it is no longer possible to recognize each words. Instead the reader will perceive the document from a distant view. The interesting situation is the right edge looks reasonable well formatted, even no formal fully-justified mode was applied. It seems that on the average a simple left-justified line wrapping algorithm results into a near right edge even without adjusting each line with inter word spacings. Is the TeX internal algorithm for word spacing an example for over engineering in a sense that in the reality nobody needs such formatted documents?
From a technical perspective, a left-justified document has the same appearance like a text rendered in a browser. That means, the spacing between the words is always the same and if the line is full, the word gets wrapped into the next line. This sounds a bit un-typographic but it results into an easy to read text.

November 14, 2021

Fully justified text in LaTeX documents

 

The LaTeX software is known as a high quality typesetting program. The reason why LaTeX generated documents are described as high quality is because they are looking the same like hand-typeset documents created 200 years ago. To understand LaTeX better we have to describe what classical hand typesetting is about.
Typessetting 200 years ago was at first the art of creating fully justified text. The interesting fact is, that this formatting style is only seldom described in the literature. MS-Word has a simple button to activate it and LaTeX is using this formatting style as default. So the user has no further explanation why this is needed.
Fully justified text means basically to adjust the words on a line so that the right and left edge are straight. In case of hand typesetting this art was very complicated to realize and the main reason why it takes so long until the metal characters are sorted on a page. The main trick for created adjusted text lines is to use glue between the words.
In the past, all the newspapers were typeset with this formatting style. What the manual typesetter were trained is to realize this unique shape. From a technical perspective it is possible to create flush left pages with manual typesetting as well. It would be even a bit more economical. But nobody has done so. A closer look into old newspapers will show that 100% of them were typesetted with fully justified text.
The idea of LaTeX is to imitate this formatting style. A standard LaTeX generated document will look like a newspaper which was manual typesetted in the past.
It is not very hard to guess what the opposite of fully justified text is. Flush left text is equal to missing typesetting. A flush left formatted text looks very different from what is used in newspapers in the past. The main reason why LaTeX generated texts are looking all the same is because of missing flush left formatting. The typical latex user assumes that it is prohibited to use a flush left formatting style.

November 11, 2021

Programming larger projects with Forth

 

The sad news is that it technical not possible to run existing c source code on a forth cpu. The reason is, that Forth cpus have no registers but only a minimal stack. And no c compiler in the world is able to generate the assembly instructions for such a machine. Even for mainstream CPUs like the 6510 it is hard to write c compiler because this CPU has also a low amount of registers. And in a forth cpu the situation is more dramatically.
So the only practical method to write software for the GA144 and other CPU is to hand code the software in Forth. Unfortunately, Forth is known as one of the hardest programming languages ever next to Dyalog APL. An easy way to learn the Forth language should be explained in the following blog post.
Instead of trying to understand the idea of a push down stack the more beginner friendly approach to write something in Forth is to use modular programming. Modular programming is a powerful software engineering method which is supported by most languages like Pascal, C, Python, C++ and of course Forth. The idea is that each file contains of 4-5 functions plus a handful of variables. The functions can access the variables and the module can be included by other modules. The concept is some sort of strip down version of a class.
Modular programming in Forth is a bit more complicated than doing the same in C. but it is not impossible in writing such programs. It is mostly a question of getting experience in writing stack based functions. The concept of modular programming is the same. It allows that the source code can grow to a size of 1000 lines of code and even more. And each single file has a limited size of not more than 100 lines of code. If the file is getting bigger, the programmer has to outsource some of the routines and creates some interfaces to communicate with other modules.
The interesting situation is that Forth based modular programming works well with existing Forth cpus. The only thing what a certain Forth system needs to support is the ability to execute a word, and also the ability to read and write to variables.

November 10, 2021

Understanding the 1990s in terms of AI

 

In case of normal compurer technology the 1990s and the year 2021 are not very different. In the 1990s wokstation computers were invented already which includes the Unix operating system. Video gaming consoles were widespread available and letter were created of course with a computer but not with a typewriter. So the only new thing today is, that the CRT monitores were replaced by flat screens and the computer is a bit faster. But this is not a revolution but only a normal minor improvements of the last 30 years.
In the context of Artificial Intelligence the situation is much more different. The situation in 1990s and the situation today is nearly the opposite. The people in the 1990s had only a rough understanding what AI is, and no practical demonstrations are available. Instead of describing what the situation today is let us take a closer look into the past.
The first thing to mention is, that in the 1990s the internet was in an early stage. It wasn't the dominant media which includes all the other media but existing media technology like the cinema, television, books and printed journals were used instead. What the people have learned about Artificial Intelligence were published in these media. That means, the understanding of robots and AI were influenced by movies and books about the subject. Robots played only a small role in all the existing media, and it was unclear if such technology can be realized soon. There was a big gap between robots in the movies and robots realized in reality. And this gap was projected into the future. Basically spoken AI was something not available in the 1990s. It was used as part of a science fiction plot in which robots take over the world or it was explained in a computer science book, that solving the np hard problem is not possible.
This public understanding is very different from today's situation. Today the average user of the internet will see lots of videos with working robots including the source code for the software to recreate them from scratch. This includes walking robots, self driving cars and Mario AI autonomous bots. Such a library of existing software which includes textual tutorials wasn't available in the 1990s. In the past, it was difficult to find an entire book about the subject Artificial intelligence and even the advanced one have described only a very rough situation. Another problem in the 1990s was, that modern programming languages like python wasn't invented. That means it was not possible for a newbie to create a line following robot because for doing so he had to learn languages like C++ first. Basically spoken, creating such a robot was in the 1990s a multi million dollar project for a research lab at the university but not an amateur project for a weekend to learn something.
Perhaps these examples have shown, that since the 1990s many things have changed. The dominant new situation is, that the former np hard problem seems to be solved. That means the amount of papers in which mathematicians are able to proof that AI can't be realized within the next 1000 years has reduced drastically. It was replaced by the understanding, that each month a new robot is presented which is a bit more powerful than the previous one.
What would happen if a person from the 1990s gets teleported into today's time? In case of normal computing the situation would be relaxed. He can use his former knowledge about how to use a mouse and a keyboard to control a computer. He will find the same MS-DOS prompt and the same Unix command line and sending e-mails is the same like 30 years ago. But in one subject the teleported individual gets confused strongly which is the current state of AI. After sitting down in front of a computer and watching a youtube playlist he won't believe what he will see. The shown examples in robotics are so much different than what he knows that the person wouldn't understand it anymore. It would be simply too much to see a biped robot, self driving cars, autonomous drones or the poor hitchbot who is waiting on a park bench at night.
The development in robotics wasn't an abrupt event but a continuous flow of changes. But at the end, the year 2021 and 1990 have nothing in common in terms of AI. There is a huge gap between both years and not a single revolution but many of them have occurred. Most of today's AI technology like biped robots, deeplearning, and game playing AI wasn't available in the 1990s. It has entered the world without any warning and there is no sign that the development will stop someday. It seems that the subject of AI was the single driver which has changed the world into a futuristic one.

Modular programming with any programming language

In the past it was some kind of game to compare different languages against each other. C programmers are convinced that their language is the fastest one, python programmers emphasizes how easy it is to write code and Forth programmers are proud of low energy consumption for a single instruction. It is hard or even impossible to bridge these communities.
On the other other hand there is a unique element which have all the programming languages in common. This feature is more powerful than stack based computing, and easier to create than object oriented programs. The single useful feature is the ability to write programs in a modular fashion. The term modular programming is a bit uncommon and sounds outdated so it makes sense to explain it further.
A module is a file which contains of 5-8 procuedures and 5-8 variables. The procedures are allowed to manipulate the variables and the idea has much in common with a class. A module was used in programming before the advent of oop. The pascal language knows the unit statement, the c language can include prototype files, and fortran porgrammer can create modules as well. So we can say that modular programming is the single paradigm which is available in all the languages including forth which has also modular capabilities.
Modular programming looks not very interesting on the first look but it is the key element to write larger program. A larger programs contains of 1000 and more lines of code. Such projects can only be realized in a modular fashion because it allows to structure the program into logical chunks which are programmed independent from each other.
The perhaps most interesting situation is that with modular programming all the languages including forth and C++ are easy to master. What the programmer has to do is to follow strict the rules. That means, if the program gets bigger he has to create a new file and put the procedures into this file. Programming means to manage procedures and variables. This paradigm allows to solve any problem in software.
Let us take a closer look into some larger Forth projects at github. Instead of explaining how a stack based language works let us focus only on modular programming. What we can see in these projects is that some files are there and each file has the same structure. At top there are some variables initialized and in the bottom some functions are written down which are accessing to the variables. So the concept is the same what c programmers are using and java programmers if they are creating new files for a new class. And yes the principle makes sense because it allows to write longer programs which has more features.
The interesting situation is, that modular programming has no limit in the code size. Because newly created submodules can be included in other modules and at the end there are 50 and more files available which have each 100 lines of code. So the overall structure is highly hierarchical. It is less powerful than real object oriented programming but it comes close to the idea.
It sounds a bit trivial to expain this programming style in detail because c programmers are doing so since the 1970s and it is the most common coding style ever. On the other hand the amount of books how are describing the principle is low. Most C introductionary books doesn't even have a chapter about the subject of creating programs with more than a single sourcecode file. Similar to other languages the main focus of a tutorial is to explain the language itself, which includes the statements, but this knowledge doesn't allow to create useful software.
So we can say that an individual programming language is less important. The fact that the for loop in forth works different from a for loop in python can be ignored. What the programmer needs to know instead is how to write programs distributed in units.

Safe artificial intelligence in the past

From a technical perspective it is very hard to prove that robotics isn't possible. It is even much harder to show, that AI is limited to certain level which can't be over jumped in the future. Also it is impossible to slow down the progress or reverse the development. There is no such thing like a frame to control the development of Artificial Intelligence. What is possible instead is to ignore the future and take a look into the past which is much easier to understand.
The major advantage of the 1990s compared to the situation in the now is that in the past robots weren't invented yet, also the amount of books about the topic was little. That means the 1990s were a time in which the problem of AI upraising wasn't there.
The 1990s were the decade before the upraising of the internet. The dominant media in this time was the television for the mass and the printed book for the educated scholars. The amount of information was small and no search engine were used. From the perspective of AI the situation was also in a very early stage. A robot like honda asimo who is able to walk upstairs wasn't invented. A company like Boston dynamics wasn't there and in the beginning of the 1990s the worlds strongest chess player was a human but not a computer. Also the subject of deeplearning wasn't invented. The only thing what was available were normal perceptrons with not more than 20 neurons. These systems were realized in software for the windows operating system but no useful application was known.
Self driving cars were also not invented in the 1990s because of many reasons. What was available instead were lot of movie related robots like the K.I.T.T. car, or the Data android in Star trek. And the perhaps most surprising fact is that the people in the 1990s were not convinced that Artificial intelligence wil become possible one day. A widepread believe was, that simple biped robots will be available in around 300 years in the future or they won't be possible at all. The underlying theoretical concept to prove this assumption was the np complete theorem. NP complete says basically, that all the AI problems can't be solved on a computer because of the large state space. A similar idea was introduced in the highlight report in the 1970s.
A widespread philosophical interpreation of Artificial Intelligence in the 1990s was, that since the beginning in the 1950s the AI researchers has promised many things but none of the goals were reach. So the rational understanding was, that AI is too complicated in general and it is not possible to build such machines. Nobody in the 1990s were able to disprove this assumption so it was equal to common shared knowledge.
So we can say, that the 1990s were the last real AI winter. AI Winter means that the subject was seen as impossible to realize and that the research towards the subject has stopped. What the computer scientists have done instead is to program normal software for example games, operating systems and very important network protocols for the upraising internet. That means, AI in the 1990s was an esoteric discipline not recognized very much.
Suppose a person from the year 2021 travels back into the 1990s and explains to the audience which sort of robots are available in only 30 years. He will say that biped robots are possible, that kitchen robots can be build, that self driving cars can be realized with neural networks and he will explain that the sourcecode for tetris playing AI Bots is distributed as open source to everybody. The audience won't believe any of these words. The audience will say, that it is impossible. If the time traveler would like to give the details and explain how these software can be realized the audience will leave the room because it is outside their horizon. It would be too much for a person of the 1990s to hear what reinforcement learning is about or that chess software can beat a human player.

November 09, 2021

The 1990s from the perspective of Artificial Intelligence

 

Describing the current situation in AI and robotics is nearly impossible. There are endless amount of projects and documentation available and at least once a week a new robot is shown at youtube which looks more human like than in the week before. Advanced topics like biped walking, speech understanding and playing of videogames are solved already or the chance is high that within the next 12 months such a success story will become visible and fiction and reality have merged to a complex media campaign. Is it not clear if a certain robot head is remote controlled or by an advanced algorithm or if a certain walking robot was able to do the steps in reality or it was drawn into the picture with animation technology.
If the current world is too complex it makes sense to increase the distance and observe something which has a clear frame around it. The 1990s are a great decade for describing artificial intelligence. The major advantage is that the amount of published books is known and that the level of technology in the 1990s was low. Biped robots weren't invented, chess machines were not able to beat the best human player (with a single exception invented by IBM) and most researchers were not sure if AI can be realized in the future.
Computer technology in the 1990s was in an early stage and most problems were located in slow hardware and bugs in the software.. Windows 95 was during that period a common operating system, and the internet was in the early beginning. This setup makes it easy to give a full overview about all robots and AI projects during this time.
Basically spoken, AI wasn't realized in the 1990s but it was a philosophical topic. The question was if experts in computer science think about how realistic AI will be in the future. That means 95% of the population wasn't informed about the subject at all, and the few computer experts who have used neural networks and expert system in reality were not able to demonstrate practical applications.
Artificial Intelligence in the 1990s was mostly a subject for movies and science fiction books. Lots of stories about intelligent robots were available in this period. Even it was not possible to build real robots it was within reach to imagine a future in which positronic brains and other technology allows humans in doing so.

November 08, 2021

The 1990s from the robotics perspective

 

... were great, because no such innovation was available. All the technology available today, wasn't invented in this period. What was common instead were normal computer technology plus some philosophical books about how to realize Artificial Intelligence in theory. It was even unclear if a chess computer would be able to win against the best human player. The general understanding about AI during the 1990s was, the same like it was formulated in the famous lighthill report. Basically spoken the idea was, that even some expensive AI projects were started at US universities none of them has resulted into something useful.
The only place in which robots in the 1990s were available was the cinema. In blockbuster movies and of course in the star trek series, many examples were shown. But again, none of these things were build by the researchers.
From today's perspective it is known, that the during the late 1980s, the Honda company has developed humanoid robots. But, during the 1990s the internet was not widespread available so even computer experts were not aware of it. It was common sense, that no one has tried to build walking machines because it is too complicated to realize. What was known in the 1990s were some examples for expert systems. Some of them were described in mainstream computer journals. Also it was known, that for playing chess or tictactoe some sort of Artificial Intelligence is needed. But it was unclear how exactly such technology can be realized.
From today's perspective the 1990s have much in common with the stone age. AI was something not invented yet and it was imagined that it will take 100 years or longer to build it. Only to get the figures, the plot of star trek TNG plays around 300 years in the future. And exactly this was the estimated duration until biped robots are build.
Perhaps one surprising insight is, that in the early 1990s robotic competition were not common. From a technical perspective such robots can be realized with 1990s technology easily. But at this time, no one has seen a need for doing so. What was common instead was to program smaller programs in prolog and lisp. For all home computers and PC a compiler was available and some books were available about the subject. What such prolog programs in the 1990s were able to do was nearly nothing. Some more advanced software was able to solve logic games, but most of the programs were created as hello world examples.
With today's knowledge it is possible to identify some advanced projects in the 1990s not mentioned yet. For example the MIT Leglab has built in the 1990s lots of walking machines. But again, the Internet wasn't invented so no one was aware of it. That means, even if they have build these machines and published papers about it, computer experts, and hobby programmers as well simply never recognized it. Or let me explain it the other way around. Suppose a time traveler would visit a larger university library outside of the U.S. in the 1990s and will read all the books He won't find a single piece of information about the MIT robots, the honda asimo project or any other advanced AI project from this time. Sure, if the imagined time traveler is visiting the MIT library and knows which paper he needs to read then he will find the information. But without such an advantage he stays completely clueless.

November 05, 2021

Symmetric typesetting with LaTeX


On the first look the LaTeX ecosystem looks highly complex. There are an endless amount of packages, tutorials and guidelines available. The surprising fact is that it is possible to reduce the idea behind latex to a single screenshot. What the LaTeX community tries to achive is shown in the top of the image. The typset paragraph looks similar to a poem. The headline is formatted with centering and the paragraph is fully justified. The justification was created by adjusting the glue between the words, plus minor adjustments of the microtype package.
In contrast, the picture at the bottom shows what the TeX community tries to prevent. In such a situation the symmetry is missing because everything was formatted with flush left.
The interesting situation is that this preference has nothing to do with the LaTeX software but with typography in general. So we have to understand why formatting something by centering it is perceived as beautiful. The idea is that the written word follows the principle of art. It is not only a book but it is a graphical creation. Or to be more specific, the reader of the text should get the impression. From an authors perspective it is trivial to format something with the centering mode.

Let us take a closer look into the shown paragraph. It is the same text, the same font and the software for rendering it was in both cases the latest version of the lualatex engine. The only difference is that in the second case the formatting style “flush left” was applied. Now the question to answer is which of the cases has a higher score in terms of typographic quality? The simple answer is that the top example gets the quality judgement “beautiful example for typesetting”, while the bottom case gets the judgment “low quality or even no typographic quality at all”.
This judgment is not based on any objective criteria but it is simple a preference for or against flush left. How can it be that the established latex software including the high quality Latin Modern Roman Font can create low quality typesetting? Because the untold assumption is that only centering text which includes the headline and the paragraph is beautiful and everything else is wrong.
The underlying reason why this rule is valid is because it takes more effort to create full justified text. If the text isn't created with an algorithm but with a hot metal printing press it will take endless amount of time to fully justify the text. What the typesetter has to do is determine the size of the words and calculate how many glue is needed. In contrast, putting the letters for the bottom example together can be realized much faster.

November 04, 2021

Modular programming with python – a tutorial for creating a game

Writing with python a small is not very complicated. The existing pygame library allows even newbies in doing so. Most tutorials are assuming that the game is created with the object oriented paradigm That means there are classes for the GUI, for the physics and for the main program. This assumption makes sense because since the advent of the C++ and C# language nearly all games are created this way.
A seldom mentioned technique for creating larger software programs was invented before tor the advent of C++. What was used before is called modular programming and can be realized with python as well. The interesting situation is that modular programs allows similar to OOP to divide a larger project into chunks which are created individual. First thing to do is to create the physics module which contains of a single file.

The formatting has similarity to a class but the class statement is missing. The next step is to create the main module which has also no classes but a variable for drawing the window and two methods. The interesting situation is that the physics module is not initiated as an object but it was only importad and then the main module is sending messages to it.


The game itself consists of a small circle on the screen which can be moved with cursor arrows left and right.


November 03, 2021

Object oriented programming without objects

 

The python programming language provides a seldom explained feature which is the ability to use modules. A module is different from the class concept and has much in common with header files in the c language. The idea is to avoid classes in pyton at all and destribute the coder over many files which are included with the import statement.
From a python perspective the advantage is low.. But a python script which is using only modular programming is much easier to convert into normal c code. That means the written program or game can be made faster in the future by translating the python code into c code. The interesting situation is, that using modules instead of classes is working surprisingly well. What the programmer can is to create small files with less than 100 lines of code. Each file consists of some functions and some variables on top. The variables are only getting accessed from the python module but not from the outside.
What modules doesn't allow is inheritance. Also it is not possible to put different modules into the same file. Each module is stored in a different file. This will increase the amount of files drastically. But thanks the ability to import modules hierarchically it is possible to create very complex applications. With dozends of modules which are combined into submodules.
The chance is high that python programmers are using a same technique. The c language has the disadvantage that it is more complicated to create such modules. Because in addition a header file is needed which is an interface to the outside. But in theory, this concept can replace object oriented classes. That means, there is no need to convert c code into c++ classes.
Modular programming has felt out of fashion since the advent of C++. Today the situation is that more complicated programs are mostly realized with the OOP paradigm. Also the UML notation knows only of classes but not of modules. But in theory both ideas allowing to create larger projects with thousands lines of code.
The only thing what is not working is to avoid classes and modules as well. If somebody writes down 20 variables and 40 functions into the same file it is hard to determine which functions gets access to which variable. Such code can't be maintained. So it is important to divide the code into smaller chunks with less than 100 lines of code. Such a file can be analyzed easily by a human programmer.

Upgrade to Debian 11 makes trouble

 

On some blogs in the internet it was suggested that the upgrade from Debian 10 to 11 is very easy. No it isn't. First problem is to fix the /etc/apt/sources.list file. There are different tutorials how to do so. Theuser has to change the name of distribution, but also the URL to the security mirror server. But suppose the user has carefully recreated the file and has run the “apt upgrade” command. What he can expect then is that only the kernel was updated. After rebooting the machine important programs like the e-mail software evolution or the matplot library doesn't work anymore. But at least the gnome environment seems to be stable. So the user will run for sure the apt full-upgrade command but this will make things worse. After the next reboot the icons on the desktop are missing and it is not even possible to start the terminal program. That means a simple attempt to upgrade the operating system has caused a system wide failure.
Only after manual login into a text only session next to the gnome X11 session and updating the system again will solve the issue. After many reboots, autoremove commands and fixing lots of minor problems it is some sort of miracle that the gnome session is running normal. That means the new Debian 11 system is booting and at least major programs like the Terminal and the Firefox browser can be started.
What we can say for sure is that the upgrade to debian 11 is working with lots of trouble and it is compared to Windows 10 not recommneded for beginner computer users. Linux (especially Debian) remains a construction site and the chance is high that after rebooting the machine the user can't login anymore.

Understanding the LaTeX typesetting system for realizing justified text

 

There are many tutorials available how to create papers and even dissertation projects with the help of LaTeX. What these manuals have in common is that they don't explain in detail why LaTeX is the better choice over word. It most cases the argument is the rendering quality of LaTeX is much higher because of better internal algorithm. To understand what does it mean in detail we have read existing dissertation documents and analyze how the documents are formatted.
What dissertation documents have in common no matter which software was used to create them is, that all of them are formatted with the justified layout. This formatting style is so obvious and so frequently used and has such a long tradition that it isn't mentioned explicit. The typical dissertation is formatted in a symmetric way. That means, the title headline is of couse formatted as center text, and the main body text is also formatted as center text. But for the main body the left and right edge is forming a straight line this is called by typographers a fully justified text.
It depends on the author how exactly this style was realized. A common option is to use the MS-Word software, disable the hyphenation feature and then format the entire text fully justified. The result is that between the words many empty spaces are visible.
Another option used by word authors is to activate the hyphenation feature first and then the rendered justified text has smaller amount of empty space. And exactly this situation is the reason why LaTeX is recommended as a word replacement. Because the LaTeX word wrapping algorithm is able to reduce the empty spaces further. The same text looks with LaTeX different, because LaTeX is using an optimized word wrap algorithm, and very important the microtype package. So the result has much in common with the output of the indesign software which is also able to create high quality text.
So what LaTeX is doing is simple: it creates fully justified text with the help of hyphenation and intelligent word wrapping so that the amount of white spaces is minimized. This ability is labeled by the LaTeX community has high quality output.
So let us go a step back ward and ask a simple question: why exactly is a dissertation formatted as fully justified text, what is about flush left formatting style? No body knows. Even the question is so extraordinary that it is hard to answer it. Basically spoken the paradigm is, that the only allowed formatting style is symmetric, in a way that the headline is centered and that longer texts are formatted fully justified.
This kind of rule is much bigger than the LaTeX community. The rule is valid for other programs like indesign and MS-word as well. The rule is also valid for dissertations written before the advent of the PC. So it is has to do with typesetting in general.
From a technical point of view the LaTeX software can produce better justified text than MS-Word. THis is not a subjective interpretation but a 1:1 comparison will show it. In contrast the difference between LaTeX and indesign is little, both are able to create optimized justified text. The open question is if fully justification in general makes sense. Is there a need to produce centric / symetric documents?
In the history there are two main exceptions from this rule available. Letter are usually created in flush left and the internet based HTML pages are also formatted in flush left. Everything else especially books, journals and dissertations are typesetted in the justified mode.
Perhaps it makes sense to explain the situation from a more positive perspective. A typical introduction into the LaTeX software starts with a direct comparison with MS-Word. On the left side the document is shown formatted with Word and on the right side the same is formatted with LaTeX. Of course the LaTeX rendered pdf documents looks better because the text density is higher. It has little or no white spaces and the page looks similar to a printed book. Because of this ability of LaTeX to generate high quality output, the software is used frequently for academic purposes.
What is not answered in this comparison is the problem of formatting in general. The untold assumption was that both examples (word and latex) have to format the paragraph in the fully justified mode. In this restricted domain, LaTeX is much better. IF the paragraph setting was changed to flush left the result is the same.

November 02, 2021

The case of raggedright in the TeX community

 

It might be surprising but about the most important issue in typesetting the amount of information is little. Most books about LaTeX are explaining what typography is, and why TeX is great but they do not question the difference between left-justification and fully justified text. One of the few exceptions were made in the Tex journal Tugboat [1].
Such a debate is more general than the common question how to realize a certain layout with Latex because fully justified text and typography are strongly connected:
quote “For centuries, book printing has applied justification to nearly all paragraphs” [2]
And yes the statement is correct, a short look into the history of typography will show exactly this preference. Not only documents generated with LaTeX are looking mostly the same, but all the academic books and journals since 300 years have the same standard layout. It contains of two columns which are typeset with justified text. This produces a perfect straight edge and is described by typographers as good typography. That means, the assumption is everything which is not fully justified is equal to low quality typography.
It is impossible to find pro or cons arguments here but what is possible instead is to describe the situation as it is. It is very hard to find books and journals which are typeset with flush left layout. For example, recent issues of the PLOS One journal is doing so. Plos one is a non-traditional digital only academic journal. And perhaps this gives a hint what the idea is. Fully justified text is a sign for printed traditional documents while flush left layout is typical for online only publication.
References
[1] Are justification and hyphenation good or bad for the reader? TUGboat, Volume 37 (2016), No. 2 First results, https://tug.org/TUGboat/tb37-2/tb116akhmadeeva.pdf
[2] Udo Wermuth: An attempt at ragged-right typesetting, TUGboat, Volume 41 (2020), No. 1, https://tug.org/TUGboat/tb41-1/tb127wermuth-ragged.pdf