May 31, 2023

Creating beautiful papers with LaTeX

The LaTeX communities judges about the layout of a paper in a certain unusual way. Everything which looks like a wall of text is qualified as excellent typographic style, while documents which containing of huge amount of white space, images and keypoints are treated as low quality paper. The following Lorem-Ipsum comparison was created in both cases with the LaTeX software and the layout looks complete different.
The page on the right side wasn't created with MS Word but everything was layouted with the LateX engine. The difference is, that the line space was increased, the reggaedright mode was activated, more paragraphs were created and 2 images were added. The judgment about good vs. wrong typography has to do with the preference for wall of text vs accessible typography.
The page on the left side is a typical example for a LaTeX formatted wall of text. It is the reason why MS Word users are arguing that every LaTeX documet looks the same. There is endless of amount of text but no visible anchors for the reader to rest. The content is hard to grasp, the visual layout is missing and additional sections or pictures are not there. The surprising situation is, that from the perspective of LateX the example on the left side is a here to stay This is how an academic text has to look like.
The question which is open yet is why exactly is a wall of text is perceived as high quality typography. Is typography not the same as an easy to read text? This is perhaps the most obvious misconception, creating a document with LaTeX doesn't mean, that it contains of images or has a lot of white spaces to rest. But it means to follow the best practice formatting style used in academics for decades. The assumption in Academia is, to write about the world in a highly abstract language style. Abstraction is the opposite of using images and examples written in tables, but abstraction means to formulate endless long sentences with lots of hard to explain specialized vocabulary. In other words, academic text are hard to read.
The advice to improve the readability of an academic text by using a special formatting and include images is ignoring the principle of a book. By definition there is a difference between a book and a powerpoint presentation. A book contains always of long sentences but has only seldom some images. In contrast, a powerpoint presentation works with the opposite principle. Every presentation divides the subject into easy to grasp sections and every page contains of at least one image. Creating a presentation without images and with full sentences is not recommended.
The main reason why the academic community prefers LaTeX for typesetting books and papers is because LaTeX is the king of formatting a wall of text. It allows to compress the text into the page. Every line looks the same. The bottom border is aligned between both columns.
This visual appearance was realized with advanced typographic algorithms for adjusting the fully justified paragraph, the vertical space between the sections and with the recent microtype extension in the pdftex program the homogeneous effect was increased further. MS Word and even Indesign can't compete with this visual appearance and the result is that since decades LaTeX is the document formatting software of choice in all the universities worldwide.



May 21, 2023

Advantages of a wall of text

 

Most tutorials about writing are explaining to the reader, that a wall of text is an antipattern which should be avoided for any costs. in contrast, a well structured and image enhanced text is prefered because it makes reading much easier for the audience.
In this blog post the opposite perspective should be explained. The starting point is the assumption that avoiding a wall of text is a trivial case. Creating a well structured text which is easy to read can be realized with a power point presentation. According to the self definition a powerpoint presentation contains only of keyterms plus visual gimmicks like tables and images but doesn't provide prose text. in addition each slide has a title and can be grasp in a short amount of time.
Or let me explain it the other way around. All the existing powerpoint presentations including LaTeX generated beamer presentation are avoiding a wall of text. There is no shortage in easy to read documents but there is a difference between a presentation and a prose text The working thesis is, that so called longer text aka a paper has to be a wall of text, otherwise it is not a paper but something else. In addition, a novel which is a fictional text is also a wall of text because in the other way it is a comic book or a movie which is a different category.
It is simply unfair to argue, that a 500 pages long fictional book is a wall of a text, because such a layout pattern is the self-understanding of a book. It is not possible to write a text which has many thousands words but do not format it as a wall of text.
Rejecting a wall of text is equal to rejecting the idea of a book in general. With this judgment in mind all the newspapers, academic papers and books have to be dismissed because they are containing of endless amount of text and only non books are allowed. This would include power point presentations, comic strips, dialogues between people and TV shows. In these case there is no longer text available.
The term typography in the core sense means usually to format a longer amount of text.

May 19, 2023

Wall of text, or: the beauty of LaTeX

In the existing debate around MS Word vs LaTeX one important aspect is missing. Everybody knows that LaTeX document are looking all the same, but the open question is why should somebody prefer such a style. To answer this question we have to provide the requirements which is fulfilled by LaTeX with ease. The challenge is to create a text intensive document which is unreadable and basically a wall of text. Such a requirement sounds a bit paradox because it is seldom formulated in this explicit manner, but let us assume that this is the task for a typographer.
The typographer isn't asked to create a power point presentation which contains of pictures on every page plus the important keypoints, but the task is to create homogeneous endless document without any iimages and without any subsections. Now it is possible to discuss how to do so in detail.
The first Lorem ipsum document was created with Libreoffice writer, the document can be read easily: there is a picture, different subsections, there is enough space between the paragraphs and there are also keypoints which makes it easy to grasp the content. In one word, the libreoffice example fulfills the criteria of an accessible document.
The example in the right place was created of course with the LaTeX software. All the reader style elements like the picture and the bullet points were removed and only a long sequence of paragraph is visible. Such kind of text-only layout can be realized great with LaTeX. The document fulfills the requirements easily and its shape is for sure unreadable. In theory, text walls can be created with Libreoffice too. But the LaTeX internal algorithms are designed for this purpose much better. The spaces between the words are more dense which makes the page look as a single block.
The document on the right side looks very scientific. The only way to improve the style is by reducing the font size from 10pt down to 9pt and add some esoteric mathematical equations. In other words, creating a wall of text is the meaning of LaTeX.


May 18, 2023

The meaning of LaTeX

The LaTeX community is using a certain sort of typographic style for creating academic journals and books. This style is codified in the TeX engine itself and it is explained in online forums how to use the software. The open question until now was the reason why this style is applied.
The assumption is, that even the LaTeX community itself doesn't know why they are formatting pdf documents in a certain way. The only thing what is for sure is, that a LaTeX generated paper looks very different from a Word generated document. There is a single working thesis available why LaTeX documents are looking all the same. In a single sentence it is about creating long winded text which creates a typographic dessert.
Let us describe the situation from a birds eye perspective and ignore the LaTeX ecosystem including the underlying algorithms. Suppose the idea is to format the text in the most boring fashion. The text should look like monoculture in agriculture. There is no structure available but all the textlines, characters and paragraphs are looking the same. That means, the size is the same, the textline width is the same and very important there are no images and subsections. It is a simply an endless amount of characters without any orientation. The reader won't see a starting point and he hardly find any lighthouse. Of course, such a text is unreadable, it has much in common how newspapers were looking until the 1970s. During that period it was technically not possible to use pictures and even tables were not available. Instead, the article in a newspaper was a long homogeneous text block.
With this requirement in mind the text logical step is to investigate different options how to create such a document with a computer. One option is the MS Word program the much better alternative is the LaTeX engine. Both programs are able to create an image free, justified text without any subsections. The reader will loose the orientation for sure but there is only an endless amount of text in a tiny font. There is no visible structure because a single paragraph occupies multiple pages. In other words, such a text is the opposite of an easy to read document.
The surprising situation is, that such an antipattern in typography is exactly the same what LaTeX is trying to archive. The goal is, that the text is hard to read. The internal algorithms in TeX which are equal to typography are trying to emphasize a situation in which the reader is lost in the text. The goal is not to provide waypoints and a structure but the opposite is the case. That means, everything looks the same. The meaning of a text isn't provided by its visual appetence but only by the words itself. Without reading the text, a fictional book and a non fictional academic text are looking the same. if the text was written mirrored it is impossible to guess what is written in the book.
LaTeX is working with some sophisticated algorithm for paragraph justification, and adjusting the white spaces. These algorithms have a simple purpose. They are reducing any visible structure. The goal is, that every line looks the same, and that the resulting text is hard to read. The eye has no visible anchor points. There are no subsections, there are no images and at the end of a line there is no white space. In contrast, the text will look like an endless ocean of characters which are forming words never seen before.
There is a possible explanation available why this unusual formatting structure makes sense. The reason is, that a book is different from a comic book and it is different from a Television show. A book is by definition equal to text. Typography is not about increasing the amount of pictures in a book but it is about making the text look like a dessert. A book has to confuse the reader by presenting him endless amount of pages which are looking all the same.
 
There is no need to utilize LaTeX for the purpose to create long winded documents. MS Word can fulfill the same purpose with some modifications. The only thing what the author has to do is to reduze the font size, format the text fully justified, removes all the images and removes all the white space and subsections from the text. The result will a bit different from the very homogeneous LaTeX rendering but it comes close to the expectations. Typography isn't a subjective decision but it's the art of creating hard to read text.

Why LaTeX is great

 

Many attempts were made in the past to compare the LaTeX engine with possible alternatives like MS Word. In most cases, LaTeX advocates are promising, that their typesetting software has a higher typographic quality. This kind of judgment is subjective because it hides the criteria why a certain layout is better.
The underlying reason why LaTeX is the superior rendering engine is its ability to create long winded text. Hard to read text without any images and sections is generated with the TeX engine easily. Here is an example:

The only thing what is sure is, that such a text is the opposite of accessible typography. It doesn't of any pictures and all the text lines are looking the same. LaTeX makes it easy to create such typography. To emphasize the situation, I have reduced the line spacing to 0.95 that means the vertical space between the lines is lower than normal which makes the text even harder to grasp.
To understand the content the reader has to go through the text line by line. It is much harder than watching Television or reading a comic but the task of acquiring the knowledge from the text is hard work. This makes the layout perfect for a scientific paper. The only thing what is missing in the document are some footnotes and non english vocabulary which will increase the reading difficulty further. In other words, academic publishing is mostly about creating in accessible content.

May 04, 2023

LaTeX revisted

 LaTeX is known as the standard tool for academic publication and lots of online forums and external software is available in the ecosystem. The main problem is, that the promise of LaTeX isn't matching to the reality and the following blog posts explains in detail what the problem is with LateX.

Let us start with the main claim of the TeX ecosystem. The self understanding is, that the output quality of LaTeX exceeds possible alternative programs especially MS-Word. The interesting situation is, that the measurement how to judge about MS Word vs. LaTeX is not given. To make the situation more realistic let us take a closer look at a pdf file generated with LaTeX.

The surprising situation is, that such a latex pdf file doesn't contain of PDF tags, also the file isn't working with the default Postscript fonts which are times, helvetica and so on. And last but not least, it is impossible to convert a pdflatex file back into the HTML format or read it aloud with the Jaws screenreader. In abstract words, the latex created pdf file has no accessibility at all. And there is a reason for this unusual behavior.

At first it should be mentioned, that this problem can't be fixed by simply adding a certain parameter or adding a new latex package. But it has to do with the self understanding of LateX that all the pdf documents are not accessible. The reason is, that LaTeX is some sort of advanced printer driver. Its main purpose is to generate a bitmap picture like a TIFF image which has a well defined size and a well defined position of each pixel. It is not possible to zoom, to scale or to convert the image into another format but the image is static.

This kind of behavior can be explained with the origin of LaTeX. Ine late 1970s LaTeX was a pro-processor for offset printing devices. These machines need an image as input and the objective is to print this image in a high amount of copies. This makes LateX a great tool for creating newspapers and printed journals but at the same time it is a poor choice for creating office documents or HTML pages.

Office documents and HTML files are operating with different assumptions about the reality. They are not assume a fixed size A4 paper in the target output but the assumption is, that each user prefers a different size. The same HTML file gets rendered to a smartphone display, can be printed on US Letter page or gets rendered on a desktop screen. Such kind of flexibility is not available with LateX.

The LaTeX community ignores the problem. The users are assuming that there is not need to read aloud a latex file in jaws, and they are assuming that every pdf file gets printed. This assumption was working fine in the 1980s but it produces a reality gap in the 2020s. Most internet traffic isn't generated by desktop users but smartphones are the preferred display devices. In addition it is very important that a pdf document can be converted into other formats like HTML because the user likes to render the information by its own.

The only thing what LaTeX can do really well is to provide a static image which contains of justified text. It looks like it was scanned from a book created in the 1960s and the LateX community assumes that this format is the only valid layout.

LaTeX-free text editing

 

The LaTeX typesetting system was used in the past for creating all sorts of academic texts. The advantage was that it allows to separate the content from the layout and has a high output quality of the PDF document. A lot of external software was created around the TeX ecosystem which are Lyx, Texstudio and open source fonts which can simplify the text creation workflow especially for larger documents.

Apart from LaTeX there some alternatives available. Especially the markdown format has the potential to replace existing Latex workflow with a thinner alternative. The screenshot shows how a textfile is edited. To make the section visible, the gedit program was improved with a outline plugin which allows to jump to each section in the text. Combined with the internal spell checking feature, this will emulate the standard Lyx editor very well. Even if the document format is not LaTeX, the workflow shares many similarities. It is very easy to create longer documents.