May 04, 2023

LaTeX revisted

 LaTeX is known as the standard tool for academic publication and lots of online forums and external software is available in the ecosystem. The main problem is, that the promise of LaTeX isn't matching to the reality and the following blog posts explains in detail what the problem is with LateX.

Let us start with the main claim of the TeX ecosystem. The self understanding is, that the output quality of LaTeX exceeds possible alternative programs especially MS-Word. The interesting situation is, that the measurement how to judge about MS Word vs. LaTeX is not given. To make the situation more realistic let us take a closer look at a pdf file generated with LaTeX.

The surprising situation is, that such a latex pdf file doesn't contain of PDF tags, also the file isn't working with the default Postscript fonts which are times, helvetica and so on. And last but not least, it is impossible to convert a pdflatex file back into the HTML format or read it aloud with the Jaws screenreader. In abstract words, the latex created pdf file has no accessibility at all. And there is a reason for this unusual behavior.

At first it should be mentioned, that this problem can't be fixed by simply adding a certain parameter or adding a new latex package. But it has to do with the self understanding of LateX that all the pdf documents are not accessible. The reason is, that LaTeX is some sort of advanced printer driver. Its main purpose is to generate a bitmap picture like a TIFF image which has a well defined size and a well defined position of each pixel. It is not possible to zoom, to scale or to convert the image into another format but the image is static.

This kind of behavior can be explained with the origin of LaTeX. Ine late 1970s LaTeX was a pro-processor for offset printing devices. These machines need an image as input and the objective is to print this image in a high amount of copies. This makes LateX a great tool for creating newspapers and printed journals but at the same time it is a poor choice for creating office documents or HTML pages.

Office documents and HTML files are operating with different assumptions about the reality. They are not assume a fixed size A4 paper in the target output but the assumption is, that each user prefers a different size. The same HTML file gets rendered to a smartphone display, can be printed on US Letter page or gets rendered on a desktop screen. Such kind of flexibility is not available with LateX.

The LaTeX community ignores the problem. The users are assuming that there is not need to read aloud a latex file in jaws, and they are assuming that every pdf file gets printed. This assumption was working fine in the 1980s but it produces a reality gap in the 2020s. Most internet traffic isn't generated by desktop users but smartphones are the preferred display devices. In addition it is very important that a pdf document can be converted into other formats like HTML because the user likes to render the information by its own.

The only thing what LaTeX can do really well is to provide a static image which contains of justified text. It looks like it was scanned from a book created in the 1960s and the LateX community assumes that this format is the only valid layout.