October 14, 2021

Reinventing LaTeX

 

The current LaTeX project is a mess. The luatex implementation which is available in most Linux distribution contains of around 1 million lines of code in total. 50% of the code was written in C the rest in other languages like bashscript and C#. The unsolved question is how to reinvent the well known LaTeX software?
Some projects have tried to do so in the past, namely “lout” and “rinohtype”. The obvious difference is that these projects are smaller one. For example “rinohtype” was announced as a python project which contains of only 6500 lines of code. It is no wonder that this small project doesn't match to the needs of today's publisher. So the question is how to make it better?
 A very basic LaTeX replacement contains similar to the gnuplot software of a command interpreter as the core. That means, after starting the binary file, the user has a command prompt in which commands can be entered. This affects the rendering of the text on the screen. For example after entering the command “fontsize 10”, the document right in the screen will become a new fontsize. The idea is that the user can test out in the command prompt different options and if the result works fine the commands can be copied into the original file.
 The only problem with this idea is that it is very complicated to write a parser which accepts thousands and more different commands. Typesetting is a very complex field and there is a need to fine adjust any detail. The existing LaTeX project is so big because many requirements are fulfilled. The chance is high that even the reprogrammed latex software will contains of one million lines of code.
 
**Command line interpreter**
The basic functionality of a LaTeX like software is to parse a mixture of plain text and formatting commands. The typical hello world example for a LaTeX file is “Hello world \TeX”. The command after the backslash is executing a certain action rendering action. LaTeX is mainly an endless list of commands which are starting with backslash and then the output is rendered.
The overleaf project shows very well what the idea is. The user can copy&paste the input text in the left window and the output is rendered into the right window. The assumption is that this principle is a powerful idea and should be implemented in a potential latex replacement.
The assumed project would start with a small vocabulary which contains of only 100 possible commands, and then the idea is to program much more commands. Formatting complex documents can only be realized if 1000 and more latex commands are parsed and rendered into graphical output. The idea is that the layout engine will become more powerful if more commands are provided.
In contrast to a common assumption this kind of interaction isn't difficult to master. Because it is possible to write a command reference similar to what is known from gnuplot. That means, in the dcumentation a hierarchical list of scenarios is provided and each of them shows the usage of a command. The user has to identify in the documentation the needed command and can copy and paste the example into the window. This allows to create longer documents.
The underlying idea behind LaTeX is a text window in which a short snipped can be inserted by the user. THis text snippet is converted into an image which is rendered to a pdf file. An entire paper / document contains of the text plus the latex commands which are starting with the backslash. The open question is which commands are needed and how many different commands are useful.