October 18, 2021

Understanding the core idea behind LaTeX

 

LaTeX is known as the most powerful typesetting program available. The open question which of the principles are future ready and which not. At first let us describe some elements of TeX which can be ignored. First problem is that the ecosystem is very huge. The tex live distribution is not a single binary program but it contains of hundreds of different programs which were created with million lines of code over decades. The problem is that nobody knows which of the code is relevant anymore.
Second problem with tex is that the original program “tex.web” was written with literate programming in mind and is based on the Turbo pascal syntax. Both ideas might be interesting as separate projects but they have nothing to do with typesetting.
After this negative describe let us search for some elements which are working great in TeX. First thing is the idea to use a markup language to create the layout. LaTeX provides around 250 commands which are rendered into a pdf document. This concept is similar to what gnuplot is using and it is very powerful.
The second interesting feature of TeX is that the rendering mechanism is working with boxes. A box is 2d space surrounded with a frame and can contains a single character, a paragraph, and an entire page. These boxes are arranged by the LaTeX renderer in the pdf document.
For reimplementing LaTeX some software has to be written which understand 250 different Tex commands and is using boxes to create a .png image or a pdf file. The latex markup language is used as input, it gets converted into boxes and then the output is generated as an image file. So the open queston is how to implement such a software?
One possible attempt would be an interactive prototype which contains of an input window left. The user types in a paragraph and on the right side the rendered boxes are shown as image. What the underlying renderer is doing is to create new boxes, and take decisions about the position on the screen.