October 20, 2021

Understanding the latex ecosystem

 



Before it is possible to reprogram LaTeX from scratch there is need to understand what LaTeX is doing. In most cases the goal is to produce a nice formatted pdf document as output. But a postscript file can be accepted as well. The reason is, that postscript can be converted easily into pdf with the help of the ghostscript tool.
The remaining question is how to convert a latex file into postscript? Let us take a closer look what postscript is about. Postscript defines on a low level layer the position of the elements in a document. A simple two column text document is given as an example.
%!PS-Adobe-3.0
/Times-Roman findfont
9 scalefont setfont

/column1 {
0.5 setlinewidth
40 300 250 400 rectstroke
40 690 moveto 
(Hello World!) show
40 670 moveto 
(This is an example text to demonstrate how the Postscript language) show
40 660 moveto 
(is working internally. This is an example text to demonstrate how) show
40 650 moveto 
(the Postscript language is working internally.) show
} def

/column2 {
0.5 setlinewidth
310 300 250 400 rectstroke
} def

%------main---------
column1
column2

showpage
The interesting point is, that postscript has no linewrapping command but the textlines have to be provided individual. WHat a latex compiler is doing is to create such a postscript file. The reason is, that creating a postscript document by hand takes too long.
It seems that a latex like processor needs to fulfill the follwoing requirement:
1. convert of latex file into a post script file
2. using boxes to place the information on the screen
Let us go into the details. In the post script file, two boxes were defined. Both keystroke commands are accepting only absolute coordinates on the screen and the (0,0) position is bottom left. So the question to answer first is where exactly is the x/y position for a box? Right only latex knows of this. It depends on the text and a bit of mathematical calcuations. But, in theory it is possible to do the calculations with a software program automatically. In a sense that the program code creates the boxes for a longer document within milliseconds.