June 07, 2019

Programming with libraries


The classical comparison between programming language is oriented on the language syntax. Python programmers love the tabulator while C# experts are fascinated by the virtual machine. The more dominant criteria for selecting a language is located in their libraries. A library is often treated as a plugin to extend a language but in reality, the library is the core concept of each language.
Let us take a look into the famous rosettacode website which is comparing all the languages against each other. The reason why sometimes a language needs only 5 lines of code to solve a problem is because on top of the sourcecode some kind of import / include statement was used which is referencing to an existing library. If no library is there, the programmer has to write all the code by hand. We can say, that the availability of a large and efficient library is the most important criteria to select a language.
Let us take a look into the famous C++ language. Most programmers argue, that C++ is the queen of all languages. And they are right, C++ is a well designed powerful language. But what will happen if it's not allowed to use existing C/C++ libraries? The first result is, that the sourcecode will read very different. The programmer has to write all the details by hand. And secondly, for doing so he will invest a lot time. Without a library, the C++ language will become unusable.
It is a paradox, that libraries are not researched very well. In most languages like C#, Java or Python the libraries will take the most part of the language. They are more important than the compiler or the runtime environment. And if a software runs faster, in most cases it has to do with an updated library. The reason why libraries are treated as a minor issue is because they are part of the operating system, and the sourcecode of the OS is patent protected. If somebody controls the library he controls the entire ecosystem.
The only case in which library can be accessed in sourcecode is Linux. The glibc library has a GPL license. Which means, that the user can use the code and he can see inside the code. In case of Apple and Microsoft the situation is much different. A so called “API documentation” provides only the interface not the sourcecode of the library. Libraries like kernel32.dll or gdi32.dll are the core of the Windows operating system. User applications are build on top of these libraries. THey are providing the sourcecode of an operating system. We can say, that the collective misunderstanding of OS-libraries is not a technical issue, but a economical one. If a library is licensed in a certain way, this will influence how well the library is documented in books.
To understand the importance of library we have to go back into the 1980s at a time in which no libraries were available. In the early 1980s the computers were programmed in Assembly language. Which means, that the only available commands were preprogrammed into the CPU and apart from LDA, and push no other higher-commands were there. Then the first operating system was invented called CP/M. The advantage of CP/M was that a simple set of library functions were realized, for example to print a character to the screen. Additional, the first C compilers were invented which also provided library routines. This simplified the programming drastically.
In later operating systems and compilers like the MS-DOS OS, and Borland compilers the advantage of libraries had become obvious. Today it's possible to write a GUI Application which includes a database and animated graphics in under 1000 lines of code. Not because the programmer are experts of Assembly language, but because the available libraries which are provided in the operating system and in their programming language. How many lines of code are needed to program a 3d game in purely assembly language without any library? Lots of code is needed. I would guess that at least 1 million lines of code are needed and the resulting software will look very buggy and not very interesting.
But what exactly is a library? A library is a list of possible commands. They main feature is, that the list is long and very long. Even a small library like the glibc provides many thousands of new words. Very similar to a dictionary in a natural language. All the example programs in Rosettacode are using one or more existing libraries. A hello world GUI program in Java starts with a “import javax.swing.JButton;” command, while the same program in C# needs a “using System.Windows.Forms;” for working great. But how does the hello world program looks in purely assembly language? In the easiest case it is allowed to use in assembly also a library. Then the sourcecode is compact. In around 100 lines of code it's possible to draw the window. But, if Assembly is utilized without an existing library, then the code will become very complex.
What modern programming languages like C++ are providing to the user is more than only a compiler for translating sourcecode into machine code. C++ is at foremost a library collection which gives the programmer access to thousands of existing frameworks. Writing, improving and documenting the underlying library is the main task in modern programming. The reason why some languages like Python are used more frequently than others can be explained with the existing of a large library collection.
Assembly
Suppose we are in the 1980s and no software was created. Only the CPU is available. The best practice method in creating an application / game is to create first a library within the assembly language. We need routines for handling strings, for drawing lines and for accessing the floppy. The problem with assembly language is, that the same library will look different for the 6510 and the Intel CPU, so we have to create the code twice. The alternative is to use the C dialect for creating the library only once and compile it for any CPU type.
The last idea (using C as a language for creating a library) was the prefered method in the 1980s. It results into a fast development cycle. If the library is ready, the application can be programmed on top of this library.
Let us take a look into the reality if libraries were used quite often in the times of MS-DOS. The DJGPP compiler had a large C library as default, http://www.delorie.com/djgpp/doc/libc/ It contains of many useful functions for example calculating the sinus, string handling and getting the current date. The DJGPP library can be seen as an early minimalistic version of the current glibc library. It was programmed and documented to make the life easier. If it's combined with a 2d graphics library, it's possible to program a small MS-DOS game without much effort. Very similar what modern programming is about.
The first thing to do before the application can be programmed is to ask for a library. This is true for all the programming languages. No matter if somebody likes to program in Assembly, C, Pascal, C# or whatever. If a library is already there, programming will become much easier.
Programming library in Forth
Sometimes it was asked why C and Python has become successful in mainstream computing while Forth was ignored and forgotten by most of the programmers. In the early 1980s the situation wasn't that clear. In that time, Forth was a valuable alternative to normal C programming, but later the difference has become more obvious. The reason is not Forth vs C as a programming language but the answer has to do with the amount of C library vs Forth libraries.
Let us take a look into the only available Forth library, Forth foundation library (FFL). It is comparable with the C standard library and provides for example a module for handling strings. The concept is very similar to what C programmers are using. With a command it is possible to create a new string and to compare two different strings. The most interesting part of the FFL that in most Forth tutorials the library isn't mentioned. It is treated as something which is extra to the language. But is a library really a plugin to extend a language or is a library the core of a language?
Let us try to install the FFL on the harddrive. The size is compared to Forth programs in general very large. Around 719kb are occupied on the harddrive only for the Forth sourcecode which provides all the modules. The string module str.fs is around 19kb in size and it provides only a minimalistic set of functions. In general the Forth Foundation library looks similar to a shrink down version of the C standard library. Sure, it is possible to write a program with FFL but it's not very comfortable. If the idea is to program a game, the library has to be extended.
What i want to explain is, that the difference between Forth and C is smaller than most people are expecting. The programmer is not using a certain language syntax but he is programming on top of a library. This is expecially true for Forth like languages. Perhaps we should take a detailed look into the Forth foundation library. What the user gets after downloading the sourcecode is a large amount of modules. He can parse HTML files, manipulate textstrings, and compress the string with the SHA1 algorithm. It is not very complicated to build on top of the FFL a small program which combining these modules into a useful script. Very similar what python programmers are doing.
And if the Forth library is extended with more promitives for example for drawing a GUI window, the resulting application would become more powerful. But, before a user can take advantage of a library somebody has to program it first. The factorcode language can be understood as a Forth system which has a builtin library. The disadvantage is, that it's no longer aminimalist language but Factorcode takes 30 MB on the harddrive. Most of them is needed for the large scale library.
The reason why modern mainstream languages like Python and C# are loved by the programmers and the reason why they need so much memory on the harddrive is because their extensive library. The power of a language is located in the library. This is true especially for Forth like languages. The funny fact is, that's not possible to make a library smaller. The FFL is programmed very efficient, but at the same time it needs a lot of ressources.
The question is, if it's possible to program a useful application without using a library. IMHO it's not possible. The only alternative is to solve only toy problems. If the aim is to write a powerful large application which contains of a GUI, there is automatically a need for a complex standard library. The difference between Forth and C is smaller than it looks on the first impression.
The Forth Foundation library can be measured very precise in it's size. It contains of 719kb sourcode and provides 1677 new words. Each word is a function which is provides by the modules in the library. The FFL is not very powerful. It provides only a minimal amount of features, comparable to a stripped down version of the C library. If the aim is to write large scale complex application the first step would be to extend the FFL with new words. A reasonable well environment would contains of 500k words which will occupy 210MB. Such a mega-Forth library would not be small and minimalistic but it will look very similar to what normal C programmers are using.